From patchwork Wed Sep 19 07:21:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongseok Koh X-Patchwork-Id: 44898 X-Patchwork-Delegate: shahafs@mellanox.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 35F8C4CA6; Wed, 19 Sep 2018 09:22:00 +0200 (CEST) Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0073.outbound.protection.outlook.com [104.47.0.73]) by dpdk.org (Postfix) with ESMTP id D231F378B for ; Wed, 19 Sep 2018 09:21:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oncnDQLZ9PkXkYHojUlp/YGY1YAua7nGl9h0DBg0dEY=; b=WUnHZGKJbwNx64dUx4oxpIjZTQ7fEfke04zTS+H7bM+rVF3Ypl0xgJ3iDyR+vZbonrd3Luw5KYrEtxvGGmzo1qN9J5LfNhSj6npfbj5q2zTWftZNkTZ8QJtkLG0vQG293+Q5MJWq+NONp/uvGuCBv52CRHzA0YNWLgp+yBrXZeM= Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com (52.134.72.27) by DB3PR0502MB4060.eurprd05.prod.outlook.com (52.134.72.153) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1143.18; Wed, 19 Sep 2018 07:21:55 +0000 Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::452:cfe7:8363:61c1]) by DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::452:cfe7:8363:61c1%2]) with mapi id 15.20.1143.017; Wed, 19 Sep 2018 07:21:55 +0000 From: Yongseok Koh To: Shahaf Shuler CC: "dev@dpdk.org" , Yongseok Koh Thread-Topic: [PATCH 1/3] net/mlx5: add abstraction for multiple flow drivers Thread-Index: AQHUT+lx0K5QmXSaw0qhPWK5YpR2OQ== Date: Wed, 19 Sep 2018 07:21:54 +0000 Message-ID: <20180919072143.23211-2-yskoh@mellanox.com> References: <20180919072143.23211-1-yskoh@mellanox.com> In-Reply-To: <20180919072143.23211-1-yskoh@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: BYAPR03CA0020.namprd03.prod.outlook.com (2603:10b6:a02:a8::33) To DB3PR0502MB3980.eurprd05.prod.outlook.com (2603:10a6:8:10::27) authentication-results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [209.116.155.178] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB3PR0502MB4060; 6:EdkIvryFdxixjdUbw1W2PyZyJ1Pg2v7mJR8poFoMOJAUWDxyS7RZPGqetk7IOXGHHBmTB1GelQnWC0bG7ICQzA1EY/XCDIggUswnYSuaVs8d1A8JXNLr4XshUa9GFKGQ17LaDW0Zrq4dXHS7d2nmDL6xIE/jSbwaFLnTSehMZhFKa4mlrl/kCixSTtfrUsIhR//rtqPzD67B24uQBMSfKPUdqBYqMg7Fn38xdGS5WWdVwjgPLwcwbLaXjNZbPiI3l/Qx9bKQ4pGgLL44EFySamQIiVtlhyqDN5E2IpCURWbZ0ELOJbek3W+J2bCLqlYAobBez6B3PBB6M9oUiFmlV942urt81RFBQgvAvn8IKXex6YSSFsQPAw++4W5Blvz0iwdlU32z4aAVnHlDeC084lkDBPvqsRc01ibUzH7kzdDHVM3XilHI2pzeC+5N9LAoFWEjMLmyDnbbccxKv3PMqQ==; 5:T7a7t1mTzpu3Ep28PNlEyviK22Q5vd08Io2ZYpSpsREg1mDfhq5pClfCz07IFGXujmFH12jZmca1BCUYUWedovXCp5AoXaA7ec3nHRgljeQkh/GePSw48HcW/7IORaTIs2qiwo3/Iyu4CmcABOgZV2Jat2qnqME33BFtmm/OAdg=; 7:HvbU29ARy2hFsdBpyYs/iYQzFTuicDO4Q/mgXetO7YrSMqU+qq1LBLGFBbX8zKmioLDvGNOHFIA3q6CuLY35ejkejF3Eom1wy1ewUls2KyxliS2qTMj4j2TueUBdzO2f1YUOTRJ//25smR9anl6fATZKTYqe1eu32ccJbY1B+tYqYqR17ptZ0Pbr/PBdDY22lgluEzyyNHSIume6erlD+SMngZqK3eWGM81zyVgadc07EzK2r+k/7Xpt/BQZjzMk x-ms-office365-filtering-correlation-id: 7fda890d-0f8a-4343-bbb3-08d61e0093a9 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:DB3PR0502MB4060; x-ms-traffictypediagnostic: DB3PR0502MB4060: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(3231355)(944501410)(52105095)(6055026)(149027)(150027)(6041310)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123564045)(20161123560045)(201708071742011)(7699050); SRVR:DB3PR0502MB4060; BCL:0; PCL:0; RULEID:; SRVR:DB3PR0502MB4060; x-forefront-prvs: 0800C0C167 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(366004)(346002)(396003)(376002)(39860400002)(199004)(189003)(3846002)(6512007)(68736007)(99286004)(6436002)(575784001)(6486002)(86362001)(97736004)(478600001)(2900100001)(6116002)(4326008)(2906002)(6862004)(1076002)(106356001)(54906003)(25786009)(36756003)(8676002)(8936002)(81156014)(316002)(486006)(5660300001)(66066001)(105586002)(446003)(37006003)(107886003)(102836004)(81166006)(476003)(305945005)(7736002)(2616005)(11346002)(6636002)(5250100002)(76176011)(256004)(53936002)(386003)(14444005)(14454004)(26005)(52116002)(53946003)(6506007); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0502MB4060; H:DB3PR0502MB3980.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: SwlZ+heeS7y7QClzAXIKBZwEEsyGH/rAzBakGwNu3sQC3w2xRIkuMgVsXF5P5TIxUYUshVaxdJVG05Ay5W3AgYV67PONcF9T1wiNuIquWPFqetF8eO4Y6q/SeveJsOCbt24tnoiz8MUNJGScNrGa70H7wncsZpA5e44vZrlz0I4citnvvFaZ5dvnCuQTmSxuEbhaEPcARbf5s92OOm6KKLVCkyYDPljye6WGUypOp+o8UOf1MoyMwhLfiAvGozZ5BIKSvvhlBLu234wPfHf7bqF0p/BPx610hVazBbfHTX57DPU6UTY3srNNudORWKcUAztC7pCagKiZSiSQ7AXjFMu84LzMqHmxAW5MW6gV6sw= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7fda890d-0f8a-4343-bbb3-08d61e0093a9 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Sep 2018 07:21:54.9964 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0502MB4060 Subject: [dpdk-dev] [PATCH 1/3] net/mlx5: add abstraction for multiple flow drivers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Flow engine has to support multiple driver paths. Verbs/DV for NIC flow steering and Linux TC flower for E-Switch flow steering. In the future, another flow driver could be added (devX). Signed-off-by: Yongseok Koh --- drivers/net/mlx5/mlx5.c | 1 - drivers/net/mlx5/mlx5_flow.c | 348 +++++++++++++++++++++++++++++++++---- drivers/net/mlx5/mlx5_flow.h | 17 +- drivers/net/mlx5/mlx5_flow_dv.c | 26 +-- drivers/net/mlx5/mlx5_flow_verbs.c | 20 +-- 5 files changed, 335 insertions(+), 77 deletions(-) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index d5936091b..02324ef4f 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1192,7 +1192,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, if (err < 0) goto error; priv->config.flow_prio = err; - mlx5_flow_init_driver_ops(eth_dev); /* * Once the device is added to the list of memory event * callback, its global MR cache table cannot be expanded diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 677cc7a32..2d3158a6f 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -37,6 +37,23 @@ extern const struct eth_dev_ops mlx5_dev_ops; extern const struct eth_dev_ops mlx5_dev_ops_isolate; +/** Device flow drivers. */ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT +extern const struct mlx5_flow_driver_ops mlx5_flow_dv_drv_ops; +#endif +extern const struct mlx5_flow_driver_ops mlx5_flow_verbs_drv_ops; + +const struct mlx5_flow_driver_ops mlx5_flow_null_drv_ops; + +const struct mlx5_flow_driver_ops *flow_drv_ops[] = { + [MLX5_FLOW_TYPE_MIN] = &mlx5_flow_null_drv_ops, +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + [MLX5_FLOW_TYPE_DV] = &mlx5_flow_dv_drv_ops, +#endif + [MLX5_FLOW_TYPE_VERBS] = &mlx5_flow_verbs_drv_ops, + [MLX5_FLOW_TYPE_MAX] = &mlx5_flow_null_drv_ops +}; + enum mlx5_expansion { MLX5_EXPANSION_ROOT, MLX5_EXPANSION_ROOT_OUTER, @@ -282,9 +299,6 @@ static struct mlx5_flow_tunnel_info tunnels_info[] = { }, }; -/* Holds the nic operations that should be used. */ -struct mlx5_flow_driver_ops nic_ops; - /** * Discover the maximum number of priority available. * @@ -1511,6 +1525,284 @@ mlx5_flow_validate_item_mpls(const struct rte_flow_item *item __rte_unused, " update."); } +static int +flow_null_validate(struct rte_eth_dev *dev __rte_unused, + const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[] __rte_unused, + const struct rte_flow_action actions[] __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return -rte_errno; +} + +static struct mlx5_flow * +flow_null_prepare(const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[] __rte_unused, + const struct rte_flow_action actions[] __rte_unused, + uint64_t *item_flags __rte_unused, + uint64_t *action_flags __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return NULL; +} + +static int +flow_null_translate(struct rte_eth_dev *dev __rte_unused, + struct mlx5_flow *dev_flow __rte_unused, + const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[] __rte_unused, + const struct rte_flow_action actions[] __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return -rte_errno; +} + +static int +flow_null_apply(struct rte_eth_dev *dev __rte_unused, + struct rte_flow *flow __rte_unused, + struct rte_flow_error *error __rte_unused) +{ + rte_errno = ENOTSUP; + return -rte_errno; +} + +static void +flow_null_remove(struct rte_eth_dev *dev __rte_unused, + struct rte_flow *flow __rte_unused) +{ +} + +static void +flow_null_destroy(struct rte_eth_dev *dev __rte_unused, + struct rte_flow *flow __rte_unused) +{ +} + +/* Void driver to protect from null pointer reference. */ +const struct mlx5_flow_driver_ops mlx5_flow_null_drv_ops = { + .validate = flow_null_validate, + .prepare = flow_null_prepare, + .translate = flow_null_translate, + .apply = flow_null_apply, + .remove = flow_null_remove, + .destroy = flow_null_destroy, +}; + +/** + * Select flow driver type according to flow attributes and device + * configuration. + * + * @param[in] dev + * Pointer to the dev structure. + * @param[in] attr + * Pointer to the flow attributes. + * + * @return + * flow driver type if supported, MLX5_FLOW_TYPE_MAX otherwise. + */ +static enum mlx5_flow_drv_type +flow_get_drv_type(struct rte_eth_dev *dev __rte_unused, + const struct rte_flow_attr *attr) +{ + struct priv *priv __rte_unused = dev->data->dev_private; + enum mlx5_flow_drv_type type = MLX5_FLOW_TYPE_MAX; + + if (!attr->transfer) { +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + type = priv->config.dv_flow_en ? MLX5_FLOW_TYPE_DV : + MLX5_FLOW_TYPE_VERBS; +#else + type = MLX5_FLOW_TYPE_VERBS; +#endif + } + return type; +} + +#define flow_get_drv_ops(type) flow_drv_ops[type] + +/** + * Flow driver validation API. This abstracts calling driver specific functions. + * The type of flow driver is determined according to flow attributes. + * + * @param[in] dev + * Pointer to the dev structure. + * @param[in] attr + * Pointer to the flow attributes. + * @param[in] items + * Pointer to the list of items. + * @param[in] actions + * Pointer to the list of actions. + * @param[out] error + * Pointer to the error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_ernno is set. + */ +static inline int +flow_drv_validate(struct rte_eth_dev *dev, + const struct rte_flow_attr *attr, + const struct rte_flow_item items[], + const struct rte_flow_action actions[], + struct rte_flow_error *error) +{ + const struct mlx5_flow_driver_ops *fops; + enum mlx5_flow_drv_type type = flow_get_drv_type(dev, attr); + + fops = flow_get_drv_ops(type); + return fops->validate(dev, attr, items, actions, error); +} + +/** + * Flow driver preparation API. This abstracts calling driver specific + * functions. Parent flow (rte_flow) should have driver type (drv_type). It + * calculates the size of memory required for device flow, allocates the memory, + * initializes the device flow and returns the pointer. + * + * @param[in] attr + * Pointer to the flow attributes. + * @param[in] items + * Pointer to the list of items. + * @param[in] actions + * Pointer to the list of actions. + * @param[out] item_flags + * Pointer to bit mask of all items detected. + * @param[out] action_flags + * Pointer to bit mask of all actions detected. + * @param[out] error + * Pointer to the error structure. + * + * @return + * Pointer to device flow on success, otherwise NULL and rte_ernno is set. + */ +static inline struct mlx5_flow * +flow_drv_prepare(struct rte_flow *flow, + const struct rte_flow_attr *attr, + const struct rte_flow_item items[], + const struct rte_flow_action actions[], + uint64_t *item_flags, + uint64_t *action_flags, + struct rte_flow_error *error) +{ + const struct mlx5_flow_driver_ops *fops; + enum mlx5_flow_drv_type type = flow->drv_type; + + assert(type > MLX5_FLOW_TYPE_MIN && type < MLX5_FLOW_TYPE_MAX); + fops = flow_get_drv_ops(type); + return fops->prepare(attr, items, actions, item_flags, action_flags, + error); +} + +/** + * Flow driver translation API. This abstracts calling driver specific + * functions. Parent flow (rte_flow) should have driver type (drv_type). It + * translates a generic flow into a driver flow. flow_drv_prepare() must + * precede. + * + * + * @param[in] dev + * Pointer to the rte dev structure. + * @param[in, out] dev_flow + * Pointer to the mlx5 flow. + * @param[in] attr + * Pointer to the flow attributes. + * @param[in] items + * Pointer to the list of items. + * @param[in] actions + * Pointer to the list of actions. + * @param[out] error + * Pointer to the error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_ernno is set. + */ +static inline int +flow_drv_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, + const struct rte_flow_attr *attr, + const struct rte_flow_item items[], + const struct rte_flow_action actions[], + struct rte_flow_error *error) +{ + const struct mlx5_flow_driver_ops *fops; + enum mlx5_flow_drv_type type = dev_flow->flow->drv_type; + + assert(type > MLX5_FLOW_TYPE_MIN && type < MLX5_FLOW_TYPE_MAX); + fops = flow_get_drv_ops(type); + return fops->translate(dev, dev_flow, attr, items, actions, error); +} + +/** + * Flow driver apply API. This abstracts calling driver specific functions. + * Parent flow (rte_flow) should have driver type (drv_type). It applies + * translated driver flows on to device. flow_drv_translate() must precede. + * + * @param[in] dev + * Pointer to Ethernet device structure. + * @param[in, out] flow + * Pointer to flow structure. + * @param[out] error + * Pointer to error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +static inline int +flow_drv_apply(struct rte_eth_dev *dev, struct rte_flow *flow, + struct rte_flow_error *error) +{ + const struct mlx5_flow_driver_ops *fops; + enum mlx5_flow_drv_type type = flow->drv_type; + + assert(type > MLX5_FLOW_TYPE_MIN && type < MLX5_FLOW_TYPE_MAX); + fops = flow_get_drv_ops(type); + return fops->apply(dev, flow, error); +} + +/** + * Flow driver remove API. This abstracts calling driver specific functions. + * Parent flow (rte_flow) should have driver type (drv_type). It removes a flow + * on device. All the resources of the flow should be freed by calling + * flow_dv_destroy(). + * + * @param[in] dev + * Pointer to Ethernet device. + * @param[in, out] flow + * Pointer to flow structure. + */ +static inline void +flow_drv_remove(struct rte_eth_dev *dev, struct rte_flow *flow) +{ + const struct mlx5_flow_driver_ops *fops; + enum mlx5_flow_drv_type type = flow->drv_type; + + assert(type > MLX5_FLOW_TYPE_MIN && type < MLX5_FLOW_TYPE_MAX); + fops = flow_get_drv_ops(type); + fops->remove(dev, flow); +} + +/** + * Flow driver destroy API. This abstracts calling driver specific functions. + * Parent flow (rte_flow) should have driver type (drv_type). It removes a flow + * on device and releases resources of the flow. + * + * @param[in] dev + * Pointer to Ethernet device. + * @param[in, out] flow + * Pointer to flow structure. + */ +static inline void +flow_drv_destroy(struct rte_eth_dev *dev, struct rte_flow *flow) +{ + const struct mlx5_flow_driver_ops *fops; + enum mlx5_flow_drv_type type = flow->drv_type; + + assert(type > MLX5_FLOW_TYPE_MIN && type < MLX5_FLOW_TYPE_MAX); + fops = flow_get_drv_ops(type); + fops->destroy(dev, flow); +} + /** * Validate a flow supported by the NIC. * @@ -1526,7 +1818,7 @@ mlx5_flow_validate(struct rte_eth_dev *dev, { int ret; - ret = nic_ops.validate(dev, attr, items, actions, error); + ret = flow_drv_validate(dev, attr, items, actions, error); if (ret < 0) return ret; return 0; @@ -1616,7 +1908,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev, uint32_t i; uint32_t flow_size; - ret = mlx5_flow_validate(dev, attr, items, actions, error); + ret = flow_drv_validate(dev, attr, items, actions, error); if (ret < 0) return NULL; flow_size = sizeof(struct rte_flow); @@ -1627,6 +1919,9 @@ mlx5_flow_list_create(struct rte_eth_dev *dev, else flow_size += RTE_ALIGN_CEIL(sizeof(uint16_t), sizeof(void *)); flow = rte_calloc(__func__, 1, flow_size, 0); + flow->drv_type = flow_get_drv_type(dev, attr); + assert(flow->drv_type > MLX5_FLOW_TYPE_MIN && + flow->drv_type < MLX5_FLOW_TYPE_MAX); flow->queue = (void *)(flow + 1); LIST_INIT(&flow->dev_flows); if (rss && rss->types) { @@ -1644,21 +1939,21 @@ mlx5_flow_list_create(struct rte_eth_dev *dev, buf->entry[0].pattern = (void *)(uintptr_t)items; } for (i = 0; i < buf->entries; ++i) { - dev_flow = nic_ops.prepare(attr, buf->entry[i].pattern, - actions, &item_flags, - &action_flags, error); + dev_flow = flow_drv_prepare(flow, attr, buf->entry[i].pattern, + actions, &item_flags, &action_flags, + error); if (!dev_flow) goto error; dev_flow->flow = flow; LIST_INSERT_HEAD(&flow->dev_flows, dev_flow, next); - ret = nic_ops.translate(dev, dev_flow, attr, - buf->entry[i].pattern, - actions, error); + ret = flow_drv_translate(dev, dev_flow, attr, + buf->entry[i].pattern, + actions, error); if (ret < 0) goto error; } if (dev->data->dev_started) { - ret = nic_ops.apply(dev, flow, error); + ret = flow_drv_apply(dev, flow, error); if (ret < 0) goto error; } @@ -1668,7 +1963,7 @@ mlx5_flow_list_create(struct rte_eth_dev *dev, error: ret = rte_errno; /* Save rte_errno before cleanup. */ assert(flow); - nic_ops.destroy(dev, flow); + flow_drv_destroy(dev, flow); rte_free(flow); rte_errno = ret; /* Restore rte_errno. */ return NULL; @@ -1706,7 +2001,7 @@ static void mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list, struct rte_flow *flow) { - nic_ops.destroy(dev, flow); + flow_drv_destroy(dev, flow); TAILQ_REMOVE(list, flow, next); /* * Update RX queue flags only if port is started, otherwise it is @@ -1750,7 +2045,7 @@ mlx5_flow_stop(struct rte_eth_dev *dev, struct mlx5_flows *list) struct rte_flow *flow; TAILQ_FOREACH_REVERSE(flow, list, mlx5_flows, next) - nic_ops.remove(dev, flow); + flow_drv_remove(dev, flow); mlx5_flow_rxq_flags_clear(dev); } @@ -1773,7 +2068,7 @@ mlx5_flow_start(struct rte_eth_dev *dev, struct mlx5_flows *list) int ret = 0; TAILQ_FOREACH(flow, list, next) { - ret = nic_ops.apply(dev, flow, &error); + ret = flow_drv_apply(dev, flow, &error); if (ret < 0) goto error; mlx5_flow_rxq_flags_set(dev, flow); @@ -2464,24 +2759,3 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev, } return 0; } - -/** - * Init the driver ops structure. - * - * @param dev - * Pointer to Ethernet device structure. - */ -void -mlx5_flow_init_driver_ops(struct rte_eth_dev *dev) -{ - struct priv *priv __rte_unused = dev->data->dev_private; - -#ifdef HAVE_IBV_FLOW_DV_SUPPORT - if (priv->config.dv_flow_en) - mlx5_flow_dv_get_driver_ops(&nic_ops); - else - mlx5_flow_verbs_get_driver_ops(&nic_ops); -#else - mlx5_flow_verbs_get_driver_ops(&nic_ops); -#endif -} diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index 53c0eeb56..2bc3bee8c 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -128,6 +128,13 @@ /* Max number of actions per DV flow. */ #define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8 +enum mlx5_flow_drv_type { + MLX5_FLOW_TYPE_MIN, + MLX5_FLOW_TYPE_DV, + MLX5_FLOW_TYPE_VERBS, + MLX5_FLOW_TYPE_MAX, +}; + /* Matcher PRM representation */ struct mlx5_flow_dv_match_params { size_t size; @@ -210,7 +217,7 @@ struct mlx5_flow_counter { /* Flow structure. */ struct rte_flow { TAILQ_ENTRY(rte_flow) next; /**< Pointer to the next flow structure. */ - struct rte_flow_attr attributes; /**< User flow attribute. */ + enum mlx5_flow_drv_type drv_type; /**< Drvier type. */ uint32_t layers; /**< Bit-fields of present layers see MLX5_FLOW_LAYER_*. */ struct mlx5_flow_counter *counter; /**< Holds flow counter. */ @@ -314,13 +321,5 @@ int mlx5_flow_validate_item_vxlan_gpe(const struct rte_flow_item *item, uint64_t item_flags, struct rte_eth_dev *dev, struct rte_flow_error *error); -void mlx5_flow_init_driver_ops(struct rte_eth_dev *dev); - -/* mlx5_flow_dv.c */ -void mlx5_flow_dv_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops); - -/* mlx5_flow_verbs.c */ - -void mlx5_flow_verbs_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops); #endif /* RTE_PMD_MLX5_FLOW_H_ */ diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 71af410b2..cf663cdb8 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -1351,23 +1351,13 @@ flow_dv_destroy(struct rte_eth_dev *dev, struct rte_flow *flow) } } -/** - * Fills the flow_ops with the function pointers. - * - * @param[out] flow_ops - * Pointer to driver_ops structure. - */ -void -mlx5_flow_dv_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops) -{ - *flow_ops = (struct mlx5_flow_driver_ops) { - .validate = flow_dv_validate, - .prepare = flow_dv_prepare, - .translate = flow_dv_translate, - .apply = flow_dv_apply, - .remove = flow_dv_remove, - .destroy = flow_dv_destroy, - }; -} +const struct mlx5_flow_driver_ops mlx5_flow_dv_drv_ops = { + .validate = flow_dv_validate, + .prepare = flow_dv_prepare, + .translate = flow_dv_translate, + .apply = flow_dv_apply, + .remove = flow_dv_remove, + .destroy = flow_dv_destroy, +}; #endif /* HAVE_IBV_FLOW_DV_SUPPORT */ diff --git a/drivers/net/mlx5/mlx5_flow_verbs.c b/drivers/net/mlx5/mlx5_flow_verbs.c index f4a264232..05ab5fdad 100644 --- a/drivers/net/mlx5/mlx5_flow_verbs.c +++ b/drivers/net/mlx5/mlx5_flow_verbs.c @@ -1638,15 +1638,11 @@ flow_verbs_apply(struct rte_eth_dev *dev, struct rte_flow *flow, return -rte_errno; } -void -mlx5_flow_verbs_get_driver_ops(struct mlx5_flow_driver_ops *flow_ops) -{ - *flow_ops = (struct mlx5_flow_driver_ops) { - .validate = flow_verbs_validate, - .prepare = flow_verbs_prepare, - .translate = flow_verbs_translate, - .apply = flow_verbs_apply, - .remove = flow_verbs_remove, - .destroy = flow_verbs_destroy, - }; -} +const struct mlx5_flow_driver_ops mlx5_flow_verbs_drv_ops = { + .validate = flow_verbs_validate, + .prepare = flow_verbs_prepare, + .translate = flow_verbs_translate, + .apply = flow_verbs_apply, + .remove = flow_verbs_remove, + .destroy = flow_verbs_destroy, +}; From patchwork Wed Sep 19 07:21:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongseok Koh X-Patchwork-Id: 44899 X-Patchwork-Delegate: shahafs@mellanox.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 597B84F9B; Wed, 19 Sep 2018 09:22:02 +0200 (CEST) Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10074.outbound.protection.outlook.com [40.107.1.74]) by dpdk.org (Postfix) with ESMTP id BBA834C9F for ; Wed, 19 Sep 2018 09:21:58 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=G/luYw4ENeODJ6DyjlAx4Nux7DQuRSLffNsE9tMcsL0=; b=X/RDnacnpqFtU0j0PnvmzErAz2RMghiOIUnpH0SKv63Y9vdu9bQOuGY3Zoo3dfM5nyTiwJeaEeLmB1Hmh8FUcwK5sigJ5UJN+zYZb0biEXr5ySsu7/hX0SAHRote38PW7sndtFsasD9HW52aTzG7BsqSAdTaAr3+RSm5T+4JBj8= Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com (52.134.72.27) by DB3PR0502MB3994.eurprd05.prod.outlook.com (52.134.72.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1143.17; Wed, 19 Sep 2018 07:21:56 +0000 Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::452:cfe7:8363:61c1]) by DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::452:cfe7:8363:61c1%2]) with mapi id 15.20.1143.017; Wed, 19 Sep 2018 07:21:56 +0000 From: Yongseok Koh To: Shahaf Shuler CC: "dev@dpdk.org" , Yongseok Koh Thread-Topic: [PATCH 2/3] net/mlx5: remove Netlink flow driver Thread-Index: AQHUT+lyN5DbyJmF5UaiDn0duzgalA== Date: Wed, 19 Sep 2018 07:21:56 +0000 Message-ID: <20180919072143.23211-3-yskoh@mellanox.com> References: <20180919072143.23211-1-yskoh@mellanox.com> In-Reply-To: <20180919072143.23211-1-yskoh@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: BYAPR03CA0020.namprd03.prod.outlook.com (2603:10b6:a02:a8::33) To DB3PR0502MB3980.eurprd05.prod.outlook.com (2603:10a6:8:10::27) authentication-results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [209.116.155.178] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB3PR0502MB3994; 6:RHxXAFFGfXI2nBrN5THa19B08q0gPGITGuJ8+BEmxFn11Uu7mcn+2hOgTRf+yxgLprL2YgPaeGvV66A7src5wkRpxnToMuw3KMmLE2dZHz8YyUyJBMCnvq8hoO3bMTSJGjEngGo7rBemfq8RkLU2G/x8G7hXtKyyWDHYwVxCQvmTk7Sd9ajiD19y+Jq1ltx4C1Eg5WzFypO1JNG9z2STHZkfMRvwxzKDaXdWdwyVtC5+Ke0aOXVQqJhwiTmwMq0QfTp8oRgmr7aeU6/Us83KwgzyULAO52fKJtpwoWnRVlTi6mPw4U906eM3nlVJugyByQdrta7qBun5jg/kKtOOGLbhDrlZ91CDnwqdVgL/M4ddCgsmjptCboJA0qTl5erOFTUbWRCFY2ppX6hOlRuNV8V8iHxhYUFLzcKaQipGGN1Y0Xfw0Z2hjZSB0Dg9gVtxTGAou+9lR2Kh28mygNMuBQ==; 5:NkTzRvwR8xctlHseRHfGORezhyda0EZ/TT8y8pKETxtymUUCBGy5dz3VProKJWsmXgmgIcX5gU8TX8qvdO5pSDhXcxUb8KUZAMZiQ2sUWAjKJTzke2uH0x7wUEJfdKnO1d/rQkZEw9bG3RgDgypKgG9vRKO7q9wokpTzllqHm3M=; 7:RV7VmwlAfzDS/OJH2CFxYF4N4cB66TgdwXtncNR7g5t+fYcs69Ou0dyNxo96X7xS6RgG9LhMS+5WBLDxrjkiNRy4FegkzHrBZTJo77tBiXaAyAAdQgmTh3hm97OMeRWfxPmdJmEbf0uTrL6ygX67Z9ETK3x8jVU4x1OhAeRAoc5/TIzpD1kosFTbGtvLMl/pCfOdcmC3ymPBW/JRaJXPVWrivwzpePlCv2OjNnMVcYwm+c6tKQCWXXBb21Wn8dSy x-ms-office365-filtering-correlation-id: 8ce17eb2-4f42-4fae-2797-08d61e009471 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:DB3PR0502MB3994; x-ms-traffictypediagnostic: DB3PR0502MB3994: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(269456686620040)(211171220733660); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3231355)(944501410)(52105095)(3002001)(6055026)(149027)(150027)(6041310)(20161123558120)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123564045)(201708071742011)(7699050); SRVR:DB3PR0502MB3994; BCL:0; PCL:0; RULEID:; SRVR:DB3PR0502MB3994; x-forefront-prvs: 0800C0C167 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(376002)(366004)(396003)(39860400002)(346002)(189003)(199004)(26005)(107886003)(102836004)(37006003)(305945005)(52116002)(446003)(8936002)(8676002)(53936002)(66066001)(6486002)(6636002)(5250100002)(16200700003)(68736007)(478600001)(11346002)(53946003)(76176011)(6436002)(105586002)(316002)(3846002)(256004)(14444005)(5024004)(6512007)(5660300001)(99286004)(1076002)(14454004)(6116002)(54906003)(106356001)(6862004)(2906002)(97736004)(81166006)(25786009)(36756003)(7736002)(81156014)(486006)(2616005)(476003)(86362001)(575784001)(2900100001)(4326008)(6506007)(386003)(569006); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0502MB3994; H:DB3PR0502MB3980.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: YYni9w+lzMabmfkxtJ3+dr4OOywJ7/OLDYIbCi1Qjr4C2+lUDMX9r2BLyKc2m35d3yCzcpQ25HNil+F1/0wieSNAbEflyvQ7pqGDYI78rU1nedP5Tl4/xGwIgYmq9lGb0aMiNb9t65cPVla6tSL9fiQWtukfcUKSDJgUkRMwHHadSPnGidLdrIFfoSEysspP10UUcM+b5DmQmhBtCxi67+v6nbm4SteK0WXTGflLL8HunfmDSCFt698MzrGxCQUQQhvss7Rg+OM93gb0G0mBg5WRTHb9z7BjvAHQ/YA3p3JjPLGfLF3H4vt+NvbvByNXfeYSPlYQ6OTuoXNdY6ZS1m8jIsvdB3v/Mhhe4VcFJ60= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8ce17eb2-4f42-4fae-2797-08d61e009471 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Sep 2018 07:21:56.5195 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0502MB3994 Subject: [dpdk-dev] [PATCH 2/3] net/mlx5: remove Netlink flow driver X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Netlink based E-Switch flow engine will be migrated to the new flow engine. nl_flow will be renamed to flow_tcf as it goes through Linux TC flower interface. Signed-off-by: Yongseok Koh --- drivers/net/mlx5/Makefile | 1 - drivers/net/mlx5/mlx5.c | 32 - drivers/net/mlx5/mlx5.h | 25 - drivers/net/mlx5/mlx5_nl_flow.c | 1228 --------------------------------------- 4 files changed, 1286 deletions(-) delete mode 100644 drivers/net/mlx5/mlx5_nl_flow.c diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index 4243b37ca..9c1044808 100644 --- a/drivers/net/mlx5/Makefile +++ b/drivers/net/mlx5/Makefile @@ -35,7 +35,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl_flow.c ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y) INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 02324ef4f..47cf538e2 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -286,8 +286,6 @@ mlx5_dev_close(struct rte_eth_dev *dev) close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); - if (priv->mnl_socket) - mlx5_nl_flow_socket_destroy(priv->mnl_socket); ret = mlx5_hrxq_ibv_verify(dev); if (ret) DRV_LOG(WARNING, "port %u some hash Rx queue still remain", @@ -1137,34 +1135,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0)); if (vf && config.vf_nl_en) mlx5_nl_mac_addr_sync(eth_dev); - priv->mnl_socket = mlx5_nl_flow_socket_create(); - if (!priv->mnl_socket) { - err = -rte_errno; - DRV_LOG(WARNING, - "flow rules relying on switch offloads will not be" - " supported: cannot open libmnl socket: %s", - strerror(rte_errno)); - } else { - struct rte_flow_error error; - unsigned int ifindex = mlx5_ifindex(eth_dev); - - if (!ifindex) { - err = -rte_errno; - error.message = - "cannot retrieve network interface index"; - } else { - err = mlx5_nl_flow_init(priv->mnl_socket, ifindex, - &error); - } - if (err) { - DRV_LOG(WARNING, - "flow rules relying on switch offloads will" - " not be supported: %s: %s", - error.message, strerror(rte_errno)); - mlx5_nl_flow_socket_destroy(priv->mnl_socket); - priv->mnl_socket = NULL; - } - } TAILQ_INIT(&priv->flows); TAILQ_INIT(&priv->ctrl_flows); /* Hint libmlx5 to use PMD allocator for data plane resources */ @@ -1217,8 +1187,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); - if (priv->mnl_socket) - mlx5_nl_flow_socket_destroy(priv->mnl_socket); if (own_domain_id) claim_zero(rte_eth_switch_domain_free(priv->domain_id)); rte_free(priv); diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 006cc8e06..5255a0270 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -157,12 +157,6 @@ struct mlx5_drop { struct mlx5_rxq_ibv *rxq; /* Verbs Rx queue. */ }; -/** DPDK port to network interface index (ifindex) conversion. */ -struct mlx5_nl_flow_ptoi { - uint16_t port_id; /**< DPDK port ID. */ - unsigned int ifindex; /**< Network interface index. */ -}; - struct mnl_socket; struct priv { @@ -398,23 +392,4 @@ unsigned int mlx5_nl_ifindex(int nl, const char *name); int mlx5_nl_switch_info(int nl, unsigned int ifindex, struct mlx5_switch_info *info); -/* mlx5_nl_flow.c */ - -int mlx5_nl_flow_transpose(void *buf, - size_t size, - const struct mlx5_nl_flow_ptoi *ptoi, - const struct rte_flow_attr *attr, - const struct rte_flow_item *pattern, - const struct rte_flow_action *actions, - struct rte_flow_error *error); -void mlx5_nl_flow_brand(void *buf, uint32_t handle); -int mlx5_nl_flow_create(struct mnl_socket *nl, void *buf, - struct rte_flow_error *error); -int mlx5_nl_flow_destroy(struct mnl_socket *nl, void *buf, - struct rte_flow_error *error); -int mlx5_nl_flow_init(struct mnl_socket *nl, unsigned int ifindex, - struct rte_flow_error *error); -struct mnl_socket *mlx5_nl_flow_socket_create(void); -void mlx5_nl_flow_socket_destroy(struct mnl_socket *nl); - #endif /* RTE_PMD_MLX5_H_ */ diff --git a/drivers/net/mlx5/mlx5_nl_flow.c b/drivers/net/mlx5/mlx5_nl_flow.c deleted file mode 100644 index beb03c911..000000000 --- a/drivers/net/mlx5/mlx5_nl_flow.c +++ /dev/null @@ -1,1228 +0,0 @@ -/* SPDX-License-Identifier: BSD-3-Clause - * Copyright 2018 6WIND S.A. - * Copyright 2018 Mellanox Technologies, Ltd - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -#include "mlx5.h" -#include "mlx5_autoconf.h" - -#ifdef HAVE_TC_ACT_VLAN - -#include - -#else /* HAVE_TC_ACT_VLAN */ - -#define TCA_VLAN_ACT_POP 1 -#define TCA_VLAN_ACT_PUSH 2 -#define TCA_VLAN_ACT_MODIFY 3 -#define TCA_VLAN_PARMS 2 -#define TCA_VLAN_PUSH_VLAN_ID 3 -#define TCA_VLAN_PUSH_VLAN_PROTOCOL 4 -#define TCA_VLAN_PAD 5 -#define TCA_VLAN_PUSH_VLAN_PRIORITY 6 - -struct tc_vlan { - tc_gen; - int v_action; -}; - -#endif /* HAVE_TC_ACT_VLAN */ - -/* Normally found in linux/netlink.h. */ -#ifndef NETLINK_CAP_ACK -#define NETLINK_CAP_ACK 10 -#endif - -/* Normally found in linux/pkt_sched.h. */ -#ifndef TC_H_MIN_INGRESS -#define TC_H_MIN_INGRESS 0xfff2u -#endif - -/* Normally found in linux/pkt_cls.h. */ -#ifndef TCA_CLS_FLAGS_SKIP_SW -#define TCA_CLS_FLAGS_SKIP_SW (1 << 1) -#endif -#ifndef HAVE_TCA_FLOWER_ACT -#define TCA_FLOWER_ACT 3 -#endif -#ifndef HAVE_TCA_FLOWER_FLAGS -#define TCA_FLOWER_FLAGS 22 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_ETH_TYPE -#define TCA_FLOWER_KEY_ETH_TYPE 8 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_ETH_DST -#define TCA_FLOWER_KEY_ETH_DST 4 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_ETH_DST_MASK -#define TCA_FLOWER_KEY_ETH_DST_MASK 5 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_ETH_SRC -#define TCA_FLOWER_KEY_ETH_SRC 6 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_ETH_SRC_MASK -#define TCA_FLOWER_KEY_ETH_SRC_MASK 7 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IP_PROTO -#define TCA_FLOWER_KEY_IP_PROTO 9 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV4_SRC -#define TCA_FLOWER_KEY_IPV4_SRC 10 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV4_SRC_MASK -#define TCA_FLOWER_KEY_IPV4_SRC_MASK 11 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV4_DST -#define TCA_FLOWER_KEY_IPV4_DST 12 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV4_DST_MASK -#define TCA_FLOWER_KEY_IPV4_DST_MASK 13 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV6_SRC -#define TCA_FLOWER_KEY_IPV6_SRC 14 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV6_SRC_MASK -#define TCA_FLOWER_KEY_IPV6_SRC_MASK 15 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV6_DST -#define TCA_FLOWER_KEY_IPV6_DST 16 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_IPV6_DST_MASK -#define TCA_FLOWER_KEY_IPV6_DST_MASK 17 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_TCP_SRC -#define TCA_FLOWER_KEY_TCP_SRC 18 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_TCP_SRC_MASK -#define TCA_FLOWER_KEY_TCP_SRC_MASK 35 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_TCP_DST -#define TCA_FLOWER_KEY_TCP_DST 19 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_TCP_DST_MASK -#define TCA_FLOWER_KEY_TCP_DST_MASK 36 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_UDP_SRC -#define TCA_FLOWER_KEY_UDP_SRC 20 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_UDP_SRC_MASK -#define TCA_FLOWER_KEY_UDP_SRC_MASK 37 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_UDP_DST -#define TCA_FLOWER_KEY_UDP_DST 21 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_UDP_DST_MASK -#define TCA_FLOWER_KEY_UDP_DST_MASK 38 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_VLAN_ID -#define TCA_FLOWER_KEY_VLAN_ID 23 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_VLAN_PRIO -#define TCA_FLOWER_KEY_VLAN_PRIO 24 -#endif -#ifndef HAVE_TCA_FLOWER_KEY_VLAN_ETH_TYPE -#define TCA_FLOWER_KEY_VLAN_ETH_TYPE 25 -#endif - -/** Parser state definitions for mlx5_nl_flow_trans[]. */ -enum mlx5_nl_flow_trans { - INVALID, - BACK, - ATTR, - PATTERN, - ITEM_VOID, - ITEM_PORT_ID, - ITEM_ETH, - ITEM_VLAN, - ITEM_IPV4, - ITEM_IPV6, - ITEM_TCP, - ITEM_UDP, - ACTIONS, - ACTION_VOID, - ACTION_PORT_ID, - ACTION_DROP, - ACTION_OF_POP_VLAN, - ACTION_OF_PUSH_VLAN, - ACTION_OF_SET_VLAN_VID, - ACTION_OF_SET_VLAN_PCP, - END, -}; - -#define TRANS(...) (const enum mlx5_nl_flow_trans []){ __VA_ARGS__, INVALID, } - -#define PATTERN_COMMON \ - ITEM_VOID, ITEM_PORT_ID, ACTIONS -#define ACTIONS_COMMON \ - ACTION_VOID, ACTION_OF_POP_VLAN, ACTION_OF_PUSH_VLAN, \ - ACTION_OF_SET_VLAN_VID, ACTION_OF_SET_VLAN_PCP -#define ACTIONS_FATE \ - ACTION_PORT_ID, ACTION_DROP - -/** Parser state transitions used by mlx5_nl_flow_transpose(). */ -static const enum mlx5_nl_flow_trans *const mlx5_nl_flow_trans[] = { - [INVALID] = NULL, - [BACK] = NULL, - [ATTR] = TRANS(PATTERN), - [PATTERN] = TRANS(ITEM_ETH, PATTERN_COMMON), - [ITEM_VOID] = TRANS(BACK), - [ITEM_PORT_ID] = TRANS(BACK), - [ITEM_ETH] = TRANS(ITEM_IPV4, ITEM_IPV6, ITEM_VLAN, PATTERN_COMMON), - [ITEM_VLAN] = TRANS(ITEM_IPV4, ITEM_IPV6, PATTERN_COMMON), - [ITEM_IPV4] = TRANS(ITEM_TCP, ITEM_UDP, PATTERN_COMMON), - [ITEM_IPV6] = TRANS(ITEM_TCP, ITEM_UDP, PATTERN_COMMON), - [ITEM_TCP] = TRANS(PATTERN_COMMON), - [ITEM_UDP] = TRANS(PATTERN_COMMON), - [ACTIONS] = TRANS(ACTIONS_FATE, ACTIONS_COMMON), - [ACTION_VOID] = TRANS(BACK), - [ACTION_PORT_ID] = TRANS(ACTION_VOID, END), - [ACTION_DROP] = TRANS(ACTION_VOID, END), - [ACTION_OF_POP_VLAN] = TRANS(ACTIONS_FATE, ACTIONS_COMMON), - [ACTION_OF_PUSH_VLAN] = TRANS(ACTIONS_FATE, ACTIONS_COMMON), - [ACTION_OF_SET_VLAN_VID] = TRANS(ACTIONS_FATE, ACTIONS_COMMON), - [ACTION_OF_SET_VLAN_PCP] = TRANS(ACTIONS_FATE, ACTIONS_COMMON), - [END] = NULL, -}; - -/** Empty masks for known item types. */ -static const union { - struct rte_flow_item_port_id port_id; - struct rte_flow_item_eth eth; - struct rte_flow_item_vlan vlan; - struct rte_flow_item_ipv4 ipv4; - struct rte_flow_item_ipv6 ipv6; - struct rte_flow_item_tcp tcp; - struct rte_flow_item_udp udp; -} mlx5_nl_flow_mask_empty; - -/** Supported masks for known item types. */ -static const struct { - struct rte_flow_item_port_id port_id; - struct rte_flow_item_eth eth; - struct rte_flow_item_vlan vlan; - struct rte_flow_item_ipv4 ipv4; - struct rte_flow_item_ipv6 ipv6; - struct rte_flow_item_tcp tcp; - struct rte_flow_item_udp udp; -} mlx5_nl_flow_mask_supported = { - .port_id = { - .id = 0xffffffff, - }, - .eth = { - .type = RTE_BE16(0xffff), - .dst.addr_bytes = "\xff\xff\xff\xff\xff\xff", - .src.addr_bytes = "\xff\xff\xff\xff\xff\xff", - }, - .vlan = { - /* PCP and VID only, no DEI. */ - .tci = RTE_BE16(0xefff), - .inner_type = RTE_BE16(0xffff), - }, - .ipv4.hdr = { - .next_proto_id = 0xff, - .src_addr = RTE_BE32(0xffffffff), - .dst_addr = RTE_BE32(0xffffffff), - }, - .ipv6.hdr = { - .proto = 0xff, - .src_addr = - "\xff\xff\xff\xff\xff\xff\xff\xff" - "\xff\xff\xff\xff\xff\xff\xff\xff", - .dst_addr = - "\xff\xff\xff\xff\xff\xff\xff\xff" - "\xff\xff\xff\xff\xff\xff\xff\xff", - }, - .tcp.hdr = { - .src_port = RTE_BE16(0xffff), - .dst_port = RTE_BE16(0xffff), - }, - .udp.hdr = { - .src_port = RTE_BE16(0xffff), - .dst_port = RTE_BE16(0xffff), - }, -}; - -/** - * Retrieve mask for pattern item. - * - * This function does basic sanity checks on a pattern item in order to - * return the most appropriate mask for it. - * - * @param[in] item - * Item specification. - * @param[in] mask_default - * Default mask for pattern item as specified by the flow API. - * @param[in] mask_supported - * Mask fields supported by the implementation. - * @param[in] mask_empty - * Empty mask to return when there is no specification. - * @param[out] error - * Perform verbose error reporting if not NULL. - * - * @return - * Either @p item->mask or one of the mask parameters on success, NULL - * otherwise and rte_errno is set. - */ -static const void * -mlx5_nl_flow_item_mask(const struct rte_flow_item *item, - const void *mask_default, - const void *mask_supported, - const void *mask_empty, - size_t mask_size, - struct rte_flow_error *error) -{ - const uint8_t *mask; - size_t i; - - /* item->last and item->mask cannot exist without item->spec. */ - if (!item->spec && (item->mask || item->last)) { - rte_flow_error_set - (error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, item, - "\"mask\" or \"last\" field provided without a" - " corresponding \"spec\""); - return NULL; - } - /* No spec, no mask, no problem. */ - if (!item->spec) - return mask_empty; - mask = item->mask ? item->mask : mask_default; - assert(mask); - /* - * Single-pass check to make sure that: - * - Mask is supported, no bits are set outside mask_supported. - * - Both item->spec and item->last are included in mask. - */ - for (i = 0; i != mask_size; ++i) { - if (!mask[i]) - continue; - if ((mask[i] | ((const uint8_t *)mask_supported)[i]) != - ((const uint8_t *)mask_supported)[i]) { - rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_MASK, - mask, "unsupported field found in \"mask\""); - return NULL; - } - if (item->last && - (((const uint8_t *)item->spec)[i] & mask[i]) != - (((const uint8_t *)item->last)[i] & mask[i])) { - rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_LAST, - item->last, - "range between \"spec\" and \"last\" not" - " comprised in \"mask\""); - return NULL; - } - } - return mask; -} - -/** - * Transpose flow rule description to rtnetlink message. - * - * This function transposes a flow rule description to a traffic control - * (TC) filter creation message ready to be sent over Netlink. - * - * Target interface is specified as the first entry of the @p ptoi table. - * Subsequent entries enable this function to resolve other DPDK port IDs - * found in the flow rule. - * - * @param[out] buf - * Output message buffer. May be NULL when @p size is 0. - * @param size - * Size of @p buf. Message may be truncated if not large enough. - * @param[in] ptoi - * DPDK port ID to network interface index translation table. This table - * is terminated by an entry with a zero ifindex value. - * @param[in] attr - * Flow rule attributes. - * @param[in] pattern - * Pattern specification. - * @param[in] actions - * Associated actions. - * @param[out] error - * Perform verbose error reporting if not NULL. - * - * @return - * A positive value representing the exact size of the message in bytes - * regardless of the @p size parameter on success, a negative errno value - * otherwise and rte_errno is set. - */ -int -mlx5_nl_flow_transpose(void *buf, - size_t size, - const struct mlx5_nl_flow_ptoi *ptoi, - const struct rte_flow_attr *attr, - const struct rte_flow_item *pattern, - const struct rte_flow_action *actions, - struct rte_flow_error *error) -{ - alignas(struct nlmsghdr) - uint8_t buf_tmp[mnl_nlmsg_size(sizeof(struct tcmsg) + 1024)]; - const struct rte_flow_item *item; - const struct rte_flow_action *action; - unsigned int n; - uint32_t act_index_cur; - bool in_port_id_set; - bool eth_type_set; - bool vlan_present; - bool vlan_eth_type_set; - bool ip_proto_set; - struct nlattr *na_flower; - struct nlattr *na_flower_act; - struct nlattr *na_vlan_id; - struct nlattr *na_vlan_priority; - const enum mlx5_nl_flow_trans *trans; - const enum mlx5_nl_flow_trans *back; - - if (!size) - goto error_nobufs; -init: - item = pattern; - action = actions; - n = 0; - act_index_cur = 0; - in_port_id_set = false; - eth_type_set = false; - vlan_present = false; - vlan_eth_type_set = false; - ip_proto_set = false; - na_flower = NULL; - na_flower_act = NULL; - na_vlan_id = NULL; - na_vlan_priority = NULL; - trans = TRANS(ATTR); - back = trans; -trans: - switch (trans[n++]) { - union { - const struct rte_flow_item_port_id *port_id; - const struct rte_flow_item_eth *eth; - const struct rte_flow_item_vlan *vlan; - const struct rte_flow_item_ipv4 *ipv4; - const struct rte_flow_item_ipv6 *ipv6; - const struct rte_flow_item_tcp *tcp; - const struct rte_flow_item_udp *udp; - } spec, mask; - union { - const struct rte_flow_action_port_id *port_id; - const struct rte_flow_action_of_push_vlan *of_push_vlan; - const struct rte_flow_action_of_set_vlan_vid * - of_set_vlan_vid; - const struct rte_flow_action_of_set_vlan_pcp * - of_set_vlan_pcp; - } conf; - struct nlmsghdr *nlh; - struct tcmsg *tcm; - struct nlattr *act_index; - struct nlattr *act; - unsigned int i; - - case INVALID: - if (item->type) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, - item, "unsupported pattern item combination"); - else if (action->type) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION, - action, "unsupported action combination"); - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, - "flow rule lacks some kind of fate action"); - case BACK: - trans = back; - n = 0; - goto trans; - case ATTR: - /* - * Supported attributes: no groups, some priorities and - * ingress only. Don't care about transfer as it is the - * caller's problem. - */ - if (attr->group) - return rte_flow_error_set - (error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_GROUP, - attr, "groups are not supported"); - if (attr->priority > 0xfffe) - return rte_flow_error_set - (error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, - attr, "lowest priority level is 0xfffe"); - if (!attr->ingress) - return rte_flow_error_set - (error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, - attr, "only ingress is supported"); - if (attr->egress) - return rte_flow_error_set - (error, ENOTSUP, - RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, - attr, "egress is not supported"); - if (size < mnl_nlmsg_size(sizeof(*tcm))) - goto error_nobufs; - nlh = mnl_nlmsg_put_header(buf); - nlh->nlmsg_type = 0; - nlh->nlmsg_flags = 0; - nlh->nlmsg_seq = 0; - tcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm)); - tcm->tcm_family = AF_UNSPEC; - tcm->tcm_ifindex = ptoi[0].ifindex; - /* - * Let kernel pick a handle by default. A predictable handle - * can be set by the caller on the resulting buffer through - * mlx5_nl_flow_brand(). - */ - tcm->tcm_handle = 0; - tcm->tcm_parent = TC_H_MAKE(TC_H_INGRESS, TC_H_MIN_INGRESS); - /* - * Priority cannot be zero to prevent the kernel from - * picking one automatically. - */ - tcm->tcm_info = TC_H_MAKE((attr->priority + 1) << 16, - RTE_BE16(ETH_P_ALL)); - break; - case PATTERN: - if (!mnl_attr_put_strz_check(buf, size, TCA_KIND, "flower")) - goto error_nobufs; - na_flower = mnl_attr_nest_start_check(buf, size, TCA_OPTIONS); - if (!na_flower) - goto error_nobufs; - if (!mnl_attr_put_u32_check(buf, size, TCA_FLOWER_FLAGS, - TCA_CLS_FLAGS_SKIP_SW)) - goto error_nobufs; - break; - case ITEM_VOID: - if (item->type != RTE_FLOW_ITEM_TYPE_VOID) - goto trans; - ++item; - break; - case ITEM_PORT_ID: - if (item->type != RTE_FLOW_ITEM_TYPE_PORT_ID) - goto trans; - mask.port_id = mlx5_nl_flow_item_mask - (item, &rte_flow_item_port_id_mask, - &mlx5_nl_flow_mask_supported.port_id, - &mlx5_nl_flow_mask_empty.port_id, - sizeof(mlx5_nl_flow_mask_supported.port_id), error); - if (!mask.port_id) - return -rte_errno; - if (mask.port_id == &mlx5_nl_flow_mask_empty.port_id) { - in_port_id_set = 1; - ++item; - break; - } - spec.port_id = item->spec; - if (mask.port_id->id && mask.port_id->id != 0xffffffff) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_MASK, - mask.port_id, - "no support for partial mask on" - " \"id\" field"); - if (!mask.port_id->id) - i = 0; - else - for (i = 0; ptoi[i].ifindex; ++i) - if (ptoi[i].port_id == spec.port_id->id) - break; - if (!ptoi[i].ifindex) - return rte_flow_error_set - (error, ENODEV, RTE_FLOW_ERROR_TYPE_ITEM_SPEC, - spec.port_id, - "missing data to convert port ID to ifindex"); - tcm = mnl_nlmsg_get_payload(buf); - if (in_port_id_set && - ptoi[i].ifindex != (unsigned int)tcm->tcm_ifindex) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_SPEC, - spec.port_id, - "cannot match traffic for several port IDs" - " through a single flow rule"); - tcm->tcm_ifindex = ptoi[i].ifindex; - in_port_id_set = 1; - ++item; - break; - case ITEM_ETH: - if (item->type != RTE_FLOW_ITEM_TYPE_ETH) - goto trans; - mask.eth = mlx5_nl_flow_item_mask - (item, &rte_flow_item_eth_mask, - &mlx5_nl_flow_mask_supported.eth, - &mlx5_nl_flow_mask_empty.eth, - sizeof(mlx5_nl_flow_mask_supported.eth), error); - if (!mask.eth) - return -rte_errno; - if (mask.eth == &mlx5_nl_flow_mask_empty.eth) { - ++item; - break; - } - spec.eth = item->spec; - if (mask.eth->type && mask.eth->type != RTE_BE16(0xffff)) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_MASK, - mask.eth, - "no support for partial mask on" - " \"type\" field"); - if (mask.eth->type) { - if (!mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_ETH_TYPE, - spec.eth->type)) - goto error_nobufs; - eth_type_set = 1; - } - if ((!is_zero_ether_addr(&mask.eth->dst) && - (!mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_ETH_DST, - ETHER_ADDR_LEN, - spec.eth->dst.addr_bytes) || - !mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_ETH_DST_MASK, - ETHER_ADDR_LEN, - mask.eth->dst.addr_bytes))) || - (!is_zero_ether_addr(&mask.eth->src) && - (!mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_ETH_SRC, - ETHER_ADDR_LEN, - spec.eth->src.addr_bytes) || - !mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_ETH_SRC_MASK, - ETHER_ADDR_LEN, - mask.eth->src.addr_bytes)))) - goto error_nobufs; - ++item; - break; - case ITEM_VLAN: - if (item->type != RTE_FLOW_ITEM_TYPE_VLAN) - goto trans; - mask.vlan = mlx5_nl_flow_item_mask - (item, &rte_flow_item_vlan_mask, - &mlx5_nl_flow_mask_supported.vlan, - &mlx5_nl_flow_mask_empty.vlan, - sizeof(mlx5_nl_flow_mask_supported.vlan), error); - if (!mask.vlan) - return -rte_errno; - if (!eth_type_set && - !mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_ETH_TYPE, - RTE_BE16(ETH_P_8021Q))) - goto error_nobufs; - eth_type_set = 1; - vlan_present = 1; - if (mask.vlan == &mlx5_nl_flow_mask_empty.vlan) { - ++item; - break; - } - spec.vlan = item->spec; - if ((mask.vlan->tci & RTE_BE16(0xe000) && - (mask.vlan->tci & RTE_BE16(0xe000)) != RTE_BE16(0xe000)) || - (mask.vlan->tci & RTE_BE16(0x0fff) && - (mask.vlan->tci & RTE_BE16(0x0fff)) != RTE_BE16(0x0fff)) || - (mask.vlan->inner_type && - mask.vlan->inner_type != RTE_BE16(0xffff))) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_MASK, - mask.vlan, - "no support for partial masks on" - " \"tci\" (PCP and VID parts) and" - " \"inner_type\" fields"); - if (mask.vlan->inner_type) { - if (!mnl_attr_put_u16_check - (buf, size, TCA_FLOWER_KEY_VLAN_ETH_TYPE, - spec.vlan->inner_type)) - goto error_nobufs; - vlan_eth_type_set = 1; - } - if ((mask.vlan->tci & RTE_BE16(0xe000) && - !mnl_attr_put_u8_check - (buf, size, TCA_FLOWER_KEY_VLAN_PRIO, - (rte_be_to_cpu_16(spec.vlan->tci) >> 13) & 0x7)) || - (mask.vlan->tci & RTE_BE16(0x0fff) && - !mnl_attr_put_u16_check - (buf, size, TCA_FLOWER_KEY_VLAN_ID, - rte_be_to_cpu_16(spec.vlan->tci & RTE_BE16(0x0fff))))) - goto error_nobufs; - ++item; - break; - case ITEM_IPV4: - if (item->type != RTE_FLOW_ITEM_TYPE_IPV4) - goto trans; - mask.ipv4 = mlx5_nl_flow_item_mask - (item, &rte_flow_item_ipv4_mask, - &mlx5_nl_flow_mask_supported.ipv4, - &mlx5_nl_flow_mask_empty.ipv4, - sizeof(mlx5_nl_flow_mask_supported.ipv4), error); - if (!mask.ipv4) - return -rte_errno; - if ((!eth_type_set || !vlan_eth_type_set) && - !mnl_attr_put_u16_check(buf, size, - vlan_present ? - TCA_FLOWER_KEY_VLAN_ETH_TYPE : - TCA_FLOWER_KEY_ETH_TYPE, - RTE_BE16(ETH_P_IP))) - goto error_nobufs; - eth_type_set = 1; - vlan_eth_type_set = 1; - if (mask.ipv4 == &mlx5_nl_flow_mask_empty.ipv4) { - ++item; - break; - } - spec.ipv4 = item->spec; - if (mask.ipv4->hdr.next_proto_id && - mask.ipv4->hdr.next_proto_id != 0xff) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_MASK, - mask.ipv4, - "no support for partial mask on" - " \"hdr.next_proto_id\" field"); - if (mask.ipv4->hdr.next_proto_id) { - if (!mnl_attr_put_u8_check - (buf, size, TCA_FLOWER_KEY_IP_PROTO, - spec.ipv4->hdr.next_proto_id)) - goto error_nobufs; - ip_proto_set = 1; - } - if ((mask.ipv4->hdr.src_addr && - (!mnl_attr_put_u32_check(buf, size, - TCA_FLOWER_KEY_IPV4_SRC, - spec.ipv4->hdr.src_addr) || - !mnl_attr_put_u32_check(buf, size, - TCA_FLOWER_KEY_IPV4_SRC_MASK, - mask.ipv4->hdr.src_addr))) || - (mask.ipv4->hdr.dst_addr && - (!mnl_attr_put_u32_check(buf, size, - TCA_FLOWER_KEY_IPV4_DST, - spec.ipv4->hdr.dst_addr) || - !mnl_attr_put_u32_check(buf, size, - TCA_FLOWER_KEY_IPV4_DST_MASK, - mask.ipv4->hdr.dst_addr)))) - goto error_nobufs; - ++item; - break; - case ITEM_IPV6: - if (item->type != RTE_FLOW_ITEM_TYPE_IPV6) - goto trans; - mask.ipv6 = mlx5_nl_flow_item_mask - (item, &rte_flow_item_ipv6_mask, - &mlx5_nl_flow_mask_supported.ipv6, - &mlx5_nl_flow_mask_empty.ipv6, - sizeof(mlx5_nl_flow_mask_supported.ipv6), error); - if (!mask.ipv6) - return -rte_errno; - if ((!eth_type_set || !vlan_eth_type_set) && - !mnl_attr_put_u16_check(buf, size, - vlan_present ? - TCA_FLOWER_KEY_VLAN_ETH_TYPE : - TCA_FLOWER_KEY_ETH_TYPE, - RTE_BE16(ETH_P_IPV6))) - goto error_nobufs; - eth_type_set = 1; - vlan_eth_type_set = 1; - if (mask.ipv6 == &mlx5_nl_flow_mask_empty.ipv6) { - ++item; - break; - } - spec.ipv6 = item->spec; - if (mask.ipv6->hdr.proto && mask.ipv6->hdr.proto != 0xff) - return rte_flow_error_set - (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM_MASK, - mask.ipv6, - "no support for partial mask on" - " \"hdr.proto\" field"); - if (mask.ipv6->hdr.proto) { - if (!mnl_attr_put_u8_check - (buf, size, TCA_FLOWER_KEY_IP_PROTO, - spec.ipv6->hdr.proto)) - goto error_nobufs; - ip_proto_set = 1; - } - if ((!IN6_IS_ADDR_UNSPECIFIED(mask.ipv6->hdr.src_addr) && - (!mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_IPV6_SRC, - sizeof(spec.ipv6->hdr.src_addr), - spec.ipv6->hdr.src_addr) || - !mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_IPV6_SRC_MASK, - sizeof(mask.ipv6->hdr.src_addr), - mask.ipv6->hdr.src_addr))) || - (!IN6_IS_ADDR_UNSPECIFIED(mask.ipv6->hdr.dst_addr) && - (!mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_IPV6_DST, - sizeof(spec.ipv6->hdr.dst_addr), - spec.ipv6->hdr.dst_addr) || - !mnl_attr_put_check(buf, size, - TCA_FLOWER_KEY_IPV6_DST_MASK, - sizeof(mask.ipv6->hdr.dst_addr), - mask.ipv6->hdr.dst_addr)))) - goto error_nobufs; - ++item; - break; - case ITEM_TCP: - if (item->type != RTE_FLOW_ITEM_TYPE_TCP) - goto trans; - mask.tcp = mlx5_nl_flow_item_mask - (item, &rte_flow_item_tcp_mask, - &mlx5_nl_flow_mask_supported.tcp, - &mlx5_nl_flow_mask_empty.tcp, - sizeof(mlx5_nl_flow_mask_supported.tcp), error); - if (!mask.tcp) - return -rte_errno; - if (!ip_proto_set && - !mnl_attr_put_u8_check(buf, size, - TCA_FLOWER_KEY_IP_PROTO, - IPPROTO_TCP)) - goto error_nobufs; - if (mask.tcp == &mlx5_nl_flow_mask_empty.tcp) { - ++item; - break; - } - spec.tcp = item->spec; - if ((mask.tcp->hdr.src_port && - (!mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_TCP_SRC, - spec.tcp->hdr.src_port) || - !mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_TCP_SRC_MASK, - mask.tcp->hdr.src_port))) || - (mask.tcp->hdr.dst_port && - (!mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_TCP_DST, - spec.tcp->hdr.dst_port) || - !mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_TCP_DST_MASK, - mask.tcp->hdr.dst_port)))) - goto error_nobufs; - ++item; - break; - case ITEM_UDP: - if (item->type != RTE_FLOW_ITEM_TYPE_UDP) - goto trans; - mask.udp = mlx5_nl_flow_item_mask - (item, &rte_flow_item_udp_mask, - &mlx5_nl_flow_mask_supported.udp, - &mlx5_nl_flow_mask_empty.udp, - sizeof(mlx5_nl_flow_mask_supported.udp), error); - if (!mask.udp) - return -rte_errno; - if (!ip_proto_set && - !mnl_attr_put_u8_check(buf, size, - TCA_FLOWER_KEY_IP_PROTO, - IPPROTO_UDP)) - goto error_nobufs; - if (mask.udp == &mlx5_nl_flow_mask_empty.udp) { - ++item; - break; - } - spec.udp = item->spec; - if ((mask.udp->hdr.src_port && - (!mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_UDP_SRC, - spec.udp->hdr.src_port) || - !mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_UDP_SRC_MASK, - mask.udp->hdr.src_port))) || - (mask.udp->hdr.dst_port && - (!mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_UDP_DST, - spec.udp->hdr.dst_port) || - !mnl_attr_put_u16_check(buf, size, - TCA_FLOWER_KEY_UDP_DST_MASK, - mask.udp->hdr.dst_port)))) - goto error_nobufs; - ++item; - break; - case ACTIONS: - if (item->type != RTE_FLOW_ITEM_TYPE_END) - goto trans; - assert(na_flower); - assert(!na_flower_act); - na_flower_act = - mnl_attr_nest_start_check(buf, size, TCA_FLOWER_ACT); - if (!na_flower_act) - goto error_nobufs; - act_index_cur = 1; - break; - case ACTION_VOID: - if (action->type != RTE_FLOW_ACTION_TYPE_VOID) - goto trans; - ++action; - break; - case ACTION_PORT_ID: - if (action->type != RTE_FLOW_ACTION_TYPE_PORT_ID) - goto trans; - conf.port_id = action->conf; - if (conf.port_id->original) - i = 0; - else - for (i = 0; ptoi[i].ifindex; ++i) - if (ptoi[i].port_id == conf.port_id->id) - break; - if (!ptoi[i].ifindex) - return rte_flow_error_set - (error, ENODEV, RTE_FLOW_ERROR_TYPE_ACTION_CONF, - conf.port_id, - "missing data to convert port ID to ifindex"); - act_index = - mnl_attr_nest_start_check(buf, size, act_index_cur++); - if (!act_index || - !mnl_attr_put_strz_check(buf, size, TCA_ACT_KIND, "mirred")) - goto error_nobufs; - act = mnl_attr_nest_start_check(buf, size, TCA_ACT_OPTIONS); - if (!act) - goto error_nobufs; - if (!mnl_attr_put_check(buf, size, TCA_MIRRED_PARMS, - sizeof(struct tc_mirred), - &(struct tc_mirred){ - .action = TC_ACT_STOLEN, - .eaction = TCA_EGRESS_REDIR, - .ifindex = ptoi[i].ifindex, - })) - goto error_nobufs; - mnl_attr_nest_end(buf, act); - mnl_attr_nest_end(buf, act_index); - ++action; - break; - case ACTION_DROP: - if (action->type != RTE_FLOW_ACTION_TYPE_DROP) - goto trans; - act_index = - mnl_attr_nest_start_check(buf, size, act_index_cur++); - if (!act_index || - !mnl_attr_put_strz_check(buf, size, TCA_ACT_KIND, "gact")) - goto error_nobufs; - act = mnl_attr_nest_start_check(buf, size, TCA_ACT_OPTIONS); - if (!act) - goto error_nobufs; - if (!mnl_attr_put_check(buf, size, TCA_GACT_PARMS, - sizeof(struct tc_gact), - &(struct tc_gact){ - .action = TC_ACT_SHOT, - })) - goto error_nobufs; - mnl_attr_nest_end(buf, act); - mnl_attr_nest_end(buf, act_index); - ++action; - break; - case ACTION_OF_POP_VLAN: - if (action->type != RTE_FLOW_ACTION_TYPE_OF_POP_VLAN) - goto trans; - conf.of_push_vlan = NULL; - i = TCA_VLAN_ACT_POP; - goto action_of_vlan; - case ACTION_OF_PUSH_VLAN: - if (action->type != RTE_FLOW_ACTION_TYPE_OF_PUSH_VLAN) - goto trans; - conf.of_push_vlan = action->conf; - i = TCA_VLAN_ACT_PUSH; - goto action_of_vlan; - case ACTION_OF_SET_VLAN_VID: - if (action->type != RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID) - goto trans; - conf.of_set_vlan_vid = action->conf; - if (na_vlan_id) - goto override_na_vlan_id; - i = TCA_VLAN_ACT_MODIFY; - goto action_of_vlan; - case ACTION_OF_SET_VLAN_PCP: - if (action->type != RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP) - goto trans; - conf.of_set_vlan_pcp = action->conf; - if (na_vlan_priority) - goto override_na_vlan_priority; - i = TCA_VLAN_ACT_MODIFY; - goto action_of_vlan; -action_of_vlan: - act_index = - mnl_attr_nest_start_check(buf, size, act_index_cur++); - if (!act_index || - !mnl_attr_put_strz_check(buf, size, TCA_ACT_KIND, "vlan")) - goto error_nobufs; - act = mnl_attr_nest_start_check(buf, size, TCA_ACT_OPTIONS); - if (!act) - goto error_nobufs; - if (!mnl_attr_put_check(buf, size, TCA_VLAN_PARMS, - sizeof(struct tc_vlan), - &(struct tc_vlan){ - .action = TC_ACT_PIPE, - .v_action = i, - })) - goto error_nobufs; - if (i == TCA_VLAN_ACT_POP) { - mnl_attr_nest_end(buf, act); - mnl_attr_nest_end(buf, act_index); - ++action; - break; - } - if (i == TCA_VLAN_ACT_PUSH && - !mnl_attr_put_u16_check(buf, size, - TCA_VLAN_PUSH_VLAN_PROTOCOL, - conf.of_push_vlan->ethertype)) - goto error_nobufs; - na_vlan_id = mnl_nlmsg_get_payload_tail(buf); - if (!mnl_attr_put_u16_check(buf, size, TCA_VLAN_PAD, 0)) - goto error_nobufs; - na_vlan_priority = mnl_nlmsg_get_payload_tail(buf); - if (!mnl_attr_put_u8_check(buf, size, TCA_VLAN_PAD, 0)) - goto error_nobufs; - mnl_attr_nest_end(buf, act); - mnl_attr_nest_end(buf, act_index); - if (action->type == RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID) { -override_na_vlan_id: - na_vlan_id->nla_type = TCA_VLAN_PUSH_VLAN_ID; - *(uint16_t *)mnl_attr_get_payload(na_vlan_id) = - rte_be_to_cpu_16 - (conf.of_set_vlan_vid->vlan_vid); - } else if (action->type == - RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP) { -override_na_vlan_priority: - na_vlan_priority->nla_type = - TCA_VLAN_PUSH_VLAN_PRIORITY; - *(uint8_t *)mnl_attr_get_payload(na_vlan_priority) = - conf.of_set_vlan_pcp->vlan_pcp; - } - ++action; - break; - case END: - if (item->type != RTE_FLOW_ITEM_TYPE_END || - action->type != RTE_FLOW_ACTION_TYPE_END) - goto trans; - if (na_flower_act) - mnl_attr_nest_end(buf, na_flower_act); - if (na_flower) - mnl_attr_nest_end(buf, na_flower); - nlh = buf; - return nlh->nlmsg_len; - } - back = trans; - trans = mlx5_nl_flow_trans[trans[n - 1]]; - n = 0; - goto trans; -error_nobufs: - if (buf != buf_tmp) { - buf = buf_tmp; - size = sizeof(buf_tmp); - goto init; - } - return rte_flow_error_set - (error, ENOBUFS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, - "generated TC message is too large"); -} - -/** - * Brand rtnetlink buffer with unique handle. - * - * This handle should be unique for a given network interface to avoid - * collisions. - * - * @param buf - * Flow rule buffer previously initialized by mlx5_nl_flow_transpose(). - * @param handle - * Unique 32-bit handle to use. - */ -void -mlx5_nl_flow_brand(void *buf, uint32_t handle) -{ - struct tcmsg *tcm = mnl_nlmsg_get_payload(buf); - - tcm->tcm_handle = handle; -} - -/** - * Send Netlink message with acknowledgment. - * - * @param nl - * Libmnl socket to use. - * @param nlh - * Message to send. This function always raises the NLM_F_ACK flag before - * sending. - * - * @return - * 0 on success, a negative errno value otherwise and rte_errno is set. - */ -static int -mlx5_nl_flow_nl_ack(struct mnl_socket *nl, struct nlmsghdr *nlh) -{ - alignas(struct nlmsghdr) - uint8_t ans[mnl_nlmsg_size(sizeof(struct nlmsgerr)) + - nlh->nlmsg_len - sizeof(*nlh)]; - uint32_t seq = random(); - int ret; - - nlh->nlmsg_flags |= NLM_F_ACK; - nlh->nlmsg_seq = seq; - ret = mnl_socket_sendto(nl, nlh, nlh->nlmsg_len); - if (ret != -1) - ret = mnl_socket_recvfrom(nl, ans, sizeof(ans)); - if (ret != -1) - ret = mnl_cb_run - (ans, ret, seq, mnl_socket_get_portid(nl), NULL, NULL); - if (!ret) - return 0; - rte_errno = errno; - return -rte_errno; -} - -/** - * Create a Netlink flow rule. - * - * @param nl - * Libmnl socket to use. - * @param buf - * Flow rule buffer previously initialized by mlx5_nl_flow_transpose(). - * @param[out] error - * Perform verbose error reporting if not NULL. - * - * @return - * 0 on success, a negative errno value otherwise and rte_errno is set. - */ -int -mlx5_nl_flow_create(struct mnl_socket *nl, void *buf, - struct rte_flow_error *error) -{ - struct nlmsghdr *nlh = buf; - - nlh->nlmsg_type = RTM_NEWTFILTER; - nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL; - if (!mlx5_nl_flow_nl_ack(nl, nlh)) - return 0; - return rte_flow_error_set - (error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, - "netlink: failed to create TC flow rule"); -} - -/** - * Destroy a Netlink flow rule. - * - * @param nl - * Libmnl socket to use. - * @param buf - * Flow rule buffer previously initialized by mlx5_nl_flow_transpose(). - * @param[out] error - * Perform verbose error reporting if not NULL. - * - * @return - * 0 on success, a negative errno value otherwise and rte_errno is set. - */ -int -mlx5_nl_flow_destroy(struct mnl_socket *nl, void *buf, - struct rte_flow_error *error) -{ - struct nlmsghdr *nlh = buf; - - nlh->nlmsg_type = RTM_DELTFILTER; - nlh->nlmsg_flags = NLM_F_REQUEST; - if (!mlx5_nl_flow_nl_ack(nl, nlh)) - return 0; - return rte_flow_error_set - (error, errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, - "netlink: failed to destroy TC flow rule"); -} - -/** - * Initialize ingress qdisc of a given network interface. - * - * @param nl - * Libmnl socket of the @p NETLINK_ROUTE kind. - * @param ifindex - * Index of network interface to initialize. - * @param[out] error - * Perform verbose error reporting if not NULL. - * - * @return - * 0 on success, a negative errno value otherwise and rte_errno is set. - */ -int -mlx5_nl_flow_init(struct mnl_socket *nl, unsigned int ifindex, - struct rte_flow_error *error) -{ - struct nlmsghdr *nlh; - struct tcmsg *tcm; - alignas(struct nlmsghdr) - uint8_t buf[mnl_nlmsg_size(sizeof(*tcm) + 128)]; - - /* Destroy existing ingress qdisc and everything attached to it. */ - nlh = mnl_nlmsg_put_header(buf); - nlh->nlmsg_type = RTM_DELQDISC; - nlh->nlmsg_flags = NLM_F_REQUEST; - tcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm)); - tcm->tcm_family = AF_UNSPEC; - tcm->tcm_ifindex = ifindex; - tcm->tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0); - tcm->tcm_parent = TC_H_INGRESS; - /* Ignore errors when qdisc is already absent. */ - if (mlx5_nl_flow_nl_ack(nl, nlh) && - rte_errno != EINVAL && rte_errno != ENOENT) - return rte_flow_error_set - (error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, - NULL, "netlink: failed to remove ingress qdisc"); - /* Create fresh ingress qdisc. */ - nlh = mnl_nlmsg_put_header(buf); - nlh->nlmsg_type = RTM_NEWQDISC; - nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL; - tcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm)); - tcm->tcm_family = AF_UNSPEC; - tcm->tcm_ifindex = ifindex; - tcm->tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0); - tcm->tcm_parent = TC_H_INGRESS; - mnl_attr_put_strz_check(nlh, sizeof(buf), TCA_KIND, "ingress"); - if (mlx5_nl_flow_nl_ack(nl, nlh)) - return rte_flow_error_set - (error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, - NULL, "netlink: failed to create ingress qdisc"); - return 0; -} - -/** - * Create and configure a libmnl socket for Netlink flow rules. - * - * @return - * A valid libmnl socket object pointer on success, NULL otherwise and - * rte_errno is set. - */ -struct mnl_socket * -mlx5_nl_flow_socket_create(void) -{ - struct mnl_socket *nl = mnl_socket_open(NETLINK_ROUTE); - - if (nl) { - mnl_socket_setsockopt(nl, NETLINK_CAP_ACK, &(int){ 1 }, - sizeof(int)); - if (!mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID)) - return nl; - } - rte_errno = errno; - if (nl) - mnl_socket_close(nl); - return NULL; -} - -/** - * Destroy a libmnl socket. - */ -void -mlx5_nl_flow_socket_destroy(struct mnl_socket *nl) -{ - mnl_socket_close(nl); -} From patchwork Wed Sep 19 07:21:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongseok Koh X-Patchwork-Id: 44900 X-Patchwork-Delegate: shahafs@mellanox.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 422C4559A; Wed, 19 Sep 2018 09:22:04 +0200 (CEST) Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10076.outbound.protection.outlook.com [40.107.1.76]) by dpdk.org (Postfix) with ESMTP id 1C07C4CA2 for ; Wed, 19 Sep 2018 09:22:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=daBHesF8yhwsHBlRImue61xgqAADKs05fBeUyU6r2AI=; b=nW8qJZaw/JyYJbTdr76uYgiBGeEpxJhMFb+xiuUjpqTC7msweDorNIYweqi3jquGVTGSvVSwJOL2KyXhUPTGF2rlcucQeiAASfEDshaMzNMAc38ymzmYI5YpbBo/jsXJSfh/OlYdTnTeZPM3DfbUjOkOXevZIWk0LzRpjfxbMRo= Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com (52.134.72.27) by DB3PR0502MB3994.eurprd05.prod.outlook.com (52.134.72.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1143.17; Wed, 19 Sep 2018 07:21:58 +0000 Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::452:cfe7:8363:61c1]) by DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::452:cfe7:8363:61c1%2]) with mapi id 15.20.1143.017; Wed, 19 Sep 2018 07:21:58 +0000 From: Yongseok Koh To: Shahaf Shuler CC: "dev@dpdk.org" , Yongseok Koh Thread-Topic: [PATCH 3/3] net/mlx5: add Linux TC flower driver for E-Switch flow Thread-Index: AQHUT+lzDR19NN1+QUG/WhLqLW2+tA== Date: Wed, 19 Sep 2018 07:21:58 +0000 Message-ID: <20180919072143.23211-4-yskoh@mellanox.com> References: <20180919072143.23211-1-yskoh@mellanox.com> In-Reply-To: <20180919072143.23211-1-yskoh@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: BYAPR03CA0020.namprd03.prod.outlook.com (2603:10b6:a02:a8::33) To DB3PR0502MB3980.eurprd05.prod.outlook.com (2603:10a6:8:10::27) authentication-results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [209.116.155.178] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB3PR0502MB3994; 6:HNUHW9UQgtmDzdHl5MZkuxGCgowCNjLXzual6vm5NFjaIP/zXGXRZj10Dv5u+dnRKq4mG1DRGUbliK2DMWsJwfi3RZGjFEwdZMYLLlPLt2u6nARQm313+cxGa8sDx/9q77/K2tj14C+j5XTaCJ0/fC2EL6u4Tg/H/tMwHI9/j0Tn7pdhH8I+pGgSwrFc/WyjeIUJEHV2QcNDiGtck0aSoUHAgQGh/01HlGviG6uDrv+dQijM35baWFndMFvJ07I1iUR4G3tUUj4eZ/F00H7e+j2il0RJ+70UDnk0Pr5O5kfCJnyZ0I2lsOsqpp/iGPKstfM1ic6xCN5iQzXnn3gis9zOJWd6RjpBhWtjjyDrT2viCQB9FU68ijRf/8f1OZ7c4jW9fiBYOTxsLEchOF74hHain5W9YuDs3r4blY3eiWg5VdLmaydOkVAic4hqKVuZGzeeyBDpX0jYLgGn3cEUew==; 5:mblzoEbWTy2zu6N1jlcHjgTQxEPSiMLxZ2Nm/gCKQTNkfcZM5llLXdAS4YxL/a1TM2K1XWEZcxr355lihkadAVwYIfO7mGaPIixulGYe6HSpGYwefyilYsilu6PVOXQUnVL5ndvfFUvnqyq5EYYvmG25sPcwi1Bw8MbBAYu53bE=; 7:aoqqA3bmOYtjuz/362e/gJeElU2oT+q4PJSR7qLjQBEvdDStuUBxDQiqyCSYHHX4PKi8LyGlKthYhzgNIFsSX8d7MUAPsItUX99pDKKqjJRnoawFxcdnAg4NvKJv4Tyq7nRvL1gqfbRvtG/eZaQ7GSAEgxNZR9gULP3/4FFhqbv7GH9/6qedqiiFFoUh55Pv0XTh/Tx2PWvRYtSpWuVdQcPo/JdxdrEU2H1R2I+ioaY9Q95JGz8pfZ9RRFvcprRm x-ms-office365-filtering-correlation-id: 79ab85bc-be9e-497d-b242-08d61e00955b x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:DB3PR0502MB3994; x-ms-traffictypediagnostic: DB3PR0502MB3994: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(269456686620040)(211171220733660); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3231355)(944501410)(52105095)(3002001)(6055026)(149027)(150027)(6041310)(20161123558120)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123564045)(201708071742011)(7699050); SRVR:DB3PR0502MB3994; BCL:0; PCL:0; RULEID:; SRVR:DB3PR0502MB3994; x-forefront-prvs: 0800C0C167 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(376002)(366004)(396003)(39860400002)(346002)(189003)(199004)(26005)(107886003)(102836004)(37006003)(305945005)(52116002)(446003)(8936002)(8676002)(53936002)(66066001)(6486002)(6636002)(5250100002)(16200700003)(68736007)(478600001)(11346002)(53946003)(76176011)(6436002)(105586002)(316002)(3846002)(256004)(14444005)(5024004)(6512007)(5660300001)(99286004)(1076002)(14454004)(6116002)(54906003)(106356001)(6862004)(2906002)(97736004)(81166006)(25786009)(36756003)(7736002)(81156014)(486006)(2616005)(476003)(86362001)(575784001)(2900100001)(4326008)(6506007)(386003)(569006); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0502MB3994; H:DB3PR0502MB3980.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: HoDZe5polp7A6Au906dIOmt6I9Ny255PNVH6zUMPD+43ssCVOWo3yBXDuqfjUgXn94wv2VL66AyRCtn+IRujYfjIDJ8zPewluOi0Jayu8f7Cjso+AgGAGaS8duJSmthfNTF87sR7rGqa//ltOjrGT9h8VKmfGRYCJmpyEVFAtsEsv3Rrof9BPrkAUkctumC4rxuF+TvXKK4xOe/2OQW9EBvLJTzQyqqUESBcutFNoPzM66DP1tetYx6aOXf8ZiSVQPOVk9sAiOFgH1WfdAMfDO2VZXCMQJbCJO3mjV7bGmuNF18cJutf1CajmK9B0J9Rp6yWAufhmGLMGJF9ofcZlTS/gKCw1XidsP2UkQ990Fo= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 79ab85bc-be9e-497d-b242-08d61e00955b X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Sep 2018 07:21:58.0776 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0502MB3994 Subject: [dpdk-dev] [PATCH 3/3] net/mlx5: add Linux TC flower driver for E-Switch flow X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Flows having 'transfer' attribute have to be inserted to E-Switch on the NIC and the control path uses Linux TC flower interface via Netlink socket. This patch adds the flow driver on top of the new flow engine. Signed-off-by: Yongseok Koh --- drivers/net/mlx5/Makefile | 1 + drivers/net/mlx5/mlx5.c | 33 + drivers/net/mlx5/mlx5_flow.c | 6 +- drivers/net/mlx5/mlx5_flow.h | 20 + drivers/net/mlx5/mlx5_flow_tcf.c | 1608 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 1667 insertions(+), 1 deletion(-) create mode 100644 drivers/net/mlx5/mlx5_flow_tcf.c diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index 9c1044808..ca1de9f21 100644 --- a/drivers/net/mlx5/Makefile +++ b/drivers/net/mlx5/Makefile @@ -32,6 +32,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_tcf.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 47cf538e2..e3c36710c 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -44,6 +44,7 @@ #include "mlx5_rxtx.h" #include "mlx5_autoconf.h" #include "mlx5_defs.h" +#include "mlx5_flow.h" #include "mlx5_glue.h" #include "mlx5_mr.h" #include "mlx5_flow.h" @@ -286,6 +287,8 @@ mlx5_dev_close(struct rte_eth_dev *dev) close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); + if (priv->mnl_socket) + mlx5_flow_tcf_socket_destroy(priv->mnl_socket); ret = mlx5_hrxq_ibv_verify(dev); if (ret) DRV_LOG(WARNING, "port %u some hash Rx queue still remain", @@ -1135,6 +1138,34 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0)); if (vf && config.vf_nl_en) mlx5_nl_mac_addr_sync(eth_dev); + priv->mnl_socket = mlx5_flow_tcf_socket_create(); + if (!priv->mnl_socket) { + err = -rte_errno; + DRV_LOG(WARNING, + "flow rules relying on switch offloads will not be" + " supported: cannot open libmnl socket: %s", + strerror(rte_errno)); + } else { + struct rte_flow_error error; + unsigned int ifindex = mlx5_ifindex(eth_dev); + + if (!ifindex) { + err = -rte_errno; + error.message = + "cannot retrieve network interface index"; + } else { + err = mlx5_flow_tcf_init(priv->mnl_socket, ifindex, + &error); + } + if (err) { + DRV_LOG(WARNING, + "flow rules relying on switch offloads will" + " not be supported: %s: %s", + error.message, strerror(rte_errno)); + mlx5_flow_tcf_socket_destroy(priv->mnl_socket); + priv->mnl_socket = NULL; + } + } TAILQ_INIT(&priv->flows); TAILQ_INIT(&priv->ctrl_flows); /* Hint libmlx5 to use PMD allocator for data plane resources */ @@ -1187,6 +1218,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, close(priv->nl_socket_route); if (priv->nl_socket_rdma >= 0) close(priv->nl_socket_rdma); + if (priv->mnl_socket) + mlx5_flow_tcf_socket_destroy(priv->mnl_socket); if (own_domain_id) claim_zero(rte_eth_switch_domain_free(priv->domain_id)); rte_free(priv); diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 2d3158a6f..19d445596 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -41,6 +41,7 @@ extern const struct eth_dev_ops mlx5_dev_ops_isolate; #ifdef HAVE_IBV_FLOW_DV_SUPPORT extern const struct mlx5_flow_driver_ops mlx5_flow_dv_drv_ops; #endif +extern const struct mlx5_flow_driver_ops mlx5_flow_tcf_drv_ops; extern const struct mlx5_flow_driver_ops mlx5_flow_verbs_drv_ops; const struct mlx5_flow_driver_ops mlx5_flow_null_drv_ops; @@ -50,6 +51,7 @@ const struct mlx5_flow_driver_ops *flow_drv_ops[] = { #ifdef HAVE_IBV_FLOW_DV_SUPPORT [MLX5_FLOW_TYPE_DV] = &mlx5_flow_dv_drv_ops, #endif + [MLX5_FLOW_TYPE_TCF] = &mlx5_flow_tcf_drv_ops, [MLX5_FLOW_TYPE_VERBS] = &mlx5_flow_verbs_drv_ops, [MLX5_FLOW_TYPE_MAX] = &mlx5_flow_null_drv_ops }; @@ -1610,7 +1612,9 @@ flow_get_drv_type(struct rte_eth_dev *dev __rte_unused, struct priv *priv __rte_unused = dev->data->dev_private; enum mlx5_flow_drv_type type = MLX5_FLOW_TYPE_MAX; - if (!attr->transfer) { + if (attr->transfer) { + type = MLX5_FLOW_TYPE_TCF; + } else { #ifdef HAVE_IBV_FLOW_DV_SUPPORT type = priv->config.dv_flow_en ? MLX5_FLOW_TYPE_DV : MLX5_FLOW_TYPE_VERBS; diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index 2bc3bee8c..10d700a7f 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -82,6 +82,11 @@ #define MLX5_ACTION_FLAG (1u << 3) #define MLX5_ACTION_MARK (1u << 4) #define MLX5_ACTION_COUNT (1u << 5) +#define MLX5_ACTION_PORT_ID (1u << 6) +#define MLX5_ACTION_OF_POP_VLAN (1u << 7) +#define MLX5_ACTION_OF_PUSH_VLAN (1u << 8) +#define MLX5_ACTION_OF_SET_VLAN_VID (1u << 9) +#define MLX5_ACTION_OF_SET_VLAN_PCP (1u << 10) /* possible L3 layers protocols filtering. */ #define MLX5_IP_PROTOCOL_TCP 6 @@ -131,6 +136,7 @@ enum mlx5_flow_drv_type { MLX5_FLOW_TYPE_MIN, MLX5_FLOW_TYPE_DV, + MLX5_FLOW_TYPE_TCF, MLX5_FLOW_TYPE_VERBS, MLX5_FLOW_TYPE_MAX, }; @@ -170,6 +176,12 @@ struct mlx5_flow_dv { int actions_n; /**< number of actions. */ }; +/** Linux TC flower driver for E-Switch flow. */ +struct mlx5_flow_tcf { + struct nlmsghdr *nlh; + struct tcmsg *tcm; +}; + /* Verbs specification header. */ struct ibv_spec_header { enum ibv_flow_spec_type type; @@ -199,6 +211,7 @@ struct mlx5_flow { #ifdef HAVE_IBV_FLOW_DV_SUPPORT struct mlx5_flow_dv dv; #endif + struct mlx5_flow_tcf tcf; struct mlx5_flow_verbs verbs; }; }; @@ -322,4 +335,11 @@ int mlx5_flow_validate_item_vxlan_gpe(const struct rte_flow_item *item, struct rte_eth_dev *dev, struct rte_flow_error *error); +/* mlx5_flow_tcf.c */ + +int mlx5_flow_tcf_init(struct mnl_socket *nl, unsigned int ifindex, + struct rte_flow_error *error); +struct mnl_socket *mlx5_flow_tcf_socket_create(void); +void mlx5_flow_tcf_socket_destroy(struct mnl_socket *nl); + #endif /* RTE_PMD_MLX5_FLOW_H_ */ diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c new file mode 100644 index 000000000..14376188e --- /dev/null +++ b/drivers/net/mlx5/mlx5_flow_tcf.c @@ -0,0 +1,1608 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 6WIND S.A. + * Copyright 2018 Mellanox Technologies, Ltd + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "mlx5.h" +#include "mlx5_flow.h" +#include "mlx5_autoconf.h" + +#ifdef HAVE_TC_ACT_VLAN + +#include + +#else /* HAVE_TC_ACT_VLAN */ + +#define TCA_VLAN_ACT_POP 1 +#define TCA_VLAN_ACT_PUSH 2 +#define TCA_VLAN_ACT_MODIFY 3 +#define TCA_VLAN_PARMS 2 +#define TCA_VLAN_PUSH_VLAN_ID 3 +#define TCA_VLAN_PUSH_VLAN_PROTOCOL 4 +#define TCA_VLAN_PAD 5 +#define TCA_VLAN_PUSH_VLAN_PRIORITY 6 + +struct tc_vlan { + tc_gen; + int v_action; +}; + +#endif /* HAVE_TC_ACT_VLAN */ + +/* Normally found in linux/netlink.h. */ +#ifndef NETLINK_CAP_ACK +#define NETLINK_CAP_ACK 10 +#endif + +/* Normally found in linux/pkt_sched.h. */ +#ifndef TC_H_MIN_INGRESS +#define TC_H_MIN_INGRESS 0xfff2u +#endif + +/* Normally found in linux/pkt_cls.h. */ +#ifndef TCA_CLS_FLAGS_SKIP_SW +#define TCA_CLS_FLAGS_SKIP_SW (1 << 1) +#endif +#ifndef HAVE_TCA_FLOWER_ACT +#define TCA_FLOWER_ACT 3 +#endif +#ifndef HAVE_TCA_FLOWER_FLAGS +#define TCA_FLOWER_FLAGS 22 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_ETH_TYPE +#define TCA_FLOWER_KEY_ETH_TYPE 8 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_ETH_DST +#define TCA_FLOWER_KEY_ETH_DST 4 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_ETH_DST_MASK +#define TCA_FLOWER_KEY_ETH_DST_MASK 5 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_ETH_SRC +#define TCA_FLOWER_KEY_ETH_SRC 6 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_ETH_SRC_MASK +#define TCA_FLOWER_KEY_ETH_SRC_MASK 7 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IP_PROTO +#define TCA_FLOWER_KEY_IP_PROTO 9 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV4_SRC +#define TCA_FLOWER_KEY_IPV4_SRC 10 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV4_SRC_MASK +#define TCA_FLOWER_KEY_IPV4_SRC_MASK 11 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV4_DST +#define TCA_FLOWER_KEY_IPV4_DST 12 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV4_DST_MASK +#define TCA_FLOWER_KEY_IPV4_DST_MASK 13 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV6_SRC +#define TCA_FLOWER_KEY_IPV6_SRC 14 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV6_SRC_MASK +#define TCA_FLOWER_KEY_IPV6_SRC_MASK 15 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV6_DST +#define TCA_FLOWER_KEY_IPV6_DST 16 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_IPV6_DST_MASK +#define TCA_FLOWER_KEY_IPV6_DST_MASK 17 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_TCP_SRC +#define TCA_FLOWER_KEY_TCP_SRC 18 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_TCP_SRC_MASK +#define TCA_FLOWER_KEY_TCP_SRC_MASK 35 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_TCP_DST +#define TCA_FLOWER_KEY_TCP_DST 19 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_TCP_DST_MASK +#define TCA_FLOWER_KEY_TCP_DST_MASK 36 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_UDP_SRC +#define TCA_FLOWER_KEY_UDP_SRC 20 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_UDP_SRC_MASK +#define TCA_FLOWER_KEY_UDP_SRC_MASK 37 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_UDP_DST +#define TCA_FLOWER_KEY_UDP_DST 21 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_UDP_DST_MASK +#define TCA_FLOWER_KEY_UDP_DST_MASK 38 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_VLAN_ID +#define TCA_FLOWER_KEY_VLAN_ID 23 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_VLAN_PRIO +#define TCA_FLOWER_KEY_VLAN_PRIO 24 +#endif +#ifndef HAVE_TCA_FLOWER_KEY_VLAN_ETH_TYPE +#define TCA_FLOWER_KEY_VLAN_ETH_TYPE 25 +#endif + +#ifndef IPV6_ADDR_LEN +#define IPV6_ADDR_LEN 16 +#endif + +/** Empty masks for known item types. */ +static const union { + struct rte_flow_item_port_id port_id; + struct rte_flow_item_eth eth; + struct rte_flow_item_vlan vlan; + struct rte_flow_item_ipv4 ipv4; + struct rte_flow_item_ipv6 ipv6; + struct rte_flow_item_tcp tcp; + struct rte_flow_item_udp udp; +} flow_tcf_mask_empty; + +/** Supported masks for known item types. */ +static const struct { + struct rte_flow_item_port_id port_id; + struct rte_flow_item_eth eth; + struct rte_flow_item_vlan vlan; + struct rte_flow_item_ipv4 ipv4; + struct rte_flow_item_ipv6 ipv6; + struct rte_flow_item_tcp tcp; + struct rte_flow_item_udp udp; +} flow_tcf_mask_supported = { + .port_id = { + .id = 0xffffffff, + }, + .eth = { + .type = RTE_BE16(0xffff), + .dst.addr_bytes = "\xff\xff\xff\xff\xff\xff", + .src.addr_bytes = "\xff\xff\xff\xff\xff\xff", + }, + .vlan = { + /* PCP and VID only, no DEI. */ + .tci = RTE_BE16(0xefff), + .inner_type = RTE_BE16(0xffff), + }, + .ipv4.hdr = { + .next_proto_id = 0xff, + .src_addr = RTE_BE32(0xffffffff), + .dst_addr = RTE_BE32(0xffffffff), + }, + .ipv6.hdr = { + .proto = 0xff, + .src_addr = + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff", + .dst_addr = + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff", + }, + .tcp.hdr = { + .src_port = RTE_BE16(0xffff), + .dst_port = RTE_BE16(0xffff), + }, + .udp.hdr = { + .src_port = RTE_BE16(0xffff), + .dst_port = RTE_BE16(0xffff), + }, +}; + +#define SZ_NLATTR_HDR MNL_ALIGN(sizeof(struct nlattr)) +#define SZ_NLATTR_NEST SZ_NLATTR_HDR +#define SZ_NLATTR_DATA_OF(len) MNL_ALIGN(SZ_NLATTR_HDR + (len)) +#define SZ_NLATTR_TYPE_OF(typ) SZ_NLATTR_DATA_OF(sizeof(typ)) +#define SZ_NLATTR_STRZ_OF(str) SZ_NLATTR_DATA_OF(strlen(str) + 1) + +#define PTOI_TABLE_SZ_MAX(dev) (mlx5_dev_to_port_id((dev)->device, NULL, 0) + 2) + +/** DPDK port to network interface index (ifindex) conversion. */ +struct flow_tcf_ptoi { + uint16_t port_id; /**< DPDK port ID. */ + unsigned int ifindex; /**< Network interface index. */ +}; + +#define MLX5_TCF_FATE_ACTIONS (MLX5_ACTION_DROP | MLX5_ACTION_PORT_ID) + +/** + * Retrieve mask for pattern item. + * + * This function does basic sanity checks on a pattern item in order to + * return the most appropriate mask for it. + * + * @param[in] item + * Item specification. + * @param[in] mask_default + * Default mask for pattern item as specified by the flow API. + * @param[in] mask_supported + * Mask fields supported by the implementation. + * @param[in] mask_empty + * Empty mask to return when there is no specification. + * @param[out] error + * Perform verbose error reporting if not NULL. + * + * @return + * Either @p item->mask or one of the mask parameters on success, NULL + * otherwise and rte_errno is set. + */ +static const void * +flow_tcf_item_mask(const struct rte_flow_item *item, const void *mask_default, + const void *mask_supported, const void *mask_empty, + size_t mask_size, struct rte_flow_error *error) +{ + const uint8_t *mask; + size_t i; + + /* item->last and item->mask cannot exist without item->spec. */ + if (!item->spec && (item->mask || item->last)) { + rte_flow_error_set(error, EINVAL, + RTE_FLOW_ERROR_TYPE_ITEM, item, + "\"mask\" or \"last\" field provided without" + " a corresponding \"spec\""); + return NULL; + } + /* No spec, no mask, no problem. */ + if (!item->spec) + return mask_empty; + mask = item->mask ? item->mask : mask_default; + assert(mask); + /* + * Single-pass check to make sure that: + * - Mask is supported, no bits are set outside mask_supported. + * - Both item->spec and item->last are included in mask. + */ + for (i = 0; i != mask_size; ++i) { + if (!mask[i]) + continue; + if ((mask[i] | ((const uint8_t *)mask_supported)[i]) != + ((const uint8_t *)mask_supported)[i]) { + rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_MASK, mask, + "unsupported field found" + " in \"mask\""); + return NULL; + } + if (item->last && + (((const uint8_t *)item->spec)[i] & mask[i]) != + (((const uint8_t *)item->last)[i] & mask[i])) { + rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_LAST, + item->last, + "range between \"spec\" and \"last\"" + " not comprised in \"mask\""); + return NULL; + } + } + return mask; +} + +/** + * Build a conversion table between port ID and ifindex. + * + * @param[in] dev + * Pointer to Ethernet device. + * @param[out] ptoi + * Pointer to ptoi table. + * @param[in] len + * Size of ptoi table provided. + * + * @return + * Size of ptoi table filled. + */ +static unsigned int +flow_tcf_build_ptoi_table(struct rte_eth_dev *dev, struct flow_tcf_ptoi *ptoi, + unsigned int len) +{ + unsigned int n = mlx5_dev_to_port_id(dev->device, NULL, 0); + uint16_t port_id[n + 1]; + unsigned int i; + unsigned int own = 0; + + /* At least one port is needed when no switch domain is present. */ + if (!n) { + n = 1; + port_id[0] = dev->data->port_id; + } else { + n = RTE_MIN(mlx5_dev_to_port_id(dev->device, port_id, n), n); + } + if (n > len) + return 0; + for (i = 0; i != n; ++i) { + struct rte_eth_dev_info dev_info; + + rte_eth_dev_info_get(port_id[i], &dev_info); + if (port_id[i] == dev->data->port_id) + own = i; + ptoi[i].port_id = port_id[i]; + ptoi[i].ifindex = dev_info.if_index; + } + /* Ensure first entry of ptoi[] is the current device. */ + if (own) { + ptoi[n] = ptoi[0]; + ptoi[0] = ptoi[own]; + ptoi[own] = ptoi[n]; + } + /* An entry with zero ifindex terminates ptoi[]. */ + ptoi[n].port_id = 0; + ptoi[n].ifindex = 0; + return n; +} + +/** + * Verify the @p attr will be correctly understood by the E-switch. + * + * @param[in] attr + * Pointer to flow attributes + * @param[out] error + * Pointer to error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +static int +flow_tcf_validate_attributes(const struct rte_flow_attr *attr, + struct rte_flow_error *error) +{ + /* + * Supported attributes: no groups, some priorities and ingress only. + * Don't care about transfer as it is the caller's problem. + */ + if (attr->group) + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ATTR_GROUP, attr, + "groups are not supported"); + if (attr->priority > 0xfffe) + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, + attr, + "lowest priority level is 0xfffe"); + if (!attr->ingress) + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, + attr, "only ingress is supported"); + if (attr->egress) + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, + attr, "egress is not supported"); + return 0; +} + +/** + * Validate flow for E-Switch. + * + * @param[in] priv + * Pointer to the priv structure. + * @param[in] attr + * Pointer to the flow attributes. + * @param[in] items + * Pointer to the list of items. + * @param[in] actions + * Pointer to the list of actions. + * @param[out] error + * Pointer to the error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_ernno is set. + */ +static int +flow_tcf_validate(struct rte_eth_dev *dev, + const struct rte_flow_attr *attr, + const struct rte_flow_item items[], + const struct rte_flow_action actions[], + struct rte_flow_error *error) +{ + union { + const struct rte_flow_item_port_id *port_id; + const struct rte_flow_item_eth *eth; + const struct rte_flow_item_vlan *vlan; + const struct rte_flow_item_ipv4 *ipv4; + const struct rte_flow_item_ipv6 *ipv6; + const struct rte_flow_item_tcp *tcp; + const struct rte_flow_item_udp *udp; + } spec, mask; + union { + const struct rte_flow_action_port_id *port_id; + const struct rte_flow_action_of_push_vlan *of_push_vlan; + const struct rte_flow_action_of_set_vlan_vid * + of_set_vlan_vid; + const struct rte_flow_action_of_set_vlan_pcp * + of_set_vlan_pcp; + } conf; + uint32_t item_flags = 0; + uint32_t action_flags = 0; + uint8_t next_protocol = -1; + unsigned int tcm_ifindex = 0; + struct flow_tcf_ptoi ptoi[PTOI_TABLE_SZ_MAX(dev)]; + bool in_port_id_set; + int ret; + + claim_nonzero(flow_tcf_build_ptoi_table(dev, ptoi, + PTOI_TABLE_SZ_MAX(dev))); + ret = flow_tcf_validate_attributes(attr, error); + if (ret < 0) + return ret; + for (; items->type != RTE_FLOW_ITEM_TYPE_END; items++) { + unsigned int i; + + switch (items->type) { + case RTE_FLOW_ITEM_TYPE_VOID: + break; + case RTE_FLOW_ITEM_TYPE_PORT_ID: + mask.port_id = flow_tcf_item_mask + (items, &rte_flow_item_port_id_mask, + &flow_tcf_mask_supported.port_id, + &flow_tcf_mask_empty.port_id, + sizeof(flow_tcf_mask_supported.port_id), + error); + if (!mask.port_id) + return -rte_errno; + if (mask.port_id == &flow_tcf_mask_empty.port_id) { + in_port_id_set = 1; + break; + } + spec.port_id = items->spec; + if (mask.port_id->id && mask.port_id->id != 0xffffffff) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_MASK, + mask.port_id, + "no support for partial mask on" + " \"id\" field"); + if (!mask.port_id->id) + i = 0; + else + for (i = 0; ptoi[i].ifindex; ++i) + if (ptoi[i].port_id == spec.port_id->id) + break; + if (!ptoi[i].ifindex) + return rte_flow_error_set + (error, ENODEV, + RTE_FLOW_ERROR_TYPE_ITEM_SPEC, + spec.port_id, + "missing data to convert port ID to" + " ifindex"); + if (in_port_id_set && ptoi[i].ifindex != tcm_ifindex) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_SPEC, + spec.port_id, + "cannot match traffic for" + " several port IDs through" + " a single flow rule"); + tcm_ifindex = ptoi[i].ifindex; + in_port_id_set = 1; + break; + case RTE_FLOW_ITEM_TYPE_ETH: + ret = mlx5_flow_validate_item_eth(items, item_flags, + error); + if (ret < 0) + return ret; + item_flags |= MLX5_FLOW_LAYER_OUTER_L2; + /* TODO: + * Redundant check due to different supported mask. + * Same for the rest of items. + */ + mask.eth = flow_tcf_item_mask + (items, &rte_flow_item_eth_mask, + &flow_tcf_mask_supported.eth, + &flow_tcf_mask_empty.eth, + sizeof(flow_tcf_mask_supported.eth), + error); + if (!mask.eth) + return -rte_errno; + if (mask.eth->type && mask.eth->type != + RTE_BE16(0xffff)) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_MASK, + mask.eth, + "no support for partial mask on" + " \"type\" field"); + break; + case RTE_FLOW_ITEM_TYPE_VLAN: + ret = mlx5_flow_validate_item_vlan(items, item_flags, + error); + if (ret < 0) + return ret; + item_flags |= MLX5_FLOW_LAYER_OUTER_VLAN; + mask.vlan = flow_tcf_item_mask + (items, &rte_flow_item_vlan_mask, + &flow_tcf_mask_supported.vlan, + &flow_tcf_mask_empty.vlan, + sizeof(flow_tcf_mask_supported.vlan), + error); + if (!mask.vlan) + return -rte_errno; + if ((mask.vlan->tci & RTE_BE16(0xe000) && + (mask.vlan->tci & RTE_BE16(0xe000)) != + RTE_BE16(0xe000)) || + (mask.vlan->tci & RTE_BE16(0x0fff) && + (mask.vlan->tci & RTE_BE16(0x0fff)) != + RTE_BE16(0x0fff)) || + (mask.vlan->inner_type && + mask.vlan->inner_type != RTE_BE16(0xffff))) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_MASK, + mask.vlan, + "no support for partial masks on" + " \"tci\" (PCP and VID parts) and" + " \"inner_type\" fields"); + break; + case RTE_FLOW_ITEM_TYPE_IPV4: + ret = mlx5_flow_validate_item_ipv4(items, item_flags, + error); + if (ret < 0) + return ret; + item_flags |= MLX5_FLOW_LAYER_OUTER_L3_IPV4; + mask.ipv4 = flow_tcf_item_mask + (items, &rte_flow_item_ipv4_mask, + &flow_tcf_mask_supported.ipv4, + &flow_tcf_mask_empty.ipv4, + sizeof(flow_tcf_mask_supported.ipv4), + error); + if (!mask.ipv4) + return -rte_errno; + if (mask.ipv4->hdr.next_proto_id && + mask.ipv4->hdr.next_proto_id != 0xff) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_MASK, + mask.ipv4, + "no support for partial mask on" + " \"hdr.next_proto_id\" field"); + else if (mask.ipv4->hdr.next_proto_id) + next_protocol = + ((const struct rte_flow_item_ipv4 *) + (items->spec))->hdr.next_proto_id; + break; + case RTE_FLOW_ITEM_TYPE_IPV6: + ret = mlx5_flow_validate_item_ipv6(items, item_flags, + error); + if (ret < 0) + return ret; + item_flags |= MLX5_FLOW_LAYER_OUTER_L3_IPV6; + mask.ipv6 = flow_tcf_item_mask + (items, &rte_flow_item_ipv6_mask, + &flow_tcf_mask_supported.ipv6, + &flow_tcf_mask_empty.ipv6, + sizeof(flow_tcf_mask_supported.ipv6), + error); + if (!mask.ipv6) + return -rte_errno; + if (mask.ipv6->hdr.proto && + mask.ipv6->hdr.proto != 0xff) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM_MASK, + mask.ipv6, + "no support for partial mask on" + " \"hdr.proto\" field"); + else if (mask.ipv6->hdr.proto) + next_protocol = + ((const struct rte_flow_item_ipv6 *) + (items->spec))->hdr.proto; + break; + case RTE_FLOW_ITEM_TYPE_UDP: + ret = mlx5_flow_validate_item_udp(items, item_flags, + next_protocol, error); + if (ret < 0) + return ret; + item_flags |= MLX5_FLOW_LAYER_OUTER_L4_UDP; + mask.udp = flow_tcf_item_mask + (items, &rte_flow_item_udp_mask, + &flow_tcf_mask_supported.udp, + &flow_tcf_mask_empty.udp, + sizeof(flow_tcf_mask_supported.udp), + error); + if (!mask.udp) + return -rte_errno; + break; + case RTE_FLOW_ITEM_TYPE_TCP: + ret = mlx5_flow_validate_item_tcp(items, item_flags, + next_protocol, error); + if (ret < 0) + return ret; + item_flags |= MLX5_FLOW_LAYER_OUTER_L4_TCP; + mask.tcp = flow_tcf_item_mask + (items, &rte_flow_item_tcp_mask, + &flow_tcf_mask_supported.tcp, + &flow_tcf_mask_empty.tcp, + sizeof(flow_tcf_mask_supported.tcp), + error); + if (!mask.tcp) + return -rte_errno; + break; + default: + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM, + NULL, "item not supported"); + } + } + for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) { + unsigned int i; + + switch (actions->type) { + case RTE_FLOW_ACTION_TYPE_VOID: + break; + case RTE_FLOW_ACTION_TYPE_PORT_ID: + if (action_flags & MLX5_TCF_FATE_ACTIONS) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ACTION, actions, + "can't have multiple fate actions"); + conf.port_id = actions->conf; + if (conf.port_id->original) + i = 0; + else + for (i = 0; ptoi[i].ifindex; ++i) + if (ptoi[i].port_id == conf.port_id->id) + break; + if (!ptoi[i].ifindex) + return rte_flow_error_set + (error, ENODEV, + RTE_FLOW_ERROR_TYPE_ACTION_CONF, + conf.port_id, + "missing data to convert port ID to" + " ifindex"); + action_flags |= MLX5_ACTION_PORT_ID; + break; + case RTE_FLOW_ACTION_TYPE_DROP: + if (action_flags & MLX5_TCF_FATE_ACTIONS) + return rte_flow_error_set + (error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ACTION, actions, + "can't have multiple fate actions"); + action_flags |= MLX5_ACTION_DROP; + break; + case RTE_FLOW_ACTION_TYPE_OF_POP_VLAN: + action_flags |= MLX5_ACTION_OF_POP_VLAN; + break; + case RTE_FLOW_ACTION_TYPE_OF_PUSH_VLAN: + action_flags |= MLX5_ACTION_OF_PUSH_VLAN; + break; + case RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID: + action_flags |= MLX5_ACTION_OF_SET_VLAN_VID; + break; + case RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP: + action_flags |= MLX5_ACTION_OF_SET_VLAN_PCP; + break; + default: + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ACTION, + actions, + "action not supported"); + } + } + return 0; +} + +/** + * Calculate maximum size of memory for flow items of Linux TC flower and + * extract specified items. + * + * @param[in] items + * Pointer to the list of items. + * @param[out] item_flags + * Pointer to the detected items. + * + * @return + * Maximum size of memory for items. + */ +static int +flow_tcf_get_items_and_size(const struct rte_flow_item items[], + uint64_t *item_flags) +{ + int size = 0; + uint64_t flags = 0; + + size += SZ_NLATTR_STRZ_OF("flower") + + SZ_NLATTR_NEST + /* TCA_OPTIONS. */ + SZ_NLATTR_TYPE_OF(uint32_t); /* TCA_CLS_FLAGS_SKIP_SW. */ + for (; items->type != RTE_FLOW_ITEM_TYPE_END; items++) { + switch (items->type) { + case RTE_FLOW_ITEM_TYPE_VOID: + break; + case RTE_FLOW_ITEM_TYPE_PORT_ID: + break; + case RTE_FLOW_ITEM_TYPE_ETH: + size += SZ_NLATTR_TYPE_OF(uint16_t) + /* Ether type. */ + SZ_NLATTR_DATA_OF(ETHER_ADDR_LEN) * 4; + /* dst/src MAC addr and mask. */ + flags |= MLX5_FLOW_LAYER_OUTER_L2; + break; + case RTE_FLOW_ITEM_TYPE_VLAN: + size += SZ_NLATTR_TYPE_OF(uint16_t) + /* Ether type. */ + SZ_NLATTR_TYPE_OF(uint16_t) + + /* VLAN Ether type. */ + SZ_NLATTR_TYPE_OF(uint8_t) + /* VLAN prio. */ + SZ_NLATTR_TYPE_OF(uint16_t); /* VLAN ID. */ + flags |= MLX5_FLOW_LAYER_OUTER_VLAN; + break; + case RTE_FLOW_ITEM_TYPE_IPV4: + size += SZ_NLATTR_TYPE_OF(uint16_t) + /* Ether type. */ + SZ_NLATTR_TYPE_OF(uint8_t) + /* IP proto. */ + SZ_NLATTR_TYPE_OF(uint32_t) * 4; + /* dst/src IP addr and mask. */ + flags |= MLX5_FLOW_LAYER_OUTER_L3_IPV4; + break; + case RTE_FLOW_ITEM_TYPE_IPV6: + size += SZ_NLATTR_TYPE_OF(uint16_t) + /* Ether type. */ + SZ_NLATTR_TYPE_OF(uint8_t) + /* IP proto. */ + SZ_NLATTR_TYPE_OF(IPV6_ADDR_LEN) * 4; + /* dst/src IP addr and mask. */ + flags |= MLX5_FLOW_LAYER_OUTER_L3_IPV6; + break; + case RTE_FLOW_ITEM_TYPE_UDP: + size += SZ_NLATTR_TYPE_OF(uint8_t) + /* IP proto. */ + SZ_NLATTR_TYPE_OF(uint16_t) * 4; + /* dst/src port and mask. */ + flags |= MLX5_FLOW_LAYER_OUTER_L4_UDP; + break; + case RTE_FLOW_ITEM_TYPE_TCP: + size += SZ_NLATTR_TYPE_OF(uint8_t) + /* IP proto. */ + SZ_NLATTR_TYPE_OF(uint16_t) * 4; + /* dst/src port and mask. */ + flags |= MLX5_FLOW_LAYER_OUTER_L4_TCP; + break; + default: + DRV_LOG(WARNING, + "unsupported item %p type %d," + " items must be validated before flow creation", + (const void *)items, items->type); + break; + } + } + *item_flags = flags; + return size; +} + +/** + * Calculate maximum size of memory for flow actions of Linux TC flower and + * extract specified actions. + * + * @param[in] actions + * Pointer to the list of actions. + * @param[out] action_flags + * Pointer to the detected actions. + * + * @return + * Maximum size of memory for actions. + */ +static int +flow_tcf_get_actions_and_size(const struct rte_flow_action actions[], + uint64_t *action_flags) +{ + int size = 0; + uint64_t flags = 0; + + size += SZ_NLATTR_NEST; /* TCA_FLOWER_ACT. */ + for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) { + switch (actions->type) { + case RTE_FLOW_ACTION_TYPE_VOID: + break; + case RTE_FLOW_ACTION_TYPE_PORT_ID: + size += SZ_NLATTR_NEST + /* na_act_index. */ + SZ_NLATTR_STRZ_OF("mirred") + + SZ_NLATTR_NEST + /* TCA_ACT_OPTIONS. */ + SZ_NLATTR_TYPE_OF(struct tc_mirred); + flags |= MLX5_ACTION_PORT_ID; + break; + case RTE_FLOW_ACTION_TYPE_DROP: + size += SZ_NLATTR_NEST + /* na_act_index. */ + SZ_NLATTR_STRZ_OF("gact") + + SZ_NLATTR_NEST + /* TCA_ACT_OPTIONS. */ + SZ_NLATTR_TYPE_OF(struct tc_gact); + flags |= MLX5_ACTION_DROP; + break; + case RTE_FLOW_ACTION_TYPE_OF_POP_VLAN: + flags |= MLX5_ACTION_OF_POP_VLAN; + goto action_of_vlan; + case RTE_FLOW_ACTION_TYPE_OF_PUSH_VLAN: + flags |= MLX5_ACTION_OF_PUSH_VLAN; + goto action_of_vlan; + case RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID: + flags |= MLX5_ACTION_OF_SET_VLAN_VID; + goto action_of_vlan; + case RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP: + flags |= MLX5_ACTION_OF_SET_VLAN_PCP; + goto action_of_vlan; +action_of_vlan: + size += SZ_NLATTR_NEST + /* na_act_index. */ + SZ_NLATTR_STRZ_OF("vlan") + + SZ_NLATTR_NEST + /* TCA_ACT_OPTIONS. */ + SZ_NLATTR_TYPE_OF(struct tc_vlan) + + SZ_NLATTR_TYPE_OF(uint16_t) + + /* VLAN protocol. */ + SZ_NLATTR_TYPE_OF(uint16_t) + /* VLAN ID. */ + SZ_NLATTR_TYPE_OF(uint8_t); /* VLAN prio. */ + break; + default: + DRV_LOG(WARNING, + "unsupported action %p type %d," + " items must be validated before flow creation", + (const void *)actions, actions->type); + break; + } + } + *action_flags = flags; + return size; +} + +/** + * Brand rtnetlink buffer with unique handle. + * + * This handle should be unique for a given network interface to avoid + * collisions. + * + * @param nlh + * Pointer to Netlink message. + * @param handle + * Unique 32-bit handle to use. + */ +static void +flow_tcf_nl_brand(struct nlmsghdr *nlh, uint32_t handle) +{ + struct tcmsg *tcm = mnl_nlmsg_get_payload(nlh); + + tcm->tcm_handle = handle; + DRV_LOG(DEBUG, "Netlink msg %p is branded with handle %x", + (void *)nlh, handle); +} + +/** + * Prepare a flow object for Linux TC flower. It calculates the maximum size of + * memory required, allocates the memory, initializes Netlink message headers + * and set unique TC message handle. + * + * @param[in] attr + * Pointer to the flow attributes. + * @param[in] items + * Pointer to the list of items. + * @param[in] actions + * Pointer to the list of actions. + * @param[out] item_flags + * Pointer to bit mask of all items detected. + * @param[out] action_flags + * Pointer to bit mask of all actions detected. + * @param[out] error + * Pointer to the error structure. + * + * @return + * Pointer to mlx5_flow object on success, + * otherwise NULL and rte_ernno is set. + */ +static struct mlx5_flow * +flow_tcf_prepare(const struct rte_flow_attr *attr __rte_unused, + const struct rte_flow_item items[], + const struct rte_flow_action actions[], + uint64_t *item_flags, uint64_t *action_flags, + struct rte_flow_error *error) +{ + size_t size = sizeof(struct mlx5_flow) + + MNL_ALIGN(sizeof(struct nlmsghdr)) + + MNL_ALIGN(sizeof(struct tcmsg)); + struct mlx5_flow *dev_flow; + struct nlmsghdr *nlh; + struct tcmsg *tcm; + + size += flow_tcf_get_items_and_size(items, item_flags); + size += flow_tcf_get_actions_and_size(actions, action_flags); + dev_flow = rte_zmalloc(__func__, size, MNL_ALIGNTO); + if (!dev_flow) { + rte_flow_error_set(error, ENOMEM, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "not enough memory to create E-Switch flow"); + return NULL; + } + nlh = mnl_nlmsg_put_header((void *)(dev_flow + 1)); + tcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm)); + *dev_flow = (struct mlx5_flow){ + .tcf = (struct mlx5_flow_tcf){ + .nlh = nlh, + .tcm = tcm, + }, + }; + /* + * Generate a reasonably unique handle based on the address of the + * target buffer. + * + * This is straightforward on 32-bit systems where the flow pointer can + * be used directly. Otherwise, its least significant part is taken + * after shifting it by the previous power of two of the pointed buffer + * size. + */ + if (sizeof(dev_flow) <= 4) + flow_tcf_nl_brand(nlh, (uintptr_t)dev_flow); + else + flow_tcf_nl_brand(nlh, (uintptr_t)dev_flow >> + rte_log2_u32(rte_align32prevpow2(size))); + return dev_flow; +} + +/** + * Translate flow for Linux TC flower and construct Netlink message. + * + * @param[in] priv + * Pointer to the priv structure. + * @param[in, out] flow + * Pointer to the sub flow. + * @param[in] attr + * Pointer to the flow attributes. + * @param[in] items + * Pointer to the list of items. + * @param[in] actions + * Pointer to the list of actions. + * @param[out] error + * Pointer to the error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_ernno is set. + */ +static int +flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, + const struct rte_flow_attr *attr, + const struct rte_flow_item items[], + const struct rte_flow_action actions[], + struct rte_flow_error *error) +{ + union { + const struct rte_flow_item_port_id *port_id; + const struct rte_flow_item_eth *eth; + const struct rte_flow_item_vlan *vlan; + const struct rte_flow_item_ipv4 *ipv4; + const struct rte_flow_item_ipv6 *ipv6; + const struct rte_flow_item_tcp *tcp; + const struct rte_flow_item_udp *udp; + } spec, mask; + union { + const struct rte_flow_action_port_id *port_id; + const struct rte_flow_action_of_push_vlan *of_push_vlan; + const struct rte_flow_action_of_set_vlan_vid * + of_set_vlan_vid; + const struct rte_flow_action_of_set_vlan_pcp * + of_set_vlan_pcp; + } conf; + struct flow_tcf_ptoi ptoi[PTOI_TABLE_SZ_MAX(dev)]; + struct nlmsghdr *nlh = dev_flow->tcf.nlh; + struct tcmsg *tcm = dev_flow->tcf.tcm; + uint32_t na_act_index_cur; + bool eth_type_set = 0; + bool vlan_present = 0; + bool vlan_eth_type_set = 0; + bool ip_proto_set = 0; + struct nlattr *na_flower; + struct nlattr *na_flower_act; + struct nlattr *na_vlan_id = NULL; + struct nlattr *na_vlan_priority = NULL; + + claim_nonzero(flow_tcf_build_ptoi_table(dev, ptoi, + PTOI_TABLE_SZ_MAX(dev))); + nlh = dev_flow->tcf.nlh; + tcm = dev_flow->tcf.tcm; + /* Prepare API must have been called beforehand. */ + assert(nlh != NULL && tcm != NULL); + tcm->tcm_family = AF_UNSPEC; + tcm->tcm_ifindex = ptoi[0].ifindex; + tcm->tcm_parent = TC_H_MAKE(TC_H_INGRESS, TC_H_MIN_INGRESS); + /* + * Priority cannot be zero to prevent the kernel from picking one + * automatically. + */ + tcm->tcm_info = TC_H_MAKE((attr->priority + 1) << 16, + RTE_BE16(ETH_P_ALL)); + mnl_attr_put_strz(nlh, TCA_KIND, "flower"); + na_flower = mnl_attr_nest_start(nlh, TCA_OPTIONS); + mnl_attr_put_u32(nlh, TCA_FLOWER_FLAGS, TCA_CLS_FLAGS_SKIP_SW); + for (; items->type != RTE_FLOW_ITEM_TYPE_END; items++) { + unsigned int i; + + switch (items->type) { + case RTE_FLOW_ITEM_TYPE_VOID: + break; + case RTE_FLOW_ITEM_TYPE_PORT_ID: + mask.port_id = flow_tcf_item_mask + (items, &rte_flow_item_port_id_mask, + &flow_tcf_mask_supported.port_id, + &flow_tcf_mask_empty.port_id, + sizeof(flow_tcf_mask_supported.port_id), + error); + assert(mask.port_id); + if (mask.port_id == &flow_tcf_mask_empty.port_id) + break; + spec.port_id = items->spec; + if (!mask.port_id->id) + i = 0; + else + for (i = 0; ptoi[i].ifindex; ++i) + if (ptoi[i].port_id == spec.port_id->id) + break; + assert(ptoi[i].ifindex); + tcm->tcm_ifindex = ptoi[i].ifindex; + break; + case RTE_FLOW_ITEM_TYPE_ETH: + mask.eth = flow_tcf_item_mask + (items, &rte_flow_item_eth_mask, + &flow_tcf_mask_supported.eth, + &flow_tcf_mask_empty.eth, + sizeof(flow_tcf_mask_supported.eth), + error); + assert(mask.eth); + if (mask.eth == &flow_tcf_mask_empty.eth) + break; + spec.eth = items->spec; + if (mask.eth->type) { + mnl_attr_put_u16(nlh, TCA_FLOWER_KEY_ETH_TYPE, + spec.eth->type); + eth_type_set = 1; + } + if (!is_zero_ether_addr(&mask.eth->dst)) { + mnl_attr_put(nlh, TCA_FLOWER_KEY_ETH_DST, + ETHER_ADDR_LEN, + spec.eth->dst.addr_bytes); + mnl_attr_put(nlh, TCA_FLOWER_KEY_ETH_DST_MASK, + ETHER_ADDR_LEN, + mask.eth->dst.addr_bytes); + } + if (!is_zero_ether_addr(&mask.eth->src)) { + mnl_attr_put(nlh, TCA_FLOWER_KEY_ETH_SRC, + ETHER_ADDR_LEN, + spec.eth->src.addr_bytes); + mnl_attr_put(nlh, TCA_FLOWER_KEY_ETH_SRC_MASK, + ETHER_ADDR_LEN, + mask.eth->src.addr_bytes); + } + break; + case RTE_FLOW_ITEM_TYPE_VLAN: + mask.vlan = flow_tcf_item_mask + (items, &rte_flow_item_vlan_mask, + &flow_tcf_mask_supported.vlan, + &flow_tcf_mask_empty.vlan, + sizeof(flow_tcf_mask_supported.vlan), + error); + assert(mask.vlan); + if (!eth_type_set) + mnl_attr_put_u16(nlh, TCA_FLOWER_KEY_ETH_TYPE, + RTE_BE16(ETH_P_8021Q)); + eth_type_set = 1; + vlan_present = 1; + if (mask.vlan == &flow_tcf_mask_empty.vlan) + break; + spec.vlan = items->spec; + if (mask.vlan->inner_type) { + mnl_attr_put_u16(nlh, + TCA_FLOWER_KEY_VLAN_ETH_TYPE, + spec.vlan->inner_type); + vlan_eth_type_set = 1; + } + if (mask.vlan->tci & RTE_BE16(0xe000)) + mnl_attr_put_u8(nlh, TCA_FLOWER_KEY_VLAN_PRIO, + (rte_be_to_cpu_16 + (spec.vlan->tci) >> 13) & 0x7); + if (mask.vlan->tci & RTE_BE16(0x0fff)) + mnl_attr_put_u16(nlh, TCA_FLOWER_KEY_VLAN_ID, + rte_be_to_cpu_16 + (spec.vlan->tci & + RTE_BE16(0x0fff))); + break; + case RTE_FLOW_ITEM_TYPE_IPV4: + mask.ipv4 = flow_tcf_item_mask + (items, &rte_flow_item_ipv4_mask, + &flow_tcf_mask_supported.ipv4, + &flow_tcf_mask_empty.ipv4, + sizeof(flow_tcf_mask_supported.ipv4), + error); + assert(mask.ipv4); + if (!eth_type_set || !vlan_eth_type_set) + mnl_attr_put_u16(nlh, + vlan_present ? + TCA_FLOWER_KEY_VLAN_ETH_TYPE : + TCA_FLOWER_KEY_ETH_TYPE, + RTE_BE16(ETH_P_IP)); + eth_type_set = 1; + vlan_eth_type_set = 1; + if (mask.ipv4 == &flow_tcf_mask_empty.ipv4) + break; + spec.ipv4 = items->spec; + if (mask.ipv4->hdr.next_proto_id) { + mnl_attr_put_u8(nlh, TCA_FLOWER_KEY_IP_PROTO, + spec.ipv4->hdr.next_proto_id); + ip_proto_set = 1; + } + if (mask.ipv4->hdr.src_addr) { + mnl_attr_put_u32(nlh, TCA_FLOWER_KEY_IPV4_SRC, + spec.ipv4->hdr.src_addr); + mnl_attr_put_u32(nlh, + TCA_FLOWER_KEY_IPV4_SRC_MASK, + mask.ipv4->hdr.src_addr); + } + if (mask.ipv4->hdr.dst_addr) { + mnl_attr_put_u32(nlh, TCA_FLOWER_KEY_IPV4_DST, + spec.ipv4->hdr.dst_addr); + mnl_attr_put_u32(nlh, + TCA_FLOWER_KEY_IPV4_DST_MASK, + mask.ipv4->hdr.dst_addr); + } + break; + case RTE_FLOW_ITEM_TYPE_IPV6: + mask.ipv6 = flow_tcf_item_mask + (items, &rte_flow_item_ipv6_mask, + &flow_tcf_mask_supported.ipv6, + &flow_tcf_mask_empty.ipv6, + sizeof(flow_tcf_mask_supported.ipv6), + error); + assert(mask.ipv6); + if (!eth_type_set || !vlan_eth_type_set) + mnl_attr_put_u16(nlh, + vlan_present ? + TCA_FLOWER_KEY_VLAN_ETH_TYPE : + TCA_FLOWER_KEY_ETH_TYPE, + RTE_BE16(ETH_P_IPV6)); + eth_type_set = 1; + vlan_eth_type_set = 1; + if (mask.ipv6 == &flow_tcf_mask_empty.ipv6) + break; + spec.ipv6 = items->spec; + if (mask.ipv6->hdr.proto) { + mnl_attr_put_u8(nlh, TCA_FLOWER_KEY_IP_PROTO, + spec.ipv6->hdr.proto); + ip_proto_set = 1; + } + if (!IN6_IS_ADDR_UNSPECIFIED(mask.ipv6->hdr.src_addr)) { + mnl_attr_put(nlh, TCA_FLOWER_KEY_IPV6_SRC, + sizeof(spec.ipv6->hdr.src_addr), + spec.ipv6->hdr.src_addr); + mnl_attr_put(nlh, TCA_FLOWER_KEY_IPV6_SRC_MASK, + sizeof(mask.ipv6->hdr.src_addr), + mask.ipv6->hdr.src_addr); + } + if (!IN6_IS_ADDR_UNSPECIFIED(mask.ipv6->hdr.dst_addr)) { + mnl_attr_put(nlh, TCA_FLOWER_KEY_IPV6_DST, + sizeof(spec.ipv6->hdr.dst_addr), + spec.ipv6->hdr.dst_addr); + mnl_attr_put(nlh, TCA_FLOWER_KEY_IPV6_DST_MASK, + sizeof(mask.ipv6->hdr.dst_addr), + mask.ipv6->hdr.dst_addr); + } + break; + case RTE_FLOW_ITEM_TYPE_UDP: + mask.udp = flow_tcf_item_mask + (items, &rte_flow_item_udp_mask, + &flow_tcf_mask_supported.udp, + &flow_tcf_mask_empty.udp, + sizeof(flow_tcf_mask_supported.udp), + error); + assert(mask.udp); + if (!ip_proto_set) + mnl_attr_put_u8(nlh, TCA_FLOWER_KEY_IP_PROTO, + IPPROTO_UDP); + if (mask.udp == &flow_tcf_mask_empty.udp) + break; + spec.udp = items->spec; + if (mask.udp->hdr.src_port) { + mnl_attr_put_u16(nlh, TCA_FLOWER_KEY_UDP_SRC, + spec.udp->hdr.src_port); + mnl_attr_put_u16(nlh, + TCA_FLOWER_KEY_UDP_SRC_MASK, + mask.udp->hdr.src_port); + } + if (mask.udp->hdr.dst_port) { + mnl_attr_put_u16(nlh, TCA_FLOWER_KEY_UDP_DST, + spec.udp->hdr.dst_port); + mnl_attr_put_u16(nlh, + TCA_FLOWER_KEY_UDP_DST_MASK, + mask.udp->hdr.dst_port); + } + break; + case RTE_FLOW_ITEM_TYPE_TCP: + mask.tcp = flow_tcf_item_mask + (items, &rte_flow_item_tcp_mask, + &flow_tcf_mask_supported.tcp, + &flow_tcf_mask_empty.tcp, + sizeof(flow_tcf_mask_supported.tcp), + error); + assert(mask.tcp); + if (!ip_proto_set) + mnl_attr_put_u8(nlh, TCA_FLOWER_KEY_IP_PROTO, + IPPROTO_TCP); + if (mask.tcp == &flow_tcf_mask_empty.tcp) + break; + spec.tcp = items->spec; + if (mask.tcp->hdr.src_port) { + mnl_attr_put_u16(nlh, TCA_FLOWER_KEY_TCP_SRC, + spec.tcp->hdr.src_port); + mnl_attr_put_u16(nlh, + TCA_FLOWER_KEY_TCP_SRC_MASK, + mask.tcp->hdr.src_port); + } + if (mask.tcp->hdr.dst_port) { + mnl_attr_put_u16(nlh, TCA_FLOWER_KEY_TCP_DST, + spec.tcp->hdr.dst_port); + mnl_attr_put_u16(nlh, + TCA_FLOWER_KEY_TCP_DST_MASK, + mask.tcp->hdr.dst_port); + } + break; + default: + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM, + NULL, "item not supported"); + } + } + na_flower_act = mnl_attr_nest_start(nlh, TCA_FLOWER_ACT); + na_act_index_cur = 1; + for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) { + struct nlattr *na_act_index; + struct nlattr *na_act; + unsigned int vlan_act; + unsigned int i; + + switch (actions->type) { + case RTE_FLOW_ACTION_TYPE_VOID: + break; + case RTE_FLOW_ACTION_TYPE_PORT_ID: + conf.port_id = actions->conf; + if (conf.port_id->original) + i = 0; + else + for (i = 0; ptoi[i].ifindex; ++i) + if (ptoi[i].port_id == conf.port_id->id) + break; + assert(ptoi[i].ifindex); + na_act_index = + mnl_attr_nest_start(nlh, na_act_index_cur++); + assert(na_act_index); + mnl_attr_put_strz(nlh, TCA_ACT_KIND, "mirred"); + na_act = mnl_attr_nest_start(nlh, TCA_ACT_OPTIONS); + assert(na_act); + mnl_attr_put(nlh, TCA_MIRRED_PARMS, + sizeof(struct tc_mirred), + &(struct tc_mirred){ + .action = TC_ACT_STOLEN, + .eaction = TCA_EGRESS_REDIR, + .ifindex = ptoi[i].ifindex, + }); + mnl_attr_nest_end(nlh, na_act); + mnl_attr_nest_end(nlh, na_act_index); + break; + case RTE_FLOW_ACTION_TYPE_DROP: + na_act_index = + mnl_attr_nest_start(nlh, na_act_index_cur++); + assert(na_act_index); + mnl_attr_put_strz(nlh, TCA_ACT_KIND, "gact"); + na_act = mnl_attr_nest_start(nlh, TCA_ACT_OPTIONS); + assert(na_act); + mnl_attr_put(nlh, TCA_GACT_PARMS, + sizeof(struct tc_gact), + &(struct tc_gact){ + .action = TC_ACT_SHOT, + }); + mnl_attr_nest_end(nlh, na_act); + mnl_attr_nest_end(nlh, na_act_index); + break; + case RTE_FLOW_ACTION_TYPE_OF_POP_VLAN: + conf.of_push_vlan = NULL; + vlan_act = TCA_VLAN_ACT_POP; + goto action_of_vlan; + case RTE_FLOW_ACTION_TYPE_OF_PUSH_VLAN: + conf.of_push_vlan = actions->conf; + vlan_act = TCA_VLAN_ACT_PUSH; + goto action_of_vlan; + case RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID: + conf.of_set_vlan_vid = actions->conf; + if (na_vlan_id) + goto override_na_vlan_id; + vlan_act = TCA_VLAN_ACT_MODIFY; + goto action_of_vlan; + case RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP: + conf.of_set_vlan_pcp = actions->conf; + if (na_vlan_priority) + goto override_na_vlan_priority; + vlan_act = TCA_VLAN_ACT_MODIFY; + goto action_of_vlan; +action_of_vlan: + na_act_index = + mnl_attr_nest_start(nlh, na_act_index_cur++); + assert(na_act_index); + mnl_attr_put_strz(nlh, TCA_ACT_KIND, "vlan"); + na_act = mnl_attr_nest_start(nlh, TCA_ACT_OPTIONS); + assert(na_act); + mnl_attr_put(nlh, TCA_VLAN_PARMS, + sizeof(struct tc_vlan), + &(struct tc_vlan){ + .action = TC_ACT_PIPE, + .v_action = vlan_act, + }); + if (vlan_act == TCA_VLAN_ACT_POP) { + mnl_attr_nest_end(nlh, na_act); + mnl_attr_nest_end(nlh, na_act_index); + break; + } + if (vlan_act == TCA_VLAN_ACT_PUSH) + mnl_attr_put_u16(nlh, + TCA_VLAN_PUSH_VLAN_PROTOCOL, + conf.of_push_vlan->ethertype); + na_vlan_id = mnl_nlmsg_get_payload_tail(nlh); + mnl_attr_put_u16(nlh, TCA_VLAN_PAD, 0); + na_vlan_priority = mnl_nlmsg_get_payload_tail(nlh); + mnl_attr_put_u8(nlh, TCA_VLAN_PAD, 0); + mnl_attr_nest_end(nlh, na_act); + mnl_attr_nest_end(nlh, na_act_index); + if (actions->type == + RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID) { +override_na_vlan_id: + na_vlan_id->nla_type = TCA_VLAN_PUSH_VLAN_ID; + *(uint16_t *)mnl_attr_get_payload(na_vlan_id) = + rte_be_to_cpu_16 + (conf.of_set_vlan_vid->vlan_vid); + } else if (actions->type == + RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP) { +override_na_vlan_priority: + na_vlan_priority->nla_type = + TCA_VLAN_PUSH_VLAN_PRIORITY; + *(uint8_t *)mnl_attr_get_payload + (na_vlan_priority) = + conf.of_set_vlan_pcp->vlan_pcp; + } + break; + default: + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ACTION, + actions, + "action not supported"); + } + } + assert(na_flower); + assert(na_flower_act); + mnl_attr_nest_end(nlh, na_flower_act); + mnl_attr_nest_end(nlh, na_flower); + return 0; +} + +/** + * Send Netlink message with acknowledgment. + * + * @param nl + * Libmnl socket to use. + * @param nlh + * Message to send. This function always raises the NLM_F_ACK flag before + * sending. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +static int +flow_tcf_nl_ack(struct mnl_socket *nl, struct nlmsghdr *nlh) +{ + alignas(struct nlmsghdr) + uint8_t ans[mnl_nlmsg_size(sizeof(struct nlmsgerr)) + + nlh->nlmsg_len - sizeof(*nlh)]; + uint32_t seq = random(); + int ret; + + nlh->nlmsg_flags |= NLM_F_ACK; + nlh->nlmsg_seq = seq; + ret = mnl_socket_sendto(nl, nlh, nlh->nlmsg_len); + if (ret != -1) + ret = mnl_socket_recvfrom(nl, ans, sizeof(ans)); + if (ret != -1) + ret = mnl_cb_run + (ans, ret, seq, mnl_socket_get_portid(nl), NULL, NULL); + if (ret > 0) + return 0; + rte_errno = errno; + return -rte_errno; +} + +/** + * Apply flow to E-Switch by sending Netlink message. + * + * @param[in] dev + * Pointer to Ethernet device. + * @param[in, out] flow + * Pointer to the sub flow. + * @param[out] error + * Pointer to the error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_ernno is set. + */ +static int +flow_tcf_apply(struct rte_eth_dev *dev, struct rte_flow *flow, + struct rte_flow_error *error) +{ + struct priv *priv = dev->data->dev_private; + struct mnl_socket *nl = priv->mnl_socket; + struct mlx5_flow *dev_flow; + struct nlmsghdr *nlh; + + dev_flow = LIST_FIRST(&flow->dev_flows); + /* E-Switch flow can't be expanded. */ + assert(!LIST_NEXT(dev_flow, next)); + nlh = dev_flow->tcf.nlh; + nlh->nlmsg_type = RTM_NEWTFILTER; + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL; + if (!flow_tcf_nl_ack(nl, nlh)) + return 0; + return rte_flow_error_set(error, rte_errno, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "netlink: failed to create TC flow rule"); +} + +/** + * Remove flow from E-Switch by sending Netlink message. + * + * @param[in] dev + * Pointer to Ethernet device. + * @param[in, out] flow + * Pointer to the sub flow. + */ +static void +flow_tcf_remove(struct rte_eth_dev *dev, struct rte_flow *flow) +{ + struct priv *priv = dev->data->dev_private; + struct mnl_socket *nl = priv->mnl_socket; + struct mlx5_flow *dev_flow; + struct nlmsghdr *nlh; + + if (!flow) + return; + dev_flow = LIST_FIRST(&flow->dev_flows); + if (!dev_flow) + return; + /* E-Switch flow can't be expanded. */ + assert(!LIST_NEXT(dev_flow, next)); + nlh = dev_flow->tcf.nlh; + nlh->nlmsg_type = RTM_DELTFILTER; + nlh->nlmsg_flags = NLM_F_REQUEST; + flow_tcf_nl_ack(nl, nlh); +} + +/** + * Remove flow from E-Switch and release resources of the device flow. + * + * @param[in] dev + * Pointer to Ethernet device. + * @param[in, out] flow + * Pointer to the sub flow. + */ +static void +flow_tcf_destroy(struct rte_eth_dev *dev, struct rte_flow *flow) +{ + struct mlx5_flow *dev_flow; + + if (!flow) + return; + flow_tcf_remove(dev, flow); + dev_flow = LIST_FIRST(&flow->dev_flows); + if (!dev_flow) + return; + /* E-Switch flow can't be expanded. */ + assert(!LIST_NEXT(dev_flow, next)); + LIST_REMOVE(dev_flow, next); + rte_free(dev_flow); +} + +const struct mlx5_flow_driver_ops mlx5_flow_tcf_drv_ops = { + .validate = flow_tcf_validate, + .prepare = flow_tcf_prepare, + .translate = flow_tcf_translate, + .apply = flow_tcf_apply, + .remove = flow_tcf_remove, + .destroy = flow_tcf_destroy, +}; + +/** + * Initialize ingress qdisc of a given network interface. + * + * @param nl + * Libmnl socket of the @p NETLINK_ROUTE kind. + * @param ifindex + * Index of network interface to initialize. + * @param[out] error + * Perform verbose error reporting if not NULL. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_flow_tcf_init(struct mnl_socket *nl, unsigned int ifindex, + struct rte_flow_error *error) +{ + struct nlmsghdr *nlh; + struct tcmsg *tcm; + alignas(struct nlmsghdr) + uint8_t buf[mnl_nlmsg_size(sizeof(*tcm) + 128)]; + + /* Destroy existing ingress qdisc and everything attached to it. */ + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = RTM_DELQDISC; + nlh->nlmsg_flags = NLM_F_REQUEST; + tcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm)); + tcm->tcm_family = AF_UNSPEC; + tcm->tcm_ifindex = ifindex; + tcm->tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0); + tcm->tcm_parent = TC_H_INGRESS; + /* Ignore errors when qdisc is already absent. */ + if (flow_tcf_nl_ack(nl, nlh) && + rte_errno != EINVAL && rte_errno != ENOENT) + return rte_flow_error_set(error, rte_errno, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "netlink: failed to remove ingress" + " qdisc"); + /* Create fresh ingress qdisc. */ + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = RTM_NEWQDISC; + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL; + tcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm)); + tcm->tcm_family = AF_UNSPEC; + tcm->tcm_ifindex = ifindex; + tcm->tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0); + tcm->tcm_parent = TC_H_INGRESS; + mnl_attr_put_strz_check(nlh, sizeof(buf), TCA_KIND, "ingress"); + if (flow_tcf_nl_ack(nl, nlh)) + return rte_flow_error_set(error, rte_errno, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "netlink: failed to create ingress" + " qdisc"); + return 0; +} + +/** + * Create and configure a libmnl socket for Netlink flow rules. + * + * @return + * A valid libmnl socket object pointer on success, NULL otherwise and + * rte_errno is set. + */ +struct mnl_socket * +mlx5_flow_tcf_socket_create(void) +{ + struct mnl_socket *nl = mnl_socket_open(NETLINK_ROUTE); + + if (nl) { + mnl_socket_setsockopt(nl, NETLINK_CAP_ACK, &(int){ 1 }, + sizeof(int)); + if (!mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID)) + return nl; + } + rte_errno = errno; + if (nl) + mnl_socket_close(nl); + return NULL; +} + +/** + * Destroy a libmnl socket. + * + * @param nl + * Libmnl socket of the @p NETLINK_ROUTE kind. + */ +void +mlx5_flow_tcf_socket_destroy(struct mnl_socket *nl) +{ + mnl_socket_close(nl); +}