Message ID | 20210923112012.14595-1-ivan.malov@oktetlabs.ru (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D92A4A0C43; Thu, 23 Sep 2021 13:20:27 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C3A6A41260; Thu, 23 Sep 2021 13:20:27 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id 5B3F44122B for <dev@dpdk.org>; Thu, 23 Sep 2021 13:20:26 +0200 (CEST) Received: from localhost.localdomain (unknown [5.144.120.173]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id CCE5B7F4E2; Thu, 23 Sep 2021 14:20:25 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru CCE5B7F4E2 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1632396025; bh=E1wqXKhHxDqCAnGCfffeRNmWalimNIQhNaofsY2oDbc=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=JUUkg+GGaUPJzksM1+wm4T8UebDfBkd5+RWYOKqbSWd3XRll5DT+JWqWg3XlD5o0g 2RzomBhi2O/sea8QA4EA7/bHlVxnpRXugpsjm95zu/Sv54nos4LNya4fWSNlkWYHxR ZoK4a0srt28IRIkCjjtIg/pHzYmIj6xyt2hHONcQ= From: Ivan Malov <ivan.malov@oktetlabs.ru> To: dev@dpdk.org Cc: Andy Moreton <amoreton@xilinx.com> Date: Thu, 23 Sep 2021 14:20:07 +0300 Message-Id: <20210923112012.14595-1-ivan.malov@oktetlabs.ru> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210902142359.28138-1-ivan.malov@oktetlabs.ru> References: <20210902142359.28138-1-ivan.malov@oktetlabs.ru> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
A means to negotiate delivery of Rx meta data
|
|
Message
Ivan Malov
Sept. 23, 2021, 11:20 a.m. UTC
In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace intending to add new flags, RSS_HASH and FLOW_MARK. Since then, only the former has been added. The problem hasn't been solved. Applications still assume that no efforts are needed to enable flow mark and similar meta data delivery. The team behind net/sfc driver has to take over the efforts since the problem has started impacting us. Riverhead, a cutting edge Xilinx smart NIC family, has two Rx prefix types. Rx meta data is available only from long Rx prefix. Switching between the prefix formats can't happen in started state. Hence, we run into the same problem which [1] was aiming to solve. Rx meta data (mark, flag, tunnel ID) delivery is not an offload on its own since the corresponding flows must be active to set the data in the first place. Hence, adding offload flags similar to RSS_HASH is not a good idea. Patch [1/5] of this series adds a generic API to let applications negotiate delivery of Rx meta data during initialisation period. This way, an application knows right from the start which parts of Rx meta data won't be delivered. Hence, no necessity to try inserting flows requesting such data and handle the failures. Major clients of the flow library like Open vSwitch will have to be patched separately to benefit from the new approach. [1] c5b2e78d1172 ("doc: announce ethdev API changes in offload flags") Changes in v2: * [1/5] has review notes from Jerin Jacob applied and the ack from Ray Kinsella added * [2/5] has minor adjustments incorporated to follow changes in [1/5] Changes in v3: * [1/5] through [5/5] have review notes from Andy Moreton applied (mostly rewording) * [1/5] has the ack from Jerin Jacob added Ivan Malov (5): ethdev: add API to negotiate delivery of Rx meta data net/sfc: support API to negotiate delivery of Rx meta data net/sfc: support flow mark delivery on EF100 native datapath common/sfc_efx/base: add RxQ flag to use Rx prefix user flag net/sfc: report user flag on EF100 native datapath app/test-flow-perf/main.c | 21 ++++++++++ app/test-pmd/testpmd.c | 26 +++++++++++++ doc/guides/rel_notes/release_21_11.rst | 9 +++++ drivers/common/sfc_efx/base/ef10_rx.c | 54 ++++++++++++++++---------- drivers/common/sfc_efx/base/efx.h | 4 ++ drivers/common/sfc_efx/base/rhead_rx.c | 3 ++ drivers/net/sfc/sfc.h | 2 + drivers/net/sfc/sfc_ef100_rx.c | 19 +++++++++ drivers/net/sfc/sfc_ethdev.c | 29 ++++++++++++++ drivers/net/sfc/sfc_flow.c | 11 ++++++ drivers/net/sfc/sfc_mae.c | 22 ++++++++++- drivers/net/sfc/sfc_rx.c | 6 +++ lib/ethdev/ethdev_driver.h | 19 +++++++++ lib/ethdev/rte_ethdev.c | 25 ++++++++++++ lib/ethdev/rte_ethdev.h | 45 +++++++++++++++++++++ lib/ethdev/rte_flow.h | 12 ++++++ lib/ethdev/version.map | 3 ++ 17 files changed, 287 insertions(+), 23 deletions(-)
Comments
23/09/2021 13:20, Ivan Malov: > In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace > intending to add new flags, RSS_HASH and FLOW_MARK. Since then, > only the former has been added. The problem hasn't been solved. > Applications still assume that no efforts are needed to enable > flow mark and similar meta data delivery. > > The team behind net/sfc driver has to take over the efforts since > the problem has started impacting us. Riverhead, a cutting edge > Xilinx smart NIC family, has two Rx prefix types. Rx meta data > is available only from long Rx prefix. Switching between the > prefix formats can't happen in started state. Hence, we run > into the same problem which [1] was aiming to solve. Sorry I don't understand what is Rx prefix? > Rx meta data (mark, flag, tunnel ID) delivery is not an offload > on its own since the corresponding flows must be active to set > the data in the first place. Hence, adding offload flags > similar to RSS_HASH is not a good idea. What means "active" here? > Patch [1/5] of this series adds a generic API to let applications > negotiate delivery of Rx meta data during initialisation period. > This way, an application knows right from the start which parts > of Rx meta data won't be delivered. Hence, no necessity to try > inserting flows requesting such data and handle the failures. Sorry I don't understand the problem you want to solve. And sorry for not noticing earlier.
Hi Thomas, On 30/09/2021 19:18, Thomas Monjalon wrote: > 23/09/2021 13:20, Ivan Malov: >> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace >> intending to add new flags, RSS_HASH and FLOW_MARK. Since then, >> only the former has been added. The problem hasn't been solved. >> Applications still assume that no efforts are needed to enable >> flow mark and similar meta data delivery. >> >> The team behind net/sfc driver has to take over the efforts since >> the problem has started impacting us. Riverhead, a cutting edge >> Xilinx smart NIC family, has two Rx prefix types. Rx meta data >> is available only from long Rx prefix. Switching between the >> prefix formats can't happen in started state. Hence, we run >> into the same problem which [1] was aiming to solve. > > Sorry I don't understand what is Rx prefix? A small chunk of per-packet metadata in Rx packet buffer preceding the actual packet data. In terms of mbuf, this could be something lying before m->data_off. >> Rx meta data (mark, flag, tunnel ID) delivery is not an offload >> on its own since the corresponding flows must be active to set >> the data in the first place. Hence, adding offload flags >> similar to RSS_HASH is not a good idea. > > What means "active" here? Active = inserted and functional. What this paragraph is trying to say is that when you enable, say, RSS_HASH, that implies both computation of the hash and the driver's ability to extract in from packets ("delivery"). But when it comes to MARK, it's just "delivery". No "offload" here: the NIC won't set any mark in packets unless you create a flow rule to make it do so. That's the gist of it. >> Patch [1/5] of this series adds a generic API to let applications >> negotiate delivery of Rx meta data during initialisation period. >> This way, an application knows right from the start which parts >> of Rx meta data won't be delivered. Hence, no necessity to try >> inserting flows requesting such data and handle the failures. > > Sorry I don't understand the problem you want to solve. > And sorry for not noticing earlier. No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the packets by default (for performance reasons). If the application tries to insert a flow with action MARK, the PMD may not be able to enable delivery of Rx mark without the need to re-start Rx sub-system. And that's fraught with traffic disruption and similar bad consequences. In order to address it, we need to let the application express its interest in receiving mark with packets as early as possible. This way, the PMD can enable Rx mark delivery in advance. And, as an additional benefit, the application can learn *from the very beginning* whether it will be possible to use the feature or not. If this API tells the application that no mark delivery will be enabled, then the application can just skip many unnecessary attempts to insert wittingly unsupported flows during runtime.
On 9/30/21 10:30 PM, Ivan Malov wrote: > Hi Thomas, > > On 30/09/2021 19:18, Thomas Monjalon wrote: >> 23/09/2021 13:20, Ivan Malov: >>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace >>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then, >>> only the former has been added. The problem hasn't been solved. >>> Applications still assume that no efforts are needed to enable >>> flow mark and similar meta data delivery. >>> >>> The team behind net/sfc driver has to take over the efforts since >>> the problem has started impacting us. Riverhead, a cutting edge >>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data >>> is available only from long Rx prefix. Switching between the >>> prefix formats can't happen in started state. Hence, we run >>> into the same problem which [1] was aiming to solve. >> >> Sorry I don't understand what is Rx prefix? > > A small chunk of per-packet metadata in Rx packet buffer preceding the > actual packet data. In terms of mbuf, this could be something lying > before m->data_off. > >>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload >>> on its own since the corresponding flows must be active to set >>> the data in the first place. Hence, adding offload flags >>> similar to RSS_HASH is not a good idea. >> >> What means "active" here? > > Active = inserted and functional. What this paragraph is trying to say > is that when you enable, say, RSS_HASH, that implies both computation of > the hash and the driver's ability to extract in from packets > ("delivery"). But when it comes to MARK, it's just "delivery". No > "offload" here: the NIC won't set any mark in packets unless you create > a flow rule to make it do so. That's the gist of it. > >>> Patch [1/5] of this series adds a generic API to let applications >>> negotiate delivery of Rx meta data during initialisation period. >>> This way, an application knows right from the start which parts >>> of Rx meta data won't be delivered. Hence, no necessity to try >>> inserting flows requesting such data and handle the failures. >> >> Sorry I don't understand the problem you want to solve. >> And sorry for not noticing earlier. > > No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the > packets by default (for performance reasons). If the application tries > to insert a flow with action MARK, the PMD may not be able to enable > delivery of Rx mark without the need to re-start Rx sub-system. And > that's fraught with traffic disruption and similar bad consequences. In > order to address it, we need to let the application express its interest > in receiving mark with packets as early as possible. This way, the PMD > can enable Rx mark delivery in advance. And, as an additional benefit, > the application can learn *from the very beginning* whether it will be > possible to use the feature or not. If this API tells the application > that no mark delivery will be enabled, then the application can just > skip many unnecessary attempts to insert wittingly unsupported flows > during runtime. > Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option is vendor-specific way to address the same problem.
01/10/2021 08:47, Andrew Rybchenko: > On 9/30/21 10:30 PM, Ivan Malov wrote: > > Hi Thomas, > > > > On 30/09/2021 19:18, Thomas Monjalon wrote: > >> 23/09/2021 13:20, Ivan Malov: > >>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace > >>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then, > >>> only the former has been added. The problem hasn't been solved. > >>> Applications still assume that no efforts are needed to enable > >>> flow mark and similar meta data delivery. > >>> > >>> The team behind net/sfc driver has to take over the efforts since > >>> the problem has started impacting us. Riverhead, a cutting edge > >>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data > >>> is available only from long Rx prefix. Switching between the > >>> prefix formats can't happen in started state. Hence, we run > >>> into the same problem which [1] was aiming to solve. > >> > >> Sorry I don't understand what is Rx prefix? > > > > A small chunk of per-packet metadata in Rx packet buffer preceding the > > actual packet data. In terms of mbuf, this could be something lying > > before m->data_off. I've never seen the word "Rx prefix". In general we talk about mbuf headroom and mbuf metadata, the rest being the mbuf payload and mbuf tailroom. I guess you mean mbuf metadata in the space of the struct rte_mbuf? > >>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload > >>> on its own since the corresponding flows must be active to set > >>> the data in the first place. Hence, adding offload flags > >>> similar to RSS_HASH is not a good idea. > >> > >> What means "active" here? > > > > Active = inserted and functional. What this paragraph is trying to say > > is that when you enable, say, RSS_HASH, that implies both computation of > > the hash and the driver's ability to extract in from packets > > ("delivery"). But when it comes to MARK, it's just "delivery". No > > "offload" here: the NIC won't set any mark in packets unless you create > > a flow rule to make it do so. That's the gist of it. OK Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag. Same for RTE_FLOW_ACTION_TYPE_SET_META. > >>> Patch [1/5] of this series adds a generic API to let applications > >>> negotiate delivery of Rx meta data during initialisation period. What is a metadata? Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? Metadata word could cover any field in the mbuf struct so it is vague. > >>> This way, an application knows right from the start which parts > >>> of Rx meta data won't be delivered. Hence, no necessity to try > >>> inserting flows requesting such data and handle the failures. > >> > >> Sorry I don't understand the problem you want to solve. > >> And sorry for not noticing earlier. > > > > No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the > > packets by default (for performance reasons). If the application tries > > to insert a flow with action MARK, the PMD may not be able to enable > > delivery of Rx mark without the need to re-start Rx sub-system. And > > that's fraught with traffic disruption and similar bad consequences. In > > order to address it, we need to let the application express its interest > > in receiving mark with packets as early as possible. This way, the PMD > > can enable Rx mark delivery in advance. And, as an additional benefit, > > the application can learn *from the very beginning* whether it will be > > possible to use the feature or not. If this API tells the application > > that no mark delivery will be enabled, then the application can just > > skip many unnecessary attempts to insert wittingly unsupported flows > > during runtime. I'm puzzled, because we could have the same reasoning for any offload. I don't understand why we are focusing on mark only. I would prefer we find a generic solution using the rte_flow API. Can we make rte_flow_validate() working before port start? If validating a fake rule doesn't make sense, why not having a new function accepting a single action as parameter? > Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option > is vendor-specific way to address the same problem. Not exactly, it is configuring the capabilities: +------+-----------+-----------+-------------+-------------+ | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through | +======+===========+===========+=============+=============+ | 0 | 24 bits | 32 bits | 32 bits | no | +------+-----------+-----------+-------------+-------------+ | 1 | 24 bits | vary 0-32 | 32 bits | yes | +------+-----------+-----------+-------------+-------------+ | 2 | vary 0-24 | 32 bits | 32 bits | yes | +------+-----------+-----------+-------------+-------------+
On 10/1/21 11:11 AM, Thomas Monjalon wrote: > 01/10/2021 08:47, Andrew Rybchenko: >> On 9/30/21 10:30 PM, Ivan Malov wrote: >>> Hi Thomas, >>> >>> On 30/09/2021 19:18, Thomas Monjalon wrote: >>>> 23/09/2021 13:20, Ivan Malov: >>>>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace >>>>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then, >>>>> only the former has been added. The problem hasn't been solved. >>>>> Applications still assume that no efforts are needed to enable >>>>> flow mark and similar meta data delivery. >>>>> >>>>> The team behind net/sfc driver has to take over the efforts since >>>>> the problem has started impacting us. Riverhead, a cutting edge >>>>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data >>>>> is available only from long Rx prefix. Switching between the >>>>> prefix formats can't happen in started state. Hence, we run >>>>> into the same problem which [1] was aiming to solve. >>>> >>>> Sorry I don't understand what is Rx prefix? >>> >>> A small chunk of per-packet metadata in Rx packet buffer preceding the >>> actual packet data. In terms of mbuf, this could be something lying >>> before m->data_off. > > I've never seen the word "Rx prefix". Yes, I agree. The term is vendor-specific. > In general we talk about mbuf headroom and mbuf metadata, > the rest being the mbuf payload and mbuf tailroom. > I guess you mean mbuf metadata in the space of the struct rte_mbuf? Not exactly. It is rather lower level, but finally yes, it goes to extra data represented by one or another field in mbuf structure. Broadly Rx metadata is all per-packet extra information available in HW and could be delivered to SW: - Rx checksum offloads information - Rx packet classification - RSS hash - flow mark/flag - flow meta - tunnel offload information - source e-Switch port Delivering everything is expensive. That's why we have offload flags, possibility to reduce required Rx packet classification etc. Some metadata are not covered yet and the series suggest an approach how to cover it. > >>>>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload >>>>> on its own since the corresponding flows must be active to set >>>>> the data in the first place. Hence, adding offload flags >>>>> similar to RSS_HASH is not a good idea. >>>> >>>> What means "active" here? >>> >>> Active = inserted and functional. What this paragraph is trying to say >>> is that when you enable, say, RSS_HASH, that implies both computation of >>> the hash and the driver's ability to extract in from packets >>> ("delivery"). But when it comes to MARK, it's just "delivery". No >>> "offload" here: the NIC won't set any mark in packets unless you create >>> a flow rule to make it do so. That's the gist of it. > > OK > Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag. > Same for RTE_FLOW_ACTION_TYPE_SET_META. > >>>>> Patch [1/5] of this series adds a generic API to let applications >>>>> negotiate delivery of Rx meta data during initialisation period. > > What is a metadata? See above. > Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? > Metadata word could cover any field in the mbuf struct so it is vague. We failed to find better term. Yes, it overlaps with other Rx features. We can document exceptions and add references to existing ways to control these exceptions. If you have idea how to name it, you're welcome. > >>>>> This way, an application knows right from the start which parts >>>>> of Rx meta data won't be delivered. Hence, no necessity to try >>>>> inserting flows requesting such data and handle the failures. >>>> >>>> Sorry I don't understand the problem you want to solve. >>>> And sorry for not noticing earlier. >>> >>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the >>> packets by default (for performance reasons). If the application tries >>> to insert a flow with action MARK, the PMD may not be able to enable >>> delivery of Rx mark without the need to re-start Rx sub-system. And >>> that's fraught with traffic disruption and similar bad consequences. In >>> order to address it, we need to let the application express its interest >>> in receiving mark with packets as early as possible. This way, the PMD >>> can enable Rx mark delivery in advance. And, as an additional benefit, >>> the application can learn *from the very beginning* whether it will be >>> possible to use the feature or not. If this API tells the application >>> that no mark delivery will be enabled, then the application can just >>> skip many unnecessary attempts to insert wittingly unsupported flows >>> during runtime. > > I'm puzzled, because we could have the same reasoning for any offload. > I don't understand why we are focusing on mark only. > I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start? > If validating a fake rule doesn't make sense, > why not having a new function accepting a single action as parameter? IMHO, it will be misuse of the rte_flow_validate(). It will be complex from application point of view and driver implementation point of view since most likely implemented in a absolutely different code branch. Also what should be checked for tunnel offload? > >> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option >> is vendor-specific way to address the same problem. > > Not exactly, it is configuring the capabilities: > +------+-----------+-----------+-------------+-------------+ > | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through | > +======+===========+===========+=============+=============+ > | 0 | 24 bits | 32 bits | 32 bits | no | > +------+-----------+-----------+-------------+-------------+ > | 1 | 24 bits | vary 0-32 | 32 bits | yes | > +------+-----------+-----------+-------------+-------------+ > | 2 | vary 0-24 | 32 bits | 32 bits | yes | > +------+-----------+-----------+-------------+-------------+ Sorry, but I don't understand the difference. Negotiate is exactly about capabilities which we want to use.
Hi Thomas, On 01/10/2021 11:11, Thomas Monjalon wrote: > 01/10/2021 08:47, Andrew Rybchenko: >> On 9/30/21 10:30 PM, Ivan Malov wrote: >>> Hi Thomas, >>> >>> On 30/09/2021 19:18, Thomas Monjalon wrote: >>>> 23/09/2021 13:20, Ivan Malov: >>>>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace >>>>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then, >>>>> only the former has been added. The problem hasn't been solved. >>>>> Applications still assume that no efforts are needed to enable >>>>> flow mark and similar meta data delivery. >>>>> >>>>> The team behind net/sfc driver has to take over the efforts since >>>>> the problem has started impacting us. Riverhead, a cutting edge >>>>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data >>>>> is available only from long Rx prefix. Switching between the >>>>> prefix formats can't happen in started state. Hence, we run >>>>> into the same problem which [1] was aiming to solve. >>>> >>>> Sorry I don't understand what is Rx prefix? >>> >>> A small chunk of per-packet metadata in Rx packet buffer preceding the >>> actual packet data. In terms of mbuf, this could be something lying >>> before m->data_off. > > I've never seen the word "Rx prefix". > In general we talk about mbuf headroom and mbuf metadata, > the rest being the mbuf payload and mbuf tailroom. > I guess you mean mbuf metadata in the space of the struct rte_mbuf? In this paragraph I describe the two ways how the NIC itself can provide metadata buffers of different sizes. Hence the term "Rx prefix". As you understand, the NIC HW is unaware of DPDK, mbufs and whatever else SW concepts. To NIC, this is "Rx prefix", that is, a chunk of per-packet metadata *preceding* the actual packet data. It's responsibility of the PMD to treat this the right way, care about headroom, payload and tailroom. I describe the two Rx prefix formats in NIC terminology just to provide the gist of the problem. > >>>>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload >>>>> on its own since the corresponding flows must be active to set >>>>> the data in the first place. Hence, adding offload flags >>>>> similar to RSS_HASH is not a good idea. >>>> >>>> What means "active" here? >>> >>> Active = inserted and functional. What this paragraph is trying to say >>> is that when you enable, say, RSS_HASH, that implies both computation of >>> the hash and the driver's ability to extract in from packets >>> ("delivery"). But when it comes to MARK, it's just "delivery". No >>> "offload" here: the NIC won't set any mark in packets unless you create >>> a flow rule to make it do so. That's the gist of it. > > OK > Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag. > Same for RTE_FLOW_ACTION_TYPE_SET_META. > >>>>> Patch [1/5] of this series adds a generic API to let applications >>>>> negotiate delivery of Rx meta data during initialisation period. > > What is a metadata? > Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? > Metadata word could cover any field in the mbuf struct so it is vague. Metadata here is *any* additional information provided by the NIC for each received packet. For example, Rx flag, Rx mark, RSS hash, packet classification info, you name it. I'd like to stress out that the suggested API comes with flags each of which is crystal clear on what concrete kind of metadata it covers, eg. Rx mark. > >>>>> This way, an application knows right from the start which parts >>>>> of Rx meta data won't be delivered. Hence, no necessity to try >>>>> inserting flows requesting such data and handle the failures. >>>> >>>> Sorry I don't understand the problem you want to solve. >>>> And sorry for not noticing earlier. >>> >>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the >>> packets by default (for performance reasons). If the application tries >>> to insert a flow with action MARK, the PMD may not be able to enable >>> delivery of Rx mark without the need to re-start Rx sub-system. And >>> that's fraught with traffic disruption and similar bad consequences. In >>> order to address it, we need to let the application express its interest >>> in receiving mark with packets as early as possible. This way, the PMD >>> can enable Rx mark delivery in advance. And, as an additional benefit, >>> the application can learn *from the very beginning* whether it will be >>> possible to use the feature or not. If this API tells the application >>> that no mark delivery will be enabled, then the application can just >>> skip many unnecessary attempts to insert wittingly unsupported flows >>> during runtime. > > I'm puzzled, because we could have the same reasoning for any offload. We're not discussing *offloads*. An offload is when NIC *computes something* and *delivers* it. We are discussing precisely *delivery*. > I don't understand why we are focusing on mark only We are not focusing on mark on purpose. It's just how our discussion goes. I chose mark (could've chosen flag or anything else) just to show you an example. > I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start? > If validating a fake rule doesn't make sense, > why not having a new function accepting a single action as parameter? A noble idea, but if we feed the entire flow rule to the driver for validation, then the driver must not look specifically for actions FLAG or MARK in it (to enable or disable metadata delivery). This way, the driver is obliged to also validate match criteria, attributes, etc. And, if something is unsupported (say, some specific item), the driver will have to reject the rule as a whole thus leaving the application to join the dots itself. Say, you ask the driver to validate the following rule: pattern blah-blah-1 / blah-blah-2 / end action flag / end intending to check support for FLAG delivery. Suppose, the driver doesn't support pattern item "blah-blah-1". It will throw an error right after seeing this unsupported item and won't even go further to see the action FLAG. How can application know whether its request for FLAG was heard or not? And I'd not bind delivery of metadata to flow API. Consider the following example. We have a DPDK application sitting at the *host* and we have a *guest* with its *own* DPDK instance. The guest DPDK has asked the NIC (by virtue of flow API) to mark all outgoing packets. This packets reach the *host* DPDK. Say, the host application just wants to see the marked packets from the guest. Its own, (the host's) use of flow API is a don't care here. The host doesn't want to mark packets itself, it wants to see packets marked by the guest. > >> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option >> is vendor-specific way to address the same problem. > > Not exactly, it is configuring the capabilities: > +------+-----------+-----------+-------------+-------------+ > | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through | > +======+===========+===========+=============+=============+ > | 0 | 24 bits | 32 bits | 32 bits | no | > +------+-----------+-----------+-------------+-------------+ > | 1 | 24 bits | vary 0-32 | 32 bits | yes | > +------+-----------+-----------+-------------+-------------+ > | 2 | vary 0-24 | 32 bits | 32 bits | yes | > +------+-----------+-----------+-------------+-------------+ > >
01/10/2021 10:54, Andrew Rybchenko: > >> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option > >> is vendor-specific way to address the same problem. > > > > Not exactly, it is configuring the capabilities: > > +------+-----------+-----------+-------------+-------------+ > > | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through | > > +======+===========+===========+=============+=============+ > > | 0 | 24 bits | 32 bits | 32 bits | no | > > +------+-----------+-----------+-------------+-------------+ > > | 1 | 24 bits | vary 0-32 | 32 bits | yes | > > +------+-----------+-----------+-------------+-------------+ > > | 2 | vary 0-24 | 32 bits | 32 bits | yes | > > +------+-----------+-----------+-------------+-------------+ > > Sorry, but I don't understand the difference. Negotiate is > exactly about capabilities which we want to use. The difference is that dv_xmeta_en is not enabling/disabling the delivery of mark/meta. It is just extending the scope (cross-domain) and the size. I think this is not covered in your proposal.
On 10/1/21 12:32 PM, Thomas Monjalon wrote: > 01/10/2021 10:54, Andrew Rybchenko: >>>> Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option >>>> is vendor-specific way to address the same problem. >>> >>> Not exactly, it is configuring the capabilities: >>> +------+-----------+-----------+-------------+-------------+ >>> | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through | >>> +======+===========+===========+=============+=============+ >>> | 0 | 24 bits | 32 bits | 32 bits | no | >>> +------+-----------+-----------+-------------+-------------+ >>> | 1 | 24 bits | vary 0-32 | 32 bits | yes | >>> +------+-----------+-----------+-------------+-------------+ >>> | 2 | vary 0-24 | 32 bits | 32 bits | yes | >>> +------+-----------+-----------+-------------+-------------+ >> >> Sorry, but I don't understand the difference. Negotiate is >> exactly about capabilities which we want to use. > > The difference is that dv_xmeta_en is not enabling/disabling > the delivery of mark/meta. > It is just extending the scope (cross-domain) and the size. Enabling/disabling delivery of some bits... > I think this is not covered in your proposal. Yes, that's true, since it is too specific this way, but our proposal can help to make the best automatic choice from above options depending on negotiation request. Of course, you can always enforce via the driver option and reply on negotiate requests in accordance with enforced configuration.
01/10/2021 10:55, Ivan Malov: > On 01/10/2021 11:11, Thomas Monjalon wrote: > > 01/10/2021 08:47, Andrew Rybchenko: > >> On 9/30/21 10:30 PM, Ivan Malov wrote: > >>> On 30/09/2021 19:18, Thomas Monjalon wrote: > >>>> 23/09/2021 13:20, Ivan Malov: > >>>>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace > >>>>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then, > >>>>> only the former has been added. The problem hasn't been solved. > >>>>> Applications still assume that no efforts are needed to enable > >>>>> flow mark and similar meta data delivery. > >>>>> > >>>>> The team behind net/sfc driver has to take over the efforts since > >>>>> the problem has started impacting us. Riverhead, a cutting edge > >>>>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data > >>>>> is available only from long Rx prefix. Switching between the > >>>>> prefix formats can't happen in started state. Hence, we run > >>>>> into the same problem which [1] was aiming to solve. > >>>> > >>>> Sorry I don't understand what is Rx prefix? > >>> > >>> A small chunk of per-packet metadata in Rx packet buffer preceding the > >>> actual packet data. In terms of mbuf, this could be something lying > >>> before m->data_off. > > > > I've never seen the word "Rx prefix". > > In general we talk about mbuf headroom and mbuf metadata, > > the rest being the mbuf payload and mbuf tailroom. > > I guess you mean mbuf metadata in the space of the struct rte_mbuf? > > In this paragraph I describe the two ways how the NIC itself can provide > metadata buffers of different sizes. Hence the term "Rx prefix". As you > understand, the NIC HW is unaware of DPDK, mbufs and whatever else SW > concepts. To NIC, this is "Rx prefix", that is, a chunk of per-packet > metadata *preceding* the actual packet data. It's responsibility of the > PMD to treat this the right way, care about headroom, payload and > tailroom. I describe the two Rx prefix formats in NIC terminology just > to provide the gist of the problem. OK but it is confusing as it is vendor-specific. Please stick with DPDK terms if possible. > >>>>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload > >>>>> on its own since the corresponding flows must be active to set > >>>>> the data in the first place. Hence, adding offload flags > >>>>> similar to RSS_HASH is not a good idea. > >>>> > >>>> What means "active" here? > >>> > >>> Active = inserted and functional. What this paragraph is trying to say > >>> is that when you enable, say, RSS_HASH, that implies both computation of > >>> the hash and the driver's ability to extract in from packets > >>> ("delivery"). But when it comes to MARK, it's just "delivery". No > >>> "offload" here: the NIC won't set any mark in packets unless you create > >>> a flow rule to make it do so. That's the gist of it. > > > > OK > > Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag. > > Same for RTE_FLOW_ACTION_TYPE_SET_META. > > > >>>>> Patch [1/5] of this series adds a generic API to let applications > >>>>> negotiate delivery of Rx meta data during initialisation period. > > > > What is a metadata? > > Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? > > Metadata word could cover any field in the mbuf struct so it is vague. > > Metadata here is *any* additional information provided by the NIC for > each received packet. For example, Rx flag, Rx mark, RSS hash, packet > classification info, you name it. I'd like to stress out that the > suggested API comes with flags each of which is crystal clear on what > concrete kind of metadata it covers, eg. Rx mark. I missed the flags. You mean these 3 flags? +/** The ethdev sees flagged packets if there are flows with action FLAG. */ +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0) + +/** The ethdev sees mark IDs in packets if there are flows with action MARK. */ +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1) + +/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */ +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2) It is not crystal clear because it does not reference the API, like RTE_FLOW_ACTION_TYPE_MARK. And it covers a limited set of metadata. Do you intend to extend to all mbuf metadata? > >>>>> This way, an application knows right from the start which parts > >>>>> of Rx meta data won't be delivered. Hence, no necessity to try > >>>>> inserting flows requesting such data and handle the failures. > >>>> > >>>> Sorry I don't understand the problem you want to solve. > >>>> And sorry for not noticing earlier. > >>> > >>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the > >>> packets by default (for performance reasons). If the application tries > >>> to insert a flow with action MARK, the PMD may not be able to enable > >>> delivery of Rx mark without the need to re-start Rx sub-system. And > >>> that's fraught with traffic disruption and similar bad consequences. In > >>> order to address it, we need to let the application express its interest > >>> in receiving mark with packets as early as possible. This way, the PMD > >>> can enable Rx mark delivery in advance. And, as an additional benefit, > >>> the application can learn *from the very beginning* whether it will be > >>> possible to use the feature or not. If this API tells the application > >>> that no mark delivery will be enabled, then the application can just > >>> skip many unnecessary attempts to insert wittingly unsupported flows > >>> during runtime. > > > > I'm puzzled, because we could have the same reasoning for any offload. > > We're not discussing *offloads*. An offload is when NIC *computes > something* and *delivers* it. We are discussing precisely *delivery*. OK but still, there are a lot more mbuf metadata delivered. > > I don't understand why we are focusing on mark only > > We are not focusing on mark on purpose. It's just how our discussion > goes. I chose mark (could've chosen flag or anything else) just to show > you an example. > > > I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start? > > If validating a fake rule doesn't make sense, > > why not having a new function accepting a single action as parameter? > > A noble idea, but if we feed the entire flow rule to the driver for > validation, then the driver must not look specifically for actions FLAG > or MARK in it (to enable or disable metadata delivery). This way, the > driver is obliged to also validate match criteria, attributes, etc. And, > if something is unsupported (say, some specific item), the driver will > have to reject the rule as a whole thus leaving the application to join > the dots itself. > > Say, you ask the driver to validate the following rule: > pattern blah-blah-1 / blah-blah-2 / end action flag / end > intending to check support for FLAG delivery. Suppose, the driver > doesn't support pattern item "blah-blah-1". It will throw an error right > after seeing this unsupported item and won't even go further to see the > action FLAG. How can application know whether its request for FLAG was > heard or not? No, I'm proposing a new function to validate the action alone, without any match etc. Example: rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK) > And I'd not bind delivery of metadata to flow API. Consider the > following example. We have a DPDK application sitting at the *host* and > we have a *guest* with its *own* DPDK instance. The guest DPDK has asked > the NIC (by virtue of flow API) to mark all outgoing packets. This > packets reach the *host* DPDK. Say, the host application just wants to > see the marked packets from the guest. Its own, (the host's) use of flow > API is a don't care here. The host doesn't want to mark packets itself, > it wants to see packets marked by the guest. It does not make sense to me. We are talking about a DPDK API. My concern is to avoid redefining new flags while we already have rte_flow actions.
On 10/1/21 12:48 PM, Thomas Monjalon wrote: > 01/10/2021 10:55, Ivan Malov: >> On 01/10/2021 11:11, Thomas Monjalon wrote: >>> 01/10/2021 08:47, Andrew Rybchenko: >>>> On 9/30/21 10:30 PM, Ivan Malov wrote: >>>>> On 30/09/2021 19:18, Thomas Monjalon wrote: >>>>>> 23/09/2021 13:20, Ivan Malov: >>>>>>> Patch [1/5] of this series adds a generic API to let applications >>>>>>> negotiate delivery of Rx meta data during initialisation period. >>> >>> What is a metadata? >>> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? >>> Metadata word could cover any field in the mbuf struct so it is vague. >> >> Metadata here is *any* additional information provided by the NIC for >> each received packet. For example, Rx flag, Rx mark, RSS hash, packet >> classification info, you name it. I'd like to stress out that the >> suggested API comes with flags each of which is crystal clear on what >> concrete kind of metadata it covers, eg. Rx mark. > > I missed the flags. > You mean these 3 flags? Yes > +/** The ethdev sees flagged packets if there are flows with action FLAG. */ > +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0) > + > +/** The ethdev sees mark IDs in packets if there are flows with action MARK. */ > +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1) > + > +/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */ > +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2) > > It is not crystal clear because it does not reference the API, > like RTE_FLOW_ACTION_TYPE_MARK. Thanks, it is easy to fix. Please, note that there is no action for tunnel ID case. > And it covers a limited set of metadata. Yes which are not covered by offloads, packet classification etc. Anything else? > Do you intend to extend to all mbuf metadata? No. It should be discussed case-by-case separately. > >>>>>>> This way, an application knows right from the start which parts >>>>>>> of Rx meta data won't be delivered. Hence, no necessity to try >>>>>>> inserting flows requesting such data and handle the failures. >>>>>> >>>>>> Sorry I don't understand the problem you want to solve. >>>>>> And sorry for not noticing earlier. >>>>> >>>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the >>>>> packets by default (for performance reasons). If the application tries >>>>> to insert a flow with action MARK, the PMD may not be able to enable >>>>> delivery of Rx mark without the need to re-start Rx sub-system. And >>>>> that's fraught with traffic disruption and similar bad consequences. In >>>>> order to address it, we need to let the application express its interest >>>>> in receiving mark with packets as early as possible. This way, the PMD >>>>> can enable Rx mark delivery in advance. And, as an additional benefit, >>>>> the application can learn *from the very beginning* whether it will be >>>>> possible to use the feature or not. If this API tells the application >>>>> that no mark delivery will be enabled, then the application can just >>>>> skip many unnecessary attempts to insert wittingly unsupported flows >>>>> during runtime. >>> >>> I'm puzzled, because we could have the same reasoning for any offload. >> >> We're not discussing *offloads*. An offload is when NIC *computes >> something* and *delivers* it. We are discussing precisely *delivery*. > > OK but still, there are a lot more mbuf metadata delivered. Yes, and some are not controlled yet early enough, and we do here. > >>> I don't understand why we are focusing on mark only >> >> We are not focusing on mark on purpose. It's just how our discussion >> goes. I chose mark (could've chosen flag or anything else) just to show >> you an example. >> >>> I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start? >>> If validating a fake rule doesn't make sense, >>> why not having a new function accepting a single action as parameter? >> >> A noble idea, but if we feed the entire flow rule to the driver for >> validation, then the driver must not look specifically for actions FLAG >> or MARK in it (to enable or disable metadata delivery). This way, the >> driver is obliged to also validate match criteria, attributes, etc. And, >> if something is unsupported (say, some specific item), the driver will >> have to reject the rule as a whole thus leaving the application to join >> the dots itself. >> >> Say, you ask the driver to validate the following rule: >> pattern blah-blah-1 / blah-blah-2 / end action flag / end >> intending to check support for FLAG delivery. Suppose, the driver >> doesn't support pattern item "blah-blah-1". It will throw an error right >> after seeing this unsupported item and won't even go further to see the >> action FLAG. How can application know whether its request for FLAG was >> heard or not? > > No, I'm proposing a new function to validate the action alone, > without any match etc. > Example: > rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK) When about tunnel ID? Also negotiation in terms of bitmask natively allows to provide everything required at once and it simplifies implementation in the driver. No dependency on order of checks etc. Also it allows to renegotiate without any extra API functions. > > >> And I'd not bind delivery of metadata to flow API. Consider the >> following example. We have a DPDK application sitting at the *host* and >> we have a *guest* with its *own* DPDK instance. The guest DPDK has asked >> the NIC (by virtue of flow API) to mark all outgoing packets. This >> packets reach the *host* DPDK. Say, the host application just wants to >> see the marked packets from the guest. Its own, (the host's) use of flow >> API is a don't care here. The host doesn't want to mark packets itself, >> it wants to see packets marked by the guest. > > It does not make sense to me. We are talking about a DPDK API. > My concern is to avoid redefining new flags > while we already have rte_flow actions. See above.
01/10/2021 12:15, Andrew Rybchenko: > On 10/1/21 12:48 PM, Thomas Monjalon wrote: > > 01/10/2021 10:55, Ivan Malov: > >> On 01/10/2021 11:11, Thomas Monjalon wrote: > >>> 01/10/2021 08:47, Andrew Rybchenko: > >>>> On 9/30/21 10:30 PM, Ivan Malov wrote: > >>>>> On 30/09/2021 19:18, Thomas Monjalon wrote: > >>>>>> 23/09/2021 13:20, Ivan Malov: > >>>>>>> Patch [1/5] of this series adds a generic API to let applications > >>>>>>> negotiate delivery of Rx meta data during initialisation period. > >>> > >>> What is a metadata? > >>> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? > >>> Metadata word could cover any field in the mbuf struct so it is vague. > >> > >> Metadata here is *any* additional information provided by the NIC for > >> each received packet. For example, Rx flag, Rx mark, RSS hash, packet > >> classification info, you name it. I'd like to stress out that the > >> suggested API comes with flags each of which is crystal clear on what > >> concrete kind of metadata it covers, eg. Rx mark. > > > > I missed the flags. > > You mean these 3 flags? > > Yes > > > +/** The ethdev sees flagged packets if there are flows with action FLAG. */ > > +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0) > > + > > +/** The ethdev sees mark IDs in packets if there are flows with action MARK. */ > > +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1) > > + > > +/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */ > > +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2) > > > > It is not crystal clear because it does not reference the API, > > like RTE_FLOW_ACTION_TYPE_MARK. > > Thanks, it is easy to fix. Please, note that there is no action > for tunnel ID case. I don't understand the tunnel ID meta. Is it an existing offload? API? > > And it covers a limited set of metadata. > > Yes which are not covered by offloads, packet classification > etc. Anything else? > > > Do you intend to extend to all mbuf metadata? > > No. It should be discussed case-by-case separately. Ah, it makes the intent clearer. Why not planning to do something truly generic? > >>>>>>> This way, an application knows right from the start which parts > >>>>>>> of Rx meta data won't be delivered. Hence, no necessity to try > >>>>>>> inserting flows requesting such data and handle the failures. > >>>>>> > >>>>>> Sorry I don't understand the problem you want to solve. > >>>>>> And sorry for not noticing earlier. > >>>>> > >>>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the > >>>>> packets by default (for performance reasons). If the application tries > >>>>> to insert a flow with action MARK, the PMD may not be able to enable > >>>>> delivery of Rx mark without the need to re-start Rx sub-system. And > >>>>> that's fraught with traffic disruption and similar bad consequences. In > >>>>> order to address it, we need to let the application express its interest > >>>>> in receiving mark with packets as early as possible. This way, the PMD > >>>>> can enable Rx mark delivery in advance. And, as an additional benefit, > >>>>> the application can learn *from the very beginning* whether it will be > >>>>> possible to use the feature or not. If this API tells the application > >>>>> that no mark delivery will be enabled, then the application can just > >>>>> skip many unnecessary attempts to insert wittingly unsupported flows > >>>>> during runtime. > >>> > >>> I'm puzzled, because we could have the same reasoning for any offload. > >> > >> We're not discussing *offloads*. An offload is when NIC *computes > >> something* and *delivers* it. We are discussing precisely *delivery*. > > > > OK but still, there are a lot more mbuf metadata delivered. > > Yes, and some are not controlled yet early enough, and > we do here. > > > > >>> I don't understand why we are focusing on mark only > >> > >> We are not focusing on mark on purpose. It's just how our discussion > >> goes. I chose mark (could've chosen flag or anything else) just to show > >> you an example. > >> > >>> I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start? > >>> If validating a fake rule doesn't make sense, > >>> why not having a new function accepting a single action as parameter? > >> > >> A noble idea, but if we feed the entire flow rule to the driver for > >> validation, then the driver must not look specifically for actions FLAG > >> or MARK in it (to enable or disable metadata delivery). This way, the > >> driver is obliged to also validate match criteria, attributes, etc. And, > >> if something is unsupported (say, some specific item), the driver will > >> have to reject the rule as a whole thus leaving the application to join > >> the dots itself. > >> > >> Say, you ask the driver to validate the following rule: > >> pattern blah-blah-1 / blah-blah-2 / end action flag / end > >> intending to check support for FLAG delivery. Suppose, the driver > >> doesn't support pattern item "blah-blah-1". It will throw an error right > >> after seeing this unsupported item and won't even go further to see the > >> action FLAG. How can application know whether its request for FLAG was > >> heard or not? > > > > No, I'm proposing a new function to validate the action alone, > > without any match etc. > > Example: > > rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK) > > When about tunnel ID? > > Also negotiation in terms of bitmask natively allows to > provide everything required at once and it simplifies > implementation in the driver. No dependency on order of > checks etc. Also it allows to renegotiate without any > extra API functions. You mean there is a single function call with all bits set? > >> And I'd not bind delivery of metadata to flow API. Consider the > >> following example. We have a DPDK application sitting at the *host* and > >> we have a *guest* with its *own* DPDK instance. The guest DPDK has asked > >> the NIC (by virtue of flow API) to mark all outgoing packets. This > >> packets reach the *host* DPDK. Say, the host application just wants to > >> see the marked packets from the guest. Its own, (the host's) use of flow > >> API is a don't care here. The host doesn't want to mark packets itself, > >> it wants to see packets marked by the guest. > > > > It does not make sense to me. We are talking about a DPDK API. > > My concern is to avoid redefining new flags > > while we already have rte_flow actions. > > See above.
On 10/1/21 3:10 PM, Thomas Monjalon wrote: > 01/10/2021 12:15, Andrew Rybchenko: >> On 10/1/21 12:48 PM, Thomas Monjalon wrote: >>> 01/10/2021 10:55, Ivan Malov: >>>> On 01/10/2021 11:11, Thomas Monjalon wrote: >>>>> 01/10/2021 08:47, Andrew Rybchenko: >>>>>> On 9/30/21 10:30 PM, Ivan Malov wrote: >>>>>>> On 30/09/2021 19:18, Thomas Monjalon wrote: >>>>>>>> 23/09/2021 13:20, Ivan Malov: >>>>>>>>> Patch [1/5] of this series adds a generic API to let applications >>>>>>>>> negotiate delivery of Rx meta data during initialisation period. >>>>> >>>>> What is a metadata? >>>>> Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? >>>>> Metadata word could cover any field in the mbuf struct so it is vague. >>>> >>>> Metadata here is *any* additional information provided by the NIC for >>>> each received packet. For example, Rx flag, Rx mark, RSS hash, packet >>>> classification info, you name it. I'd like to stress out that the >>>> suggested API comes with flags each of which is crystal clear on what >>>> concrete kind of metadata it covers, eg. Rx mark. >>> >>> I missed the flags. >>> You mean these 3 flags? >> >> Yes >> >>> +/** The ethdev sees flagged packets if there are flows with action FLAG. */ >>> +#define RTE_ETH_RX_META_USER_FLAG (UINT64_C(1) << 0) >>> + >>> +/** The ethdev sees mark IDs in packets if there are flows with action MARK. */ >>> +#define RTE_ETH_RX_META_USER_MARK (UINT64_C(1) << 1) >>> + >>> +/** The ethdev detects missed packets if there are "tunnel_set" flows in use. */ >>> +#define RTE_ETH_RX_META_TUNNEL_ID (UINT64_C(1) << 2) >>> >>> It is not crystal clear because it does not reference the API, >>> like RTE_FLOW_ACTION_TYPE_MARK. >> >> Thanks, it is easy to fix. Please, note that there is no action >> for tunnel ID case. > > I don't understand the tunnel ID meta. > Is it an existing offload? API? rte_flow_tunnel_*() API and "Tunneled traffic offload" in flow API documentation. > >>> And it covers a limited set of metadata. >> >> Yes which are not covered by offloads, packet classification >> etc. Anything else? >> >>> Do you intend to extend to all mbuf metadata? >> >> No. It should be discussed case-by-case separately. > > Ah, it makes the intent clearer. > Why not planning to do something truly generic? IMHO, it is generic enough for the purpose. > >>>>>>>>> This way, an application knows right from the start which parts >>>>>>>>> of Rx meta data won't be delivered. Hence, no necessity to try >>>>>>>>> inserting flows requesting such data and handle the failures. >>>>>>>> >>>>>>>> Sorry I don't understand the problem you want to solve. >>>>>>>> And sorry for not noticing earlier. >>>>>>> >>>>>>> No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the >>>>>>> packets by default (for performance reasons). If the application tries >>>>>>> to insert a flow with action MARK, the PMD may not be able to enable >>>>>>> delivery of Rx mark without the need to re-start Rx sub-system. And >>>>>>> that's fraught with traffic disruption and similar bad consequences. In >>>>>>> order to address it, we need to let the application express its interest >>>>>>> in receiving mark with packets as early as possible. This way, the PMD >>>>>>> can enable Rx mark delivery in advance. And, as an additional benefit, >>>>>>> the application can learn *from the very beginning* whether it will be >>>>>>> possible to use the feature or not. If this API tells the application >>>>>>> that no mark delivery will be enabled, then the application can just >>>>>>> skip many unnecessary attempts to insert wittingly unsupported flows >>>>>>> during runtime. >>>>> >>>>> I'm puzzled, because we could have the same reasoning for any offload. >>>> >>>> We're not discussing *offloads*. An offload is when NIC *computes >>>> something* and *delivers* it. We are discussing precisely *delivery*. >>> >>> OK but still, there are a lot more mbuf metadata delivered. >> >> Yes, and some are not controlled yet early enough, and >> we do here. >> >>> >>>>> I don't understand why we are focusing on mark only >>>> >>>> We are not focusing on mark on purpose. It's just how our discussion >>>> goes. I chose mark (could've chosen flag or anything else) just to show >>>> you an example. >>>> >>>>> I would prefer we find a generic solution using the rte_flow API. > Can we make rte_flow_validate() working before port start? >>>>> If validating a fake rule doesn't make sense, >>>>> why not having a new function accepting a single action as parameter? >>>> >>>> A noble idea, but if we feed the entire flow rule to the driver for >>>> validation, then the driver must not look specifically for actions FLAG >>>> or MARK in it (to enable or disable metadata delivery). This way, the >>>> driver is obliged to also validate match criteria, attributes, etc. And, >>>> if something is unsupported (say, some specific item), the driver will >>>> have to reject the rule as a whole thus leaving the application to join >>>> the dots itself. >>>> >>>> Say, you ask the driver to validate the following rule: >>>> pattern blah-blah-1 / blah-blah-2 / end action flag / end >>>> intending to check support for FLAG delivery. Suppose, the driver >>>> doesn't support pattern item "blah-blah-1". It will throw an error right >>>> after seeing this unsupported item and won't even go further to see the >>>> action FLAG. How can application know whether its request for FLAG was >>>> heard or not? >>> >>> No, I'm proposing a new function to validate the action alone, >>> without any match etc. >>> Example: >>> rte_flow_action_request(RTE_FLOW_ACTION_TYPE_MARK) Also, please, note that sometimes it makes sense to use action MARK on transfer level, match it in flow rules in non-transfer level, but do not require deliver the mark to host. >> >> When about tunnel ID? >> >> Also negotiation in terms of bitmask natively allows to >> provide everything required at once and it simplifies >> implementation in the driver. No dependency on order of >> checks etc. Also it allows to renegotiate without any >> extra API functions. > > You mean there is a single function call with all bits set? Yes, but not all, but required bits set. > >>>> And I'd not bind delivery of metadata to flow API. Consider the >>>> following example. We have a DPDK application sitting at the *host* and >>>> we have a *guest* with its *own* DPDK instance. The guest DPDK has asked >>>> the NIC (by virtue of flow API) to mark all outgoing packets. This >>>> packets reach the *host* DPDK. Say, the host application just wants to >>>> see the marked packets from the guest. Its own, (the host's) use of flow >>>> API is a don't care here. The host doesn't want to mark packets itself, >>>> it wants to see packets marked by the guest. >>> >>> It does not make sense to me. We are talking about a DPDK API. >>> My concern is to avoid redefining new flags >>> while we already have rte_flow actions. >> >> See above. >