[v6,1/3] ethdev: introduce IP reassembly offload
Checks
Commit Message
IP Reassembly is a costly operation if it is done in software.
The operation becomes even more costlier if IP fragments are encrypted.
However, if it is offloaded to HW, it can considerably save application
cycles.
Hence, a new offload feature is exposed in eth_dev ops for devices which
can attempt IP reassembly of packets in hardware.
- rte_eth_ip_reassembly_capability_get() - to get the maximum values
of reassembly configuration which can be set.
- rte_eth_ip_reassembly_conf_set() - to set IP reassembly configuration
and to enable the feature in the PMD (to be called before
rte_eth_dev_start()).
- rte_eth_ip_reassembly_conf_get() - to get the current configuration
set in PMD.
Now when the offload is enabled using rte_eth_ip_reassembly_conf_set(),
the resulting reassembled IP packet would be a typical segmented mbuf in
case of success.
And if reassembly of IP fragments is failed or is incomplete (if
fragments do not come before the reass_timeout, overlap, etc), the mbuf
dynamic flags can be updated by the PMD. This is updated in a subsequent
patch.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
---
doc/guides/nics/features.rst | 13 ++++
doc/guides/nics/features/default.ini | 1 +
doc/guides/rel_notes/release_22_03.rst | 6 ++
lib/ethdev/ethdev_driver.h | 55 ++++++++++++++
lib/ethdev/rte_ethdev.c | 96 ++++++++++++++++++++++++
lib/ethdev/rte_ethdev.h | 100 +++++++++++++++++++++++++
lib/ethdev/version.map | 5 ++
7 files changed, 276 insertions(+)
Comments
On 2/8/2022 10:20 PM, Akhil Goyal wrote:
> IP Reassembly is a costly operation if it is done in software.
> The operation becomes even more costlier if IP fragments are encrypted.
> However, if it is offloaded to HW, it can considerably save application
> cycles.
>
> Hence, a new offload feature is exposed in eth_dev ops for devices which
> can attempt IP reassembly of packets in hardware.
> - rte_eth_ip_reassembly_capability_get() - to get the maximum values
> of reassembly configuration which can be set.
> - rte_eth_ip_reassembly_conf_set() - to set IP reassembly configuration
> and to enable the feature in the PMD (to be called before
> rte_eth_dev_start()).
> - rte_eth_ip_reassembly_conf_get() - to get the current configuration
> set in PMD.
>
> Now when the offload is enabled using rte_eth_ip_reassembly_conf_set(),
> the resulting reassembled IP packet would be a typical segmented mbuf in
> case of success.
>
> And if reassembly of IP fragments is failed or is incomplete (if
> fragments do not come before the reass_timeout, overlap, etc), the mbuf
> dynamic flags can be updated by the PMD. This is updated in a subsequent
> patch.
>
> Signed-off-by: Akhil Goyal<gakhil@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
On 2/9/22 01:20, Akhil Goyal wrote:
> IP Reassembly is a costly operation if it is done in software.
> The operation becomes even more costlier if IP fragments are encrypted.
> However, if it is offloaded to HW, it can considerably save application
> cycles.
>
> Hence, a new offload feature is exposed in eth_dev ops for devices which
> can attempt IP reassembly of packets in hardware.
> - rte_eth_ip_reassembly_capability_get() - to get the maximum values
> of reassembly configuration which can be set.
> - rte_eth_ip_reassembly_conf_set() - to set IP reassembly configuration
> and to enable the feature in the PMD (to be called before
> rte_eth_dev_start()).
> - rte_eth_ip_reassembly_conf_get() - to get the current configuration
> set in PMD.
>
> Now when the offload is enabled using rte_eth_ip_reassembly_conf_set(),
> the resulting reassembled IP packet would be a typical segmented mbuf in
> case of success.
>
> And if reassembly of IP fragments is failed or is incomplete (if
> fragments do not come before the reass_timeout, overlap, etc), the mbuf
> dynamic flags can be updated by the PMD. This is updated in a subsequent
> patch.
>
> Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Just one nit below, sorry that I'm so late
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 147cc1ced3..0215f9d854 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -5202,6 +5202,106 @@ int rte_eth_representor_info_get(uint16_t port_id,
> __rte_experimental
> int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
>
> +/* Flag to offload IP reassembly for IPv4 packets. */
> +#define RTE_ETH_DEV_REASSEMBLY_F_IPV4 (RTE_BIT32(0))
> +/* Flag to offload IP reassembly for IPv6 packets. */
> +#define RTE_ETH_DEV_REASSEMBLY_F_IPV6 (RTE_BIT32(1))
Doxygen style comments shoud be above: /**
On 2/10/2022 10:08 AM, Andrew Rybchenko wrote:
> On 2/9/22 01:20, Akhil Goyal wrote:
>> IP Reassembly is a costly operation if it is done in software.
>> The operation becomes even more costlier if IP fragments are encrypted.
>> However, if it is offloaded to HW, it can considerably save application
>> cycles.
>>
>> Hence, a new offload feature is exposed in eth_dev ops for devices which
>> can attempt IP reassembly of packets in hardware.
>> - rte_eth_ip_reassembly_capability_get() - to get the maximum values
>> of reassembly configuration which can be set.
>> - rte_eth_ip_reassembly_conf_set() - to set IP reassembly configuration
>> and to enable the feature in the PMD (to be called before
>> rte_eth_dev_start()).
>> - rte_eth_ip_reassembly_conf_get() - to get the current configuration
>> set in PMD.
>>
>> Now when the offload is enabled using rte_eth_ip_reassembly_conf_set(),
>> the resulting reassembled IP packet would be a typical segmented mbuf in
>> case of success.
>>
>> And if reassembly of IP fragments is failed or is incomplete (if
>> fragments do not come before the reass_timeout, overlap, etc), the mbuf
>> dynamic flags can be updated by the PMD. This is updated in a subsequent
>> patch.
>>
>> Signed-off-by: Akhil Goyal <gakhil@marvell.com>
>
> Just one nit below, sorry that I'm so late
>
>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>> index 147cc1ced3..0215f9d854 100644
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -5202,6 +5202,106 @@ int rte_eth_representor_info_get(uint16_t port_id,
>> __rte_experimental
>> int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
>> +/* Flag to offload IP reassembly for IPv4 packets. */
>> +#define RTE_ETH_DEV_REASSEMBLY_F_IPV4 (RTE_BIT32(0))
>> +/* Flag to offload IP reassembly for IPv6 packets. */
>> +#define RTE_ETH_DEV_REASSEMBLY_F_IPV6 (RTE_BIT32(1))
>
> Doxygen style comments shoud be above: /**
ack. Let me fix that in next-net.
On 2/10/2022 10:20 AM, Ferruh Yigit wrote:
> On 2/10/2022 10:08 AM, Andrew Rybchenko wrote:
>> On 2/9/22 01:20, Akhil Goyal wrote:
>>> IP Reassembly is a costly operation if it is done in software.
>>> The operation becomes even more costlier if IP fragments are encrypted.
>>> However, if it is offloaded to HW, it can considerably save application
>>> cycles.
>>>
>>> Hence, a new offload feature is exposed in eth_dev ops for devices which
>>> can attempt IP reassembly of packets in hardware.
>>> - rte_eth_ip_reassembly_capability_get() - to get the maximum values
>>> of reassembly configuration which can be set.
>>> - rte_eth_ip_reassembly_conf_set() - to set IP reassembly configuration
>>> and to enable the feature in the PMD (to be called before
>>> rte_eth_dev_start()).
>>> - rte_eth_ip_reassembly_conf_get() - to get the current configuration
>>> set in PMD.
>>>
>>> Now when the offload is enabled using rte_eth_ip_reassembly_conf_set(),
>>> the resulting reassembled IP packet would be a typical segmented mbuf in
>>> case of success.
>>>
>>> And if reassembly of IP fragments is failed or is incomplete (if
>>> fragments do not come before the reass_timeout, overlap, etc), the mbuf
>>> dynamic flags can be updated by the PMD. This is updated in a subsequent
>>> patch.
>>>
>>> Signed-off-by: Akhil Goyal <gakhil@marvell.com>
>>
>> Just one nit below, sorry that I'm so late
>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>> index 147cc1ced3..0215f9d854 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -5202,6 +5202,106 @@ int rte_eth_representor_info_get(uint16_t port_id,
>>> __rte_experimental
>>> int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
>>> +/* Flag to offload IP reassembly for IPv4 packets. */
>>> +#define RTE_ETH_DEV_REASSEMBLY_F_IPV4 (RTE_BIT32(0))
>>> +/* Flag to offload IP reassembly for IPv6 packets. */
>>> +#define RTE_ETH_DEV_REASSEMBLY_F_IPV6 (RTE_BIT32(1))
>>
>> Doxygen style comments shoud be above: /**
>
> ack. Let me fix that in next-net.
done, please verify in next-net
@@ -602,6 +602,19 @@ Supports inner packet L4 checksum.
``tx_offload_capa,tx_queue_offload_capa:RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM``.
+.. _nic_features_ip_reassembly:
+
+IP reassembly
+-------------
+
+Supports IP reassembly in hardware.
+
+* **[provides] eth_dev_ops**: ``ip_reassembly_capability_get``,
+ ``ip_reassembly_conf_get``, ``ip_reassembly_conf_set``.
+* **[related] API**: ``rte_eth_ip_reassembly_capability_get()``,
+ ``rte_eth_ip_reassembly_conf_get()``, ``rte_eth_ip_reassembly_conf_set()``.
+
+
.. _nic_features_shared_rx_queue:
Shared Rx queue
@@ -52,6 +52,7 @@ Timestamp offload =
MACsec offload =
Inner L3 checksum =
Inner L4 checksum =
+IP reassembly =
Packet type parsing =
Timesync =
Rx descriptor status =
@@ -55,6 +55,12 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added IP reassembly Ethernet offload API, to get and set config.**
+
+ Added IP reassembly offload APIs which provide functions to query IP
+ reassembly capabilities, to set configuration and to get currently set
+ reassembly configuration.
+
* **Updated Cisco enic driver.**
* Added rte_flow support for matching GENEVE packets.
@@ -990,6 +990,54 @@ typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev,
typedef int (*eth_rx_metadata_negotiate_t)(struct rte_eth_dev *dev,
uint64_t *features);
+/**
+ * @internal
+ * Get IP reassembly offload capability of a PMD.
+ *
+ * @param dev
+ * Port (ethdev) handle
+ *
+ * @param[out] conf
+ * IP reassembly capability supported by the PMD
+ *
+ * @return
+ * Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_ip_reassembly_capability_get_t)(struct rte_eth_dev *dev,
+ struct rte_eth_ip_reassembly_params *capa);
+
+/**
+ * @internal
+ * Get IP reassembly offload configuration parameters set in PMD.
+ *
+ * @param dev
+ * Port (ethdev) handle
+ *
+ * @param[out] conf
+ * Configuration parameters for IP reassembly.
+ *
+ * @return
+ * Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_ip_reassembly_conf_get_t)(struct rte_eth_dev *dev,
+ struct rte_eth_ip_reassembly_params *conf);
+
+/**
+ * @internal
+ * Set configuration parameters for enabling IP reassembly offload in hardware.
+ *
+ * @param dev
+ * Port (ethdev) handle
+ *
+ * @param[in] conf
+ * Configuration parameters for IP reassembly.
+ *
+ * @return
+ * Negative errno value on error, zero otherwise
+ */
+typedef int (*eth_ip_reassembly_conf_set_t)(struct rte_eth_dev *dev,
+ const struct rte_eth_ip_reassembly_params *conf);
+
/**
* @internal A structure containing the functions exported by an Ethernet driver.
*/
@@ -1186,6 +1234,13 @@ struct eth_dev_ops {
* kinds of metadata to the PMD
*/
eth_rx_metadata_negotiate_t rx_metadata_negotiate;
+
+ /** Get IP reassembly capability */
+ eth_ip_reassembly_capability_get_t ip_reassembly_capability_get;
+ /** Get IP reassembly configuration */
+ eth_ip_reassembly_conf_get_t ip_reassembly_conf_get;
+ /** Set IP reassembly configuration */
+ eth_ip_reassembly_conf_set_t ip_reassembly_conf_set;
};
/**
@@ -6474,6 +6474,102 @@ rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features)
(*dev->dev_ops->rx_metadata_negotiate)(dev, features));
}
+int
+rte_eth_ip_reassembly_capability_get(uint16_t port_id,
+ struct rte_eth_ip_reassembly_params *reassembly_capa)
+{
+ struct rte_eth_dev *dev;
+
+ RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+ dev = &rte_eth_devices[port_id];
+
+ if (dev->data->dev_configured == 0) {
+ RTE_ETHDEV_LOG(ERR,
+ "Device with port_id=%u is not configured.\n"
+ "Cannot get IP reassembly capability\n",
+ port_id);
+ return -EINVAL;
+ }
+
+ if (reassembly_capa == NULL) {
+ RTE_ETHDEV_LOG(ERR, "Cannot get reassembly capability to NULL");
+ return -EINVAL;
+ }
+
+ RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->ip_reassembly_capability_get,
+ -ENOTSUP);
+ memset(reassembly_capa, 0, sizeof(struct rte_eth_ip_reassembly_params));
+
+ return eth_err(port_id, (*dev->dev_ops->ip_reassembly_capability_get)
+ (dev, reassembly_capa));
+}
+
+int
+rte_eth_ip_reassembly_conf_get(uint16_t port_id,
+ struct rte_eth_ip_reassembly_params *conf)
+{
+ struct rte_eth_dev *dev;
+
+ RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+ dev = &rte_eth_devices[port_id];
+
+ if (dev->data->dev_configured == 0) {
+ RTE_ETHDEV_LOG(ERR,
+ "Device with port_id=%u is not configured.\n"
+ "Cannot get IP reassembly configuration\n",
+ port_id);
+ return -EINVAL;
+ }
+
+ if (conf == NULL) {
+ RTE_ETHDEV_LOG(ERR, "Cannot get reassembly info to NULL");
+ return -EINVAL;
+ }
+
+ RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->ip_reassembly_conf_get,
+ -ENOTSUP);
+ memset(conf, 0, sizeof(struct rte_eth_ip_reassembly_params));
+ return eth_err(port_id,
+ (*dev->dev_ops->ip_reassembly_conf_get)(dev, conf));
+}
+
+int
+rte_eth_ip_reassembly_conf_set(uint16_t port_id,
+ const struct rte_eth_ip_reassembly_params *conf)
+{
+ struct rte_eth_dev *dev;
+
+ RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+ dev = &rte_eth_devices[port_id];
+
+ if (dev->data->dev_configured == 0) {
+ RTE_ETHDEV_LOG(ERR,
+ "Device with port_id=%u is not configured.\n"
+ "Cannot set IP reassembly configuration",
+ port_id);
+ return -EINVAL;
+ }
+
+ if (dev->data->dev_started != 0) {
+ RTE_ETHDEV_LOG(ERR,
+ "Device with port_id=%u started,\n"
+ "cannot configure IP reassembly params.\n",
+ port_id);
+ return -EINVAL;
+ }
+
+ if (conf == NULL) {
+ RTE_ETHDEV_LOG(ERR,
+ "Invalid IP reassembly configuration (NULL)\n");
+ return -EINVAL;
+ }
+
+ RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->ip_reassembly_conf_set,
+ -ENOTSUP);
+ return eth_err(port_id,
+ (*dev->dev_ops->ip_reassembly_conf_set)(dev, conf));
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
RTE_INIT(ethdev_init_telemetry)
@@ -5202,6 +5202,106 @@ int rte_eth_representor_info_get(uint16_t port_id,
__rte_experimental
int rte_eth_rx_metadata_negotiate(uint16_t port_id, uint64_t *features);
+/* Flag to offload IP reassembly for IPv4 packets. */
+#define RTE_ETH_DEV_REASSEMBLY_F_IPV4 (RTE_BIT32(0))
+/* Flag to offload IP reassembly for IPv6 packets. */
+#define RTE_ETH_DEV_REASSEMBLY_F_IPV6 (RTE_BIT32(1))
+/**
+ * A structure used to get/set IP reassembly configuration. It is also used
+ * to get the maximum capability values that a PMD can support.
+ *
+ * If rte_eth_ip_reassembly_capability_get() returns 0, IP reassembly can be
+ * enabled using rte_eth_ip_reassembly_conf_set() and params values lower than
+ * capability params can be set in the PMD.
+ */
+struct rte_eth_ip_reassembly_params {
+ /** Maximum time in ms which PMD can wait for other fragments. */
+ uint32_t timeout_ms;
+ /** Maximum number of fragments that can be reassembled. */
+ uint16_t max_frags;
+ /**
+ * Flags to enable reassembly of packet types -
+ * RTE_ETH_DEV_REASSEMBLY_F_xxx.
+ */
+ uint16_t flags;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Get IP reassembly capabilities supported by the PMD. This is the first API
+ * to be called for enabling the IP reassembly offload feature. PMD will return
+ * the maximum values of parameters that PMD can support and user can call
+ * rte_eth_ip_reassembly_conf_set() with param values lower than capability.
+ *
+ * @param port_id
+ * The port identifier of the device.
+ * @param capa
+ * A pointer to rte_eth_ip_reassembly_params structure.
+ * @return
+ * - (-ENOTSUP) if offload configuration is not supported by device.
+ * - (-ENODEV) if *port_id* invalid.
+ * - (-EIO) if device is removed.
+ * - (-EINVAL) if device is not configured or *capa* passed is NULL.
+ * - (0) on success.
+ */
+__rte_experimental
+int rte_eth_ip_reassembly_capability_get(uint16_t port_id,
+ struct rte_eth_ip_reassembly_params *capa);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Get IP reassembly configuration parameters currently set in PMD.
+ * The API will return error if the configuration is not already
+ * set using rte_eth_ip_reassembly_conf_set() before calling this API or if
+ * the device is not configured.
+ *
+ * @param port_id
+ * The port identifier of the device.
+ * @param conf
+ * A pointer to rte_eth_ip_reassembly_params structure.
+ * @return
+ * - (-ENOTSUP) if offload configuration is not supported by device.
+ * - (-ENODEV) if *port_id* invalid.
+ * - (-EIO) if device is removed.
+ * - (-EINVAL) if device is not configured or if *conf* passed is NULL or if
+ * configuration is not set using rte_eth_ip_reassembly_conf_set().
+ * - (0) on success.
+ */
+__rte_experimental
+int rte_eth_ip_reassembly_conf_get(uint16_t port_id,
+ struct rte_eth_ip_reassembly_params *conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Set IP reassembly configuration parameters if the PMD supports IP reassembly
+ * offload. User should first call rte_eth_ip_reassembly_capability_get() to
+ * check the maximum values supported by the PMD before setting the
+ * configuration. The use of this API is mandatory to enable this feature and
+ * should be called before rte_eth_dev_start().
+ *
+ * @param port_id
+ * The port identifier of the device.
+ * @param conf
+ * A pointer to rte_eth_ip_reassembly_params structure.
+ * @return
+ * - (-ENOTSUP) if offload configuration is not supported by device.
+ * - (-ENODEV) if *port_id* invalid.
+ * - (-EIO) if device is removed.
+ * - (-EINVAL) if device is not configured or if device is already started or
+ * if *conf* passed is NULL.
+ * - (0) on success.
+ */
+__rte_experimental
+int rte_eth_ip_reassembly_conf_set(uint16_t port_id,
+ const struct rte_eth_ip_reassembly_params *conf);
+
+
#include <rte_ethdev_core.h>
/**
@@ -256,6 +256,11 @@ EXPERIMENTAL {
rte_flow_flex_item_create;
rte_flow_flex_item_release;
rte_flow_pick_transfer_proxy;
+
+ # added in 22.03
+ rte_eth_ip_reassembly_capability_get;
+ rte_eth_ip_reassembly_conf_get;
+ rte_eth_ip_reassembly_conf_set;
};
INTERNAL {