diff mbox series

[v8,2/4] ethdev: introduce protocol hdr based buffer split

Message ID 20221005231836.215112-3-yuanx.wang@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: Andrew Rybchenko
Headers show
Series support protocol based buffer split | expand

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Yuan Wang Oct. 5, 2022, 11:18 p.m. UTC
Currently, Rx buffer split supports length based split. With Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
configured, PMD will be able to split the received packets into
multiple segments.

However, length based buffer split is not suitable for NICs that do split
based on protocol headers. Given an arbitrarily variable length in Rx
packet segment, it is almost impossible to pass a fixed protocol header to
driver. Besides, the existence of tunneling results in the composition of
a packet is various, which makes the situation even worse.

This patch extends current buffer split to support protocol header based
buffer split. A new proto_hdr field is introduced in the reserved field
of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
field defines the split position of packet, splitting will always happen
after the protocol header defined in the Rx packet segment. When Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
protocol header is configured, driver will split the ingress packets into
multiple segments.

Examples for proto_hdr field defines:
To split after ETH-IPV4-UDP, it should be defined as
proto_hdr = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
            RTE_PTYPE_L4_UDP

For inner ETH-IPV4-UDP, it should be defined as
proto_hdr = RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
            RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_INNER_L4_UDP

If the protocol header is repeated with the previously defined one,
the repeated part can be omitted. For example, split after ETH, ETH-IPV4
and ETH-IPV4-UDP, it should be defined as
proto_hdr0 = RTE_PTYPE_L2_ETHER
proto_hdr1 = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN
proto_hdr2 = RTE_PTYPE_L4_UDP

struct rte_eth_rxseg_split {
        struct rte_mempool *mp; /* memory pools to allocate segment from */
        uint16_t length; /* segment maximal data length,
                            configures split point */
        uint16_t offset; /* data offset from beginning
                            of mbuf data buffer */
        /**
	 * Proto_hdr defines a bit mask of the protocol sequence as
         * RTE_PTYPE_*, configures split point. The last RTE_PTYPE*
         * in the mask indicates the split position.
         * If one protocol header is defined to split packets into two
         * segments, for non-tunneling packets, the complete protocol
         * sequence should be defined.
         * For tunneling packets, for simplicity,
         * only the tunnel and inner part of comple protocol sequence
         * is required.
         * If several protocol headers are defined to split packets into
         * multi-segments, the repeated parts of adjacent segments
         * should be omitted.
	 */
        uint32_t proto_hdr;
};

If protocol header split can be supported by a PMD, the
rte_eth_buffer_split_get_supported_hdr_ptypes function can
be use to obtain a list of these protocol headers.

For example, let's suppose we configured the Rx queue with the
following segments:
        seg0 - pool0, proto_hdr0=RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
               off0=2B
        seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
        seg2 - pool2, off1=0B

The packet consists of ETH_IPV4_UDP_PAYLOAD will be split like
following:
        seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
        seg1 - udp header @ 128 in mbuf from pool1
        seg2 - payload @ 0 in mbuf from pool2

Note: NIC will only do split when the packets exactly match all the
protocol headers in the segments. For example, if ARP packets received
with above config, the NIC won't do split for ARP packets since
it does not contains ipv4 header and udp header. These packets will be put
into the last valid mempool, with zero offset.

Now buffer split can be configured in two modes. For length based
buffer split, the mp, length, offset field in Rx packet segment should
be configured, while the proto_hdr field will be ignored.
For protocol header based buffer split, the mp, offset, proto_hdr field
in Rx packet segment should be configured, while the length field will
be ignored.

The split limitations imposed by underlying driver is reported in the
rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
parts may differ either, dpdk memory and external memory, respectively.

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
---
 doc/guides/rel_notes/release_22_11.rst |  4 ++
 lib/ethdev/rte_ethdev.c                | 89 ++++++++++++++++++++++----
 lib/ethdev/rte_ethdev.h                | 34 +++++++++-
 3 files changed, 115 insertions(+), 12 deletions(-)

Comments

Andrew Rybchenko Oct. 6, 2022, 10:11 a.m. UTC | #1
On 10/6/22 02:18, Yuan Wang wrote:
> Currently, Rx buffer split supports length based split. With Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
> configured, PMD will be able to split the received packets into
> multiple segments.
> 
> However, length based buffer split is not suitable for NICs that do split
> based on protocol headers. Given an arbitrarily variable length in Rx
> packet segment, it is almost impossible to pass a fixed protocol header to
> driver. Besides, the existence of tunneling results in the composition of
> a packet is various, which makes the situation even worse.
> 
> This patch extends current buffer split to support protocol header based
> buffer split. A new proto_hdr field is introduced in the reserved field
> of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
> field defines the split position of packet, splitting will always happen
> after the protocol header defined in the Rx packet segment. When Rx queue
> offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
> protocol header is configured, driver will split the ingress packets into
> multiple segments.
> 
> Examples for proto_hdr field defines:
> To split after ETH-IPV4-UDP, it should be defined as
> proto_hdr = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
>              RTE_PTYPE_L4_UDP
> 
> For inner ETH-IPV4-UDP, it should be defined as
> proto_hdr = RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
>              RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_INNER_L4_UDP
> 
> If the protocol header is repeated with the previously defined one,
> the repeated part can be omitted. For example, split after ETH, ETH-IPV4
> and ETH-IPV4-UDP, it should be defined as
> proto_hdr0 = RTE_PTYPE_L2_ETHER
> proto_hdr1 = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN
> proto_hdr2 = RTE_PTYPE_L4_UDP

Ack

> 
> struct rte_eth_rxseg_split {
>          struct rte_mempool *mp; /* memory pools to allocate segment from */
>          uint16_t length; /* segment maximal data length,
>                              configures split point */
>          uint16_t offset; /* data offset from beginning
>                              of mbuf data buffer */
>          /**
> 	 * Proto_hdr defines a bit mask of the protocol sequence as
>           * RTE_PTYPE_*, configures split point. The last RTE_PTYPE*
>           * in the mask indicates the split position.
>           * If one protocol header is defined to split packets into two
>           * segments, for non-tunneling packets, the complete protocol
>           * sequence should be defined.
>           * For tunneling packets, for simplicity,
>           * only the tunnel and inner part of comple protocol sequence
>           * is required.
>           * If several protocol headers are defined to split packets into
>           * multi-segments, the repeated parts of adjacent segments
>           * should be omitted.
> 	 */
>          uint32_t proto_hdr;
> };

Sorry, but I see no reason to repeat in the descrtion.
What is the purpose of the duplication?

> 
> If protocol header split can be supported by a PMD, the
> rte_eth_buffer_split_get_supported_hdr_ptypes function can
> be use to obtain a list of these protocol headers.
> 
> For example, let's suppose we configured the Rx queue with the
> following segments:
>          seg0 - pool0, proto_hdr0=RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
>                 off0=2B
>          seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
>          seg2 - pool2, off1=0B
> 
> The packet consists of ETH_IPV4_UDP_PAYLOAD will be split like
> following:
>          seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
>          seg1 - udp header @ 128 in mbuf from pool1
>          seg2 - payload @ 0 in mbuf from pool2
> 
> Note: NIC will only do split when the packets exactly match all the
> protocol headers in the segments. For example, if ARP packets received
> with above config, the NIC won't do split for ARP packets since
> it does not contains ipv4 header and udp header. These packets will be put

ipv4 -> IPv4, udp -> UDP.

> into the last valid mempool, with zero offset.

What should happen if we have seg1 -> ETH, seg2 -> IPv4, seg3 - 
remaining and receive ARP? Will we see ETH header split in seg1
and everything else in the seg3? I would say yes.

May be we should define intended behavior using pseudo-code?

> 
> Now buffer split can be configured in two modes. For length based
> buffer split, the mp, length, offset field in Rx packet segment should
> be configured, while the proto_hdr field will be ignored.
> For protocol header based buffer split, the mp, offset, proto_hdr field
> in Rx packet segment should be configured, while the length field will
> be ignored.
> 
> The split limitations imposed by underlying driver is reported in the
> rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
> parts may differ either, dpdk memory and external memory, respectively.
> 
> Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> Signed-off-by: Xuan Ding <xuan.ding@intel.com>
> Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
> ---
>   doc/guides/rel_notes/release_22_11.rst |  4 ++
>   lib/ethdev/rte_ethdev.c                | 89 ++++++++++++++++++++++----
>   lib/ethdev/rte_ethdev.h                | 34 +++++++++-
>   3 files changed, 115 insertions(+), 12 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
> index 141fd9442b..4c3a7f8b8b 100644
> --- a/doc/guides/rel_notes/release_22_11.rst
> +++ b/doc/guides/rel_notes/release_22_11.rst
> @@ -127,6 +127,10 @@ New Features
>   
>     * Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get supported
>       header protocols of a PMD to split.
> +  * Ethdev: The ``reserved`` field in the ``rte_eth_rxseg_split`` structure is
> +    replaced with ``proto_hdr`` to support protocol header based buffer split.
> +    User can choose length or protocol header to configure buffer split
> +    according to NIC's capability.

It sounds like it should be mentioned in API change section as
well. Here I'd concentrate on top level feature overview only.
I.e. Supported protocol-based buffer split using added
``proto_hdr`` in structure ``rte_eth_rxseg_split``.

>   
>   
>   Removed Items
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index ee3b490889..60fe6eb2bd 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -1650,14 +1650,18 @@ rte_eth_dev_is_removed(uint16_t port_id)
>   }
>   
>   static int
> -rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
> -			     uint16_t n_seg, uint32_t *mbp_buf_size,
> -			     const struct rte_eth_dev_info *dev_info)
> +rte_eth_rx_queue_check_split(uint16_t port_id,
> +			const struct rte_eth_rxseg_split *rx_seg,
> +			uint16_t n_seg, uint32_t *mbp_buf_size,
> +			const struct rte_eth_dev_info *dev_info)
>   {
>   	const struct rte_eth_rxseg_capa *seg_capa = &dev_info->rx_seg_capa;
>   	struct rte_mempool *mp_first;
>   	uint32_t offset_mask;
>   	uint16_t seg_idx;
> +	int ptype_cnt;
> +	uint32_t *ptypes;
> +	int i;
>   
>   	if (n_seg > seg_capa->max_nseg) {
>   		RTE_ETHDEV_LOG(ERR,
> @@ -1675,6 +1679,7 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
>   		struct rte_mempool *mpl = rx_seg[seg_idx].mp;
>   		uint32_t length = rx_seg[seg_idx].length;
>   		uint32_t offset = rx_seg[seg_idx].offset;
> +		uint32_t proto_hdr = rx_seg[seg_idx].proto_hdr;
>   
>   		if (mpl == NULL) {
>   			RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
> @@ -1708,13 +1713,75 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
>   		}
>   		offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
>   		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
> -		length = length != 0 ? length : *mbp_buf_size;
> -		if (*mbp_buf_size < length + offset) {
> -			RTE_ETHDEV_LOG(ERR,
> -				       "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
> -				       mpl->name, *mbp_buf_size,
> -				       length + offset, length, offset);
> -			return -EINVAL;
> +
> +		if (proto_hdr > 0) {

proto_hdr != 0, please. I know that it is the same, but != 0
raises a bit less question if the field is signed or unsigned.

As the first condition here we should check if protocol-based
split is supported at all (see note about separate helper
function below).

> +			/* Split based on protocol headers. */
> +			if (length != 0) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"Do not set length split and protocol split within a segment\n"
> +					);
> +				return -EINVAL;
> +			}
> +
> +			if (seg_idx == n_seg - 1) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"The proto_hdr in the last segment should be 0\n"
> +					);
> +				return -EINVAL;
> +			}

I think here we should check if we have seen any segment
with proto_hdr == 0 before. If so, we can't do protocol
based split any more. Since we need to collect already
split protcols (prev_proto_hdrs), I would use the variable
as a marker and set it to all 1's MASK as soon as
proto_hdr==0 met.

So, the condition will be
if ((proto_hdr & prev_proto_hdrs) != 0)

So, it will check two since no repeat of previou
protocol headers which are already split and no
ptoto-split after length-based split.

> +
> +			if (*mbp_buf_size < offset) {
> +				RTE_ETHDEV_LOG(ERR,
> +						"%s mbuf_data_room_size %u < %u segment offset)\n",
> +						mpl->name, *mbp_buf_size,
> +						offset);
> +				return -EINVAL;
> +			}
> +

(separate helper function starts here)

> +			ptype_cnt = rte_eth_buffer_split_get_supported_hdr_ptypes(port_id, NULL, 0);

Three is no point to do it in a loop. It should be done
outside. Moreover, it should be a helper function
which does it to make this functionshort.

> +			if (ptype_cnt <= 0) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"Port %u failed to supported buffer split header protocols\n",
> +					port_id);
> +				return -EINVAL;
> +			}
> +
> +			ptypes = malloc(sizeof(uint32_t) * ptype_cnt);
> +			if (ptypes == NULL)
> +				return -ENOMEM;
> +
> +			ptype_cnt = rte_eth_buffer_split_get_supported_hdr_ptypes(port_id,
> +										ptypes, ptype_cnt);
> +			if (ptype_cnt < 0) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"Port %u failed to supported buffer split header protocols\n",
> +					port_id);
> +				free(ptypes);
> +				return -EINVAL;
> +			}

(separate helper function ends here)

> +
> +			for (i = 0; i < ptype_cnt; i++)
> +				if (ptypes[i] == proto_hdr)

It should be if ((prev_proto_hdrs | proto_hdr) == ptypes[i])

> +					break;
> +
> +			free(ptypes);
> +
> +			if (i == ptype_cnt) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"Requested Rx split header protocols 0x%x is not supported.\n",
> +					proto_hdr);
> +				return -EINVAL;
> +			}

prev_proto_hdrs |= proto_hdr;

> +		} else {

NOTE If driver does not support length-based split,
it should reject such configuration itself.

> +			/* Split at fixed length. */
> +			length = length != 0 ? length : *mbp_buf_size;
> +			if (*mbp_buf_size < length + offset) {
> +				RTE_ETHDEV_LOG(ERR,
> +					"%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
> +					mpl->name, *mbp_buf_size,
> +					length + offset, length, offset);
> +				return -EINVAL;
> +			}

prev_proto_hdrs = RTE_PTYPE_ALL_MASK;

>   		}
>   	}
>   	return 0;
> @@ -1794,7 +1861,7 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
>   		n_seg = rx_conf->rx_nseg;
>   
>   		if (rx_conf->offloads & RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT) {
> -			ret = rte_eth_rx_queue_check_split(rx_seg, n_seg,
> +			ret = rte_eth_rx_queue_check_split(port_id, rx_seg, n_seg,
>   							   &mbp_buf_size,
>   							   &dev_info);
>   			if (ret != 0)
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index c51c1f3fa0..4c9b121355 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -994,6 +994,9 @@ struct rte_eth_txmode {
>    *   specified in the first array element, the second buffer, from the
>    *   pool in the second element, and so on.
>    *
> + * - The proto_hdrs in the elements define the split position of
> + *   received packets.
> + *
>    * - The offsets from the segment description elements specify
>    *   the data offset from the buffer beginning except the first mbuf.
>    *   The first segment offset is added with RTE_PKTMBUF_HEADROOM.
> @@ -1015,12 +1018,41 @@ struct rte_eth_txmode {
>    *     - pool from the last valid element
>    *     - the buffer size from this pool
>    *     - zero offset
> + *
> + * - Length based buffer split:
> + *     - mp, length, offset should be configured.
> + *     - The proto_hdr field must be 0.
> + *
> + * - Protocol header based buffer split:
> + *     - mp, offset, proto_hdr should be configured.
> + *     - The length field must be 0.
> + *     - The proto_hdr field in the last segment should be 0.
> + *
> + * - For Protocol header based buffer split, if the received packets
> + *   don't exactly match all protocol headers in the elements, packets
> + *   will not be split.
> + *   These packets will be put into:
> + *     - pool from the last valid element
> + *     - the buffer size from this pool
> + *     - zero offset
>    */
>   struct rte_eth_rxseg_split {
>   	struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
>   	uint16_t length; /**< Segment data length, configures split point. */
>   	uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
> -	uint32_t reserved; /**< Reserved field. */
> +	/**
> +	 * Proto_hdr defines a bit mask of the protocol sequence as RTE_PTYPE_*,
> +	 * configures split point. The last RTE_PTYPE* in the mask indicates the
> +	 * split position.
> +	 *
> +	 * If one protocol header is defined to split packets into two segments,
> +	 * for non-tunneling packets, the complete protocol sequence should be defined.
> +	 * For tunneling packets, for simplicity, only the tunnel and inner part of
> +	 * comple protocol sequence is required.
> +	 * If several protocol headers are defined to split packets into multi-segments,
> +	 * the repeated parts of adjacent segments should be omitted.
> +	 */
> +	uint32_t proto_hdr;
>   };
>   
>   /**
Ding, Xuan Oct. 8, 2022, 2:30 p.m. UTC | #2
Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Thursday, October 6, 2022 6:12 PM
> To: Wang, YuanX <yuanx.wang@intel.com>; dev@dpdk.org; Thomas
> Monjalon <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@amd.com>
> Cc: ferruh.yigit@xilinx.com; mdr@ashroe.eu; Li, Xiaoyun
> <xiaoyun.li@intel.com>; Singh, Aman Deep <aman.deep.singh@intel.com>;
> Zhang, Yuying <yuying.zhang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Yang, Qiming <qiming.yang@intel.com>;
> jerinjacobk@gmail.com; viacheslavo@nvidia.com;
> stephen@networkplumber.org; Ding, Xuan <xuan.ding@intel.com>;
> hpothula@marvell.com; Tang, Yaqi <yaqi.tang@intel.com>; Wenxuan Wu
> <wenxuanx.wu@intel.com>
> Subject: Re: [PATCH v8 2/4] ethdev: introduce protocol hdr based buffer split
> 
> On 10/6/22 02:18, Yuan Wang wrote:
> > Currently, Rx buffer split supports length based split. With Rx queue
> > offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet
> segment
> > configured, PMD will be able to split the received packets into
> > multiple segments.
> >
> > However, length based buffer split is not suitable for NICs that do
> > split based on protocol headers. Given an arbitrarily variable length
> > in Rx packet segment, it is almost impossible to pass a fixed protocol
> > header to driver. Besides, the existence of tunneling results in the
> > composition of a packet is various, which makes the situation even worse.
> >
> > This patch extends current buffer split to support protocol header
> > based buffer split. A new proto_hdr field is introduced in the
> > reserved field of rte_eth_rxseg_split structure to specify protocol
> > header. The proto_hdr field defines the split position of packet,
> > splitting will always happen after the protocol header defined in the
> > Rx packet segment. When Rx queue offload
> > RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
> protocol
> > header is configured, driver will split the ingress packets into multiple
> segments.
> >
> > Examples for proto_hdr field defines:
> > To split after ETH-IPV4-UDP, it should be defined as proto_hdr =
> > RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
> >              RTE_PTYPE_L4_UDP
> >
> > For inner ETH-IPV4-UDP, it should be defined as proto_hdr =
> > RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
> >              RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
> > RTE_PTYPE_INNER_L4_UDP
> >
> > If the protocol header is repeated with the previously defined one,
> > the repeated part can be omitted. For example, split after ETH,
> > ETH-IPV4 and ETH-IPV4-UDP, it should be defined as
> > proto_hdr0 = RTE_PTYPE_L2_ETHER
> > proto_hdr1 = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN
> > proto_hdr2 = RTE_PTYPE_L4_UDP
> 
> Ack
> 
> >
> > struct rte_eth_rxseg_split {
> >          struct rte_mempool *mp; /* memory pools to allocate segment from
> */
> >          uint16_t length; /* segment maximal data length,
> >                              configures split point */
> >          uint16_t offset; /* data offset from beginning
> >                              of mbuf data buffer */
> >          /**
> > 	 * Proto_hdr defines a bit mask of the protocol sequence as
> >           * RTE_PTYPE_*, configures split point. The last RTE_PTYPE*
> >           * in the mask indicates the split position.
> >           * If one protocol header is defined to split packets into two
> >           * segments, for non-tunneling packets, the complete protocol
> >           * sequence should be defined.
> >           * For tunneling packets, for simplicity,
> >           * only the tunnel and inner part of comple protocol sequence
> >           * is required.
> >           * If several protocol headers are defined to split packets into
> >           * multi-segments, the repeated parts of adjacent segments
> >           * should be omitted.
> > 	 */
> >          uint32_t proto_hdr;
> > };
> 
> Sorry, but I see no reason to repeat in the descrtion.
> What is the purpose of the duplication?

The intension for repeating here is to make the commit log more comprehensive.
We can remove these lines to make log cleaner.

> 
> >
> > If protocol header split can be supported by a PMD, the
> > rte_eth_buffer_split_get_supported_hdr_ptypes function can be use to
> > obtain a list of these protocol headers.
> >
> > For example, let's suppose we configured the Rx queue with the
> > following segments:
> >          seg0 - pool0, proto_hdr0=RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
> >                 off0=2B
> >          seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
> >          seg2 - pool2, off1=0B
> >
> > The packet consists of ETH_IPV4_UDP_PAYLOAD will be split like
> > following:
> >          seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from
> pool0
> >          seg1 - udp header @ 128 in mbuf from pool1
> >          seg2 - payload @ 0 in mbuf from pool2
> >
> > Note: NIC will only do split when the packets exactly match all the
> > protocol headers in the segments. For example, if ARP packets received
> > with above config, the NIC won't do split for ARP packets since it
> > does not contains ipv4 header and udp header. These packets will be
> > put
> 
> ipv4 -> IPv4, udp -> UDP.
> 
> > into the last valid mempool, with zero offset.
> 
> What should happen if we have seg1 -> ETH, seg2 -> IPv4, seg3 - remaining
> and receive ARP? Will we see ETH header split in seg1 and everything else in
> the seg3? I would say yes.
> 
> May be we should define intended behavior using pseudo-code?

When NIC receives these packets (like ARP), we think the expected split behavior is not decided by library, but the driver itself.
It is possible for NIC to do split in exact match and longest match cases.

The exact match means NIC only do split when the packets exactly match all the protocol headers in the segments.
Otherwise, these packets won't be split and the whole packet will be put into the last valid mempool, that's what we defined.
The longest match means NIC will split as long as the packets meet some of the protocol headers.

Since both cases are possible, so IMO the two scenarios should be both defined. The final result of split will always be one of them.
Hope to get your insights.

Attached pseudo-code for two cases below:
Exact match:
FOR each seg in segs except last one
    IF proto_hdr is not matched THEN
        BREAK
    END IF
END FOR
IF loop breaked THEN
    put whole pkt in last seg
ELSE
    put protocol header in each seg
    put everything else in last seg
END IF

Longest match:
FOR each seg in segs except last one
    IF proto_hdr is matched THEN
        put protocol header in seg
    ELSE
        BREAK
    END IF
END FOR
put everything else in last seg

> 
> >
> > Now buffer split can be configured in two modes. For length based
> > buffer split, the mp, length, offset field in Rx packet segment should
> > be configured, while the proto_hdr field will be ignored.
> > For protocol header based buffer split, the mp, offset, proto_hdr
> > field in Rx packet segment should be configured, while the length
> > field will be ignored.
> >
> > The split limitations imposed by underlying driver is reported in the
> > rte_eth_dev_info->rx_seg_capa field. The memory attributes for the
> > split parts may differ either, dpdk memory and external memory,
> respectively.
> >
> > Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> > Signed-off-by: Xuan Ding <xuan.ding@intel.com>
> > Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
> > ---
> >   doc/guides/rel_notes/release_22_11.rst |  4 ++
> >   lib/ethdev/rte_ethdev.c                | 89 ++++++++++++++++++++++----
> >   lib/ethdev/rte_ethdev.h                | 34 +++++++++-
> >   3 files changed, 115 insertions(+), 12 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_22_11.rst
> > b/doc/guides/rel_notes/release_22_11.rst
> > index 141fd9442b..4c3a7f8b8b 100644
> > --- a/doc/guides/rel_notes/release_22_11.rst
> > +++ b/doc/guides/rel_notes/release_22_11.rst
> > @@ -127,6 +127,10 @@ New Features
> >
> >     * Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get
> supported
> >       header protocols of a PMD to split.
> > +  * Ethdev: The ``reserved`` field in the ``rte_eth_rxseg_split`` structure is
> > +    replaced with ``proto_hdr`` to support protocol header based buffer
> split.
> > +    User can choose length or protocol header to configure buffer split
> > +    according to NIC's capability.
> 
> It sounds like it should be mentioned in API change section as well. Here I'd
> concentrate on top level feature overview only.
> I.e. Supported protocol-based buffer split using added ``proto_hdr`` in
> structure ``rte_eth_rxseg_split``.

Thanks for your suggestion, please see next version.

> 
> >
> >
> >   Removed Items
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > ee3b490889..60fe6eb2bd 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -1650,14 +1650,18 @@ rte_eth_dev_is_removed(uint16_t port_id)
> >   }
> >
> >   static int
> > -rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
> > -			     uint16_t n_seg, uint32_t *mbp_buf_size,
> > -			     const struct rte_eth_dev_info *dev_info)
> > +rte_eth_rx_queue_check_split(uint16_t port_id,
> > +			const struct rte_eth_rxseg_split *rx_seg,
> > +			uint16_t n_seg, uint32_t *mbp_buf_size,
> > +			const struct rte_eth_dev_info *dev_info)
> >   {
> >   	const struct rte_eth_rxseg_capa *seg_capa = &dev_info-
> >rx_seg_capa;
> >   	struct rte_mempool *mp_first;
> >   	uint32_t offset_mask;
> >   	uint16_t seg_idx;
> > +	int ptype_cnt;
> > +	uint32_t *ptypes;
> > +	int i;
> >
> >   	if (n_seg > seg_capa->max_nseg) {
> >   		RTE_ETHDEV_LOG(ERR,
> > @@ -1675,6 +1679,7 @@ rte_eth_rx_queue_check_split(const struct
> rte_eth_rxseg_split *rx_seg,
> >   		struct rte_mempool *mpl = rx_seg[seg_idx].mp;
> >   		uint32_t length = rx_seg[seg_idx].length;
> >   		uint32_t offset = rx_seg[seg_idx].offset;
> > +		uint32_t proto_hdr = rx_seg[seg_idx].proto_hdr;
> >
> >   		if (mpl == NULL) {
> >   			RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
> @@ -1708,13
> > +1713,75 @@ rte_eth_rx_queue_check_split(const struct
> rte_eth_rxseg_split *rx_seg,
> >   		}
> >   		offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
> >   		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
> > -		length = length != 0 ? length : *mbp_buf_size;
> > -		if (*mbp_buf_size < length + offset) {
> > -			RTE_ETHDEV_LOG(ERR,
> > -				       "%s mbuf_data_room_size %u < %u
> (segment length=%u + segment offset=%u)\n",
> > -				       mpl->name, *mbp_buf_size,
> > -				       length + offset, length, offset);
> > -			return -EINVAL;
> > +
> > +		if (proto_hdr > 0) {
> 
> proto_hdr != 0, please. I know that it is the same, but != 0 raises a bit less
> question if the field is signed or unsigned.

Get it.

> 
> As the first condition here we should check if protocol-based split is
> supported at all (see note about separate helper function below).
> 
> > +			/* Split based on protocol headers. */
> > +			if (length != 0) {
> > +				RTE_ETHDEV_LOG(ERR,
> > +					"Do not set length split and protocol
> split within a segment\n"
> > +					);
> > +				return -EINVAL;
> > +			}
> > +
> > +			if (seg_idx == n_seg - 1) {
> > +				RTE_ETHDEV_LOG(ERR,
> > +					"The proto_hdr in the last segment
> should be 0\n"
> > +					);
> > +				return -EINVAL;
> > +			}
> 
> I think here we should check if we have seen any segment with proto_hdr ==
> 0 before. If so, we can't do protocol based split any more. Since we need to
> collect already split protcols (prev_proto_hdrs), I would use the variable as a
> marker and set it to all 1's MASK as soon as
> proto_hdr==0 met.
> 
> So, the condition will be
> if ((proto_hdr & prev_proto_hdrs) != 0)
> 
> So, it will check two since no repeat of previou protocol headers which are
> already split and no ptoto-split after length-based split.

Thanks for your suggestion.
The introduction of prev_proto_hdrs is a good idea, which helps to solve the two issues above.
We will adopt this implementation, please see next version.

> 
> > +
> > +			if (*mbp_buf_size < offset) {
> > +				RTE_ETHDEV_LOG(ERR,
> > +						"%s
> mbuf_data_room_size %u < %u segment offset)\n",
> > +						mpl->name, *mbp_buf_size,
> > +						offset);
> > +				return -EINVAL;
> > +			}
> > +
> 
> (separate helper function starts here)
> 
> > +			ptype_cnt =
> rte_eth_buffer_split_get_supported_hdr_ptypes(port_id,
> > +NULL, 0);
> 
> Three is no point to do it in a loop. It should be done outside. Moreover, it
> should be a helper function which does it to make this functionshort.

A new helper function eth_dev_buffer_split_get_supported_hdrs_helper() will be added in next version.

> 
> > +			if (ptype_cnt <= 0) {
> > +				RTE_ETHDEV_LOG(ERR,
> > +					"Port %u failed to supported buffer
> split header protocols\n",
> > +					port_id);
> > +				return -EINVAL;
> > +			}
> > +
> > +			ptypes = malloc(sizeof(uint32_t) * ptype_cnt);
> > +			if (ptypes == NULL)
> > +				return -ENOMEM;
> > +
> > +			ptype_cnt =
> rte_eth_buffer_split_get_supported_hdr_ptypes(port_id,
> > +
> 	ptypes, ptype_cnt);
> > +			if (ptype_cnt < 0) {
> > +				RTE_ETHDEV_LOG(ERR,
> > +					"Port %u failed to supported buffer
> split header protocols\n",
> > +					port_id);
> > +				free(ptypes);
> > +				return -EINVAL;
> > +			}
> 
> (separate helper function ends here)
> 
> > +
> > +			for (i = 0; i < ptype_cnt; i++)
> > +				if (ptypes[i] == proto_hdr)
> 
> It should be if ((prev_proto_hdrs | proto_hdr) == ptypes[i])
> 
> > +					break;
> > +
> > +			free(ptypes);
> > +
> > +			if (i == ptype_cnt) {
> > +				RTE_ETHDEV_LOG(ERR,
> > +					"Requested Rx split header protocols
> 0x%x is not supported.\n",
> > +					proto_hdr);
> > +				return -EINVAL;
> > +			}
> 
> prev_proto_hdrs |= proto_hdr;
> 
> > +		} else {
> 
> NOTE If driver does not support length-based split, it should reject such
> configuration itself.

Here we have a question. From the code perspective, how to know
whether length-based split is supported by driver, thus to reject such configuration.
Because we have get_support_ptypes() API for driver to know proto-based split, but no API for length-based split.
Or are you referring to a doc update?

Thanks,
Xuan

> 
> > +			/* Split at fixed length. */
> > +			length = length != 0 ? length : *mbp_buf_size;
> > +			if (*mbp_buf_size < length + offset) {
> > +				RTE_ETHDEV_LOG(ERR,
> > +					"%s mbuf_data_room_size %u < %u
> (segment length=%u + segment offset=%u)\n",
> > +					mpl->name, *mbp_buf_size,
> > +					length + offset, length, offset);
> > +				return -EINVAL;
> > +			}
> 
> prev_proto_hdrs = RTE_PTYPE_ALL_MASK;
> 
> >   		}
> >   	}
> >   	return 0;
> > @@ -1794,7 +1861,7 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> uint16_t rx_queue_id,
> >   		n_seg = rx_conf->rx_nseg;
> >
> >   		if (rx_conf->offloads &
> RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT) {
> > -			ret = rte_eth_rx_queue_check_split(rx_seg, n_seg,
> > +			ret = rte_eth_rx_queue_check_split(port_id, rx_seg,
> n_seg,
> >   							   &mbp_buf_size,
> >   							   &dev_info);
> >   			if (ret != 0)
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > c51c1f3fa0..4c9b121355 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -994,6 +994,9 @@ struct rte_eth_txmode {
> >    *   specified in the first array element, the second buffer, from the
> >    *   pool in the second element, and so on.
> >    *
> > + * - The proto_hdrs in the elements define the split position of
> > + *   received packets.
> > + *
> >    * - The offsets from the segment description elements specify
> >    *   the data offset from the buffer beginning except the first mbuf.
> >    *   The first segment offset is added with RTE_PKTMBUF_HEADROOM.
> > @@ -1015,12 +1018,41 @@ struct rte_eth_txmode {
> >    *     - pool from the last valid element
> >    *     - the buffer size from this pool
> >    *     - zero offset
> > + *
> > + * - Length based buffer split:
> > + *     - mp, length, offset should be configured.
> > + *     - The proto_hdr field must be 0.
> > + *
> > + * - Protocol header based buffer split:
> > + *     - mp, offset, proto_hdr should be configured.
> > + *     - The length field must be 0.
> > + *     - The proto_hdr field in the last segment should be 0.
> > + *
> > + * - For Protocol header based buffer split, if the received packets
> > + *   don't exactly match all protocol headers in the elements, packets
> > + *   will not be split.
> > + *   These packets will be put into:
> > + *     - pool from the last valid element
> > + *     - the buffer size from this pool
> > + *     - zero offset
> >    */
> >   struct rte_eth_rxseg_split {
> >   	struct rte_mempool *mp; /**< Memory pool to allocate segment
> from. */
> >   	uint16_t length; /**< Segment data length, configures split point. */
> >   	uint16_t offset; /**< Data offset from beginning of mbuf data buffer.
> */
> > -	uint32_t reserved; /**< Reserved field. */
> > +	/**
> > +	 * Proto_hdr defines a bit mask of the protocol sequence as
> RTE_PTYPE_*,
> > +	 * configures split point. The last RTE_PTYPE* in the mask indicates
> the
> > +	 * split position.
> > +	 *
> > +	 * If one protocol header is defined to split packets into two
> segments,
> > +	 * for non-tunneling packets, the complete protocol sequence should
> be defined.
> > +	 * For tunneling packets, for simplicity, only the tunnel and inner part
> of
> > +	 * comple protocol sequence is required.
> > +	 * If several protocol headers are defined to split packets into multi-
> segments,
> > +	 * the repeated parts of adjacent segments should be omitted.
> > +	 */
> > +	uint32_t proto_hdr;
> >   };
> >
> >   /**
diff mbox series

Patch

diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
index 141fd9442b..4c3a7f8b8b 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -127,6 +127,10 @@  New Features
 
   * Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get supported
     header protocols of a PMD to split.
+  * Ethdev: The ``reserved`` field in the ``rte_eth_rxseg_split`` structure is
+    replaced with ``proto_hdr`` to support protocol header based buffer split.
+    User can choose length or protocol header to configure buffer split
+    according to NIC's capability.
 
 
 Removed Items
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index ee3b490889..60fe6eb2bd 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1650,14 +1650,18 @@  rte_eth_dev_is_removed(uint16_t port_id)
 }
 
 static int
-rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
-			     uint16_t n_seg, uint32_t *mbp_buf_size,
-			     const struct rte_eth_dev_info *dev_info)
+rte_eth_rx_queue_check_split(uint16_t port_id,
+			const struct rte_eth_rxseg_split *rx_seg,
+			uint16_t n_seg, uint32_t *mbp_buf_size,
+			const struct rte_eth_dev_info *dev_info)
 {
 	const struct rte_eth_rxseg_capa *seg_capa = &dev_info->rx_seg_capa;
 	struct rte_mempool *mp_first;
 	uint32_t offset_mask;
 	uint16_t seg_idx;
+	int ptype_cnt;
+	uint32_t *ptypes;
+	int i;
 
 	if (n_seg > seg_capa->max_nseg) {
 		RTE_ETHDEV_LOG(ERR,
@@ -1675,6 +1679,7 @@  rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
 		struct rte_mempool *mpl = rx_seg[seg_idx].mp;
 		uint32_t length = rx_seg[seg_idx].length;
 		uint32_t offset = rx_seg[seg_idx].offset;
+		uint32_t proto_hdr = rx_seg[seg_idx].proto_hdr;
 
 		if (mpl == NULL) {
 			RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
@@ -1708,13 +1713,75 @@  rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg,
 		}
 		offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
 		*mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
-		length = length != 0 ? length : *mbp_buf_size;
-		if (*mbp_buf_size < length + offset) {
-			RTE_ETHDEV_LOG(ERR,
-				       "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
-				       mpl->name, *mbp_buf_size,
-				       length + offset, length, offset);
-			return -EINVAL;
+
+		if (proto_hdr > 0) {
+			/* Split based on protocol headers. */
+			if (length != 0) {
+				RTE_ETHDEV_LOG(ERR,
+					"Do not set length split and protocol split within a segment\n"
+					);
+				return -EINVAL;
+			}
+
+			if (seg_idx == n_seg - 1) {
+				RTE_ETHDEV_LOG(ERR,
+					"The proto_hdr in the last segment should be 0\n"
+					);
+				return -EINVAL;
+			}
+
+			if (*mbp_buf_size < offset) {
+				RTE_ETHDEV_LOG(ERR,
+						"%s mbuf_data_room_size %u < %u segment offset)\n",
+						mpl->name, *mbp_buf_size,
+						offset);
+				return -EINVAL;
+			}
+
+			ptype_cnt = rte_eth_buffer_split_get_supported_hdr_ptypes(port_id, NULL, 0);
+			if (ptype_cnt <= 0) {
+				RTE_ETHDEV_LOG(ERR,
+					"Port %u failed to supported buffer split header protocols\n",
+					port_id);
+				return -EINVAL;
+			}
+
+			ptypes = malloc(sizeof(uint32_t) * ptype_cnt);
+			if (ptypes == NULL)
+				return -ENOMEM;
+
+			ptype_cnt = rte_eth_buffer_split_get_supported_hdr_ptypes(port_id,
+										ptypes, ptype_cnt);
+			if (ptype_cnt < 0) {
+				RTE_ETHDEV_LOG(ERR,
+					"Port %u failed to supported buffer split header protocols\n",
+					port_id);
+				free(ptypes);
+				return -EINVAL;
+			}
+
+			for (i = 0; i < ptype_cnt; i++)
+				if (ptypes[i] == proto_hdr)
+					break;
+
+			free(ptypes);
+
+			if (i == ptype_cnt) {
+				RTE_ETHDEV_LOG(ERR,
+					"Requested Rx split header protocols 0x%x is not supported.\n",
+					proto_hdr);
+				return -EINVAL;
+			}
+		} else {
+			/* Split at fixed length. */
+			length = length != 0 ? length : *mbp_buf_size;
+			if (*mbp_buf_size < length + offset) {
+				RTE_ETHDEV_LOG(ERR,
+					"%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n",
+					mpl->name, *mbp_buf_size,
+					length + offset, length, offset);
+				return -EINVAL;
+			}
 		}
 	}
 	return 0;
@@ -1794,7 +1861,7 @@  rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		n_seg = rx_conf->rx_nseg;
 
 		if (rx_conf->offloads & RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT) {
-			ret = rte_eth_rx_queue_check_split(rx_seg, n_seg,
+			ret = rte_eth_rx_queue_check_split(port_id, rx_seg, n_seg,
 							   &mbp_buf_size,
 							   &dev_info);
 			if (ret != 0)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c51c1f3fa0..4c9b121355 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -994,6 +994,9 @@  struct rte_eth_txmode {
  *   specified in the first array element, the second buffer, from the
  *   pool in the second element, and so on.
  *
+ * - The proto_hdrs in the elements define the split position of
+ *   received packets.
+ *
  * - The offsets from the segment description elements specify
  *   the data offset from the buffer beginning except the first mbuf.
  *   The first segment offset is added with RTE_PKTMBUF_HEADROOM.
@@ -1015,12 +1018,41 @@  struct rte_eth_txmode {
  *     - pool from the last valid element
  *     - the buffer size from this pool
  *     - zero offset
+ *
+ * - Length based buffer split:
+ *     - mp, length, offset should be configured.
+ *     - The proto_hdr field must be 0.
+ *
+ * - Protocol header based buffer split:
+ *     - mp, offset, proto_hdr should be configured.
+ *     - The length field must be 0.
+ *     - The proto_hdr field in the last segment should be 0.
+ *
+ * - For Protocol header based buffer split, if the received packets
+ *   don't exactly match all protocol headers in the elements, packets
+ *   will not be split.
+ *   These packets will be put into:
+ *     - pool from the last valid element
+ *     - the buffer size from this pool
+ *     - zero offset
  */
 struct rte_eth_rxseg_split {
 	struct rte_mempool *mp; /**< Memory pool to allocate segment from. */
 	uint16_t length; /**< Segment data length, configures split point. */
 	uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */
-	uint32_t reserved; /**< Reserved field. */
+	/**
+	 * Proto_hdr defines a bit mask of the protocol sequence as RTE_PTYPE_*,
+	 * configures split point. The last RTE_PTYPE* in the mask indicates the
+	 * split position.
+	 *
+	 * If one protocol header is defined to split packets into two segments,
+	 * for non-tunneling packets, the complete protocol sequence should be defined.
+	 * For tunneling packets, for simplicity, only the tunnel and inner part of
+	 * comple protocol sequence is required.
+	 * If several protocol headers are defined to split packets into multi-segments,
+	 * the repeated parts of adjacent segments should be omitted.
+	 */
+	uint32_t proto_hdr;
 };
 
 /**