[dpdk-dev,v2] pmd: Add generic support for TCP TSO (Transmit Segmentation Offload)

Message ID 20141021112946.30069.67329.stgit@gklab-18-011.igk.intel.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

miroslaw.walukiewicz@intel.com Oct. 21, 2014, 11:29 a.m. UTC
  From: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>

The NICs supported by DPDK have a possibility to accelerate TCP
traffic by sergnention offload. The application preprares a packet
with valid TCP header with size up to 64K and NIC makes packet
segmenation generating valid checksums and TCP segments.

The patch defines a generic support for TSO offload.
- Add new  PKT_TX_TCP_SEG flag.
  Only packets with this flag set in ol_flags will be handled as
  TSO packets.

- Add new fields in indicating TCP TSO segment size and TCP header len.
  The TSO requires from application setting following fields in mbuf.
  1. L2 header len including MAC/VLANs/SNAP if present
  2. L3 header len including IP options
  3. L4 header len (new field) including TCP options
  4. tso_segsz (new field) the size of TCP segment

The apllication has obligation to compute the pseudo header checksum
instead of full TCP checksum and put it in the TCP header csum field.

Handling complexity of creation combined l2_l3_len field
a new macro RTE_MBUF_TO_L2_L3_LEN() is defined to retrieve this
part of rte_mbuf.

Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com>
---
 app/test-pmd/testpmd.c            |    3 ++-
 lib/librte_mbuf/rte_mbuf.h        |   27 +++++++++++++++++++++------
 lib/librte_pmd_e1000/igb_rxtx.c   |    2 +-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c |    2 +-
 4 files changed, 25 insertions(+), 9 deletions(-)
  

Comments

Olivier Matz Oct. 31, 2014, 3:49 p.m. UTC | #1
Hello Miroslaw,

On 10/21/2014 01:29 PM, miroslaw.walukiewicz@intel.com wrote:
> From: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
> 
> The NICs supported by DPDK have a possibility to accelerate TCP
> traffic by sergnention offload. The application preprares a packet
> with valid TCP header with size up to 64K and NIC makes packet
> segmenation generating valid checksums and TCP segments.
> 
> The patch defines a generic support for TSO offload.
> - Add new  PKT_TX_TCP_SEG flag.
>   Only packets with this flag set in ol_flags will be handled as
>   TSO packets.
> 
> - Add new fields in indicating TCP TSO segment size and TCP header len.
>   The TSO requires from application setting following fields in mbuf.
>   1. L2 header len including MAC/VLANs/SNAP if present
>   2. L3 header len including IP options
>   3. L4 header len (new field) including TCP options
>   4. tso_segsz (new field) the size of TCP segment
> 
> The apllication has obligation to compute the pseudo header checksum
> instead of full TCP checksum and put it in the TCP header csum field.
> 
> Handling complexity of creation combined l2_l3_len field
> a new macro RTE_MBUF_TO_L2_L3_LEN() is defined to retrieve this
> part of rte_mbuf.
> 

The patch you submitted does not include any changes in a driver
(let's say ixgbe as it is the reference) taking advantage it.
So it's difficult to validate that your modifications are sufficient
to implement TSO.

In addition of a driver, I think that testpmd should be modified
to validate your changes and give an example to people wanting to use
this feature.

Based on your patch, I'll try to submit a series in the coming days
(maybe today if the winds are favourable) that includes the remaining
patches from the original TSO series [1] that were not applied by
Bruce's mbuf rework.

Regards,
Olivier

[1] http://dpdk.org/ml/archives/dev/2014-May/002537.html
  
miroslaw.walukiewicz@intel.com Nov. 3, 2014, 1:14 p.m. UTC | #2
Hello Olivier, 

> -----Original Message-----

> From: Olivier MATZ [mailto:olivier.matz@6wind.com]

> Sent: Friday, October 31, 2014 4:50 PM

> To: Walukiewicz, Miroslaw; dev@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH v2] pmd: Add generic support for TCP TSO

> (Transmit Segmentation Offload)

> 

> Hello Miroslaw,

> 

> On 10/21/2014 01:29 PM, miroslaw.walukiewicz@intel.com wrote:

> > From: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>

> >

> > The NICs supported by DPDK have a possibility to accelerate TCP

> > traffic by sergnention offload. The application preprares a packet

> > with valid TCP header with size up to 64K and NIC makes packet

> > segmenation generating valid checksums and TCP segments.

> >

> > The patch defines a generic support for TSO offload.

> > - Add new  PKT_TX_TCP_SEG flag.

> >   Only packets with this flag set in ol_flags will be handled as

> >   TSO packets.

> >

> > - Add new fields in indicating TCP TSO segment size and TCP header len.

> >   The TSO requires from application setting following fields in mbuf.

> >   1. L2 header len including MAC/VLANs/SNAP if present

> >   2. L3 header len including IP options

> >   3. L4 header len (new field) including TCP options

> >   4. tso_segsz (new field) the size of TCP segment

> >

> > The apllication has obligation to compute the pseudo header checksum

> > instead of full TCP checksum and put it in the TCP header csum field.

> >

> > Handling complexity of creation combined l2_l3_len field

> > a new macro RTE_MBUF_TO_L2_L3_LEN() is defined to retrieve this

> > part of rte_mbuf.

> >

> 

> The patch you submitted does not include any changes in a driver

> (let's say ixgbe as it is the reference) taking advantage it.

> So it's difficult to validate that your modifications are sufficient

> to implement TSO.


I wanted to agree on the generic TSO API first and next to make driver modifications in next patches.
> 

> In addition of a driver, I think that testpmd should be modified

> to validate your changes and give an example to people wanting to use

> this feature.

> 

> Based on your patch, I'll try to submit a series in the coming days

> (maybe today if the winds are favourable) that includes the remaining

> patches from the original TSO series [1] that were not applied by

> Bruce's mbuf rework.

> 

Thank you Olivier, 
I will follow your patches with my i40e TSO changes when generic API will be accepted by community. 

> Regards,

> Olivier

> 

> [1] http://dpdk.org/ml/archives/dev/2014-May/002537.html
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index f76406f..d8fd025 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -408,7 +408,8 @@  testpmd_mbuf_ctor(struct rte_mempool *mp,
 	mb->ol_flags     = 0;
 	mb->data_off     = RTE_PKTMBUF_HEADROOM;
 	mb->nb_segs      = 1;
-	mb->l2_l3_len       = 0;
+	mb->l2_len       = 0;
+	mb->l3_len       = 0;
 	mb->vlan_tci     = 0;
 	mb->hash.rss     = 0;
 }
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ddadc21..2e2e315 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -114,6 +114,9 @@  extern "C" {
 /* Bit 51 - IEEE1588*/
 #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to timestamp. */
 
+/* Bit 49 - TCP transmit segmenation offload */
+#define PKT_TX_TCP_SEG (1ULL << 49) /**< TX TSO offload */
+ 
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG       (1ULL << 63) /**< Mbuf contains control data */
 
@@ -189,16 +192,28 @@  struct rte_mbuf {
 	struct rte_mbuf *next;    /**< Next segment of scattered packet. */
 
 	/* fields to support TX offloads */
-	union {
-		uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var */
+	/* two bytes - l2 len (including MAC/VLANs/SNAP if present)
+ 	 * two bytes - l3 len (including IP options)
+	 * two bytes - l4 len TCP/UDP header len - including TCP options
+	 * two bytes - TCP tso segment size
+ 	 */
+	union{
+		uint64_t l2_l3_l4_tso_seg; /**< combined for easy fetch */
 		struct {
-			uint16_t l3_len:9;      /**< L3 (IP) Header Length. */
-			uint16_t l2_len:7;      /**< L2 (MAC) Header Length. */
+			uint16_t l3_len; /**< L3 (IP) Header */
+			uint16_t l2_len; /**< L2 (MAC) Header */
+			uint16_t l4_len; /**< TCP/UDP header len */
+			uint16_t tso_segsz; /**< TCP TSO segment size */
 		};
 	};
 } __rte_cache_aligned;
 
 /**
+ * Given the rte_mbuf returns the l2_l3_len combined
+ */
+#define RTE_MBUF_TO_L2_L3_LEN(mb) (uint32_t)(((mb)->l2_len << 16) | (mb)->l3_len)
+
+/**
  * Given the buf_addr returns the pointer to corresponding mbuf.
  */
 #define RTE_MBUF_FROM_BADDR(ba)     (((struct rte_mbuf *)(ba)) - 1)
@@ -545,7 +560,7 @@  static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 {
 	m->next = NULL;
 	m->pkt_len = 0;
-	m->l2_l3_len = 0;
+	m->l2_l3_l4_tso_seg = 0;
 	m->vlan_tci = 0;
 	m->nb_segs = 1;
 	m->port = 0xff;
@@ -613,7 +628,7 @@  static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *md)
 	mi->data_len = md->data_len;
 	mi->port = md->port;
 	mi->vlan_tci = md->vlan_tci;
-	mi->l2_l3_len = md->l2_l3_len;
+	mi->l2_l3_l4_tso_seg = md->l2_l3_l4_tso_seg;
 	mi->hash = md->hash;
 
 	mi->next = NULL;
diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index f09c525..0f3248e 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -399,7 +399,7 @@  eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 
 		ol_flags = tx_pkt->ol_flags;
 		vlan_macip_lens.f.vlan_tci = tx_pkt->vlan_tci;
-		vlan_macip_lens.f.l2_l3_len = tx_pkt->l2_l3_len;
+		vlan_macip_lens.f.l2_l3_len = RTE_MBUF_TO_L2_L3_LEN(tx_pkt);
 		tx_ol_req = ol_flags & PKT_TX_OFFLOAD_MASK;
 
 		/* If a Context Descriptor need be built . */
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 123b8b3..5e92224 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -583,7 +583,7 @@  ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 		tx_ol_req = ol_flags & PKT_TX_OFFLOAD_MASK;
 		if (tx_ol_req) {
 			vlan_macip_lens.f.vlan_tci = tx_pkt->vlan_tci;
-			vlan_macip_lens.f.l2_l3_len = tx_pkt->l2_l3_len;
+			vlan_macip_lens.f.l2_l3_len = RTE_MBUF_TO_L2_L3_LEN(tx_pkt);
 
 			/* If new context need be built or reuse the exist ctx. */
 			ctx = what_advctx_update(txq, tx_ol_req,