[dpdk-dev,RFC,1/2] net/tap: calculate checksum for multi segs packets

Message ID 1520629826-23055-2-git-send-email-ophirmu@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Ophir Munk March 9, 2018, 9:10 p.m. UTC
  In past TAP implementation checksum offload calculations (for
IP/UDP/TCP) were skipped in case of a multi segments packet.
This commit improves TAP functionality by enabling checksum calculations
in multi segments cases.
The only restriction now is that the first segment must contain all
headers of layers 2, 3 and 4 (where layer 4 header size is taken as
that of TCP).

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
---
 drivers/net/tap/rte_eth_tap.c | 42 ++++++++++++++++++++++++++++++++----------
 1 file changed, 32 insertions(+), 10 deletions(-)
  

Comments

Ophir Munk April 9, 2018, 10:33 p.m. UTC | #1
This patch implements TAP TSO (TSP segmentation offload) in SW.
It uses dpdk library librte_gso.
Dpdk librte_gso library segments large TCP payloads (e.g. 64K bytes)
into smaller size buffers.
By supporting TSO offload capability in software a TAP device can be used
as a failsafe sub device and be paired with another PCI device which
supports TSO capability in HW.

This patch includes 2 commits:
1. Calculation of IP/TCP/UDP checksums for multi segments packets.
Previously checksum offload was skipped if the number of packet segments
was greater than 1.
This commit removes this limitation. It is required before supporting TAP TSO
since the generated TCP TSO may be composed of two segments where the first segment
includes all headers up to layer 4 with their calculated checksums (it is librte_gso way
of building TCP segments)
2. TAP TSO implementation: calling rte_gso_segment() to segment large TCP packets.
This commits creates of a small private mbuf pool in TAP PMD required by librte_gso.
The number of buffers will be 64 - each of 128 bytes length.

Ophir Munk (2):
  net/tap: calculate checksums of multi segs packets
  net/tap: support TSO (TCP Segment Offload)

 drivers/net/tap/Makefile      |   2 +-
 drivers/net/tap/rte_eth_tap.c | 205 ++++++++++++++++++++++++++++++++++--------
 drivers/net/tap/rte_eth_tap.h |   4 +
 mk/rte.app.mk                 |   4 +-
 4 files changed, 174 insertions(+), 41 deletions(-)
  

Patch

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index f09db0e..f312084 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -496,6 +496,9 @@  pmd_tx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		char m_copy[mbuf->data_len];
 		int n;
 		int j;
+		int k; /* first index in iovecs for copying segments */
+		uint16_t l234_len; /* length of layers 2,3,4 headers */
+		uint16_t seg_len; /* length of first segment */
 
 		/* stats.errs will be incremented */
 		if (rte_pktmbuf_pkt_len(mbuf) > max_size)
@@ -503,25 +506,44 @@  pmd_tx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 
 		iovecs[0].iov_base = &pi;
 		iovecs[0].iov_len = sizeof(pi);
-		for (j = 1; j <= mbuf->nb_segs; j++) {
-			iovecs[j].iov_len = rte_pktmbuf_data_len(seg);
-			iovecs[j].iov_base =
-				rte_pktmbuf_mtod(seg, void *);
-			seg = seg->next;
-		}
+		k = 1;
 		if (txq->csum &&
 		    ((mbuf->ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4) ||
 		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM ||
 		     (mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM))) {
-			/* Support only packets with all data in the same seg */
-			if (mbuf->nb_segs > 1)
+			/* Only support packets with at least layer 4
+			 * header included in the first segment
+			 */
+			seg_len = rte_pktmbuf_data_len(mbuf);
+			l234_len = mbuf->l2_len + mbuf->l3_len +
+				sizeof(struct tcp_hdr);
+			if (seg_len < l234_len)
 				break;
-			/* To change checksums, work on a copy of data. */
+
+			/* To change checksums, work on a
+			 * copy of l2, l3 l4 headers.
+			 */
 			rte_memcpy(m_copy, rte_pktmbuf_mtod(mbuf, void *),
-				   rte_pktmbuf_data_len(mbuf));
+					l234_len);
 			tap_tx_offload(m_copy, mbuf->ol_flags,
 				       mbuf->l2_len, mbuf->l3_len);
 			iovecs[1].iov_base = m_copy;
+			iovecs[1].iov_len = l234_len;
+			k++;
+			/* Adjust data pointer beyond l2, l3, l4 headers.
+			 * If this segment becomes empty - skip it
+			 */
+			if (seg_len > l234_len) {
+				rte_pktmbuf_adj(mbuf, l234_len);
+			} else {
+				seg = seg->next;
+				mbuf->nb_segs--;
+			}
+		}
+		for (j = k; j <= mbuf->nb_segs; j++) {
+			iovecs[j].iov_len = rte_pktmbuf_data_len(seg);
+			iovecs[j].iov_base = rte_pktmbuf_mtod(seg, void *);
+			seg = seg->next;
 		}
 		/* copy the tx frame data */
 		n = writev(txq->fd, iovecs, mbuf->nb_segs + 1);