[dpdk-dev,RFC,1/2] net/tap: calculate checksum for multi segs packets
Checks
Commit Message
In past TAP implementation checksum offload calculations (for
IP/UDP/TCP) were skipped in case of a multi segments packet.
This commit improves TAP functionality by enabling checksum calculations
in multi segments cases.
The only restriction now is that the first segment must contain all
headers of layers 2, 3 and 4 (where layer 4 header size is taken as
that of TCP).
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
---
drivers/net/tap/rte_eth_tap.c | 42 ++++++++++++++++++++++++++++++++----------
1 file changed, 32 insertions(+), 10 deletions(-)
Comments
This patch implements TAP TSO (TSP segmentation offload) in SW.
It uses dpdk library librte_gso.
Dpdk librte_gso library segments large TCP payloads (e.g. 64K bytes)
into smaller size buffers.
By supporting TSO offload capability in software a TAP device can be used
as a failsafe sub device and be paired with another PCI device which
supports TSO capability in HW.
This patch includes 2 commits:
1. Calculation of IP/TCP/UDP checksums for multi segments packets.
Previously checksum offload was skipped if the number of packet segments
was greater than 1.
This commit removes this limitation. It is required before supporting TAP TSO
since the generated TCP TSO may be composed of two segments where the first segment
includes all headers up to layer 4 with their calculated checksums (it is librte_gso way
of building TCP segments)
2. TAP TSO implementation: calling rte_gso_segment() to segment large TCP packets.
This commits creates of a small private mbuf pool in TAP PMD required by librte_gso.
The number of buffers will be 64 - each of 128 bytes length.
Ophir Munk (2):
net/tap: calculate checksums of multi segs packets
net/tap: support TSO (TCP Segment Offload)
drivers/net/tap/Makefile | 2 +-
drivers/net/tap/rte_eth_tap.c | 205 ++++++++++++++++++++++++++++++++++--------
drivers/net/tap/rte_eth_tap.h | 4 +
mk/rte.app.mk | 4 +-
4 files changed, 174 insertions(+), 41 deletions(-)
@@ -496,6 +496,9 @@ pmd_tx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
char m_copy[mbuf->data_len];
int n;
int j;
+ int k; /* first index in iovecs for copying segments */
+ uint16_t l234_len; /* length of layers 2,3,4 headers */
+ uint16_t seg_len; /* length of first segment */
/* stats.errs will be incremented */
if (rte_pktmbuf_pkt_len(mbuf) > max_size)
@@ -503,25 +506,44 @@ pmd_tx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
iovecs[0].iov_base = π
iovecs[0].iov_len = sizeof(pi);
- for (j = 1; j <= mbuf->nb_segs; j++) {
- iovecs[j].iov_len = rte_pktmbuf_data_len(seg);
- iovecs[j].iov_base =
- rte_pktmbuf_mtod(seg, void *);
- seg = seg->next;
- }
+ k = 1;
if (txq->csum &&
((mbuf->ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4) ||
(mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_UDP_CKSUM ||
(mbuf->ol_flags & PKT_TX_L4_MASK) == PKT_TX_TCP_CKSUM))) {
- /* Support only packets with all data in the same seg */
- if (mbuf->nb_segs > 1)
+ /* Only support packets with at least layer 4
+ * header included in the first segment
+ */
+ seg_len = rte_pktmbuf_data_len(mbuf);
+ l234_len = mbuf->l2_len + mbuf->l3_len +
+ sizeof(struct tcp_hdr);
+ if (seg_len < l234_len)
break;
- /* To change checksums, work on a copy of data. */
+
+ /* To change checksums, work on a
+ * copy of l2, l3 l4 headers.
+ */
rte_memcpy(m_copy, rte_pktmbuf_mtod(mbuf, void *),
- rte_pktmbuf_data_len(mbuf));
+ l234_len);
tap_tx_offload(m_copy, mbuf->ol_flags,
mbuf->l2_len, mbuf->l3_len);
iovecs[1].iov_base = m_copy;
+ iovecs[1].iov_len = l234_len;
+ k++;
+ /* Adjust data pointer beyond l2, l3, l4 headers.
+ * If this segment becomes empty - skip it
+ */
+ if (seg_len > l234_len) {
+ rte_pktmbuf_adj(mbuf, l234_len);
+ } else {
+ seg = seg->next;
+ mbuf->nb_segs--;
+ }
+ }
+ for (j = k; j <= mbuf->nb_segs; j++) {
+ iovecs[j].iov_len = rte_pktmbuf_data_len(seg);
+ iovecs[j].iov_base = rte_pktmbuf_mtod(seg, void *);
+ seg = seg->next;
}
/* copy the tx frame data */
n = writev(txq->fd, iovecs, mbuf->nb_segs + 1);