From patchwork Fri Sep 10 18:18:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98673 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E588BA0547; Fri, 10 Sep 2021 20:18:52 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5CBD941103; Fri, 10 Sep 2021 20:18:49 +0200 (CEST) Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by mails.dpdk.org (Postfix) with ESMTP id 5B26F4067E for ; Fri, 10 Sep 2021 20:18:47 +0200 (CEST) Received: by mail-pl1-f169.google.com with SMTP id c4so129653pls.6 for ; Fri, 10 Sep 2021 11:18:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mwHYb1APwVM1Hyv8Al/kX35q63aNi00KCIKEUaNDSnU=; b=oJKiK5UMBojgviSUYQF6TzK71k9W0kYuKPagHFWTP4oEmIomkBazaKuaBMni0Qrecx 7prA65TtjBpD9YjXV/R4rpnld1ABH50whojXRT81b+NV2Jn1r6Pxa+UenVbcXuWng65m aPPmd45hoXAglUWOYAKnDcWJ8oxT8mzfFwvLyg3EcbbUjbmSqsXP9dLzBOsP41egeuHs mDDGasAYB48ljQORyGPFYHDzOqTVJx6oLKmDunLxoqTPhHgq0tz1O7V0m1RQ67Whu8yz t9kzAlbLz5bQCOxRjwakTW+5v39J9dLgevi1aLIBBdx5akdamdxYO7hGq5tk0fLOoph7 jvSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mwHYb1APwVM1Hyv8Al/kX35q63aNi00KCIKEUaNDSnU=; b=7A2hNexLTWxKwy+aNVY+IEwnSxfOGRUVTD1ffVr9JQkkqDB4i76onJEToggZKyNzIF lm6m26UK3s5xs9R/8lP35NXc26QdVxjPA2VDiTTVuAav9fB7hGjGgGXWHx8QMO7qr8uG 431bk1RPHXEnlAVQT4130tddv87OdVSg6UA+v+Djqivyo+eYwX+Cb1keLeOlTWugu6G9 Ky2DaBfkBbp8tDBsnlnydOsegUdL1x+wg1lgCS2wJcZFnznT941Y3aZXqWPwFUgKg8CO Cc4upcW/yXc6UfvrCq5XCWyJj9q6M1R4esry80fWfTWi2F13+addgh0Ye4xlYIJrKmGl PTFQ== X-Gm-Message-State: AOAM533m6meI70qbgj8s7U/McZq7j77p76W0xQ/KwOyEow7q5/WGDoop WFqexVdgMKQqWodOoJYHcv159d1HsTykiA== X-Google-Smtp-Source: ABdhPJwCk5RQE0UfzWFcpm6GqeQnDJRajgx8xhefWyn1fZtVZTuoOmc9ULWnuwZv76+dNFaaEqqndg== X-Received: by 2002:a17:902:ce84:b0:138:9422:512e with SMTP id f4-20020a170902ce8400b001389422512emr8808634plg.12.1631297925550; Fri, 10 Sep 2021 11:18:45 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:44 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:31 -0700 Message-Id: <20210910181841.530280-2-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 01/11] librte_pcapng: add new library for writing pcapng files X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is utility library for writing pcapng format files used by Wireshark family of utilities. Older tcpdump also knows how to read (but not write) this format. See draft RFC https://www.ietf.org/id/draft-tuexen-opsawg-pcapng-03.html and https://github.com/pcapng/pcapng/ Signed-off-by: Stephen Hemminger --- lib/meson.build | 1 + lib/pcapng/meson.build | 8 + lib/pcapng/pcapng_proto.h | 129 +++++++++ lib/pcapng/rte_pcapng.c | 574 ++++++++++++++++++++++++++++++++++++++ lib/pcapng/rte_pcapng.h | 194 +++++++++++++ lib/pcapng/version.map | 12 + 6 files changed, 918 insertions(+) create mode 100644 lib/pcapng/meson.build create mode 100644 lib/pcapng/pcapng_proto.h create mode 100644 lib/pcapng/rte_pcapng.c create mode 100644 lib/pcapng/rte_pcapng.h create mode 100644 lib/pcapng/version.map diff --git a/lib/meson.build b/lib/meson.build index 1673ca4323c0..51bf9c2d11f0 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -41,6 +41,7 @@ libraries = [ 'latencystats', 'lpm', 'member', + 'pcapng', 'power', 'pdump', 'rawdev', diff --git a/lib/pcapng/meson.build b/lib/pcapng/meson.build new file mode 100644 index 000000000000..fe636bdf3c0b --- /dev/null +++ b/lib/pcapng/meson.build @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Microsoft Corporation + +version = 1 +sources = files('rte_pcapng.c') +headers = files('rte_pcapng.h') + +deps += ['ethdev'] diff --git a/lib/pcapng/pcapng_proto.h b/lib/pcapng/pcapng_proto.h new file mode 100644 index 000000000000..47161d8a1213 --- /dev/null +++ b/lib/pcapng/pcapng_proto.h @@ -0,0 +1,129 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019-2020 Microsoft Corporation + * + * PCAP Next Generation Capture File writer + * + * See: https://github.com/pcapng/pcapng/ for the file format. + */ + +enum pcapng_block_types { + PCAPNG_INTERFACE_BLOCK = 1, + PCAPNG_PACKET_BLOCK, /* Obsolete */ + PCAPNG_SIMPLE_PACKET_BLOCK, + PCAPNG_NAME_RESOLUTION_BLOCK, + PCAPNG_INTERFACE_STATS_BLOCK, + PCAPNG_ENHANCED_PACKET_BLOCK, + + PCAPNG_SECTION_BLOCK = 0x0A0D0D0A, +}; + +struct pcapng_option { + uint16_t code; + uint16_t length; + uint8_t data[]; +}; + +#define PCAPNG_BYTE_ORDER_MAGIC 0x1A2B3C4D +#define PCAPNG_MAJOR_VERS 1 +#define PCAPNG_MINOR_VERS 0 + +enum pcapng_opt { + PCAPNG_OPT_END = 0, + PCAPNG_OPT_COMMENT = 1, +}; + +struct pcapng_section_header { + uint32_t block_type; + uint32_t block_length; + uint32_t byte_order_magic; + uint16_t major_version; + uint16_t minor_version; + uint64_t section_length; +}; + +enum pcapng_section_opt { + PCAPNG_SHB_HARDWARE = 2, + PCAPNG_SHB_OS = 3, + PCAPNG_SHB_USERAPPL = 4, +}; + +struct pcapng_interface_block { + uint32_t block_type; /* 1 */ + uint32_t block_length; + uint16_t link_type; + uint16_t reserved; + uint32_t snap_len; +}; + +enum pcapng_interface_options { + PCAPNG_IFB_NAME = 2, + PCAPNG_IFB_DESCRIPTION, + PCAPNG_IFB_IPV4ADDR, + PCAPNG_IFB_IPV6ADDR, + PCAPNG_IFB_MACADDR, + PCAPNG_IFB_EUIADDR, + PCAPNG_IFB_SPEED, + PCAPNG_IFB_TSRESOL, + PCAPNG_IFB_TZONE, + PCAPNG_IFB_FILTER, + PCAPNG_IFB_OS, + PCAPNG_IFB_FCSLEN, + PCAPNG_IFB_TSOFFSET, + PCAPNG_IFB_HARDWARE, +}; + +struct pcapng_enhance_packet_block { + uint32_t block_type; /* 6 */ + uint32_t block_length; + uint32_t interface_id; + uint32_t timestamp_hi; + uint32_t timestamp_lo; + uint32_t capture_length; + uint32_t original_length; +}; + +/* Flags values */ +#define PCAPNG_IFB_INBOUND 0b01 +#define PCAPNG_IFB_OUTBOUND 0b10 + +enum pcapng_epb_options { + PCAPNG_EPB_FLAGS = 2, + PCAPNG_EPB_HASH, + PCAPNG_EPB_DROPCOUNT, + PCAPNG_EPB_PACKETID, + PCAPNG_EPB_QUEUE, + PCAPNG_EPB_VERDICT, +}; + +enum pcapng_epb_hash { + PCAPNG_HASH_2COMP = 0, + PCAPNG_HASH_XOR, + PCAPNG_HASH_CRC32, + PCAPNG_HASH_MD5, + PCAPNG_HASH_SHA1, + PCAPNG_HASH_TOEPLITZ, +}; + +struct pcapng_simple_packet { + uint32_t block_type; /* 3 */ + uint32_t block_length; + uint32_t packet_length; +}; + +struct pcapng_statistics { + uint32_t block_type; /* 5 */ + uint32_t block_length; + uint32_t interface_id; + uint32_t timestamp_hi; + uint32_t timestamp_lo; +}; + +enum pcapng_isb_options { + PCAPNG_ISB_STARTTIME = 2, + PCAPNG_ISB_ENDTIME, + PCAPNG_ISB_IFRECV, + PCAPNG_ISB_IFDROP, + PCAPNG_ISB_FILTERACCEPT, + PCAPNG_ISB_OSDROP, + PCAPNG_ISB_USRDELIV, +}; diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c new file mode 100644 index 000000000000..f8280a8b01f4 --- /dev/null +++ b/lib/pcapng/rte_pcapng.c @@ -0,0 +1,574 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Microsoft Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "pcapng_proto.h" + +/* conversion from DPDK speed to PCAPNG */ +#define PCAPNG_MBPS_SPEED 1000000ull + +/* Format of the capture file handle */ +struct rte_pcapng { + int outfd; /* output file */ + /* DPDK port id to interface index in file */ + uint32_t port_index[RTE_MAX_ETHPORTS]; +}; + +/* For converting TSC cycles to PCAPNG ns format */ +struct pcapng_time { + uint64_t ns; + uint64_t cycles; +} pcapng_time; + +RTE_INIT(pcapng_init) +{ + struct timespec ts; + + pcapng_time.cycles = rte_get_tsc_cycles(); + clock_gettime(CLOCK_REALTIME, &ts); + pcapng_time.ns = rte_timespec_to_ns(&ts); +} + +/* PCAPNG timestamps are in nanoseconds */ +static uint64_t pcapng_tsc_to_ns(uint64_t cycles) +{ + uint64_t delta; + + delta = cycles - pcapng_time.cycles; + return pcapng_time.ns + (delta * NSEC_PER_SEC) / rte_get_tsc_hz(); +} + +/* length of option including padding */ +static uint16_t pcapng_optlen(uint16_t len) +{ + return RTE_ALIGN(sizeof(struct pcapng_option) + len, + sizeof(uint32_t)); +} + +/* build TLV option and return location of next */ +static struct pcapng_option * +pcapng_add_option(struct pcapng_option *popt, uint16_t code, + const void *data, uint16_t len) +{ + popt->code = code; + popt->length = len; + memcpy(popt->data, data, len); + + return (struct pcapng_option *)((uint8_t *)popt + pcapng_optlen(len)); +} + +/* + * Write required initial section header describing the capture + */ +static int +pcapng_section_block(rte_pcapng_t *self, + const char *os, const char *hw, + const char *app, const char *comment) +{ + struct pcapng_section_header *hdr; + struct pcapng_option *opt; + void *buf; + uint32_t len; + ssize_t cc; + + len = sizeof(*hdr); + if (hw) + len += pcapng_optlen(strlen(hw)); + if (os) + len += pcapng_optlen(strlen(os)); + if (app) + len += pcapng_optlen(strlen(app)); + if (comment) + len += pcapng_optlen(strlen(comment)); + + len += pcapng_optlen(0); + len += sizeof(uint32_t); + + buf = calloc(1, len); + if (!buf) + return -1; + + hdr = (struct pcapng_section_header *)buf; + *hdr = (struct pcapng_section_header) { + .block_type = PCAPNG_SECTION_BLOCK, + .block_length = len, + .byte_order_magic = PCAPNG_BYTE_ORDER_MAGIC, + .major_version = PCAPNG_MAJOR_VERS, + .minor_version = PCAPNG_MINOR_VERS, + .section_length = UINT64_MAX, + }; + hdr->block_length = len; + + opt = (struct pcapng_option *)(hdr + 1); + if (comment) + opt = pcapng_add_option(opt, PCAPNG_OPT_COMMENT, + comment, strlen(comment)); + if (hw) + opt = pcapng_add_option(opt, PCAPNG_SHB_HARDWARE, + hw, strlen(hw)); + if (os) + opt = pcapng_add_option(opt, PCAPNG_SHB_OS, + os, strlen(os)); + if (app) + opt = pcapng_add_option(opt, PCAPNG_SHB_USERAPPL, + app, strlen(app)); + + opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0); + /* clone block_length after option */ + memcpy(opt, &hdr->block_length, sizeof(uint32_t)); + + cc = write(self->outfd, buf, len); + free(buf); + + return cc; +} + +/* Write the PCAPNG section header at start of file */ +static ssize_t +pcapng_interface_block(rte_pcapng_t *self, const char *if_name, + uint64_t if_speed, const uint8_t *mac_addr, + const char *if_hw, const char *comment) +{ + struct pcapng_interface_block *hdr; + struct pcapng_option *opt; + const uint8_t tsresol = 9; /* nanosecond resolution */ + uint32_t len = sizeof(*hdr); + ssize_t cc; + void *buf; + + len += pcapng_optlen(sizeof(tsresol)); + if (if_name) + len += pcapng_optlen(strlen(if_name)); + if (mac_addr) + len += pcapng_optlen(6); + if (if_speed) + len += pcapng_optlen(sizeof(uint64_t)); + if (if_hw) + len += pcapng_optlen(strlen(if_hw)); + if (comment) + len += pcapng_optlen(strlen(comment)); + + len += pcapng_optlen(0); + len += sizeof(uint32_t); + buf = calloc(1, len); + if (!buf) + return -ENOMEM; + + hdr = (struct pcapng_interface_block *)buf; + hdr->block_type = PCAPNG_INTERFACE_BLOCK; + hdr->link_type = 1; /* Ethernet */ + hdr->block_length = len; + + opt = (struct pcapng_option *)(hdr + 1); + if (if_name) + opt = pcapng_add_option(opt, PCAPNG_IFB_NAME, + if_name, strlen(if_name)); + if (mac_addr) + opt = pcapng_add_option(opt, PCAPNG_IFB_MACADDR, + mac_addr, RTE_ETHER_ADDR_LEN); + if (if_speed) + opt = pcapng_add_option(opt, PCAPNG_IFB_SPEED, + &if_speed, sizeof(uint64_t)); + opt = pcapng_add_option(opt, PCAPNG_IFB_TSRESOL, + &tsresol, sizeof(tsresol)); + if (if_hw) + opt = pcapng_add_option(opt, PCAPNG_IFB_HARDWARE, + if_hw, strlen(if_hw)); + if (comment) + opt = pcapng_add_option(opt, PCAPNG_OPT_COMMENT, + comment, strlen(comment)); + + opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0); + + memcpy(opt, &hdr->block_length, sizeof(uint32_t)); + cc = write(self->outfd, buf, len); + free(buf); + + return cc; +} + +static int +pcapng_add_interface(rte_pcapng_t *self, uint16_t port) +{ + struct rte_eth_dev_info dev_info; + struct rte_ether_addr macaddr; + const struct rte_device *dev; + struct rte_eth_link link; + char ifname[IF_NAMESIZE]; + char ifhw[256]; + uint64_t speed = 0; + + if (rte_eth_dev_info_get(port, &dev_info) < 0) + return -1; + + /* make something like an interface name */ + if (if_indextoname(dev_info.if_index, ifname) == NULL) + snprintf(ifname, IF_NAMESIZE, "dpdk:%u", port); + + /* make a useful device hardware string */ + dev = dev_info.device; + if (dev) + snprintf(ifhw, sizeof(ifhw), + "%s-%s", dev->bus->name, dev->name); + + /* DPDK reports in units of Mbps */ + rte_eth_link_get(port, &link); + if (link.link_status == ETH_LINK_UP) + speed = link.link_speed * PCAPNG_MBPS_SPEED; + + rte_eth_macaddr_get(port, &macaddr); + + return pcapng_interface_block(self, ifname, speed, + macaddr.addr_bytes, + dev ? ifhw : NULL, NULL); +} + +/* + * Write the list of possible interfaces at the start + * of the file. + */ +static int +pcapng_interfaces(rte_pcapng_t *self) +{ + uint16_t port_id; + uint16_t index = 0; + + RTE_ETH_FOREACH_DEV(port_id) { + /* The list if ports in pcapng needs to be contiguous */ + self->port_index[port_id] = index++; + if (pcapng_add_interface(self, port_id) < 0) + return -1; + } + return 0; +} + +/* + * Write an Interface statistics block at the end of capture. + */ +ssize_t +rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id, + const char *comment, + uint64_t start_time, uint64_t end_time, + uint64_t ifrecv, uint64_t ifdrop) +{ + struct pcapng_statistics *hdr; + struct pcapng_option *opt; + uint32_t optlen, len; + uint8_t *buf; + uint64_t ns; + + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); + + optlen = 0; + + if (ifrecv != UINT64_MAX) + optlen += pcapng_optlen(sizeof(ifrecv)); + if (ifdrop != UINT64_MAX) + optlen += pcapng_optlen(sizeof(ifdrop)); + if (start_time != 0) + optlen += pcapng_optlen(sizeof(start_time)); + if (end_time != 0) + optlen += pcapng_optlen(sizeof(end_time)); + if (comment) + optlen += pcapng_optlen(strlen(comment)); + if (optlen != 0) + optlen += pcapng_optlen(0); + + len = sizeof(*hdr) + optlen + sizeof(uint32_t); + buf = alloca(len); + if (buf == NULL) + return -1; + + hdr = (struct pcapng_statistics *)buf; + opt = (struct pcapng_option *)(hdr + 1); + + if (comment) + opt = pcapng_add_option(opt, PCAPNG_OPT_COMMENT, + comment, strlen(comment)); + if (start_time != 0) + opt = pcapng_add_option(opt, PCAPNG_ISB_STARTTIME, + &start_time, sizeof(start_time)); + if (end_time != 0) + opt = pcapng_add_option(opt, PCAPNG_ISB_ENDTIME, + &end_time, sizeof(end_time)); + if (ifrecv != UINT64_MAX) + opt = pcapng_add_option(opt, PCAPNG_ISB_IFRECV, + &ifrecv, sizeof(ifrecv)); + if (ifdrop != UINT64_MAX) + opt = pcapng_add_option(opt, PCAPNG_ISB_IFDROP, + &ifdrop, sizeof(ifdrop)); + if (optlen != 0) + opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0); + + hdr->block_type = PCAPNG_INTERFACE_STATS_BLOCK; + hdr->block_length = len; + hdr->interface_id = self->port_index[port_id]; + + ns = pcapng_tsc_to_ns(rte_get_tsc_cycles()); + hdr->timestamp_hi = ns >> 32; + hdr->timestamp_lo = (uint32_t)ns; + + /* clone block_length after option */ + memcpy(opt, &len, sizeof(uint32_t)); + + return write(self->outfd, buf, len); +} + +uint32_t +rte_pcapng_mbuf_size(uint32_t length) +{ + /* The VLAN and EPB header must fit in the mbuf headroom. */ + RTE_ASSERT(sizeof(struct pcapng_enhance_packet_block) + + sizeof(struct rte_vlan_hdr) <= RTE_PKTMBUF_HEADROOM); + + /* The flags and queue information are added at the end. */ + return sizeof(struct rte_mbuf) + + RTE_ALIGN(length, sizeof(uint32_t)) + + pcapng_optlen(sizeof(uint32_t)) /* flag option */ + + pcapng_optlen(sizeof(uint32_t)) /* queue option */ + + sizeof(uint32_t); /* length */ +} + +/* + * The mbufs created use the Pcapng standard enhanced packet block. + * + * 1 2 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 0 | Block Type = 0x00000006 | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 4 | Block Total Length | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 8 | Interface ID | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 12 | Timestamp (High) | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 16 | Timestamp (Low) | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 20 | Captured Packet Length | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 24 | Original Packet Length | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * 28 / / + * / Packet Data / + * / variable length, padded to 32 bits / + * / / + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Option Code = 0x0002 | Option Length = 0x004 | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Flags (direction) | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Option Code = 0x0006 | Option Length = 0x002 | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Queue id | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Block Total Length | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + */ + +/* Make a copy of original mbuf with pcapng header and options */ +struct rte_mbuf * +rte_pcapng_copy(uint16_t port_id, uint32_t queue, + const struct rte_mbuf *md, + struct rte_mempool *mp, + uint32_t length, uint64_t cycles, + enum rte_pcapng_direction direction) +{ + struct pcapng_enhance_packet_block *epb; + uint32_t orig_len, data_len, padding, flags; + struct pcapng_option *opt; + const uint16_t optlen = pcapng_optlen(sizeof(flags)) + pcapng_optlen(sizeof(queue)); + struct rte_mbuf *mc; + uint64_t ns; + +#ifdef RTE_LIBRTE_ETHDEV_DEBUG + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL); +#endif + ns = pcapng_tsc_to_ns(cycles); + + orig_len = rte_pktmbuf_pkt_len(md); + + /* Take snapshot of the data */ + mc = rte_pktmbuf_copy(md, mp, 0, length); + if (unlikely(mc == NULL)) + return NULL; + + /* If packet had offloaded VLAN, expand it */ + if (md->ol_flags & ~(PKT_RX_VLAN_STRIPPED | PKT_TX_VLAN)) { + if (rte_vlan_insert(&mc) != 0) + goto fail; + + orig_len += sizeof(struct rte_vlan_hdr); + } + + /* pad the packet to 32 bit boundary */ + data_len = rte_pktmbuf_data_len(mc); + padding = RTE_ALIGN(data_len, sizeof(uint32_t)) - data_len; + if (padding > 0) { + void *tail = rte_pktmbuf_append(mc, padding); + + if (tail == NULL) + goto fail; + memset(tail, 0, padding); + } + + /* reserve trailing options and block length */ + opt = (struct pcapng_option *) + rte_pktmbuf_append(mc, optlen + sizeof(uint32_t)); + if (unlikely(opt == NULL)) + goto fail; + + switch (direction) { + case RTE_PCAPNG_DIRECTION_IN: + flags = PCAPNG_IFB_INBOUND; + break; + case RTE_PCAPNG_DIRECTION_OUT: + flags = PCAPNG_IFB_OUTBOUND; + break; + default: + flags = 0; + } + + opt = pcapng_add_option(opt, PCAPNG_EPB_FLAGS, + &flags, sizeof(flags)); + + opt = pcapng_add_option(opt, PCAPNG_EPB_QUEUE, + &queue, sizeof(queue)); + + /* Add PCAPNG packet header */ + epb = (struct pcapng_enhance_packet_block *) + rte_pktmbuf_prepend(mc, sizeof(*epb)); + if (unlikely(epb == NULL)) + goto fail; + + epb->block_type = PCAPNG_ENHANCED_PACKET_BLOCK; + epb->block_length = rte_pktmbuf_data_len(mc); + + /* Interface index is filled in later during write */ + mc->port = port_id; + + epb->timestamp_hi = ns >> 32; + epb->timestamp_lo = (uint32_t)ns; + epb->capture_length = data_len; + epb->original_length = orig_len; + + /* set trailer of block length */ + *(uint32_t *)opt = epb->block_length; + + return mc; + +fail: + rte_pktmbuf_free(mc); + return NULL; +} + +/* Count how many segments are in this array of mbufs */ +static unsigned int +mbuf_burst_segs(struct rte_mbuf *pkts[], unsigned int n) +{ + unsigned int i, iovcnt; + + for (iovcnt = 0, i = 0; i < n; i++) { + const struct rte_mbuf *m = pkts[i]; + + __rte_mbuf_sanity_check(m, 1); + + iovcnt += m->nb_segs; + } + return iovcnt; +} + +/* Write pre-formatted packets to file. */ +ssize_t +rte_pcapng_write_packets(rte_pcapng_t *self, + struct rte_mbuf *pkts[], uint16_t nb_pkts) +{ + int iovcnt = mbuf_burst_segs(pkts, nb_pkts); + struct iovec iov[iovcnt]; + unsigned int i, cnt; + ssize_t ret; + + for (i = cnt = 0; i < nb_pkts; i++) { + struct rte_mbuf *m = pkts[i]; + struct pcapng_enhance_packet_block *epb; + + /* sanity check that is really a pcapng mbuf */ + epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *); + if (unlikely(epb->block_type != PCAPNG_ENHANCED_PACKET_BLOCK || + epb->block_length != rte_pktmbuf_data_len(m))) { + rte_errno = EINVAL; + return -1; + } + + /* + * The DPDK port is recorded during pcapng_copy. + * Map that to PCAPNG interface in file. + */ + epb->interface_id = self->port_index[m->port]; + do { + iov[cnt].iov_base = rte_pktmbuf_mtod(m, void *); + iov[cnt].iov_len = rte_pktmbuf_data_len(m); + ++cnt; + } while ((m = m->next)); + } + + ret = writev(self->outfd, iov, iovcnt); + if (unlikely(ret < 0)) + rte_errno = errno; + return ret; +} + +/* Create new pcapng writer handle */ +rte_pcapng_t * +rte_pcapng_fdopen(int fd, + const char *osname, const char *hardware, + const char *appname, const char *comment) +{ + rte_pcapng_t *self; + + self = malloc(sizeof(*self)); + if (!self) { + rte_errno = ENOMEM; + return NULL; + } + + self->outfd = fd; + + if (pcapng_section_block(self, osname, hardware, appname, comment) < 0) + goto fail; + + if (pcapng_interfaces(self) < 0) + goto fail; + + return self; +fail: + free(self); + return NULL; +} + +void +rte_pcapng_close(rte_pcapng_t *self) +{ + close(self->outfd); + free(self); +} diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h new file mode 100644 index 000000000000..2f1bb073df08 --- /dev/null +++ b/lib/pcapng/rte_pcapng.h @@ -0,0 +1,194 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Microsoft Corporation + */ + +/** + * @file + * RTE pcapng + * + * @warning + * @b EXPERIMENTAL: + * All functions in this file may be changed or removed without prior notice. + * + * Pcapng is an evolution from the pcap format, created to address some of + * its deficiencies. Namely, the lack of extensibility and inability to store + * additional information. + * + * For details about the file format see RFC: + * https://www.ietf.org/id/draft-tuexen-opsawg-pcapng-03.html + * and + * https://github.com/pcapng/pcapng/ + */ + +#ifndef _RTE_PCAPNG_H_ +#define _RTE_PCAPNG_H_ + +#include +#include +#include +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/* Opaque handle used for functions in this library. */ +typedef struct rte_pcapng rte_pcapng_t; + +/** + * Write data to existing open file + * + * @param fd + * file descriptor + * @param osname + * Optional description of the operating system. + * Examples: "Debian 11", "Windows Server 22" + * @param hardware + * Optional description of the hardware used to create this file. + * Examples: "x86 Virtual Machine" + * @param appname + * Optional: application name recorded in the pcapng file. + * Example: "dpdk-dumpcap 1.0 (DPDK 20.11)" + * @param comment + * Optional comment to add to file header. + * @return + * handle to library, or NULL in case of error (and rte_errno is set). + */ +__rte_experimental +rte_pcapng_t * +rte_pcapng_fdopen(int fd, + const char *osname, const char *hardware, + const char *appname, const char *comment); + +/** + * Close capture file + * + * @param self + * handle to library + */ +__rte_experimental +void +rte_pcapng_close(rte_pcapng_t *self); + +/** + * Direction flag + * These should match Enhanced Packet Block flag bits + */ +enum rte_pcapng_direction { + RTE_PCAPNG_DIRECTION_UNKNOWN = 0, + RTE_PCAPNG_DIRECTION_IN = 1, + RTE_PCAPNG_DIRECTION_OUT = 2, +}; + +/** + * Format an mbuf for writing to file. + * + * @param port_id + * The Ethernet port on which packet was received + * or is going to be transmitted. + * @param queue + * The queue on the Ethernet port where packet was received + * or is going to be transmitted. + * @param mp + * The mempool from which the "clone" mbufs are allocated. + * @param m + * The mbuf to copy + * @param length + * The upper limit on bytes to copy. Passing UINT32_MAX + * means all data (after offset). + * @param timestamp + * The timestamp in TSC cycles. + * @param direction + * The direction of the packer: receive, transmit or unknown. + * + * @return + * - The pointer to the new mbuf formatted for pcapng_write + * - NULL if allocation fails. + * + */ +__rte_experimental +struct rte_mbuf * +rte_pcapng_copy(uint16_t port_id, uint32_t queue, + const struct rte_mbuf *m, struct rte_mempool *mp, + uint32_t length, uint64_t timestamp, + enum rte_pcapng_direction direction); + + +/** + * Determine optimum mbuf data size. + * + * @param length + * The upper limit on bytes to copy. Passing UINT32_MAX + * means all data (after offset). + * @return + * The minimum size of mbuf data to handle packet with length bytes. + * Accounting for required header and trailer fields + */ +__rte_experimental +uint32_t +rte_pcapng_mbuf_size(uint32_t length); + +/** + * Write packets to the capture file. + * + * Packets to be captured are copied by rte_pcapng_mbuf() + * and then this function is called to write them to the file. + * @warning + * Do not pass original mbufs + * + * @param self + * The handle to the packet capture file + * @param pkts + * The address of an array of *nb_pkts* pointers to *rte_mbuf* structures + * which contain the output packets + * @param nb_pkts + * The number of packets to write to the file. + * @return + * The number of bytes written to file, -1 on failure to write file. + * The mbuf's in *pkts* are always freed. + */ +__rte_experimental +ssize_t +rte_pcapng_write_packets(rte_pcapng_t *self, + struct rte_mbuf *pkts[], uint16_t nb_pkts); + +/** + * Write an Interface statistics block. + * For statistics, use 0 if don't know or care to report it. + * Should be called before closing capture to report results. + * + * @param self + * The handle to the packet capture file + * @param port + * The Ethernet port to report stats on. + * @param comment + * Optional comment to add to statistics. + * @param start_time + * The time when packet capture was started in nanoseconds. + * Optional: can be zero if not known. + * @param end_time + * The time when packet capture was stopped in nanoseconds. + * Optional: can be zero if not finished; + * @param ifrecv + * The number of packets received by capture. + * Optional: use UINT64_MAX if not known. + * @param ifdrop + * The number of packets missed by the capture process. + * Optional: use UINT64_MAX if not known. + * @return + * number of bytes written to file, -1 on failure to write file + */ +__rte_experimental +ssize_t +rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port, + const char *comment, + uint64_t start_time, uint64_t end_time, + uint64_t ifrecv, uint64_t ifdrop); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PCAPNG_H_ */ diff --git a/lib/pcapng/version.map b/lib/pcapng/version.map new file mode 100644 index 000000000000..05a9c86a7d91 --- /dev/null +++ b/lib/pcapng/version.map @@ -0,0 +1,12 @@ +EXPERIMENTAL { + global: + + rte_pcapng_close; + rte_pcapng_copy; + rte_pcapng_fdopen; + rte_pcapng_mbuf_size; + rte_pcapng_write_packets; + rte_pcapng_write_stats; + + local: *; +}; From patchwork Fri Sep 10 18:18:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98674 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6D98AA0547; Fri, 10 Sep 2021 20:18:59 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 71DF241134; Fri, 10 Sep 2021 20:18:50 +0200 (CEST) Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by mails.dpdk.org (Postfix) with ESMTP id D10264067E for ; Fri, 10 Sep 2021 20:18:47 +0200 (CEST) Received: by mail-pj1-f46.google.com with SMTP id m21-20020a17090a859500b00197688449c4so2049337pjn.0 for ; Fri, 10 Sep 2021 11:18:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aAcq48e1/WMcqwcaHcHh6O+Od2VhVZjP5wy4td0957M=; b=oTmIWVKiXhqcUb01iE939Rh5F4Iie32WsAXrygS/WSPUPDDZObdi5Y6lD+KfpeuU0k I8fCSs2iHOfsPGW8lbmfZXpme0/HAC508R2OhgX33bkevewjCWaabwTgjwigxfmGc0fe Ht8CnVeXlUV2o5CRyWjh8wJ7QBGzPQt0v8tVIQGJ24JzDFOTE+aAykQ2KN8Cpu+Sre8S t8WOnoJkMldmT4Ks4y0U5mAlGEqqf/hHOvl5cQ8kGIUQParbbqLP3uKFx9TL9g9usOri kRq97JJTehf/6i5o+STz0cOEOgSNrIFwH+RfUzwhPgltlhLYx/OLKSLqc5f8DA+UZBAX KdMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aAcq48e1/WMcqwcaHcHh6O+Od2VhVZjP5wy4td0957M=; b=GpNYraXrSNcboQvusvo5yCHb2ktOpgpY9DXbN/ObRspAjR8yztCxnOHns2OTNCmzwK dQYF47gayfhzObBZ+SHmrl6dR0eVTHe70nU+VY4VelCA0eKt2pB5DP5KMfN/fZIHO+2x kOG31UJgFJgyNEKD7fla0PMpt3eHhy27kphnFubevhjJsCMq5a4DRgkdPL9BjFjuYB8E lWe044S3QPsn9INv87VinAY9UQKxOOHR7eW9c130sav9ZW3Z+YVxfp7CL7zQ+sHKd33l A4juJ2+ZLobmG4heUoTzvYZkWQEP+d6a+VR5MwSZy3GYKLHB7tYnCfVf7FzRiiWQcFcJ zxAA== X-Gm-Message-State: AOAM530geRcAOzIqq2n8Z3eNhhMGv6buR+96988ZY6BMmMt149I5jNQ9 yJl6efOSXtdfDOKtlRr6ZqeVcEOxBXF7uw== X-Google-Smtp-Source: ABdhPJyQatLBaM56HZq0xKKNq82H41UfT/YuX9MrT3Me/Kuv2pcHi0kEmdNg6vITes1wLHa86onaAg== X-Received: by 2002:a17:90a:9511:: with SMTP id t17mr10971437pjo.194.1631297926664; Fri, 10 Sep 2021 11:18:46 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:46 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger , Dmitry Kozlyuk , Narcisa Ana Maria Vasile , Dmitry Malloy , Pallavi Kadam Date: Fri, 10 Sep 2021 11:18:32 -0700 Message-Id: <20210910181841.530280-3-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 02/11] lib: pdump is not supported on Windows X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The current version of the pdump library was building on Windows, but it was useless since the pdump utility was not being built and Windows does not have multi-process support. The new version of pdump with filtering now has dependency on bpf. But bpf library is not available on Windows. Signed-off-by: Stephen Hemminger Acked-by: Dmitry Kozlyuk Cc: Dmitry Kozlyuk Cc: Narcisa Ana Maria Vasile Cc: Dmitry Malloy Cc: Pallavi Kadam --- lib/meson.build | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/meson.build b/lib/meson.build index 51bf9c2d11f0..ba88e9eabc58 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -85,7 +85,6 @@ if is_windows 'gro', 'gso', 'latencystats', - 'pdump', ] # only supported libraries for windows endif From patchwork Fri Sep 10 18:18:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98675 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D16EEA0547; Fri, 10 Sep 2021 20:19:06 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0D85941148; Fri, 10 Sep 2021 20:18:53 +0200 (CEST) Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by mails.dpdk.org (Postfix) with ESMTP id 47505410EF for ; Fri, 10 Sep 2021 20:18:49 +0200 (CEST) Received: by mail-pj1-f48.google.com with SMTP id c13-20020a17090a558d00b00198e6497a4fso2092842pji.4 for ; Fri, 10 Sep 2021 11:18:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=B2uAon4cvKoOj+QnKe/KWMgPSrjtWc91Ug3S5m+/n+k=; b=CTJWQNsk6pkPbpmUaZkWQ0ylrFBt9tMS0Avtx5HzwIp5E/ILTk7IroBXMEH0xODa/5 /4UjoqOe3qNTksz8a7pnxE53orHzu4WyopHSc3xJK69ktu3OyVsDKdars7cQSGJ18vFj D+EnQT+PB0bINBTwqbKRIst3FXyLlYwgUlJqtkAoS6EjxF9jZmXf/7Lt5T0WKrtomE0q Tq53mhOkoL0UrUsTFVK3P0SBHZazN7MCF8SIxKh/uY5VrvIthITPu/+9aT6oCPKy5+yL aW25r6V56hhkEj35lUV0Zb24Qat7iC5r++O1yPbZSNK+Q4Ml7Sf4vTOLmqowKGUy/U9r i9hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=B2uAon4cvKoOj+QnKe/KWMgPSrjtWc91Ug3S5m+/n+k=; b=rZ1TC+8IeSTXDCp8DFkoJW1MGcoDssWH2B6F8W65OoN6mrEnQDOe6tGhTLK2ShvWrP C3xinlb/G7z6p3PPv1hFXegbYJJ8X+Hd1cP/e46y7NlJ6XoSvxUXTt52bSlrPKoWRNkt B40T2QzyxiMtZ9LV2KpUz1sEv4dQft4jotLU86zH8hlDwC+ddtjCs1+4x6DyxfqYOYP5 fokJCqzl8SC05TJHq5OJDMF4gTec/Ws6z1HMsncPhWIq56omKCusZAyuIc9kWVDiSkSa GIOp8ET5kz3v485Kljzmc0d9XBEQkpsho4QBNU4eiRr4STO9mpaXa7E6AjNZJ6+x0fCD rrbA== X-Gm-Message-State: AOAM531jTTagb6IJX796GQlx9+D32GRl41wBzElk5mQFwOka9h9GIUPx eE+gi1Fbi5JsXtFrPhvknEn83X+AuBLpFw== X-Google-Smtp-Source: ABdhPJzoqoj6kPDzk8mur3NBl5IRsKOWj02shPOZ4Qyha8jjP8Xv7h0CqEO16B6jjw6v+0G8l1LBUg== X-Received: by 2002:a17:90a:5886:: with SMTP id j6mr10981259pji.238.1631297928095; Fri, 10 Sep 2021 11:18:48 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:47 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:33 -0700 Message-Id: <20210910181841.530280-4-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 03/11] bpf: allow self-xor operation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When doing BPF filter program conversion, a common way to zero a register in single instruction is: xor r7,r7 The BPF validator would not allow this because the value of r7 was undefined. But after this operation it always zero. Signed-off-by: Stephen Hemminger --- lib/bpf/bpf_validate.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c index 7b1291b382e9..7647a7454dc2 100644 --- a/lib/bpf/bpf_validate.c +++ b/lib/bpf/bpf_validate.c @@ -661,8 +661,12 @@ eval_alu(struct bpf_verifier *bvf, const struct ebpf_insn *ins) op = BPF_OP(ins->code); - err = eval_defined((op != EBPF_MOV) ? rd : NULL, - (op != BPF_NEG) ? &rs : NULL); + /* Allow self-xor as way to zero register */ + if (op == BPF_XOR && ins->src_reg == ins->dst_reg) + err = NULL; + else + err = eval_defined((op != EBPF_MOV) ? rd : NULL, + (op != BPF_NEG) ? &rs : NULL); if (err != NULL) return err; From patchwork Fri Sep 10 18:18:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98676 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 16A7FA0547; Fri, 10 Sep 2021 20:19:13 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 33BB541151; Fri, 10 Sep 2021 20:18:54 +0200 (CEST) Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by mails.dpdk.org (Postfix) with ESMTP id 3FBAB4113D for ; Fri, 10 Sep 2021 20:18:51 +0200 (CEST) Received: by mail-pf1-f172.google.com with SMTP id s29so2583808pfw.5 for ; Fri, 10 Sep 2021 11:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Kt8ZlOZNuJmkYBLEeCKkBERWlrqnaqrTua1JK7vwSMw=; b=RXa1oC5xyF4A+AAbjQNQS4gehFDKblr/3lwUanLJJzgXCCPkGa4XewnGNshm+407Cm VY2MS6jPZHntouzmrUDF/WZsMTn8cTBxS4e358CFrMtmC0bQ8r/p+R4tVxnnUrnI/F/M vWRJBX/iJj/DgQmwTABfkbZnCYFBfNK0h+1XoiesNmBEOn/EtlhpDxEaxDapMKaSNYVQ MYLzNqALIXvBj5PC6PFBK7En0UsfT4QFySdVf4ZRWcsVdIAFhas3fgQ7aezQB+ZCRko7 urZ+r7qE2Fi+nCHKrRZg43V9yX3HSM+5IuawrMk5oS5tvBo+YSa+1+WPePB+PieBTRJ6 no0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Kt8ZlOZNuJmkYBLEeCKkBERWlrqnaqrTua1JK7vwSMw=; b=z4LbJhXYLsexQ2WTaFQIuYUj+tjRp+ZLTpi9AQ/czHuWMA2xabnogJ9g5Pcb92bJfU CP8FO6gV7aH3/EXIM3n8gygIoV+FTYQ7k1PyKHixzpoCFruTlAOmuhdDyJVbdbS8UZi4 +r5gqAyP7b1rU93ydZJPp9dlZWAOQlXDXGMm3cu6Yq7TaU/oEh5uOhfg+cFE6Tx2iurY xUSKJWzDMF+7JQ56QMoi5WZBN/jlkKViTiPNhu2t0maJhol+7inzCWPDb6MQzZtiHsii Zray3qJQTVo15mKk5ZPqycoEhUoE3OnzGQeh6g3S+LOZXgVr4U282fV++X2Fmwp7PTkc I3nQ== X-Gm-Message-State: AOAM533BcoQG9X1ytE9Lvdx2O5YZDRZjtkwkKg6siyoGv5o6cVwGmMEb RkIIhD8UO2a4Dh32TcBQ3miCQ7jkb2CTNA== X-Google-Smtp-Source: ABdhPJxdk/aKaRMOwvI4Hq/1ahuWNQYoFdj/d3TDI5nRlCz7MC0pd8QcCJJWpvBWgvrEZ5vXiSDJcg== X-Received: by 2002:a63:5555:: with SMTP id f21mr8372960pgm.18.1631297929288; Fri, 10 Sep 2021 11:18:49 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:48 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:34 -0700 Message-Id: <20210910181841.530280-5-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 04/11] bpf: add function to convert classic BPF to DPDK BPF X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The pcap library emits classic BPF (32 bit) and is useful for creating filter programs. The DPDK BPF library only implements extended BPF (eBPF). Add an function to convert from old to new. The rte_bpf_convert function uses rte_malloc to put the resulting program in hugepage shared memory so it can be passed from a secondary process to a primary process. The code to convert was originally done as part of the Linux kernel implementation then converted to a userspace program. Both authors have agreed that it is allowable to license this as BSD licensed in DPDK. Signed-off-by: Stephen Hemminger --- lib/bpf/bpf_convert.c | 570 ++++++++++++++++++++++++++++++++++++++++++ lib/bpf/meson.build | 5 + lib/bpf/rte_bpf.h | 25 ++ lib/bpf/version.map | 6 + 4 files changed, 606 insertions(+) create mode 100644 lib/bpf/bpf_convert.c diff --git a/lib/bpf/bpf_convert.c b/lib/bpf/bpf_convert.c new file mode 100644 index 000000000000..198e6d359042 --- /dev/null +++ b/lib/bpf/bpf_convert.c @@ -0,0 +1,570 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Microsoft Corporation + * + * Based on bpf_convert_filter() in the Linux kernel sources + * and filter2xdp. + * + * Licensed as BSD with permission original authors. + * Copyright (C) 2017 Tobias Klauser + * Copyright (c) 2011 - 2014 PLUMgrid, http://plumgrid.com + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +/* Workaround name conflicts with libpcap */ +#define bpf_validate(f, len) bpf_validate_libpcap(f, len) +#include +#include +#undef bpf_validate + +#include "bpf_impl.h" +#include "bpf_def.h" + +#ifndef BPF_MAXINSNS +#define BPF_MAXINSNS 4096 +#endif + +/* + * Linux socket filter uses negative absolute offsets to + * reference ancillary data. + * + */ +#define SKF_AD_OFF (-0x1000) +#define SKF_AD_PROTOCOL 0 +#define SKF_AD_PKTTYPE 4 +#define SKF_AD_IFINDEX 8 +#define SKF_AD_NLATTR 12 +#define SKF_AD_NLATTR_NEST 16 +#define SKF_AD_MARK 20 +#define SKF_AD_QUEUE 24 +#define SKF_AD_HATYPE 28 +#define SKF_AD_RXHASH 32 +#define SKF_AD_CPU 36 +#define SKF_AD_ALU_XOR_X 40 +#define SKF_AD_VLAN_TAG 44 +#define SKF_AD_VLAN_TAG_PRESENT 48 +#define SKF_AD_PAY_OFFSET 52 +#define SKF_AD_RANDOM 56 +#define SKF_AD_VLAN_TPID 60 +#define SKF_AD_MAX 64 + +/* ArgX, context and stack frame pointer register positions. Note, + * Arg1, Arg2, Arg3, etc are used as argument mappings of function + * calls in BPF_CALL instruction. + */ +#define BPF_REG_ARG1 EBPF_REG_1 +#define BPF_REG_ARG2 EBPF_REG_2 +#define BPF_REG_ARG3 EBPF_REG_3 +#define BPF_REG_ARG4 EBPF_REG_4 +#define BPF_REG_ARG5 EBPF_REG_5 +#define BPF_REG_CTX EBPF_REG_6 +#define BPF_REG_FP EBPF_REG_10 + +/* Additional register mappings for converted user programs. */ +#define BPF_REG_A EBPF_REG_0 +#define BPF_REG_X EBPF_REG_7 +#define BPF_REG_TMP EBPF_REG_8 + +/* Helper macros for filter block array initializers. */ + +/* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */ + +#define EBPF_ALU64_REG(OP, DST, SRC) \ + ((struct ebpf_insn) { \ + .code = EBPF_ALU64 | BPF_OP(OP) | BPF_X, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = 0, \ + .imm = 0 }) + +#define BPF_ALU32_REG(OP, DST, SRC) \ + ((struct ebpf_insn) { \ + .code = BPF_ALU | BPF_OP(OP) | BPF_X, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = 0, \ + .imm = 0 }) + +/* ALU ops on immediates, bpf_add|sub|...: dst_reg += imm32 */ + +#define BPF_ALU32_IMM(OP, DST, IMM) \ + ((struct ebpf_insn) { \ + .code = BPF_ALU | BPF_OP(OP) | BPF_K, \ + .dst_reg = DST, \ + .src_reg = 0, \ + .off = 0, \ + .imm = IMM }) + +/* Short form of mov, dst_reg = src_reg */ + +#define BPF_MOV64_REG(DST, SRC) \ + ((struct ebpf_insn) { \ + .code = EBPF_ALU64 | EBPF_MOV | BPF_X, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = 0, \ + .imm = 0 }) + +#define BPF_MOV32_REG(DST, SRC) \ + ((struct ebpf_insn) { \ + .code = BPF_ALU | EBPF_MOV | BPF_X, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = 0, \ + .imm = 0 }) + +/* Short form of mov, dst_reg = imm32 */ + +#define BPF_MOV32_IMM(DST, IMM) \ + ((struct ebpf_insn) { \ + .code = BPF_ALU | EBPF_MOV | BPF_K, \ + .dst_reg = DST, \ + .src_reg = 0, \ + .off = 0, \ + .imm = IMM }) + +/* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */ + +#define BPF_MOV32_RAW(TYPE, DST, SRC, IMM) \ + ((struct ebpf_insn) { \ + .code = BPF_ALU | EBPF_MOV | BPF_SRC(TYPE), \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = 0, \ + .imm = IMM }) + +/* Direct packet access, R0 = *(uint *) (skb->data + imm32) */ + +#define BPF_LD_ABS(SIZE, IMM) \ + ((struct ebpf_insn) { \ + .code = BPF_LD | BPF_SIZE(SIZE) | BPF_ABS, \ + .dst_reg = 0, \ + .src_reg = 0, \ + .off = 0, \ + .imm = IMM }) + +/* Memory load, dst_reg = *(uint *) (src_reg + off16) */ + +#define BPF_LDX_MEM(SIZE, DST, SRC, OFF) \ + ((struct ebpf_insn) { \ + .code = BPF_LDX | BPF_SIZE(SIZE) | BPF_MEM, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = OFF, \ + .imm = 0 }) + +/* Memory store, *(uint *) (dst_reg + off16) = src_reg */ + +#define BPF_STX_MEM(SIZE, DST, SRC, OFF) \ + ((struct ebpf_insn) { \ + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_MEM, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = OFF, \ + .imm = 0 }) + +/* Conditional jumps against immediates, if (dst_reg 'op' imm32) goto pc + off16 */ + +#define BPF_JMP_IMM(OP, DST, IMM, OFF) \ + ((struct ebpf_insn) { \ + .code = BPF_JMP | BPF_OP(OP) | BPF_K, \ + .dst_reg = DST, \ + .src_reg = 0, \ + .off = OFF, \ + .imm = IMM }) + +/* Raw code statement block */ + +#define BPF_RAW_INSN(CODE, DST, SRC, OFF, IMM) \ + ((struct ebpf_insn) { \ + .code = CODE, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = OFF, \ + .imm = IMM }) + +/* Program exit */ + +#define BPF_EXIT_INSN() \ + ((struct ebpf_insn) { \ + .code = BPF_JMP | EBPF_EXIT, \ + .dst_reg = 0, \ + .src_reg = 0, \ + .off = 0, \ + .imm = 0 }) + +/* + * Placeholder to convert BPF extensions like length and VLAN tag + * If and when DPDK BPF supports them. + */ +static bool convert_bpf_load(const struct bpf_insn *fp, + struct ebpf_insn **new_insnp __rte_unused) +{ + switch (fp->k) { + case SKF_AD_OFF + SKF_AD_PROTOCOL: + case SKF_AD_OFF + SKF_AD_PKTTYPE: + case SKF_AD_OFF + SKF_AD_IFINDEX: + case SKF_AD_OFF + SKF_AD_HATYPE: + case SKF_AD_OFF + SKF_AD_MARK: + case SKF_AD_OFF + SKF_AD_RXHASH: + case SKF_AD_OFF + SKF_AD_QUEUE: + case SKF_AD_OFF + SKF_AD_VLAN_TAG: + case SKF_AD_OFF + SKF_AD_VLAN_TAG_PRESENT: + case SKF_AD_OFF + SKF_AD_VLAN_TPID: + case SKF_AD_OFF + SKF_AD_PAY_OFFSET: + case SKF_AD_OFF + SKF_AD_NLATTR: + case SKF_AD_OFF + SKF_AD_NLATTR_NEST: + case SKF_AD_OFF + SKF_AD_CPU: + case SKF_AD_OFF + SKF_AD_RANDOM: + case SKF_AD_OFF + SKF_AD_ALU_XOR_X: + /* TODO: Not implemented yet */ + RTE_BPF_LOG(ERR, "BPF extension LOAD ABS %u not supported\n", + fp->k); + return true; + default: + return false; + } +} + +static int bpf_convert_filter(const struct bpf_insn *prog, size_t len, + struct ebpf_insn *new_prog, uint32_t *new_len) +{ + unsigned int pass = 0; + size_t new_flen = 0, target, i; + struct ebpf_insn *new_insn; + const struct bpf_insn *fp; + int *addrs = NULL; + uint8_t bpf_src; + + if (len > BPF_MAXINSNS) { + RTE_BPF_LOG(ERR, "%s: cBPF program too long (%zu insns)\n", + __func__, len); + return -EINVAL; + } + + /* On second pass, allocate the new program */ + if (new_prog) { + addrs = calloc(len, sizeof(*addrs)); + if (addrs == NULL) + return -ENOMEM; + } + +do_pass: + new_insn = new_prog; + fp = prog; + + /* Classic BPF related prologue emission. */ + if (new_insn) { + /* Classic BPF expects A and X to be reset first. These need + * to be guaranteed to be the first two instructions. + */ + *new_insn++ = EBPF_ALU64_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); + *new_insn++ = EBPF_ALU64_REG(BPF_XOR, BPF_REG_X, BPF_REG_X); + + /* All programs must keep CTX in callee saved BPF_REG_CTX. + * In eBPF case it's done by the compiler, here we need to + * do this ourself. Initial CTX is present in BPF_REG_ARG1. + */ + *new_insn++ = BPF_MOV64_REG(BPF_REG_CTX, BPF_REG_ARG1); + } else { + new_insn += 3; + } + + for (i = 0; i < len; fp++, i++) { + struct ebpf_insn tmp_insns[6] = { }; + struct ebpf_insn *insn = tmp_insns; + + if (addrs) + addrs[i] = new_insn - new_prog; + + switch (fp->code) { + /* Absolute loads are how classic BPF accesses skb */ + case BPF_LD | BPF_ABS | BPF_W: + case BPF_LD | BPF_ABS | BPF_H: + case BPF_LD | BPF_ABS | BPF_B: + if (convert_bpf_load(fp, &insn)) + goto err; + + *insn = BPF_RAW_INSN(fp->code, 0, 0, 0, fp->k); + break; + + case BPF_ALU | BPF_DIV | BPF_X: + case BPF_ALU | BPF_MOD | BPF_X: + /* For cBPF, don't cause floating point exception */ + *insn++ = BPF_MOV32_REG(BPF_REG_X, BPF_REG_X); + *insn++ = BPF_JMP_IMM(EBPF_JNE, BPF_REG_X, 0, 2); + *insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); + *insn++ = BPF_EXIT_INSN(); + /* fallthrough */ + case BPF_ALU | BPF_ADD | BPF_X: + case BPF_ALU | BPF_ADD | BPF_K: + case BPF_ALU | BPF_SUB | BPF_X: + case BPF_ALU | BPF_SUB | BPF_K: + case BPF_ALU | BPF_AND | BPF_X: + case BPF_ALU | BPF_AND | BPF_K: + case BPF_ALU | BPF_OR | BPF_X: + case BPF_ALU | BPF_OR | BPF_K: + case BPF_ALU | BPF_LSH | BPF_X: + case BPF_ALU | BPF_LSH | BPF_K: + case BPF_ALU | BPF_RSH | BPF_X: + case BPF_ALU | BPF_RSH | BPF_K: + case BPF_ALU | BPF_XOR | BPF_X: + case BPF_ALU | BPF_XOR | BPF_K: + case BPF_ALU | BPF_MUL | BPF_X: + case BPF_ALU | BPF_MUL | BPF_K: + case BPF_ALU | BPF_DIV | BPF_K: + case BPF_ALU | BPF_MOD | BPF_K: + case BPF_ALU | BPF_NEG: + case BPF_LD | BPF_IND | BPF_W: + case BPF_LD | BPF_IND | BPF_H: + case BPF_LD | BPF_IND | BPF_B: + /* All arithmetic insns map as-is. */ + *insn = BPF_RAW_INSN(fp->code, BPF_REG_A, BPF_REG_X, 0, fp->k); + break; + + /* Jump transformation cannot use BPF block macros + * everywhere as offset calculation and target updates + * require a bit more work than the rest, i.e. jump + * opcodes map as-is, but offsets need adjustment. + */ + +#define BPF_EMIT_JMP \ + do { \ + if (target >= len) \ + goto err; \ + insn->off = addrs ? addrs[target] - addrs[i] - 1 : 0; \ + /* Adjust pc relative offset for 2nd or 3rd insn. */ \ + insn->off -= insn - tmp_insns; \ + } while (0) + + case BPF_JMP | BPF_JA: + target = i + fp->k + 1; + insn->code = fp->code; + BPF_EMIT_JMP; + break; + + case BPF_JMP | BPF_JEQ | BPF_K: + case BPF_JMP | BPF_JEQ | BPF_X: + case BPF_JMP | BPF_JSET | BPF_K: + case BPF_JMP | BPF_JSET | BPF_X: + case BPF_JMP | BPF_JGT | BPF_K: + case BPF_JMP | BPF_JGT | BPF_X: + case BPF_JMP | BPF_JGE | BPF_K: + case BPF_JMP | BPF_JGE | BPF_X: + if (BPF_SRC(fp->code) == BPF_K && (int) fp->k < 0) { + /* BPF immediates are signed, zero extend + * immediate into tmp register and use it + * in compare insn. + */ + *insn++ = BPF_MOV32_IMM(BPF_REG_TMP, fp->k); + + insn->dst_reg = BPF_REG_A; + insn->src_reg = BPF_REG_TMP; + bpf_src = BPF_X; + } else { + insn->dst_reg = BPF_REG_A; + insn->imm = fp->k; + bpf_src = BPF_SRC(fp->code); + insn->src_reg = bpf_src == BPF_X ? BPF_REG_X : 0; + } + + /* Common case where 'jump_false' is next insn. */ + if (fp->jf == 0) { + insn->code = BPF_JMP | BPF_OP(fp->code) | bpf_src; + target = i + fp->jt + 1; + BPF_EMIT_JMP; + break; + } + + /* Convert JEQ into JNE when 'jump_true' is next insn. */ + if (fp->jt == 0 && BPF_OP(fp->code) == BPF_JEQ) { + insn->code = BPF_JMP | EBPF_JNE | bpf_src; + target = i + fp->jf + 1; + BPF_EMIT_JMP; + break; + } + + /* Other jumps are mapped into two insns: Jxx and JA. */ + target = i + fp->jt + 1; + insn->code = BPF_JMP | BPF_OP(fp->code) | bpf_src; + BPF_EMIT_JMP; + insn++; + + insn->code = BPF_JMP | BPF_JA; + target = i + fp->jf + 1; + BPF_EMIT_JMP; + break; + + /* ldxb 4 * ([14] & 0xf) is remaped into 6 insns. */ + case BPF_LDX | BPF_MSH | BPF_B: + /* tmp = A */ + *insn++ = BPF_MOV64_REG(BPF_REG_TMP, BPF_REG_A); + /* A = BPF_R0 = *(u8 *) (skb->data + K) */ + *insn++ = BPF_LD_ABS(BPF_B, fp->k); + /* A &= 0xf */ + *insn++ = BPF_ALU32_IMM(BPF_AND, BPF_REG_A, 0xf); + /* A <<= 2 */ + *insn++ = BPF_ALU32_IMM(BPF_LSH, BPF_REG_A, 2); + /* X = A */ + *insn++ = BPF_MOV64_REG(BPF_REG_X, BPF_REG_A); + /* A = tmp */ + *insn = BPF_MOV64_REG(BPF_REG_A, BPF_REG_TMP); + break; + + /* RET_K is remaped into 2 insns. RET_A case doesn't need an + * extra mov as EBPF_REG_0 is already mapped into BPF_REG_A. + */ + case BPF_RET | BPF_A: + case BPF_RET | BPF_K: + if (BPF_RVAL(fp->code) == BPF_K) { + *insn++ = BPF_MOV32_RAW(BPF_K, EBPF_REG_0, + 0, fp->k); + } + *insn = BPF_EXIT_INSN(); + break; + + /* Store to stack. */ + case BPF_ST: + case BPF_STX: + *insn = BPF_STX_MEM(BPF_W, BPF_REG_FP, BPF_CLASS(fp->code) == + BPF_ST ? BPF_REG_A : BPF_REG_X, + -(BPF_MEMWORDS - fp->k) * 4); + break; + + /* Load from stack. */ + case BPF_LD | BPF_MEM: + case BPF_LDX | BPF_MEM: + *insn = BPF_LDX_MEM(BPF_W, BPF_CLASS(fp->code) == BPF_LD ? + BPF_REG_A : BPF_REG_X, BPF_REG_FP, + -(BPF_MEMWORDS - fp->k) * 4); + break; + + /* A = K or X = K */ + case BPF_LD | BPF_IMM: + case BPF_LDX | BPF_IMM: + *insn = BPF_MOV32_IMM(BPF_CLASS(fp->code) == BPF_LD ? + BPF_REG_A : BPF_REG_X, fp->k); + break; + + /* X = A */ + case BPF_MISC | BPF_TAX: + *insn = BPF_MOV64_REG(BPF_REG_X, BPF_REG_A); + break; + + /* A = X */ + case BPF_MISC | BPF_TXA: + *insn = BPF_MOV64_REG(BPF_REG_A, BPF_REG_X); + break; + + /* A = mbuf->len or X = mbuf->len */ + case BPF_LD | BPF_W | BPF_LEN: + case BPF_LDX | BPF_W | BPF_LEN: + /* BPF_ABS/BPF_IND implicitly expect mbuf ptr in R6 */ + + *insn = BPF_LDX_MEM(BPF_W, BPF_CLASS(fp->code) == BPF_LD ? + BPF_REG_A : BPF_REG_X, BPF_REG_CTX, + offsetof(struct rte_mbuf, pkt_len)); + break; + + /* Unknown instruction. */ + default: + RTE_BPF_LOG(ERR, "%s: Unknown instruction!: %#x\n", + __func__, fp->code); + goto err; + } + + insn++; + if (new_prog) + memcpy(new_insn, tmp_insns, + sizeof(*insn) * (insn - tmp_insns)); + new_insn += insn - tmp_insns; + } + + if (!new_prog) { + /* Only calculating new length. */ + *new_len = new_insn - new_prog; + return 0; + } + + pass++; + if ((ptrdiff_t)new_flen != new_insn - new_prog) { + new_flen = new_insn - new_prog; + if (pass > 2) + goto err; + goto do_pass; + } + + free(addrs); + assert(*new_len == new_flen); + + return 0; +err: + free(addrs); + return -1; +} + +struct rte_bpf_prm * +rte_bpf_convert(const struct bpf_program *prog) +{ + struct rte_bpf_prm *prm = NULL; + struct ebpf_insn *ebpf = NULL; + uint32_t ebpf_len = 0; + int ret; + + if (prog == NULL) { + RTE_BPF_LOG(ERR, "%s: NULL program\n", __func__); + rte_errno = EINVAL; + return NULL; + } + + /* 1st pass: calculate the eBPF program length */ + ret = bpf_convert_filter(prog->bf_insns, prog->bf_len, NULL, &ebpf_len); + if (ret < 0) { + RTE_BPF_LOG(ERR, "%s: cannot get eBPF length\n", __func__); + rte_errno = -ret; + return NULL; + } + + RTE_BPF_LOG(DEBUG, "%s: prog len cBPF=%u -> eBPF=%u\n", + __func__, prog->bf_len, ebpf_len); + + prm = rte_zmalloc("bpf_filter", + sizeof(*prm) + ebpf_len * sizeof(*ebpf), 0); + if (prm == NULL) { + rte_errno = ENOMEM; + return NULL; + } + + /* The EPBF instructions in this case are right after the header */ + ebpf = (void *)(prm + 1); + + /* 2nd pass: remap cBPF to eBPF instructions */ + ret = bpf_convert_filter(prog->bf_insns, prog->bf_len, ebpf, &ebpf_len); + if (ret < 0) { + RTE_BPF_LOG(ERR, "%s: cannot convert cBPF to eBPF\n", __func__); + free(prm); + rte_errno = -ret; + return NULL; + } + + prm->ins = ebpf; + prm->nb_ins = ebpf_len; + + /* Classic BPF programs use mbufs */ + prm->prog_arg.type = RTE_BPF_ARG_PTR_MBUF; + prm->prog_arg.size = sizeof(struct rte_mbuf); + + return prm; +} diff --git a/lib/bpf/meson.build b/lib/bpf/meson.build index 63cbd60185e0..54f7610ae990 100644 --- a/lib/bpf/meson.build +++ b/lib/bpf/meson.build @@ -25,3 +25,8 @@ if dep.found() sources += files('bpf_load_elf.c') ext_deps += dep endif + +if dpdk_conf.has('RTE_PORT_PCAP') + sources += files('bpf_convert.c') + ext_deps += pcap_dep +endif diff --git a/lib/bpf/rte_bpf.h b/lib/bpf/rte_bpf.h index 69116f36ba8b..2f23e272a376 100644 --- a/lib/bpf/rte_bpf.h +++ b/lib/bpf/rte_bpf.h @@ -198,6 +198,31 @@ rte_bpf_exec_burst(const struct rte_bpf *bpf, void *ctx[], uint64_t rc[], int rte_bpf_get_jit(const struct rte_bpf *bpf, struct rte_bpf_jit *jit); +#ifdef RTE_PORT_PCAP + +struct bpf_program; + +/** + * Convert a Classic BPF program from libpcap into a DPDK BPF code. + * + * @param prog + * Classic BPF program from pcap_compile(). + * @param prm + * Result Extended BPF program. + * @return + * Pointer to BPF program (allocated with *rte_malloc*) + * that is used in future BPF operations, + * or NULL on error, with error code set in rte_errno. + * Possible rte_errno errors include: + * - EINVAL - invalid parameter passed to function + * - ENOMEM - can't reserve enough memory + */ +__rte_experimental +struct rte_bpf_prm * +rte_bpf_convert(const struct bpf_program *prog); + +#endif + #ifdef __cplusplus } #endif diff --git a/lib/bpf/version.map b/lib/bpf/version.map index 0bf35f487666..47082d5003ef 100644 --- a/lib/bpf/version.map +++ b/lib/bpf/version.map @@ -14,3 +14,9 @@ DPDK_22 { local: *; }; + +EXPERIMENTAL { + global: + + rte_bpf_convert; +}; From patchwork Fri Sep 10 18:18:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98677 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D108FA0547; Fri, 10 Sep 2021 20:19:19 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5A29F41154; Fri, 10 Sep 2021 20:18:55 +0200 (CEST) Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by mails.dpdk.org (Postfix) with ESMTP id A0F5241140 for ; Fri, 10 Sep 2021 20:18:51 +0200 (CEST) Received: by mail-pg1-f179.google.com with SMTP id t1so2580719pgv.3 for ; Fri, 10 Sep 2021 11:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZxzxLoQP0SUm+Bqk8X1ceBPXhUZPFhLr74/7ty41ZNs=; b=DyJntPD7HQkQhRArNvvLuoNAjnGot6NCv6ujM7c1yt8llfqoSmRA5DYZorGuLGTbxh DYlpvz5SbZEFrIJ23pQFp1eoJmvTH5Rlz9PeRbh3pMsGMYANw6ENKzQZUeAMViTH+uYZ 5tSmxI8pNvTLzCGorlCcjWilU/eS9QmgqOU6Q6fOzT3vDOeUiFoCPo5B/Ja0D2I2MM+s JG9IAHq5PHU0bIl/Dze6qws+wHFGdCLdgOG3LgrweJkDRJQxfpSKhJHl+/UEFMpE3uN0 O3DsZXU5MC7Rp/oKm3gTkmBf+ZQyxQpG+KaEVZDy3/4fMKb9JGtxqqaziY5ugeSmvfnB fUKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZxzxLoQP0SUm+Bqk8X1ceBPXhUZPFhLr74/7ty41ZNs=; b=vaFkr/Hr6d9xdg4i72qz6OJzd7CX5avmPQXoaEN3ajsEgeoYn5h04lyWPJU5ijDPNd 7WE9KO04hEdTtkG9aJkQaWV+8V5lOInJf14WeGnLps76hrdK/kEjvrczRSgzIQw/g2jp cDE1aWdkUxSYxFFzq+0JfcKnPDzYx6bhRnEBk92SUOWi1unA0mA9E3Pmiu86YVyd9Liq m5aGL95TaIpnMMMnkxB4/U6NeYJnCaiRvu4t3BrLJ9SSp8AVlMvyt85wWrxp33oADeou s9zXliAoVyXpy/IpDh0edajQCEpHnSVkoj8bVzxwA3OmlZDmgw9DHljyZ7xqORi4PXlD 3DaQ== X-Gm-Message-State: AOAM530w4iqK2NQQXmYEhAM76wq9Kt5gBTHmLwcCRCrCDXikykGUslk5 DEnRH5twIo3yDymMsAxmDfmLpLhFIxhbrQ== X-Google-Smtp-Source: ABdhPJxk0HKoLg0lA2354PTgbT08Y5Vf9wgXG9U6yZHP2d9TN8ekLiBRKm08gVrueiDmujSAcV/FlQ== X-Received: by 2002:a63:aa4c:: with SMTP id x12mr8531419pgo.211.1631297930311; Fri, 10 Sep 2021 11:18:50 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:49 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:35 -0700 Message-Id: <20210910181841.530280-6-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 05/11] bpf: add function to dump eBPF instructions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When debugging converted (and other) programs it is useful to see disassembled eBPF output. Signed-off-by: Stephen Hemminger --- lib/bpf/bpf_convert.c | 7 ++- lib/bpf/bpf_dump.c | 118 ++++++++++++++++++++++++++++++++++++++++++ lib/bpf/meson.build | 1 + lib/bpf/rte_bpf.h | 14 +++++ lib/bpf/version.map | 1 + 5 files changed, 140 insertions(+), 1 deletion(-) create mode 100644 lib/bpf/bpf_dump.c diff --git a/lib/bpf/bpf_convert.c b/lib/bpf/bpf_convert.c index 198e6d359042..f649fa663edf 100644 --- a/lib/bpf/bpf_convert.c +++ b/lib/bpf/bpf_convert.c @@ -331,7 +331,12 @@ static int bpf_convert_filter(const struct bpf_insn *prog, size_t len, case BPF_LD | BPF_IND | BPF_H: case BPF_LD | BPF_IND | BPF_B: /* All arithmetic insns map as-is. */ - *insn = BPF_RAW_INSN(fp->code, BPF_REG_A, BPF_REG_X, 0, fp->k); + insn->code = fp->code; + insn->dst_reg = BPF_REG_A; + bpf_src = BPF_SRC(fp->code); + insn->src_reg = bpf_src == BPF_X ? BPF_REG_X : 0; + insn->off = 0; + insn->imm = fp->k; break; /* Jump transformation cannot use BPF block macros diff --git a/lib/bpf/bpf_dump.c b/lib/bpf/bpf_dump.c new file mode 100644 index 000000000000..a6a431e64903 --- /dev/null +++ b/lib/bpf/bpf_dump.c @@ -0,0 +1,118 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2021 Stephen Hemminger + * Based on filter2xdp + * Copyright (C) 2017 Tobias Klauser + */ + +#include +#include + +#include "rte_bpf.h" + +#define BPF_OP_INDEX(x) (BPF_OP(x) >> 4) +#define BPF_SIZE_INDEX(x) (BPF_SIZE(x) >> 3) + +static const char *const class_tbl[] = { + [BPF_LD] = "ld", [BPF_LDX] = "ldx", [BPF_ST] = "st", + [BPF_STX] = "stx", [BPF_ALU] = "alu", [BPF_JMP] = "jmp", + [BPF_RET] = "ret", [BPF_MISC] = "alu64", +}; + +static const char *const alu_op_tbl[16] = { + [BPF_ADD >> 4] = "add", [BPF_SUB >> 4] = "sub", + [BPF_MUL >> 4] = "mul", [BPF_DIV >> 4] = "div", + [BPF_OR >> 4] = "or", [BPF_AND >> 4] = "and", + [BPF_LSH >> 4] = "lsh", [BPF_RSH >> 4] = "rsh", + [BPF_NEG >> 4] = "neg", [BPF_MOD >> 4] = "mod", + [BPF_XOR >> 4] = "xor", [EBPF_MOV >> 4] = "mov", + [EBPF_ARSH >> 4] = "arsh", [EBPF_END >> 4] = "endian", +}; + +static const char *const size_tbl[] = { + [BPF_W >> 3] = "w", + [BPF_H >> 3] = "h", + [BPF_B >> 3] = "b", + [EBPF_DW >> 3] = "dw", +}; + +static const char *const jump_tbl[16] = { + [BPF_JA >> 4] = "ja", [BPF_JEQ >> 4] = "jeq", + [BPF_JGT >> 4] = "jgt", [BPF_JGE >> 4] = "jge", + [BPF_JSET >> 4] = "jset", [EBPF_JNE >> 4] = "jne", + [EBPF_JSGT >> 4] = "jsgt", [EBPF_JSGE >> 4] = "jsge", + [EBPF_CALL >> 4] = "call", [EBPF_EXIT >> 4] = "exit", +}; + +static void ebpf_dump(FILE *f, const struct ebpf_insn insn, size_t n) +{ + const char *op, *postfix = ""; + uint8_t cls = BPF_CLASS(insn.code); + + fprintf(f, " L%zu:\t", n); + + switch (cls) { + default: + fprintf(f, "unimp 0x%x // class: %s\n", insn.code, + class_tbl[cls]); + break; + case BPF_ALU: + postfix = "32"; + /* fall through */ + case EBPF_ALU64: + op = alu_op_tbl[BPF_OP_INDEX(insn.code)]; + if (BPF_SRC(insn.code) == BPF_X) + fprintf(f, "%s%s r%u, r%u\n", op, postfix, insn.dst_reg, + insn.src_reg); + else + fprintf(f, "%s%s r%u, #0x%x\n", op, postfix, + insn.dst_reg, insn.imm); + break; + case BPF_LD: + op = "ld"; + postfix = size_tbl[BPF_SIZE_INDEX(insn.code)]; + if (BPF_MODE(insn.code) == BPF_IMM) + fprintf(f, "%s%s r%d, #0x%x\n", op, postfix, + insn.dst_reg, insn.imm); + else if (BPF_MODE(insn.code) == BPF_ABS) + fprintf(f, "%s%s r%d, [%d]\n", op, postfix, + insn.dst_reg, insn.imm); + else if (BPF_MODE(insn.code) == BPF_IND) + fprintf(f, "%s%s r%d, [r%u + %d]\n", op, postfix, + insn.dst_reg, insn.src_reg, insn.imm); + else + fprintf(f, "// BUG: LD opcode 0x%02x in eBPF insns\n", + insn.code); + break; + case BPF_LDX: + op = "ldx"; + postfix = size_tbl[BPF_SIZE_INDEX(insn.code)]; + fprintf(f, "%s%s r%d, [r%u + %d]\n", op, postfix, insn.dst_reg, + insn.src_reg, insn.off); + break; +#define L(pc, off) ((int)(pc) + 1 + (off)) + case BPF_JMP: + op = jump_tbl[BPF_OP_INDEX(insn.code)]; + if (op == NULL) + fprintf(f, "invalid jump opcode: %#x\n", insn.code); + else if (BPF_OP(insn.code) == BPF_JA) + fprintf(f, "%s L%d\n", op, L(n, insn.off)); + else if (BPF_OP(insn.code) == EBPF_EXIT) + fprintf(f, "%s\n", op); + else + fprintf(f, "%s r%u, #0x%x, L%d\n", op, insn.dst_reg, + insn.imm, L(n, insn.off)); + break; + case BPF_RET: + fprintf(f, "// BUG: RET opcode 0x%02x in eBPF insns\n", + insn.code); + break; + } +} + +void rte_bpf_dump(FILE *f, const struct ebpf_insn *buf, uint32_t len) +{ + uint32_t i; + + for (i = 0; i < len; ++i) + ebpf_dump(f, buf[i], i); +} diff --git a/lib/bpf/meson.build b/lib/bpf/meson.build index 54f7610ae990..5b5585173aeb 100644 --- a/lib/bpf/meson.build +++ b/lib/bpf/meson.build @@ -2,6 +2,7 @@ # Copyright(c) 2018 Intel Corporation sources = files('bpf.c', + 'bpf_dump.c', 'bpf_exec.c', 'bpf_load.c', 'bpf_pkt.c', diff --git a/lib/bpf/rte_bpf.h b/lib/bpf/rte_bpf.h index 2f23e272a376..0d0a84b130a0 100644 --- a/lib/bpf/rte_bpf.h +++ b/lib/bpf/rte_bpf.h @@ -198,6 +198,20 @@ rte_bpf_exec_burst(const struct rte_bpf *bpf, void *ctx[], uint64_t rc[], int rte_bpf_get_jit(const struct rte_bpf *bpf, struct rte_bpf_jit *jit); +/** + * Dump epf instructions to a file. + * + * @param f + * A pointer to a file for output + * @param buf + * A pointer to BPF instructions + * @param len + * Number of BPF instructions to dump. + */ +__rte_experimental +void +rte_bpf_dump(FILE *f, const struct ebpf_insn *buf, uint32_t len); + #ifdef RTE_PORT_PCAP struct bpf_program; diff --git a/lib/bpf/version.map b/lib/bpf/version.map index 47082d5003ef..3b953f2f4592 100644 --- a/lib/bpf/version.map +++ b/lib/bpf/version.map @@ -19,4 +19,5 @@ EXPERIMENTAL { global: rte_bpf_convert; + rte_bpf_dump; }; From patchwork Fri Sep 10 18:18:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98678 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 68556A0547; Fri, 10 Sep 2021 20:19:27 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E1A5741155; Fri, 10 Sep 2021 20:18:58 +0200 (CEST) Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by mails.dpdk.org (Postfix) with ESMTP id 93B974114E for ; Fri, 10 Sep 2021 20:18:53 +0200 (CEST) Received: by mail-pg1-f175.google.com with SMTP id r2so2558934pgl.10 for ; Fri, 10 Sep 2021 11:18:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ILYXBi6oR68ploilVtGOYTguVitUDnZ6veSRxnLdhVo=; b=G5xT+2c1t5Fu+RI5mfEanFLJk6GZkCBZg8w/1uMTeXoxgF4QiN9NkerU0S7ItNIio1 pupuxV+EKyGfKQNe+xb5HX2SyFCxWaIbbXUamkZJXJUgns8uWOsOMuxkvIugjzuxEpXP 6c9ePV+NWtDNwD6uikT3s51POmSWFbtDAN/tLpJePfXLjOK3FI8vcMx0NlSJYZcF7O+D ePKn3D2Nnh8vQUyQKrnvAEjt+6ioTeSDScewwL5AfWKk6EqQhyo4kh+UEIQgSLcIxeQf oFw7AXWjh1OnXodYy7qolei1NSYcaIeTs6eSAdXUQssMTKEy255fEtrnA+yiuDeO5Q2C 4QxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ILYXBi6oR68ploilVtGOYTguVitUDnZ6veSRxnLdhVo=; b=pqPt6Qcpwox3DPP9F0Eb4S0ROT0ZN7eHD8QOM5od5YrQiPwWmAe5T6QyzB2WWkUv91 SQ70zvW1Tib/e3WlCOFJ8kj3tkGZakcG1pK2hNOa8f1Tc4DWkclHdyUMv5WIURVo6Sf4 3bUJ7GQEB7i6C6nUcpwO14eymtX0ELTCqRO/EsSmfAqIaCtAo7S4h6cwEslsVi1s1Jiv hQDxp+Sl8KGALi9aFBUAEM6yPfXO/3JTAhWXbzuu0Qnpby14k/pgl6sl5svPQgRSsQEz T6wsBJ/Z20XNo5W+mxjdtwU3dm0qq6of2wFwwsSluMaKGQr1joV0LLwmzOxqtcND+kbJ /Egg== X-Gm-Message-State: AOAM532/EHnVVDB8FmgolZZAqDLu/W79TvmLrOfpdZWM+6W6swqkAzdL wWu5g3PfdaG0FaXXopYPLeWoUzUsShxm2A== X-Google-Smtp-Source: ABdhPJzp0OIPyRj+tBnGhIVqk51i7pxt88ibcndKm3baNM1yvDJI/icg2iXsJPC3/kIlRn5GHVgaLg== X-Received: by 2002:a63:3449:: with SMTP id b70mr8340516pga.315.1631297931631; Fri, 10 Sep 2021 11:18:51 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:50 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:36 -0700 Message-Id: <20210910181841.530280-7-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 06/11] pdump: support pcapng and filtering X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This enhances the DPDK pdump library to support new pcapng format and filtering via BPF. The internal client/server protocol is changed to support two versions: the original pdump basic version and a new pcapng version. The internal version number (not part of exposed API or ABI) is intentionally increased to cause any attempt to try mismatched primary/secondary process to fail. Add new API to do allow filtering of captured packets with DPDK BPF (eBPF) filter program. It keeps statistics on packets captured, filtered, and missed (because ring was full). Signed-off-by: Stephen Hemminger --- lib/meson.build | 4 +- lib/pdump/meson.build | 2 +- lib/pdump/rte_pdump.c | 437 ++++++++++++++++++++++++++++++------------ lib/pdump/rte_pdump.h | 110 ++++++++++- lib/pdump/version.map | 8 + 5 files changed, 435 insertions(+), 126 deletions(-) diff --git a/lib/meson.build b/lib/meson.build index ba88e9eabc58..1da521ea6185 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -26,6 +26,7 @@ libraries = [ 'timer', # eventdev depends on this 'acl', 'bbdev', + 'bpf', 'bitratestats', 'cfgfile', 'compressdev', @@ -43,7 +44,6 @@ libraries = [ 'member', 'pcapng', 'power', - 'pdump', 'rawdev', 'regexdev', 'rib', @@ -55,10 +55,10 @@ libraries = [ 'ipsec', # ipsec lib depends on net, crypto and security 'fib', #fib lib depends on rib 'port', # pkt framework libs which use other libs from above + 'pdump', # pdump lib depends on bpf pcapng 'table', 'pipeline', 'flow_classify', # flow_classify lib depends on pkt framework table lib - 'bpf', 'graph', 'node', ] diff --git a/lib/pdump/meson.build b/lib/pdump/meson.build index 3a95eabde6a6..51ceb2afdec5 100644 --- a/lib/pdump/meson.build +++ b/lib/pdump/meson.build @@ -3,4 +3,4 @@ sources = files('rte_pdump.c') headers = files('rte_pdump.h') -deps += ['ethdev'] +deps += ['ethdev', 'bpf', 'pcapng'] diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c index 382217bc1564..f2047ad9f001 100644 --- a/lib/pdump/rte_pdump.c +++ b/lib/pdump/rte_pdump.c @@ -7,8 +7,10 @@ #include #include #include +#include #include #include +#include #include "rte_pdump.h" @@ -27,30 +29,26 @@ enum pdump_operation { ENABLE = 2 }; +/* + * Note: version numbers intentionally start at 3 + * in order to catch any application built with older out + * version of DPDK using incompatible client request format. + */ enum pdump_version { - V1 = 1 + PDUMP_CLIENT_LEGACY = 3, + PDUMP_CLIENT_PCAPNG = 4, }; struct pdump_request { uint16_t ver; uint16_t op; - uint32_t flags; - union pdump_data { - struct enable_v1 { - char device[RTE_DEV_NAME_MAX_LEN]; - uint16_t queue; - struct rte_ring *ring; - struct rte_mempool *mp; - void *filter; - } en_v1; - struct disable_v1 { - char device[RTE_DEV_NAME_MAX_LEN]; - uint16_t queue; - struct rte_ring *ring; - struct rte_mempool *mp; - void *filter; - } dis_v1; - } data; + uint16_t flags; + uint16_t queue; + struct rte_ring *ring; + struct rte_mempool *mp; + const struct rte_bpf_prm *prm; + uint32_t snaplen; + char device[RTE_DEV_NAME_MAX_LEN]; }; struct pdump_response { @@ -63,80 +61,140 @@ static struct pdump_rxtx_cbs { struct rte_ring *ring; struct rte_mempool *mp; const struct rte_eth_rxtx_callback *cb; - void *filter; + const struct rte_bpf *filter; + enum pdump_version ver; + uint32_t snaplen; } rx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT], tx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT]; - -static inline void -pdump_copy(struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params) +static const char *MZ_RTE_PDUMP_STATS = "rte_pdump_stats"; + +/* Shared memory between primary and secondary processes. */ +static struct { + struct rte_pdump_stats rx[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT]; + struct rte_pdump_stats tx[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT]; +} *pdump_stats; + +/* Create a clone of mbuf to be placed into ring. */ +static void +pdump_copy(uint16_t port_id, uint16_t queue, + enum rte_pcapng_direction direction, + struct rte_mbuf **pkts, uint16_t nb_pkts, + const struct pdump_rxtx_cbs *cbs, + struct rte_pdump_stats *stats) { unsigned int i; int ring_enq; uint16_t d_pkts = 0; struct rte_mbuf *dup_bufs[nb_pkts]; - struct pdump_rxtx_cbs *cbs; + uint64_t ts; struct rte_ring *ring; struct rte_mempool *mp; struct rte_mbuf *p; + uint64_t rcs[nb_pkts]; + + if (cbs->filter && + rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts) == 0) { + /* All packets were filtered out */ + __atomic_fetch_add(&stats->filtered, nb_pkts, + __ATOMIC_RELAXED); + return; + } - cbs = user_params; + ts = rte_get_tsc_cycles(); ring = cbs->ring; mp = cbs->mp; for (i = 0; i < nb_pkts; i++) { - p = rte_pktmbuf_copy(pkts[i], mp, 0, UINT32_MAX); - if (p) + /* + * Similar behavior to rte_bpf_eth callback. + * if BPF program returns zero value for a given packet, + * then it will be ignored. + */ + if (cbs->filter && rcs[i] == 0) { + __atomic_fetch_add(&stats->filtered, + 1, __ATOMIC_RELAXED); + continue; + } + + /* + * If using pcapng then want to wrap packets + * otherwise a simple copy. + */ + if (cbs->ver == PDUMP_CLIENT_PCAPNG) + p = rte_pcapng_copy(port_id, queue, + pkts[i], mp, cbs->snaplen, + ts, direction); + else + p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen); + + if (unlikely(p == NULL)) + __atomic_fetch_add(&stats->nombuf, 1, __ATOMIC_RELAXED); + else dup_bufs[d_pkts++] = p; } + __atomic_fetch_add(&stats->accepted, d_pkts, __ATOMIC_RELAXED); + ring_enq = rte_ring_enqueue_burst(ring, (void *)dup_bufs, d_pkts, NULL); if (unlikely(ring_enq < d_pkts)) { unsigned int drops = d_pkts - ring_enq; - PDUMP_LOG(DEBUG, - "only %d of packets enqueued to ring\n", ring_enq); + __atomic_fetch_add(&stats->ringfull, drops, __ATOMIC_RELAXED); rte_pktmbuf_free_bulk(&dup_bufs[ring_enq], drops); } } static uint16_t -pdump_rx(uint16_t port __rte_unused, uint16_t qidx __rte_unused, +pdump_rx(uint16_t port, uint16_t queue, struct rte_mbuf **pkts, uint16_t nb_pkts, - uint16_t max_pkts __rte_unused, - void *user_params) + uint16_t max_pkts __rte_unused, void *user_params) { - pdump_copy(pkts, nb_pkts, user_params); + const struct pdump_rxtx_cbs *cbs = user_params; + struct rte_pdump_stats *stats = &pdump_stats->rx[port][queue]; + + pdump_copy(port, queue, RTE_PCAPNG_DIRECTION_IN, + pkts, nb_pkts, cbs, stats); return nb_pkts; } static uint16_t -pdump_tx(uint16_t port __rte_unused, uint16_t qidx __rte_unused, +pdump_tx(uint16_t port, uint16_t queue, struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params) { - pdump_copy(pkts, nb_pkts, user_params); + const struct pdump_rxtx_cbs *cbs = user_params; + struct rte_pdump_stats *stats = &pdump_stats->tx[port][queue]; + + pdump_copy(port, queue, RTE_PCAPNG_DIRECTION_OUT, + pkts, nb_pkts, cbs, stats); return nb_pkts; } static int -pdump_register_rx_callbacks(uint16_t end_q, uint16_t port, uint16_t queue, - struct rte_ring *ring, struct rte_mempool *mp, - uint16_t operation) +pdump_register_rx_callbacks(enum pdump_version ver, + uint16_t end_q, uint16_t port, uint16_t queue, + struct rte_ring *ring, struct rte_mempool *mp, + struct rte_bpf *filter, + uint16_t operation, uint32_t snaplen) { uint16_t qid; - struct pdump_rxtx_cbs *cbs = NULL; qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue; for (; qid < end_q; qid++) { - cbs = &rx_cbs[port][qid]; - if (cbs && operation == ENABLE) { + struct pdump_rxtx_cbs *cbs = &rx_cbs[port][qid]; + + if (operation == ENABLE) { if (cbs->cb) { PDUMP_LOG(ERR, "rx callback for port=%d queue=%d, already exists\n", port, qid); return -EEXIST; } + cbs->ver = ver; cbs->ring = ring; cbs->mp = mp; + cbs->snaplen = snaplen; + cbs->filter = filter; + cbs->cb = rte_eth_add_first_rx_callback(port, qid, pdump_rx, cbs); if (cbs->cb == NULL) { @@ -145,8 +203,7 @@ pdump_register_rx_callbacks(uint16_t end_q, uint16_t port, uint16_t queue, rte_errno); return rte_errno; } - } - if (cbs && operation == DISABLE) { + } else if (operation == DISABLE) { int ret; if (cbs->cb == NULL) { @@ -170,26 +227,32 @@ pdump_register_rx_callbacks(uint16_t end_q, uint16_t port, uint16_t queue, } static int -pdump_register_tx_callbacks(uint16_t end_q, uint16_t port, uint16_t queue, - struct rte_ring *ring, struct rte_mempool *mp, - uint16_t operation) +pdump_register_tx_callbacks(enum pdump_version ver, + uint16_t end_q, uint16_t port, uint16_t queue, + struct rte_ring *ring, struct rte_mempool *mp, + struct rte_bpf *filter, + uint16_t operation, uint32_t snaplen) { uint16_t qid; - struct pdump_rxtx_cbs *cbs = NULL; qid = (queue == RTE_PDUMP_ALL_QUEUES) ? 0 : queue; for (; qid < end_q; qid++) { - cbs = &tx_cbs[port][qid]; - if (cbs && operation == ENABLE) { + struct pdump_rxtx_cbs *cbs = &tx_cbs[port][qid]; + + if (operation == ENABLE) { if (cbs->cb) { PDUMP_LOG(ERR, "tx callback for port=%d queue=%d, already exists\n", port, qid); return -EEXIST; } + cbs->ver = ver; cbs->ring = ring; cbs->mp = mp; + cbs->snaplen = snaplen; + cbs->filter = filter; + cbs->cb = rte_eth_add_tx_callback(port, qid, pdump_tx, cbs); if (cbs->cb == NULL) { @@ -198,8 +261,7 @@ pdump_register_tx_callbacks(uint16_t end_q, uint16_t port, uint16_t queue, rte_errno); return rte_errno; } - } - if (cbs && operation == DISABLE) { + } else if (operation == DISABLE) { int ret; if (cbs->cb == NULL) { @@ -228,37 +290,47 @@ set_pdump_rxtx_cbs(const struct pdump_request *p) uint16_t nb_rx_q = 0, nb_tx_q = 0, end_q, queue; uint16_t port; int ret = 0; + struct rte_bpf *filter = NULL; uint32_t flags; uint16_t operation; struct rte_ring *ring; struct rte_mempool *mp; - flags = p->flags; - operation = p->op; - if (operation == ENABLE) { - ret = rte_eth_dev_get_port_by_name(p->data.en_v1.device, - &port); - if (ret < 0) { + if (!(p->ver == PDUMP_CLIENT_LEGACY || + p->ver == PDUMP_CLIENT_PCAPNG)) { + PDUMP_LOG(ERR, + "incorrect client version %u\n", p->ver); + return -EINVAL; + } + + if (p->prm) { + if (p->prm->prog_arg.type != RTE_BPF_ARG_PTR_MBUF) { PDUMP_LOG(ERR, - "failed to get port id for device id=%s\n", - p->data.en_v1.device); + "invalid BPF program type: %u\n", + p->prm->prog_arg.type); return -EINVAL; } - queue = p->data.en_v1.queue; - ring = p->data.en_v1.ring; - mp = p->data.en_v1.mp; - } else { - ret = rte_eth_dev_get_port_by_name(p->data.dis_v1.device, - &port); - if (ret < 0) { - PDUMP_LOG(ERR, - "failed to get port id for device id=%s\n", - p->data.dis_v1.device); - return -EINVAL; + + filter = rte_bpf_load(p->prm); + if (filter == NULL) { + PDUMP_LOG(ERR, "cannot load BPF filter: %s\n", + rte_strerror(rte_errno)); + return -rte_errno; } - queue = p->data.dis_v1.queue; - ring = p->data.dis_v1.ring; - mp = p->data.dis_v1.mp; + } + + flags = p->flags; + operation = p->op; + queue = p->queue; + ring = p->ring; + mp = p->mp; + + ret = rte_eth_dev_get_port_by_name(p->device, &port); + if (ret < 0) { + PDUMP_LOG(ERR, + "failed to get port id for device id=%s\n", + p->device); + return -EINVAL; } /* validation if packet capture is for all queues */ @@ -296,8 +368,9 @@ set_pdump_rxtx_cbs(const struct pdump_request *p) /* register RX callback */ if (flags & RTE_PDUMP_FLAG_RX) { end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_rx_q : queue + 1; - ret = pdump_register_rx_callbacks(end_q, port, queue, ring, mp, - operation); + ret = pdump_register_rx_callbacks(p->ver, end_q, port, queue, + ring, mp, filter, + operation, p->snaplen); if (ret < 0) return ret; } @@ -305,8 +378,9 @@ set_pdump_rxtx_cbs(const struct pdump_request *p) /* register TX callback */ if (flags & RTE_PDUMP_FLAG_TX) { end_q = (queue == RTE_PDUMP_ALL_QUEUES) ? nb_tx_q : queue + 1; - ret = pdump_register_tx_callbacks(end_q, port, queue, ring, mp, - operation); + ret = pdump_register_tx_callbacks(p->ver, end_q, port, queue, + ring, mp, filter, + operation, p->snaplen); if (ret < 0) return ret; } @@ -347,8 +421,18 @@ pdump_server(const struct rte_mp_msg *mp_msg, const void *peer) int rte_pdump_init(void) { + const struct rte_memzone *mz; int ret; + mz = rte_memzone_reserve(MZ_RTE_PDUMP_STATS, sizeof(*pdump_stats), + rte_socket_id(), 0); + if (mz == NULL) { + PDUMP_LOG(ERR, "cannot allocate pdump statistics\n"); + rte_errno = ENOMEM; + return -1; + } + pdump_stats = mz->addr; + ret = rte_mp_action_register(PDUMP_MP, pdump_server); if (ret && rte_errno != ENOTSUP) return -1; @@ -392,14 +476,21 @@ pdump_validate_ring_mp(struct rte_ring *ring, struct rte_mempool *mp) static int pdump_validate_flags(uint32_t flags) { - if (flags != RTE_PDUMP_FLAG_RX && flags != RTE_PDUMP_FLAG_TX && - flags != RTE_PDUMP_FLAG_RXTX) { + if ((flags & RTE_PDUMP_FLAG_RXTX) == 0) { PDUMP_LOG(ERR, "invalid flags, should be either rx/tx/rxtx\n"); rte_errno = EINVAL; return -1; } + /* mask off the flags we know about */ + if (flags & ~(RTE_PDUMP_FLAG_RXTX | RTE_PDUMP_FLAG_PCAPNG)) { + PDUMP_LOG(ERR, + "unknown flags: %#x\n", flags); + rte_errno = ENOTSUP; + return -1; + } + return 0; } @@ -426,12 +517,12 @@ pdump_validate_port(uint16_t port, char *name) } static int -pdump_prepare_client_request(char *device, uint16_t queue, - uint32_t flags, - uint16_t operation, - struct rte_ring *ring, - struct rte_mempool *mp, - void *filter) +pdump_prepare_client_request(const char *device, uint16_t queue, + uint32_t flags, uint32_t snaplen, + uint16_t operation, + struct rte_ring *ring, + struct rte_mempool *mp, + const struct rte_bpf_prm *prm) { int ret = -1; struct rte_mp_msg mp_req, *mp_rep; @@ -440,23 +531,23 @@ pdump_prepare_client_request(char *device, uint16_t queue, struct pdump_request *req = (struct pdump_request *)mp_req.param; struct pdump_response *resp; - req->ver = 1; - req->flags = flags; + memset(req, 0, sizeof(*req)); + if (flags & RTE_PDUMP_FLAG_PCAPNG) + req->ver = PDUMP_CLIENT_PCAPNG; + else + req->ver = PDUMP_CLIENT_LEGACY; + + req->flags = flags & RTE_PDUMP_FLAG_RXTX; req->op = operation; + req->queue = queue; + strlcpy(req->device, device, sizeof(req->device)); + if ((operation & ENABLE) != 0) { - strlcpy(req->data.en_v1.device, device, - sizeof(req->data.en_v1.device)); - req->data.en_v1.queue = queue; - req->data.en_v1.ring = ring; - req->data.en_v1.mp = mp; - req->data.en_v1.filter = filter; - } else { - strlcpy(req->data.dis_v1.device, device, - sizeof(req->data.dis_v1.device)); - req->data.dis_v1.queue = queue; - req->data.dis_v1.ring = NULL; - req->data.dis_v1.mp = NULL; - req->data.dis_v1.filter = NULL; + req->queue = queue; + req->ring = ring; + req->mp = mp; + req->prm = prm; + req->snaplen = snaplen; } strlcpy(mp_req.name, PDUMP_MP, RTE_MP_MAX_NAME_LEN); @@ -477,11 +568,17 @@ pdump_prepare_client_request(char *device, uint16_t queue, return ret; } -int -rte_pdump_enable(uint16_t port, uint16_t queue, uint32_t flags, - struct rte_ring *ring, - struct rte_mempool *mp, - void *filter) +/* + * There are two versions of this function, because although original API + * left place holder for future filter, it never checked the value. + * Therefore the API can't depend on application passing a non + * bogus value. + */ +static int +pdump_enable(uint16_t port, uint16_t queue, + uint32_t flags, uint32_t snaplen, + struct rte_ring *ring, struct rte_mempool *mp, + const struct rte_bpf_prm *prm) { int ret; char name[RTE_DEV_NAME_MAX_LEN]; @@ -496,20 +593,42 @@ rte_pdump_enable(uint16_t port, uint16_t queue, uint32_t flags, if (ret < 0) return ret; - ret = pdump_prepare_client_request(name, queue, flags, - ENABLE, ring, mp, filter); + if (snaplen == 0) + snaplen = UINT32_MAX; - return ret; + return pdump_prepare_client_request(name, queue, flags, snaplen, + ENABLE, ring, mp, prm); } int -rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue, - uint32_t flags, - struct rte_ring *ring, - struct rte_mempool *mp, - void *filter) +rte_pdump_enable(uint16_t port, uint16_t queue, uint32_t flags, + struct rte_ring *ring, + struct rte_mempool *mp, + void *filter __rte_unused) { - int ret = 0; + return pdump_enable(port, queue, flags, 0, + ring, mp, NULL); +} + +int +rte_pdump_enable_bpf(uint16_t port, uint16_t queue, + uint32_t flags, uint32_t snaplen, + struct rte_ring *ring, + struct rte_mempool *mp, + const struct rte_bpf_prm *prm) +{ + return pdump_enable(port, queue, flags, snaplen, + ring, mp, prm); +} + +static int +pdump_enable_by_deviceid(const char *device_id, uint16_t queue, + uint32_t flags, uint32_t snaplen, + struct rte_ring *ring, + struct rte_mempool *mp, + const struct rte_bpf_prm *prm) +{ + int ret; ret = pdump_validate_ring_mp(ring, mp); if (ret < 0) @@ -518,10 +637,30 @@ rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue, if (ret < 0) return ret; - ret = pdump_prepare_client_request(device_id, queue, flags, - ENABLE, ring, mp, filter); + return pdump_prepare_client_request(device_id, queue, flags, snaplen, + ENABLE, ring, mp, prm); +} - return ret; +int +rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue, + uint32_t flags, + struct rte_ring *ring, + struct rte_mempool *mp, + void *filter __rte_unused) +{ + return pdump_enable_by_deviceid(device_id, queue, flags, 0, + ring, mp, NULL); +} + +int +rte_pdump_enable_bpf_by_deviceid(const char *device_id, uint16_t queue, + uint32_t flags, uint32_t snaplen, + struct rte_ring *ring, + struct rte_mempool *mp, + const struct rte_bpf_prm *prm) +{ + return pdump_enable_by_deviceid(device_id, queue, flags, snaplen, + ring, mp, prm); } int @@ -537,8 +676,8 @@ rte_pdump_disable(uint16_t port, uint16_t queue, uint32_t flags) if (ret < 0) return ret; - ret = pdump_prepare_client_request(name, queue, flags, - DISABLE, NULL, NULL, NULL); + ret = pdump_prepare_client_request(name, queue, flags, 0, + DISABLE, NULL, NULL, NULL); return ret; } @@ -553,8 +692,66 @@ rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue, if (ret < 0) return ret; - ret = pdump_prepare_client_request(device_id, queue, flags, - DISABLE, NULL, NULL, NULL); + ret = pdump_prepare_client_request(device_id, queue, flags, 0, + DISABLE, NULL, NULL, NULL); return ret; } + +static void +pdump_sum_stats(uint16_t port, uint16_t nq, + struct rte_pdump_stats stats[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT], + struct rte_pdump_stats *total) +{ + uint64_t *sum = (uint64_t *)total; + unsigned int i; + uint64_t val; + uint16_t qid; + + for (qid = 0; qid < nq; qid++) { + const uint64_t *perq = (const uint64_t *)&stats[port][qid]; + + for (i = 0; i < sizeof(*total) / sizeof(uint64_t); i++) { + val = __atomic_load_n(&perq[i], __ATOMIC_RELAXED); + sum[i] += val; + } + } +} + +int +rte_pdump_stats(uint16_t port, struct rte_pdump_stats *stats) +{ + struct rte_eth_dev_info dev_info; + const struct rte_memzone *mz; + int ret; + + memset(stats, 0, sizeof(*stats)); + ret = rte_eth_dev_info_get(port, &dev_info); + if (ret != 0) { + PDUMP_LOG(ERR, + "Error during getting device (port %u) info: %s\n", + port, strerror(-ret)); + return ret; + } + + if (pdump_stats == NULL) { + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { + PDUMP_LOG(ERR, + "pdump not initialized\n"); + rte_errno = EINVAL; + return -1; + } + + mz = rte_memzone_lookup(MZ_RTE_PDUMP_STATS); + if (mz == NULL) { + PDUMP_LOG(ERR, "can not find pdump stats\n"); + rte_errno = EINVAL; + return -1; + } + pdump_stats = mz->addr; + } + + pdump_sum_stats(port, dev_info.nb_rx_queues, pdump_stats->rx, stats); + pdump_sum_stats(port, dev_info.nb_tx_queues, pdump_stats->tx, stats); + return 0; +} diff --git a/lib/pdump/rte_pdump.h b/lib/pdump/rte_pdump.h index 6b00fc17aeb2..be3fd14c4bd3 100644 --- a/lib/pdump/rte_pdump.h +++ b/lib/pdump/rte_pdump.h @@ -15,6 +15,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -26,7 +27,9 @@ enum { RTE_PDUMP_FLAG_RX = 1, /* receive direction */ RTE_PDUMP_FLAG_TX = 2, /* transmit direction */ /* both receive and transmit directions */ - RTE_PDUMP_FLAG_RXTX = (RTE_PDUMP_FLAG_RX|RTE_PDUMP_FLAG_TX) + RTE_PDUMP_FLAG_RXTX = (RTE_PDUMP_FLAG_RX|RTE_PDUMP_FLAG_TX), + + RTE_PDUMP_FLAG_PCAPNG = 4, /* format for pcapng */ }; /** @@ -68,7 +71,7 @@ rte_pdump_uninit(void); * @param mp * mempool on to which original packets will be mirrored or duplicated. * @param filter - * place holder for packet filtering. + * Unused should be NULL. * * @return * 0 on success, -1 on error, rte_errno is set accordingly. @@ -80,6 +83,41 @@ rte_pdump_enable(uint16_t port, uint16_t queue, uint32_t flags, struct rte_mempool *mp, void *filter); +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * Enables packet capturing on given port and queue with filtering. + * + * @param port_id + * The Ethernet port on which packet capturing should be enabled. + * @param queue + * The queue on the Ethernet port which packet capturing + * should be enabled. Pass UINT16_MAX to enable packet capturing on all + * queues of a given port. + * @param flags + * Pdump library flags that specify direction and packet format. + * @param snaplen + * The upper limit on bytes to copy. + * Passing UINT32_MAX means capture all the possible data. + * @param ring + * The ring on which captured packets will be enqueued for user. + * @param mp + * The mempool on to which original packets will be mirrored or duplicated. + * @param prm + * Use BPF program to run to filter packes (can be NULL) + * + * @return + * 0 on success, -1 on error, rte_errno is set accordingly. + */ +__rte_experimental +int +rte_pdump_enable_bpf(uint16_t port_id, uint16_t queue, + uint32_t flags, uint32_t snaplen, + struct rte_ring *ring, + struct rte_mempool *mp, + const struct rte_bpf_prm *prm); + /** * Disables packet capturing on given port and queue. * @@ -118,7 +156,7 @@ rte_pdump_disable(uint16_t port, uint16_t queue, uint32_t flags); * @param mp * mempool on to which original packets will be mirrored or duplicated. * @param filter - * place holder for packet filtering. + * unused should be NULL * * @return * 0 on success, -1 on error, rte_errno is set accordingly. @@ -131,6 +169,43 @@ rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue, struct rte_mempool *mp, void *filter); +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * Enables packet capturing on given device id and queue with filtering. + * device_id can be name or pci address of device. + * + * @param device_id + * device id on which packet capturing should be enabled. + * @param queue + * The queue on the Ethernet port which packet capturing + * should be enabled. Pass UINT16_MAX to enable packet capturing on all + * queues of a given port. + * @param flags + * Pdump library flags that specify direction and packet format. + * @param snaplen + * The upper limit on bytes to copy. + * Passing UINT32_MAX means capture all the possible data. + * @param ring + * The ring on which captured packets will be enqueued for user. + * @param mp + * The mempool on to which original packets will be mirrored or duplicated. + * @param filter + * Use BPF program to run to filter packes (can be NULL) + * + * @return + * 0 on success, -1 on error, rte_errno is set accordingly. + */ +__rte_experimental +int +rte_pdump_enable_bpf_by_deviceid(const char *device_id, uint16_t queue, + uint32_t flags, uint32_t snaplen, + struct rte_ring *ring, + struct rte_mempool *mp, + const struct rte_bpf_prm *filter); + + /** * Disables packet capturing on given device_id and queue. * device_id can be name or pci address of device. @@ -153,6 +228,35 @@ int rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue, uint32_t flags); + +/** + * A structure used to retrieve statistics from packet capture. + * The statistics are sum of both receive and transmit queues. + */ +struct rte_pdump_stats { + uint64_t accepted; /**< Number of packets accepted by filter. */ + uint64_t filtered; /**< Number of packets rejected by filter. */ + uint64_t nombuf; /**< Number of mbuf allocation failures. */ + uint64_t ringfull; /**< Number of missed packets due to ring full. */ + + uint64_t reserved[4]; /**< Reserved and pad to cache line */ +}; + +/** + * Retrieve the packet capture statistics for a queue. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param stats + * A pointer to structure of type *rte_pdump_stats* to be filled in. + * @return + * Zero if successful. -1 on error and rte_errno is set. + */ +__rte_experimental +int +rte_pdump_stats(uint16_t port_id, struct rte_pdump_stats *stats); + + #ifdef __cplusplus } #endif diff --git a/lib/pdump/version.map b/lib/pdump/version.map index f0a9d12c9a9e..ce5502d9cdf4 100644 --- a/lib/pdump/version.map +++ b/lib/pdump/version.map @@ -10,3 +10,11 @@ DPDK_22 { local: *; }; + +EXPERIMENTAL { + global: + + rte_pdump_enable_bpf; + rte_pdump_enable_bpf_by_deviceid; + rte_pdump_stats; +}; From patchwork Fri Sep 10 18:18:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98679 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D1C2AA0547; Fri, 10 Sep 2021 20:19:33 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1775941167; Fri, 10 Sep 2021 20:19:00 +0200 (CEST) Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) by mails.dpdk.org (Postfix) with ESMTP id A05D340DFD for ; Fri, 10 Sep 2021 20:18:54 +0200 (CEST) Received: by mail-pg1-f172.google.com with SMTP id w8so2575188pgf.5 for ; Fri, 10 Sep 2021 11:18:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SAmiFwWrp5bJuD9m8yr4+wGC+OG+L3x3whbYE1ZuLag=; b=VEZeu3AALR4tQtLs1yfuMYjZgk0mJOYgmbkiuOhp1dT06+ITufdIDuvGmut6kN+oHw e0+j1+t87nYQwALtTIuGwT6vGh3FfVlFwjzfT1ZiYNGJxX7mM/ItFjACnR/nVf/rIlRP +FQ0Uy0BbezR1O7exDOmfy0u4fH1cm4hptwYFyh6PMk1cHI+Q9IajLuRpNwow1EyOHJi 5OSTNYsi7ZmlWBmN2Kr929zTPta8uBZcTafS3eVDUOPtn9JJ8df91RXNNrPfb1qimxav vxigcAhryZw8kZQo3nnPFxGs/lSXde3Llbe1xNCKzkJoMATricEgiR8C7U5jbCs+j59l HYFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SAmiFwWrp5bJuD9m8yr4+wGC+OG+L3x3whbYE1ZuLag=; b=4n8BvTQCdRWe1qvy58YkolOGJ+CEUReG8ReJkrSsuErUJD1KbqwIlbyumo1kq4akrG ylfnQCIKMd+BeuHvf6H/1fYdV58XowuDfWvpYkqfnNTrRdp5pG9NhBXly2EUOH7IghE0 pec+bWuo4Q4i6uVIw9ZQGcNTpv+TCDSdhepn9WgIbVs+id8dvTyUpPdhPFnPezGQtByB 2tqSf2XiMvKyhv6Wz3O3lJ11vtpxnGwCSTi5etYZsHSrCZ7vd9TDhlF/lBACD2qkYaSp Qg3rUP8xZ0E5/GHKjCkQV9TjVfA9UbZI5Hx4WcNdbrh39iAxqHhlCsHs61rVYNIOfSQU MptQ== X-Gm-Message-State: AOAM532OlET4wzWWFa/dcfmWmq7HV85BptCR9h7MWYjn6VOJbS7WpVBl 7Gsnifsxgq2t+Bqbln9eSE/00cITsEffRQ== X-Google-Smtp-Source: ABdhPJyrvvIlqgIdhQy6fHEqR/BWtbeapscVgO8TcuFsLnoOlkQX9ZjgFH9rIF4Hs5D7lGfZKJ8sCw== X-Received: by 2002:a63:7a05:: with SMTP id v5mr8365554pgc.387.1631297932884; Fri, 10 Sep 2021 11:18:52 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:52 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:37 -0700 Message-Id: <20210910181841.530280-8-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 07/11] app/dumpcap: add new packet capture application X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is a new packet capture application to replace existing pdump. The new application works like Wireshark dumpcap program and supports the pdump API features. It is not complete yet some features such as filtering are not implemented. Signed-off-by: Stephen Hemminger --- app/dumpcap/main.c | 844 ++++++++++++++++++++++++++++++++++++++++ app/dumpcap/meson.build | 16 + app/meson.build | 1 + 3 files changed, 861 insertions(+) create mode 100644 app/dumpcap/main.c create mode 100644 app/dumpcap/meson.build diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c new file mode 100644 index 000000000000..91a508e7af12 --- /dev/null +++ b/app/dumpcap/main.c @@ -0,0 +1,844 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019-2020 Microsoft Corporation + * + * DPDK application to dump network traffic + * This is designed to look and act like the Wireshark + * dumpcap program. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define RING_NAME "capture-ring" +#define MONITOR_INTERVAL (500 * 1000) +#define MBUF_POOL_CACHE_SIZE 32 +#define BURST_SIZE 32 +#define SLEEP_THRESHOLD 1000 + +/* command line flags */ +static const char *progname; +static bool quit_signal; +static bool group_read; +static bool quiet; +static bool promiscuous_mode = true; +static bool use_pcapng = true; +static char *output_name; +static const char *filter_str; +static unsigned int ring_size = 2048; +static const char *capture_comment; +static uint32_t snaplen = RTE_MBUF_DEFAULT_BUF_SIZE; +static bool dump_bpf; +static struct { + uint64_t duration; /* nanoseconds */ + unsigned long packets; /* number of packets in file */ + size_t size; /* file size (bytes) */ +} stop; + +/* Running state */ +static struct rte_bpf_prm *bpf_prm; +static uint64_t start_time, end_time; +static uint64_t packets_received; +static size_t file_size; + +struct interface { + TAILQ_ENTRY(interface) next; + uint16_t port; + char name[RTE_ETH_NAME_MAX_LEN]; + + struct rte_rxtx_callback *rx_cb[RTE_MAX_QUEUES_PER_PORT]; +}; + +TAILQ_HEAD(interface_list, interface); +static struct interface_list interfaces = TAILQ_HEAD_INITIALIZER(interfaces); +static struct interface *port2intf[RTE_MAX_ETHPORTS]; + +/* Can do either pcap or pcapng format output */ +typedef union { + rte_pcapng_t *pcapng; + pcap_dumper_t *dumper; +} dumpcap_out_t; + +static void usage(void) +{ + printf("Usage: %s [options] ...\n\n", progname); + printf("Capture Interface:\n" + " -i name or port index of interface\n" + " -f packet filter in libpcap filter syntax\n"); + printf(" -s , --snapshot-length \n" + " packet snapshot length (def: %u)\n", + RTE_MBUF_DEFAULT_BUF_SIZE); + printf(" -p, --no-promiscuous-mode\n" + " don't capture in promiscuous mode\n" + " -D, --list-interfaces print list of interfaces and exit\n" + " -d print generated BPF code for capture filter\n" + "\n" + "Stop conditions:\n" + " -c stop after n packets (def: infinite)\n" + " -a ..., --autostop ...\n" + " duration:NUM - stop after NUM seconds\n" + " filesize:NUM - stop this file after NUM kB\n" + " packets:NUM - stop after NUM packets\n" + "Output (files):\n" + " -w name of file to save (def: tempfile)\n" + " -g enable group read access on the output file(s)\n" + " -n use pcapng format instead of pcap (default)\n" + " -P use libpcap format instead of pcapng\n" + " --capture-comment \n" + " add a capture comment to the output file\n" + "\n" + "Miscellaneous:\n" + " -q don't report packet capture counts\n" + " -v, --version print version information and exit\n" + " -h, --help display this help and exit\n" + "\n" + "Use Ctrl-C to stop capturing at any time.\n"); +} + +static const char *version(void) +{ + static char str[128]; + + snprintf(str, sizeof(str), + "%s 1.0 (%s)\n", progname, rte_version()); + return str; +} + +/* Parse numeric argument from command line */ +static unsigned long get_uint(const char *arg, const char *name, + unsigned int limit) +{ + unsigned long u; + char *endp; + + u = strtoul(arg, &endp, 0); + if (*arg == '\0' || *endp != '\0') + rte_exit(EXIT_FAILURE, + "Specified %s \"%s\" is not a valid number\n", + name, arg); + if (limit && u > limit) + rte_exit(EXIT_FAILURE, + "Specified %s \"%s\" is too large (greater than %u)\n", + name, arg, limit); + + return u; +} + +/* Set auto stop values */ +static void auto_stop(char *opt) +{ + char *value, *endp; + + value = strchr(opt, ':'); + if (value == NULL) + rte_exit(EXIT_FAILURE, + "Missing colon in auto stop parameter\n"); + + *value++ = '\0'; + if (strcmp(opt, "duration") == 0) { + double interval = strtod(value, &endp); + + if (*value == '\0' || *endp != '\0' || interval <= 0) + rte_exit(EXIT_FAILURE, + "Invalid duration \"%s\"\n", value); + stop.duration = NSEC_PER_SEC * interval; + } else if (strcmp(opt, "filesize") == 0) { + stop.size = get_uint(value, "filesize", 0) * 1024; + } else if (strcmp(opt, "packets") == 0) { + stop.packets = get_uint(value, "packets", 0); + } else { + rte_exit(EXIT_FAILURE, + "Unknown autostop parameter \"%s\"\n", opt); + } +} + +/* Add interface to list of interfaces to capture */ +static void add_interface(uint16_t port, const char *name) +{ + struct interface *intf; + + intf = malloc(sizeof(*intf)); + if (!intf) + rte_exit(EXIT_FAILURE, "no memory for interface\n"); + + memset(intf, 0, sizeof(*intf)); + strlcpy(intf->name, name, sizeof(intf->name)); + + printf("Capturing on '%s'\n", name); + + port2intf[port] = intf; + TAILQ_INSERT_TAIL(&interfaces, intf, next); +} + +/* Select all valid DPDK interfaces */ +static void select_all_interfaces(void) +{ + char name[RTE_ETH_NAME_MAX_LEN]; + uint16_t p; + + RTE_ETH_FOREACH_DEV(p) { + if (rte_eth_dev_get_name_by_port(p, name) < 0) + continue; + add_interface(p, name); + } +} + +/* + * Choose interface to capture if no -i option given. + * Select the first DPDK port, this matches what dumpcap does. + */ +static void set_default_interface(void) +{ + char name[RTE_ETH_NAME_MAX_LEN]; + uint16_t p; + + RTE_ETH_FOREACH_DEV(p) { + if (rte_eth_dev_get_name_by_port(p, name) < 0) + continue; + add_interface(p, name); + return; + } + rte_exit(EXIT_FAILURE, "No usable interfaces found\n"); +} + +/* Lookup interface by name or port and add it to the list */ +static void select_interface(const char *arg) +{ + uint16_t port; + + if (strcmp(arg, "*")) + select_all_interfaces(); + else if (rte_eth_dev_get_port_by_name(arg, &port) == 0) + add_interface(port, arg); + else { + char name[RTE_ETH_NAME_MAX_LEN]; + + port = get_uint(arg, "port_number", UINT16_MAX); + if (rte_eth_dev_get_name_by_port(port, name) < 0) + rte_exit(EXIT_FAILURE, "Invalid port number %u\n", + port); + add_interface(port, name); + } +} + +/* Display list of possible interfaces that can be used. */ +static void show_interfaces(void) +{ + char name[RTE_ETH_NAME_MAX_LEN]; + uint16_t p; + + RTE_ETH_FOREACH_DEV(p) { + if (rte_eth_dev_get_name_by_port(p, name) < 0) + continue; + printf("%u. %s\n", p, name); + } +} + +static void compile_filter(void) +{ + struct bpf_program bf; + pcap_t *pcap; + + pcap = pcap_open_dead(DLT_EN10MB, snaplen); + if (!pcap) + rte_exit(EXIT_FAILURE, "can not open pcap\n"); + + if (pcap_compile(pcap, &bf, filter_str, + 1, PCAP_NETMASK_UNKNOWN) != 0) + rte_exit(EXIT_FAILURE, "pcap filter string not valid (%s)\n", + pcap_geterr(pcap)); + + bpf_prm = rte_bpf_convert(&bf); + if (bpf_prm == NULL) + rte_exit(EXIT_FAILURE, + "bpf convert failed\n"); + + if (dump_bpf) { + printf("cBPF program (%u insns)\n", bf.bf_len); + bpf_dump(&bf, 1); + printf("\neBPF program (%u insns)\n", bpf_prm->nb_ins); + rte_bpf_dump(stdout, bpf_prm->ins, bpf_prm->nb_ins); + exit(0); + } + + /* Don't care about original program any more */ + pcap_freecode(&bf); + pcap_close(pcap); +} + +/* + * Parse command line options. + * These are chosen to be similar to dumpcap command. + */ +static void parse_opts(int argc, char **argv) +{ + static const struct option long_options[] = { + { "autostop", required_argument, NULL, 'a' }, + { "capture-comment", required_argument, NULL, 0 }, + { "help", no_argument, NULL, 'h' }, + { "interface", required_argument, NULL, 'i' }, + { "list-interfaces", no_argument, NULL, 'D' }, + { "no-promiscuous-mode", no_argument, NULL, 'p' }, + { "output-file", required_argument, NULL, 'w' }, + { "ring-buffer", required_argument, NULL, 'b' }, + { "snapshot-length", required_argument, NULL, 's' }, + { "version", no_argument, NULL, 'v' }, + { NULL }, + }; + int option_index, c; + + for (;;) { + c = getopt_long(argc, argv, "a:b:c:dDf:ghi:nN:pPqs:vw:", + long_options, &option_index); + if (c == -1) + break; + + switch (c) { + case 0: + switch (option_index) { + case 0: + capture_comment = optarg; + break; + default: + usage(); + exit(1); + } + break; + case 'a': + auto_stop(optarg); + break; + case 'b': + rte_exit(EXIT_FAILURE, + "multiple files not implemented\n"); + break; + case 'c': + stop.packets = get_uint(optarg, "packet_count", 0); + break; + case 'd': + dump_bpf = true; + break; + case 'D': + show_interfaces(); + exit(0); + case 'f': + filter_str = optarg; + break; + case 'g': + group_read = true; + break; + case 'h': + printf("%s\n\n", version()); + usage(); + exit(0); + case 'i': + select_interface(optarg); + break; + case 'n': + use_pcapng = true; + break; + case 'N': + ring_size = get_uint(optarg, "packet_limit", 0); + break; + case 'p': + promiscuous_mode = false; + break; + case 'P': + use_pcapng = false; + break; + case 'q': + quiet = true; + break; + case 's': + snaplen = get_uint(optarg, "snap_len", 0); + break; + case 'w': + output_name = optarg; + break; + case 'v': + printf("%s\n", version()); + exit(0); + default: + fprintf(stderr, "Invalid option: %s\n", + argv[optind - 1]); + usage(); + exit(1); + } + } +} + +static void +signal_handler(int sig_num __rte_unused) +{ + __atomic_store_n(&quit_signal, true, __ATOMIC_RELAXED); +} + +/* Return the time since 1/1/1970 in nanoseconds */ +static uint64_t create_timestamp(void) +{ + struct timespec now; + + clock_gettime(CLOCK_MONOTONIC, &now); + return rte_timespec_to_ns(&now); +} + +static void +cleanup_pdump_resources(void) +{ + struct interface *intf; + + TAILQ_FOREACH(intf, &interfaces, next) { + rte_pdump_disable(intf->port, + RTE_PDUMP_ALL_QUEUES, RTE_PDUMP_FLAG_RXTX); + if (promiscuous_mode) + rte_eth_promiscuous_disable(intf->port); + } +} + +/* Alarm signal handler, used to check that primary process */ +static void +monitor_primary(void *arg __rte_unused) +{ + if (__atomic_load_n(&quit_signal, __ATOMIC_RELAXED)) + return; + + if (rte_eal_primary_proc_alive(NULL)) { + rte_eal_alarm_set(MONITOR_INTERVAL, monitor_primary, NULL); + } else { + fprintf(stderr, + "Primary process is no longer active, exiting...\n"); + __atomic_store_n(&quit_signal, true, __ATOMIC_RELAXED); + } +} + +/* Setup handler to check when primary exits. */ +static void +enable_primary_monitor(void) +{ + int ret; + + /* Once primary exits, so will pdump. */ + ret = rte_eal_alarm_set(MONITOR_INTERVAL, monitor_primary, NULL); + if (ret < 0) + fprintf(stderr, "Fail to enable monitor:%d\n", ret); +} + +static void +disable_primary_monitor(void) +{ + int ret; + + ret = rte_eal_alarm_cancel(monitor_primary, NULL); + if (ret < 0) + fprintf(stderr, "Fail to disable monitor:%d\n", ret); +} + +static void +report_packet_stats(dumpcap_out_t out) +{ + struct rte_pdump_stats pdump_stats; + struct interface *intf; + uint64_t ifrecv, ifdrop; + double percent; + + fputc('\n', stderr); + TAILQ_FOREACH(intf, &interfaces, next) { + if (rte_pdump_stats(intf->port, &pdump_stats) < 0) + continue; + + /* do what Wiretap does */ + ifrecv = pdump_stats.accepted + pdump_stats.filtered; + ifdrop = pdump_stats.nombuf + pdump_stats.ringfull; + + if (use_pcapng) + rte_pcapng_write_stats(out.pcapng, intf->port, NULL, + start_time, end_time, + ifrecv, ifdrop); + + if (ifrecv == 0) + percent = 0; + else + percent = 100. * ifrecv / (ifrecv + ifdrop); + + fprintf(stderr, + "Packets received/dropped on interface '%s': " + "%"PRIu64 "/%" PRIu64 " (%.1f)\n", + intf->name, ifrecv, ifdrop, percent); + } +} + +/* + * Start DPDK EAL with arguments. + * Unlike most DPDK programs, this application does not use the + * typical EAL command line arguments. + * We don't want to expose all the DPDK internals to the user. + */ +static void dpdk_init(void) +{ + static const char * const args[] = { + "dumpcap", "--proc-type", "secondary", + "--log-level", "notice" + + }; + const int eal_argc = RTE_DIM(args); + char **eal_argv; + unsigned int i; + + /* DPDK API requires mutable versions of command line arguments. */ + eal_argv = calloc(eal_argc + 1, sizeof(char *)); + if (eal_argv == NULL) + rte_panic("No memory\n"); + + eal_argv[0] = strdup(progname); + for (i = 1; i < RTE_DIM(args); i++) + eal_argv[i] = strdup(args[i]); + + if (rte_eal_init(eal_argc, eal_argv) < 0) + rte_exit(EXIT_FAILURE, "EAL init failed: is primary process running?\n"); + + if (rte_eth_dev_count_avail() == 0) + rte_exit(EXIT_FAILURE, "No Ethernet ports found\n"); +} + +/* Create packet ring shared between callbacks and process */ +static struct rte_ring *create_ring(void) +{ + struct rte_ring *ring; + size_t size, log2; + + /* Find next power of 2 >= size. */ + size = ring_size; + log2 = sizeof(size) * 8 - __builtin_clzl(size - 1); + size = 1u << log2; + + if (size != ring_size) { + fprintf(stderr, "Ring size %u rounded up to %zu\n", + ring_size, size); + ring_size = size; + } + + ring = rte_ring_lookup(RING_NAME); + if (ring == NULL) { + ring = rte_ring_create(RING_NAME, ring_size, + rte_socket_id(), 0); + if (ring == NULL) + rte_exit(EXIT_FAILURE, "Could not create ring :%s\n", + rte_strerror(rte_errno)); + } + return ring; +} + +static struct rte_mempool *create_mempool(void) +{ + static const char pool_name[] = "capture_mbufs"; + size_t num_mbufs = 2 * ring_size; + struct rte_mempool *mp; + + mp = rte_mempool_lookup(pool_name); + if (mp) + return mp; + + mp = rte_pktmbuf_pool_create_by_ops(pool_name, num_mbufs, + MBUF_POOL_CACHE_SIZE, 0, + rte_pcapng_mbuf_size(snaplen), + rte_socket_id(), "ring_mp_sc"); + if (mp == NULL) + rte_exit(EXIT_FAILURE, + "Mempool (%s) creation failed: %s\n", pool_name, + rte_strerror(rte_errno)); + + return mp; +} + +/* + * Get Operating System information. + * Returns an string allocated via malloc(). + */ +static char *get_os_info(void) +{ + struct utsname uts; + char *osname = NULL; + + if (uname(&uts) < 0) + return NULL; + + if (asprintf(&osname, "%s %s", + uts.sysname, uts.release) == -1) + return NULL; + + return osname; +} + +static dumpcap_out_t create_output(void) +{ + dumpcap_out_t ret; + static char tmp_path[PATH_MAX]; + int fd; + + /* If no filename specified make a tempfile name */ + if (output_name == NULL) { + struct interface *intf; + struct tm *tm; + time_t now; + char ts[32]; + + intf = TAILQ_FIRST(&interfaces); + now = time(NULL); + tm = localtime(&now); + if (!tm) + rte_panic("localtime failed\n"); + + strftime(ts, sizeof(ts), "%Y%m%d%H%M%S", tm); + + snprintf(tmp_path, sizeof(tmp_path), + "/tmp/%s_%u_%s_%s.%s", + progname, intf->port, intf->name, ts, + use_pcapng ? "pcapng" : "pcap"); + output_name = tmp_path; + } + + if (strcmp(output_name, "-") == 0) + fd = STDOUT_FILENO; + else { + mode_t mode = group_read ? 0640 : 0600; + + fd = open(output_name, O_WRONLY | O_CREAT, mode); + if (fd < 0) + rte_exit(EXIT_FAILURE, "Can not open \"%s\": %s\n", + output_name, strerror(errno)); + } + + if (use_pcapng) { + char *os = get_os_info(); + + ret.pcapng = rte_pcapng_fdopen(fd, os, NULL, + version(), capture_comment); + if (ret.pcapng == NULL) + rte_exit(EXIT_FAILURE, "pcapng_fdopen failed: %s\n", + strerror(rte_errno)); + free(os); + } else { + pcap_t *pcap; + + pcap = pcap_open_dead_with_tstamp_precision(DLT_EN10MB, snaplen, + PCAP_TSTAMP_PRECISION_NANO); + if (pcap == NULL) + rte_exit(EXIT_FAILURE, "pcap_open_dead failed\n"); + + ret.dumper = pcap_dump_fopen(pcap, fdopen(fd, "w")); + if (ret.dumper == NULL) + rte_exit(EXIT_FAILURE, "pcap_dump_fopen failed: %s\n", + pcap_geterr(pcap)); + } + + return ret; +} + +static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp) +{ + struct interface *intf; + uint32_t flags; + int ret; + + flags = RTE_PDUMP_FLAG_RXTX; + if (use_pcapng) + flags |= RTE_PDUMP_FLAG_PCAPNG; + + TAILQ_FOREACH(intf, &interfaces, next) { + if (promiscuous_mode) + rte_eth_promiscuous_enable(intf->port); + + ret = rte_pdump_enable_bpf(intf->port, RTE_PDUMP_ALL_QUEUES, + flags, snaplen, + r, mp, bpf_prm); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "Packet dump enable failed: %s\n", + rte_strerror(-ret)); + } +} + +/* + * Show current count of captured packets + * with backspaces to overwrite last value. + */ +static void show_count(uint64_t count) +{ + unsigned int i; + static unsigned int bt; + + for (i = 0; i < bt; i++) + fputc('\b', stderr); + + bt = fprintf(stderr, "%"PRIu64" ", count); +} + +/* Write multiple packets in older pcap format */ +static ssize_t +pcap_write_packets(pcap_dumper_t *dumper, + struct rte_mbuf *pkts[], uint16_t n) +{ + uint8_t temp_data[snaplen]; + struct pcap_pkthdr header; + uint16_t i; + size_t total = 0; + + gettimeofday(&header.ts, NULL); + + for (i = 0; i < n; i++) { + struct rte_mbuf *m = pkts[i]; + + header.len = rte_pktmbuf_pkt_len(m); + header.caplen = RTE_MIN(header.len, snaplen); + + pcap_dump((u_char *)dumper, &header, + rte_pktmbuf_read(m, 0, header.caplen, temp_data)); + + total += sizeof(header) + header.len; + } + + return total; +} + +/* Process all packets in ring and dump to capture file */ +static int process_ring(dumpcap_out_t out, struct rte_ring *r) +{ + struct rte_mbuf *pkts[BURST_SIZE]; + unsigned int avail, n; + static unsigned int empty_count; + ssize_t written; + + n = rte_ring_sc_dequeue_burst(r, (void **) pkts, BURST_SIZE, + &avail); + if (n == 0) { + /* don't consume endless amounts of cpu if idle */ + if (empty_count < SLEEP_THRESHOLD) + ++empty_count; + else + usleep(10); + return 0; + } + + empty_count = (avail == 0); + + if (use_pcapng) + written = rte_pcapng_write_packets(out.pcapng, pkts, n); + else + written = pcap_write_packets(out.dumper, pkts, n); + + rte_pktmbuf_free_bulk(pkts, n); + + if (written < 0) + return -1; + + file_size += written; + packets_received += n; + if (!quiet) + show_count(packets_received); + + return 0; +} + +int main(int argc, char **argv) +{ + struct rte_ring *r; + struct rte_mempool *mp; + dumpcap_out_t out; + + progname = argv[0]; + + dpdk_init(); + parse_opts(argc, argv); + + if (filter_str) + compile_filter(); + + if (TAILQ_EMPTY(&interfaces)) + set_default_interface(); + + r = create_ring(); + mp = create_mempool(); + out = create_output(); + + start_time = create_timestamp(); + enable_pdump(r, mp); + + signal(SIGINT, signal_handler); + signal(SIGPIPE, SIG_IGN); + + enable_primary_monitor(); + + if (!quiet) { + fprintf(stderr, "Packets captured: "); + show_count(0); + } + + while (!__atomic_load_n(&quit_signal, __ATOMIC_RELAXED)) { + if (process_ring(out, r) < 0) { + fprintf(stderr, "pcapng file write failed; %s\n", + strerror(errno)); + break; + } + + if (stop.size && file_size >= stop.size) + break; + + if (stop.packets && packets_received >= stop.packets) + break; + + if (stop.duration != 0 && + create_timestamp() - start_time > stop.duration) + break; + } + + end_time = create_timestamp(); + disable_primary_monitor(); + + if (rte_eal_primary_proc_alive(NULL)) + report_packet_stats(out); + + if (use_pcapng) + rte_pcapng_close(out.pcapng); + else + pcap_dump_close(out.dumper); + + cleanup_pdump_resources(); + rte_free(bpf_filter); + rte_ring_free(r); + rte_mempool_free(mp); + + return rte_eal_cleanup() ? EXIT_FAILURE : 0; +} diff --git a/app/dumpcap/meson.build b/app/dumpcap/meson.build new file mode 100644 index 000000000000..794336211eff --- /dev/null +++ b/app/dumpcap/meson.build @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Microsoft Corporation + +if not dpdk_conf.has('RTE_PORT_PCAP') + build = false + reason = 'missing dependency, "libpcap"' +endif + +if is_windows + build = false + reason = 'not supported on Windows' + subdir_done() +endif + +sources = files('main.c') +deps += ['ethdev', 'pdump', 'pcapng', 'bpf'] diff --git a/app/meson.build b/app/meson.build index 4c6049807cc3..e41a2e390236 100644 --- a/app/meson.build +++ b/app/meson.build @@ -2,6 +2,7 @@ # Copyright(c) 2017-2019 Intel Corporation apps = [ + 'dumpcap', 'pdump', 'proc-info', 'test-acl', From patchwork Fri Sep 10 18:18:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98680 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 29094A0547; Fri, 10 Sep 2021 20:19:40 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 55FC54116B; Fri, 10 Sep 2021 20:19:01 +0200 (CEST) Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by mails.dpdk.org (Postfix) with ESMTP id 2C16140DFD for ; Fri, 10 Sep 2021 20:18:55 +0200 (CEST) Received: by mail-pj1-f44.google.com with SMTP id j10-20020a17090a94ca00b00181f17b7ef7so2103045pjw.2 for ; Fri, 10 Sep 2021 11:18:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PDdY4oBU1gcR2+3EtAxIcQ34wYOai6D2reU0A2iPtss=; b=FTjPY1yYjUvJzNalYCkyKPU+ruZf6bBhxZeKwGPOl+7aIGXp8sJitlYIer0lHfMLiz 9QmaCIamt7UcdXde+9vz3PTP/zCei99G8X0NvL3Zu+sner4CqhKtrAKVnzdDdBgc9B6I eA0/175rEUdTNNawqg0AUHbsbuKp6/9ut/PWlnIop7+jUmtA1/+/7jNWwbfuwDm4EnIW NgSMD+pWnCXzwwAqeIkXYS0lShc7/Ozs1RfdQgbDdMOR5HhIZsUewzKdjVHxIbPlm163 oveYGe/We6kuZpVql72o300dZd981RF+zfqismXMXh12wb+YuV5S3sisMkbDDGJXTEgw EkfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PDdY4oBU1gcR2+3EtAxIcQ34wYOai6D2reU0A2iPtss=; b=yEMAT1ra25kYn1LjojvIm/w8HA/nsvwZTMklxsoaIlShCeBSf6KwLTaiEnwQ2DaJE7 tZlD32U6rbNBun0Vi7XhUwdi3PkzZ9eeOoZqoey9inTdqhxpcPygyKYjHR9DVZ+C5PGd N83FdFHGIvkFivmcu05kqkIoFByTnYzmO1q6OSLMOTJNApd2g9KeMWL9wkm0t+dT2xMs ogQwtAGEXhEwklPr67vUmdLjzLi98bRIWO3ePDRpGSMxdon6qnWy6ZlQ0407W8GFqLlE tQSVR9xYiEQaVIyWc5evynj0anGwOdUEfCo54OYhGjPELN9RX0nUjCXC4lNdxdjIvmaL cWmA== X-Gm-Message-State: AOAM5308CRFl4QhzQm2w98N+kLBZ79Ixn50JaB0lVXMzw+MB+eKrYqrn 4y51hvYAwRMZCIyWmVKcl2DtBC7yL0S9DA== X-Google-Smtp-Source: ABdhPJz1xcJRQ6tX9Kgg3G0FezIzq70SK962onCsSzVRaJHVuZCfwGi8Cn94qk63wpMJPSCjpkA6rg== X-Received: by 2002:a17:902:780f:b0:13a:3919:e365 with SMTP id p15-20020a170902780f00b0013a3919e365mr8683713pll.63.1631297933933; Fri, 10 Sep 2021 11:18:53 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:53 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:38 -0700 Message-Id: <20210910181841.530280-9-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 08/11] test: add test for bpf_convert X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add some functional tests for the Classic BPF to DPDK BPF converter. Signed-off-by: Stephen Hemminger --- app/test/test_bpf.c | 148 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 148 insertions(+) diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c index 527c06b80708..565b19653a77 100644 --- a/app/test/test_bpf.c +++ b/app/test/test_bpf.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -3233,3 +3234,150 @@ test_bpf(void) } REGISTER_TEST_COMMAND(bpf_autotest, test_bpf); + +#ifdef RTE_PORT_PCAP +#include + +static void +test_bpf_dump(struct bpf_program *cbf, const struct rte_bpf_prm *prm) +{ + printf("cBPF program (%u insns)\n", cbf->bf_len); + bpf_dump(cbf, 1); + + printf("\neBPF program (%u insns)\n", prm->nb_ins); + rte_bpf_dump(stdout, prm->ins, prm->nb_ins); +} + +static int +test_bpf_match(pcap_t *pcap, const char *str, bool expected) +{ + struct bpf_program fcode; + struct rte_bpf_prm *prm = NULL; + struct rte_bpf *bpf = NULL; + uint8_t tbuf[sizeof(struct dummy_mbuf)]; + int ret = -1; + uint64_t rc; + + if (pcap_compile(pcap, &fcode, str, 1, PCAP_NETMASK_UNKNOWN)) { + printf("%s@%d: pcap_compile(\"%s\") failed: %s;\n", + __func__, __LINE__, str, pcap_geterr(pcap)); + return -1; + } + + prm = rte_bpf_convert(&fcode); + if (prm == NULL) { + printf("%s@%d: bpf_convert('%s') failed,, error=%d(%s);\n", + __func__, __LINE__, str, rte_errno, strerror(rte_errno)); + goto error; + } + + bpf = rte_bpf_load(prm); + if (bpf == NULL) { + printf("%s@%d: failed to load bpf code, error=%d(%s);\n", + __func__, __LINE__, rte_errno, strerror(rte_errno)); + goto error; + } + + test_ld_mbuf1_prepare(tbuf); + rc = rte_bpf_exec(bpf, tbuf); + if ((rc == 0) == expected) + ret = 0; + else + printf("%s@%d: failed match: expect %s 0 got %"PRIu64"\n", + __func__, __LINE__, expected ? "==" : "<>", rc); +error: + if (bpf) + rte_bpf_destroy(bpf); + rte_free(prm); + pcap_freecode(&fcode); + return ret; +} + +/* Basic sanity test can we match a IP packet */ +static int +test_bpf_filter_sanity(pcap_t *pcap) +{ + int ret; + + ret = test_bpf_match(pcap, "ip", true); + ret |= test_bpf_match(pcap, "not ip", false); + + return ret; +} + +/* Some sample pcap filter strings from tcpdump man page */ +static const char * const sample_filters[] = { + "host 192.168.1.100", + "src net 10", + "not stp", + "len = 128", + "ip host 1.1.1.1 and not 1.1.1.2", + "ip and not net 127.0.0.1", + "tcp[tcpflags] & (tcp-syn|tcp-fin) != 0 and not src and dst net 127.0.0.1", + "ether[0] & 1 = 0 and ip[16] >= 224", + "icmp[icmptype] != icmp-echo and icmp[icmptype] != icmp-echoreply", +}; + +static int +test_bpf_filter(pcap_t *pcap, const char *s) +{ + struct bpf_program fcode; + struct rte_bpf_prm *prm = NULL; + struct rte_bpf *bpf = NULL; + + if (pcap_compile(pcap, &fcode, s, 1, PCAP_NETMASK_UNKNOWN)) { + printf("%s@%d: pcap_compile('%s') failed: %s;\n", + __func__, __LINE__, s, pcap_geterr(pcap)); + return -1; + } + + prm = rte_bpf_convert(&fcode); + if (prm == NULL) { + printf("%s@%d: bpf_convert('%s') failed,, error=%d(%s);\n", + __func__, __LINE__, s, rte_errno, strerror(rte_errno)); + goto error; + } + + bpf = rte_bpf_load(prm); + if (bpf == NULL) { + printf("%s@%d: failed to load bpf code, error=%d(%s);\n", + __func__, __LINE__, rte_errno, strerror(rte_errno)); + goto error; + } + +error: + if (bpf) + rte_bpf_destroy(bpf); + else { + printf("%s \"%s\"\n", __func__, s); + test_bpf_dump(&fcode, prm); + } + + rte_free(prm); + pcap_freecode(&fcode); + return (bpf == NULL) ? -1 : 0; +} + +static int +test_bpf_convert(void) +{ + unsigned int i; + pcap_t *pcap; + int rc; + + pcap = pcap_open_dead(DLT_EN10MB, 262144); + if (!pcap) { + printf("pcap_open_dead failed\n"); + return -1; + } + + rc = test_bpf_filter_sanity(pcap); + for (i = 0; i < RTE_DIM(sample_filters); i++) + rc |= test_bpf_filter(pcap, sample_filters[i]); + + pcap_close(pcap); + return rc; +} + +REGISTER_TEST_COMMAND(bpf_convert_autotest, test_bpf_convert); +#endif /* RTE_PORT_PCAP */ From patchwork Fri Sep 10 18:18:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98681 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E2BE5A0547; Fri, 10 Sep 2021 20:19:45 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6DDF541172; Fri, 10 Sep 2021 20:19:02 +0200 (CEST) Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) by mails.dpdk.org (Postfix) with ESMTP id 513D94115A for ; Fri, 10 Sep 2021 20:18:56 +0200 (CEST) Received: by mail-pg1-f182.google.com with SMTP id s11so2557443pgr.11 for ; Fri, 10 Sep 2021 11:18:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=S0wgL5DgJ+cdmzD3DfnVIW8B/xg3YdbgKzAOs1++iHM=; b=nZBrZ1cj6K1w0kLgrpNWOrcEPMBoy22JzgW0Il2K6miwzH3PazwgOf4poVnqvMC0xH Jdfs0M8T2KAzrcmzA4ckpJPDbKeiMeTAEbEcvKOqYBSgy8h8YbHRj+6SxB0Yw5OtK52J TY0R/VtuzD+u5Ql3SRX/DfYdAMfuDd6Zgggq4ab+op9hKE/OnhcsUMSBuHneXu00VKtY uEWAtazZCUn/viI26wcUna5hNVC62ILxuOhZA11HEH5KKWT/lizX/6Q2H8x1kcufaWAq a3/8aBHfOTlA1DF7zKIfMGJr9aAQCYnK4ef0vsu6a32SQIvRXwnB789EACljHdd7VsgD fJzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S0wgL5DgJ+cdmzD3DfnVIW8B/xg3YdbgKzAOs1++iHM=; b=Ax9gwTMaUVya9noPrygOHQW9aGsmIzyIj4hQR0fVYCU79VTusTOuWPAx9nAyHkqI54 iiHPbJZyqi5HtSb1CjCA8AlJ8G/g9V6xWonWFHpo7K5+39pj0/JMAj26yDO+3moqFK/Q bOWoDfrqSfGAtBmpz+vzVcFyiHMw9eMJ8VEQbR5Sw94b0QoJhwKZqWmjuv3uCFaFkd5m 7sz4YNp8ZfqQQtmxmn3ulF1tHn4tRALn5btfsJwXPlH0dkpDmTMjE2JCRDfMNUcxWuvW AveF+4Rlbwz1hdWYuMsjKSAeidkaGiQ6R9GY7PueIUj/kthO9r9Ralqqh3j1KqWQ4Ec3 o3/w== X-Gm-Message-State: AOAM533ZbQDgpKl0a5zjdDaDpYQnOJwIBJXvjMa4LEeCbTtd5hu2IudZ xdWaM7/9mV0WeVgBk2iEbAawwriKXDZeoA== X-Google-Smtp-Source: ABdhPJy81x0dW7j96Rs2iqtR4yr+UERhXRsIIv8VfzAiajUpmnfwxjH/mOeQ+mTiDWwRXeevU3bZ6A== X-Received: by 2002:a63:5f8f:: with SMTP id t137mr8265657pgb.420.1631297935032; Fri, 10 Sep 2021 11:18:55 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:54 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:39 -0700 Message-Id: <20210910181841.530280-10-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 09/11] test: add a test for pcapng library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Simple unit test that created pcapng file using API. To run this test you need to have at least one device. For example: DPDK_TEST=pcapng_autotest ./build/app/test/dpdk-test -l 0-15 \ --no-huge -m 2048 --vdev=net_tap,iface=dummy Signed-off-by: Stephen Hemminger --- app/test/meson.build | 1 + app/test/test_pcapng.c | 190 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 191 insertions(+) create mode 100644 app/test/test_pcapng.c diff --git a/app/test/meson.build b/app/test/meson.build index a7611686adcb..0d551ac9c2b2 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -97,6 +97,7 @@ test_sources = files( 'test_metrics.c', 'test_mcslock.c', 'test_mp_secondary.c', + 'test_pcapng.c', 'test_per_lcore.c', 'test_pflock.c', 'test_pmd_perf.c', diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c new file mode 100644 index 000000000000..6bf993ad30f6 --- /dev/null +++ b/app/test/test_pcapng.c @@ -0,0 +1,190 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2021 Microsoft Corporation + */ + +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "test.h" + +#define DUMMY_MBUF_NUM 3 + +static rte_pcapng_t *pcapng; +static struct rte_mempool *mp; +static const uint32_t pkt_len = 200; +static uint64_t ifrecv, ifdrop; +static uint16_t port_id; + +/* first mbuf in the packet, should always be at offset 0 */ +struct dummy_mbuf { + struct rte_mbuf mb[DUMMY_MBUF_NUM]; + uint8_t buf[DUMMY_MBUF_NUM][RTE_MBUF_DEFAULT_BUF_SIZE]; +}; + +static void +dummy_mbuf_prep(struct rte_mbuf *mb, uint8_t buf[], uint32_t buf_len, + uint32_t data_len) +{ + uint32_t i; + uint8_t *db; + + mb->buf_addr = buf; + mb->buf_iova = (uintptr_t)buf; + mb->buf_len = buf_len; + rte_mbuf_refcnt_set(mb, 1); + + /* set pool pointer to dummy value, test doesn't use it */ + mb->pool = (void *)buf; + + rte_pktmbuf_reset(mb); + db = (uint8_t *)rte_pktmbuf_append(mb, data_len); + + for (i = 0; i != data_len; i++) + db[i] = i; +} + +/* Make an IP packet consisting of chain of two packets */ +static void +mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen) +{ + struct rte_ipv4_hdr *ph; + const struct rte_ipv4_hdr iph = { + .version_ihl = RTE_IPV4_VHL_DEF, + .total_length = rte_cpu_to_be_16(plen), + .time_to_live = IPDEFTTL, + .next_proto_id = IPPROTO_RAW, + .src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK), + .dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST), + }; + + memset(dm, 0, sizeof(*dm)); + dummy_mbuf_prep(&dm->mb[0], dm->buf[0], sizeof(dm->buf[0]), plen); + + ph = rte_pktmbuf_mtod(dm->mb, typeof(ph)); + memcpy(ph, &iph, sizeof(iph)); +} + +static int +test_setup(void) +{ + char file_template[] = "/tmp/pcapng_test_XXXXXX.pcapng"; + int tmp_fd; + + port_id = rte_eth_find_next(0); + if (port_id >= RTE_MAX_ETHPORTS) { + fprintf(stderr, "No valid Ether port\n"); + return -1; + } + + tmp_fd = mkstemps(file_template, strlen(".pcapng")); + if (tmp_fd == -1) { + perror("mkstemps() failure"); + return -1; + } + + /* open a test capture file */ + pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL); + if (pcapng == NULL) { + fprintf(stderr, "rte_pcapng_fdopen failed\n"); + close(tmp_fd); + return -1; + } + + /* Make a pool for cloned packeets */ + mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", DUMMY_MBUF_NUM, + 0, 0, + rte_pcapng_mbuf_size(pkt_len), + SOCKET_ID_ANY, "ring_mp_sc"); + if (mp == NULL) { + fprintf(stderr, "Cannot create mempool\n"); + return -1; + } + return 0; + +} + +static int +test_basic_packets(void) +{ + struct rte_mbuf *orig, *clone = NULL; + struct dummy_mbuf mbfs; + ssize_t len; + + /* make a dummy packet */ + mbuf1_prepare(&mbfs, pkt_len); + + /* clone them */ + orig = &mbfs.mb[0]; + clone = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len, + rte_get_tsc_cycles(), 0); + if (clone == NULL) { + fprintf(stderr, "Cannot copy packet\n"); + return -1; + } + + /* write it to capture file */ + len = rte_pcapng_write_packets(pcapng, &clone, 1); + rte_pktmbuf_free(clone); + + if (len <= 0) { + fprintf(stderr, "Write of packets failed\n"); + return -1; + } + + ++ifrecv; + return 0; +} + +static int +test_write_stats(void) +{ + ssize_t len; + + /* write a statistics block */ + len = rte_pcapng_write_stats(pcapng, port_id, + NULL, 0, 0, + ifrecv, ifdrop); + if (len <= 0) { + fprintf(stderr, "Write of statistics failed\n"); + return -1; + } + return 0; +} + +static void +test_cleanup(void) +{ + if (mp) + rte_mempool_free(mp); + + if (pcapng) + rte_pcapng_close(pcapng); + +} + +static struct +unit_test_suite test_pcapng_suite = { + .setup = test_setup, + .teardown = test_cleanup, + .suite_name = "Test Pcapng Unit Test Suite", + .unit_test_cases = { + TEST_CASE(test_basic_packets), + TEST_CASE(test_write_stats), + TEST_CASES_END() + } +}; + +static int +test_pcapng(void) +{ + return unit_test_suite_runner(&test_pcapng_suite); +} + +REGISTER_TEST_COMMAND(pcapng_autotest, test_pcapng); From patchwork Fri Sep 10 18:18:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98682 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 07444A0547; Fri, 10 Sep 2021 20:19:51 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 858E941177; Fri, 10 Sep 2021 20:19:06 +0200 (CEST) Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) by mails.dpdk.org (Postfix) with ESMTP id 4E22E41104 for ; Fri, 10 Sep 2021 20:18:58 +0200 (CEST) Received: by mail-pg1-f171.google.com with SMTP id t1so2580933pgv.3 for ; Fri, 10 Sep 2021 11:18:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+vkeq9G4kEH23sOuf26gc9ZgFod9x5H9VZ1fsvuCTQ0=; b=0zpZWC09ut2osOEC3VK+KB20H+9f9xFcFImNJq5PhIpAZPveGssyGYakypkDBgOzID wbEyKdsN1ofvgQu+LIdm6aO2N224hp/zFy0GjlWqyGhwuWOdjjQJy8gwMOr9o3M/r2cL PGbTZvaxSiR4o5rjHZ2QLbOx7bscJwnOh2x+AM5nUi4hAc0Ja0en9Wu2KUm/kC2r2r5R xWlLcVa17FkKeb2l4J4GAhUinFwF4tCicIm13I3bY7dz6tBKx7g4dARfumvRodIfzd/q dAmo8x07VpQuRqJjb4uu4pP9haaIDhW7BiImYPH65obdDJWB+pxS/2uzam/R9UyegOIW REFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+vkeq9G4kEH23sOuf26gc9ZgFod9x5H9VZ1fsvuCTQ0=; b=csGfCa3XSjKsA/umAA5vo7DPzfbZyuUShgxUDDNlIwC7ESCvYgdQbqI2yQ+O96PKbU m47lkNG62rjhAbqTlkaKDgKtjgjcLxBQzR8jhuUuOmb2DI/C4liXrFYcHSVBDpZ84tZC 4QKUM6X2TIwNdaqZMFW6EvGUhFgeUGw+ndu6eCfttSJ0pGzxuuBuSzY9Q23MhyLy5CEH Fb3G7IOtDbqw9/ry+yO5f1LtfikUjAewgcpRp1eKwW5TdR7m7hERqhjMPsnHl4Z95lXl 7CfWgz74UqwNjYAFY/6TlM3OnmjUlaLG76lOTyrO8WSVEtK7G00nmZveBjPWsuTm6N+2 FryA== X-Gm-Message-State: AOAM5326X6Ss7glXwrBqRZmY0Ell3C+Ods5s1q1Qyd8I3LQcoD/+Exxc NQyU+JtU4c5iNhoG5tJVS7tg9GXNPrBv9g== X-Google-Smtp-Source: ABdhPJyFtYQVyhm7ygi9C+jN1rzJiIfWN+SfYoj+K6DmtJhpc5gMXRJEFoIS/Sb+8ZxCUlPJ/qKJvg== X-Received: by 2002:aa7:9282:0:b0:3e2:800a:b423 with SMTP id j2-20020aa79282000000b003e2800ab423mr9210984pfa.21.1631297936279; Fri, 10 Sep 2021 11:18:56 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:55 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:40 -0700 Message-Id: <20210910181841.530280-11-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 10/11] doc: changes for new pcapng and dumpcap X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Describe the new packet capture library and utilities Signed-off-by: Stephen Hemminger --- doc/api/doxy-api-index.md | 1 + doc/api/doxy-api.conf.in | 1 + .../howto/img/packet_capture_framework.svg | 96 +++++++++---------- doc/guides/howto/packet_capture_framework.rst | 67 ++++++------- doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/pcapng_lib.rst | 24 +++++ doc/guides/prog_guide/pdump_lib.rst | 28 ++++-- doc/guides/rel_notes/release_21_11.rst | 10 ++ doc/guides/tools/dumpcap.rst | 86 +++++++++++++++++ doc/guides/tools/index.rst | 1 + 10 files changed, 228 insertions(+), 87 deletions(-) create mode 100644 doc/guides/prog_guide/pcapng_lib.rst create mode 100644 doc/guides/tools/dumpcap.rst diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 1992107a0356..ee07394d1c78 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -223,3 +223,4 @@ The public API headers are grouped by topics: [experimental APIs] (@ref rte_compat.h), [ABI versioning] (@ref rte_function_versioning.h), [version] (@ref rte_version.h) + [pcapng] (@ref rte_pcapng.h) diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index 325a0195c6ab..aba17799a9a1 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -58,6 +58,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/metrics \ @TOPDIR@/lib/node \ @TOPDIR@/lib/net \ + @TOPDIR@/lib/pcapng \ @TOPDIR@/lib/pci \ @TOPDIR@/lib/pdump \ @TOPDIR@/lib/pipeline \ diff --git a/doc/guides/howto/img/packet_capture_framework.svg b/doc/guides/howto/img/packet_capture_framework.svg index a76baf71fdee..1c2646a81096 100644 --- a/doc/guides/howto/img/packet_capture_framework.svg +++ b/doc/guides/howto/img/packet_capture_framework.svg @@ -1,6 +1,4 @@ - - + inkscape:version="1.0.2 (e86c870879, 2021-01-15)" + sodipodi:docname="packet_capture_framework.svg"> + gradientTransform="matrix(1.1457977,0,0,0.99944907,-151.97019,745.05014)" /> + inkscape:window-width="2560" + inkscape:window-height="1414" + inkscape:window-x="0" + inkscape:window-y="0" + inkscape:window-maximized="1" + inkscape:document-rotation="0" /> @@ -296,7 +295,7 @@ image/svg+xml - + @@ -321,15 +320,15 @@ y="790.82452" /> DPDK Primary Application + y="807.3205" + style="font-size:12.5px;line-height:1.25">DPDK Primary Application dpdk-pdumpdpdk-dumpcaptool + id="tspan4193" + style="font-size:12.5px;line-height:1.25">tool PCAP PMD + id="tspan4193-3" + style="font-size:12.5px;line-height:1.25">librte_pcapng dpdk_port0 + id="tspan4193-6" + style="font-size:12.5px;line-height:1.25">dpdk_port0 librte_pdump + id="tspan4193-2" + style="font-size:12.5px;line-height:1.25">librte_pdump + width="108.21974" + height="35.335861" + x="297.9809" + y="985.62219" /> capture.pcap + id="tspan4193-3-2" + style="font-size:12.5px;line-height:1.25">capture.pcapng Traffic Generator + id="tspan4193-3-2-7" + style="font-size:12.5px;line-height:1.25">Traffic Generator ` tool is developed based on the -``librte_pdump`` library. It runs as a DPDK secondary process and is capable -of enabling or disabling packet capture on DPDK ports. The ``dpdk-pdump`` tool -provides command-line options with which users can request enabling or -disabling of the packet capture on DPDK ports. +The :ref:`librte_pcapng ` library provides the APIs to format +packets and write them to a file in Pcapng format. + + +The :ref:`dpdk-dumpcap ` is a tool that captures packets in +like Wireshark dumpcap does for Linux. It runs as a DPDK secondary process and +captures packets from one or more interfaces and writes them to a file +in Pcapng format. The ``dpdk-dumpcap`` tool is designed to take +most of the same options as the Wireshark ``dumpcap`` command. -The application which initializes the packet capture framework will be a primary process -and the application that enables or disables the packet capture will -be a secondary process. The primary process sends the Rx and Tx packets from the DPDK ports -to the secondary process. +Without any options it will use the packet capture framework to +capture traffic from the first available DPDK port. In DPDK the ``testpmd`` application can be used to initialize the packet -capture framework and acts as a server, and the ``dpdk-pdump`` tool acts as a +capture framework and acts as a server, and the ``dpdk-dumpcap`` tool acts as a client. To view Rx or Tx packets of ``testpmd``, the application should be -launched first, and then the ``dpdk-pdump`` tool. Packets from ``testpmd`` -will be sent to the tool, which then sends them on to the Pcap PMD device and -that device writes them to the Pcap file or to an external interface depending -on the command-line option used. +launched first, and then the ``dpdk-dumpcap`` tool. Packets from ``testpmd`` +will be sent to the tool, and then to the Pcapng file. Some things to note: -* The ``dpdk-pdump`` tool can only be used in conjunction with a primary +* All tools using ``librte_pdump`` can only be used in conjunction with a primary application which has the packet capture framework initialized already. In dpdk, only ``testpmd`` is modified to initialize packet capture framework, - other applications remain untouched. So, if the ``dpdk-pdump`` tool has to + other applications remain untouched. So, if the ``dpdk-dumpcap`` tool has to be used with any application other than the testpmd, the user needs to explicitly modify that application to call the packet capture framework initialization code. Refer to the ``app/test-pmd/testpmd.c`` code and look for ``pdump`` keyword to see how this is done. -* The ``dpdk-pdump`` tool depends on the libpcap based PMD. +* The ``dpdk-pdump`` tool is an older tool created as demonstration of ``librte_pdump`` + library. The ``dpdk-pdump`` tool provides more limited functionality and + and depends on the Pcap PMD. It is retained only for compatibility reasons; + users should use ``dpdk-dumpcap`` instead. Test Environment ---------------- -The overview of using the Packet Capture Framework and the ``dpdk-pdump`` tool +The overview of using the Packet Capture Framework and the ``dpdk-dumpcap`` utility for packet capturing on the DPDK port in :numref:`figure_packet_capture_framework`. @@ -66,13 +70,13 @@ for packet capturing on the DPDK port in .. figure:: img/packet_capture_framework.* - Packet capturing on a DPDK port using the dpdk-pdump tool. + Packet capturing on a DPDK port using the dpdk-dumpcap utility. Running the Application ----------------------- -The following steps demonstrate how to run the ``dpdk-pdump`` tool to capture +The following steps demonstrate how to run the ``dpdk-dumpcap`` tool to capture Rx side packets on dpdk_port0 in :numref:`figure_packet_capture_framework` and inspect them using ``tcpdump``. @@ -80,16 +84,15 @@ inspect them using ``tcpdump``. sudo /app/dpdk-testpmd -c 0xf0 -n 4 -- -i --port-topology=chained -#. Launch the pdump tool as follows:: +#. Launch the dpdk-dump as follows:: - sudo /app/dpdk-pdump -- \ - --pdump 'port=0,queue=*,rx-dev=/tmp/capture.pcap' + sudo /app/dpdk-dumpcap -w /tmp/capture.pcapng #. Send traffic to dpdk_port0 from traffic generator. - Inspect packets captured in the file capture.pcap using a tool - that can interpret Pcap files, for example tcpdump:: + Inspect packets captured in the file capture.pcap using a tool such as + tcpdump or tshark that can interpret Pcapng files:: - $tcpdump -nr /tmp/capture.pcap + $ tcpdump -nr /tmp/capture.pcapng reading from file /tmp/capture.pcap, link-type EN10MB (Ethernet) 11:11:36.891404 IP 4.4.4.4.whois++ > 3.3.3.3.whois++: UDP, length 18 11:11:36.891442 IP 4.4.4.4.whois++ > 3.3.3.3.whois++: UDP, length 18 diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index 2dce507f46a3..b440c77c2ba1 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -43,6 +43,7 @@ Programmer's Guide ip_fragment_reassembly_lib generic_receive_offload_lib generic_segmentation_offload_lib + pcapng_lib pdump_lib multi_proc_support kernel_nic_interface diff --git a/doc/guides/prog_guide/pcapng_lib.rst b/doc/guides/prog_guide/pcapng_lib.rst new file mode 100644 index 000000000000..36379b530a57 --- /dev/null +++ b/doc/guides/prog_guide/pcapng_lib.rst @@ -0,0 +1,24 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2016 Intel Corporation. + +.. _pcapng_library: + +Packet Capture File Writer +========================== + +Pcapng is a library for creating files in Pcapng file format. +The Pcapng file format is the default capture file format for modern +network capture processing tools. It can be read by wireshark and tcpdump. + +Usage +----- + +Before the library can be used the function ``rte_pcapng_init`` +should be called once to initialize timestamp computation. + + +References +---------- +* Draft RFC https://www.ietf.org/id/draft-tuexen-opsawg-pcapng-03.html + +* Project repository https://github.com/pcapng/pcapng/ diff --git a/doc/guides/prog_guide/pdump_lib.rst b/doc/guides/prog_guide/pdump_lib.rst index 62c0b015b2fe..9af91415e5ea 100644 --- a/doc/guides/prog_guide/pdump_lib.rst +++ b/doc/guides/prog_guide/pdump_lib.rst @@ -3,10 +3,10 @@ .. _pdump_library: -The librte_pdump Library -======================== +The Packet Capture Library +========================== -The ``librte_pdump`` library provides a framework for packet capturing in DPDK. +The DPDK ``pdump`` library provides a framework for packet capturing in DPDK. The library does the complete copy of the Rx and Tx mbufs to a new mempool and hence it slows down the performance of the applications, so it is recommended to use this library for debugging purposes. @@ -23,11 +23,19 @@ or disable the packet capture, and to uninitialize it. * ``rte_pdump_enable()``: This API enables the packet capture on a given port and queue. - Note: The filter option in the API is a place holder for future enhancements. + +* ``rte_pdump_enable_bpf()`` + This API enables the packet capture on a given port and queue. + It also allows setting an optional filter using DPDK BPF interpreter and + setting the captured packet length. * ``rte_pdump_enable_by_deviceid()``: This API enables the packet capture on a given device id (``vdev name or pci address``) and queue. - Note: The filter option in the API is a place holder for future enhancements. + +* ``rte_pdump_enable_bpf_by_deviceid()`` + This API enables the packet capture on a given device id (``vdev name or pci address``) and queue. + It also allows seating an optional filter using DPDK BPF interpreter and + setting the captured packet length. * ``rte_pdump_disable()``: This API disables the packet capture on a given port and queue. @@ -61,6 +69,12 @@ and enables the packet capture by registering the Ethernet RX and TX callbacks f and queue combinations. Then the primary process will mirror the packets to the new mempool and enqueue them to the rte_ring that secondary process have passed to these APIs. +The packet ring supports one of two formats. The default format enqueues copies of the original packets +into the rte_ring. If the ``RTE_PDUMP_FLAG_PCAPNG`` is set the mbuf data is extended with header and trailer +to match the format of Pcapng enhanced packet block. The enhanced packet block has meta-data such as the +timestamp, port and queue the packet was captured on. It is up to the application consuming the +packets from the ring to select the format desired. + The library APIs ``rte_pdump_disable()`` and ``rte_pdump_disable_by_deviceid()`` disables the packet capture. For the calls to these APIs from secondary process, the library creates the "pdump disable" request and sends the request to the primary process over the multi process channel. The primary process takes this request and @@ -74,5 +88,5 @@ function. Use Case: Packet Capturing -------------------------- -The DPDK ``app/pdump`` tool is developed based on this library to capture packets in DPDK. -Users can use this as an example to develop their own packet capturing tools. +The DPDK ``app/dpdk-dumpcap`` utility uses this library +to capture packets in DPDK. diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index 675b5738348b..ee24cbfdb99d 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -62,6 +62,16 @@ New Features * Added bus-level parsing of the devargs syntax. * Kept compatibility with the legacy syntax as parsing fallback. +* **Enhance Packet capture.** + + * New dpdk-dumpcap program that has most of the features of the + wireshark dumpcap utility including capture of multiple interfaces, + stopping after number of bytes, packets. + * New library for writing pcapng packet capture files. + * Enhancement to the pdump library to support: + * Packet filter with BPF. + * Pcapng format with timestamps and meta-data. + * Fixes packet capture with stripped VLAN tags. Removed Items ------------- diff --git a/doc/guides/tools/dumpcap.rst b/doc/guides/tools/dumpcap.rst new file mode 100644 index 000000000000..664ea0c79802 --- /dev/null +++ b/doc/guides/tools/dumpcap.rst @@ -0,0 +1,86 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2020 Microsoft Corporation. + +.. _dumpcap_tool: + +dpdk-dumpcap Application +======================== + +The ``dpdk-dumpcap`` tool is a Data Plane Development Kit (DPDK) +network traffic dump tool. The interface is similar to the dumpcap tool in Wireshark. +It runs as a secondary DPDK process and lets you capture packets that are +coming into and out of a DPDK primary process. +The ``dpdk-dumpcap`` writes files in Pcapng packet format using +capture file format is pcapng. + +Without any options set it will use DPDK to capture traffic from the first +available DPDK interface and write the received raw packet data, along +with timestamps into a pcapng file. + +If the ``-w`` option is not specified, ``dpdk-dumpcap`` writes to a newly +create file with a name chosen based on interface name and timestamp. +If ``-w`` option is specified, then that file is used. + + .. Note:: + * The ``dpdk-dumpcap`` tool can only be used in conjunction with a primary + application which has the packet capture framework initialized already. + In dpdk, only the ``testpmd`` is modified to initialize packet capture + framework, other applications remain untouched. So, if the ``dpdk-dumpcap`` + tool has to be used with any application other than the testpmd, user + needs to explicitly modify that application to call packet capture + framework initialization code. Refer ``app/test-pmd/testpmd.c`` + code to see how this is done. + + * The ``dpdk-dumpcap`` tool runs as a DPDK secondary process. It exits when + the primary application exits. + + +Running the Application +----------------------- + +To list interfaces available for capture use ``--list-interfaces``. + +To filter packets in style of *tshark* use the ``-f`` flag. + +To capture on multiple interfaces at once, use multiple ``-I`` flags. + +Example +------- + +.. code-block:: console + + # .//app/dpdk-dumpcap --list-interfaces + 0. 000:00:03.0 + 1. 000:00:03.1 + + # .//app/dpdk-dumpcap -I 0000:00:03.0 -c 6 -w /tmp/sample.pcapng + Packets captured: 6 + Packets received/dropped on interface '0000:00:03.0' 6/0 + + # .//app/dpdk-dumpcap -f 'tcp port 80' + Packets captured: 6 + Packets received/dropped on interface '0000:00:03.0' 10/8 + + +Limitations +----------- +The following option of Wireshark ``dumpcap`` is not yet implemented: + + * ``-b|--ring-buffer`` -- more complex file management. + +The following options do not make sense in the context of DPDK. + + * ``-C `` -- its a kernel thing + + * ``-t`` -- use a thread per interface + + * Timestamp type. + + * Link data types. Only EN10MB (Ethernet) is supported. + + * Wireless related options: ``-I|--monitor-mode`` and ``-k `` + + +.. Note:: + * The options to ``dpdk-dumpcap`` are like the Wireshark dumpcap program and + are not the same as ``dpdk-pdump`` and other DPDK applications. diff --git a/doc/guides/tools/index.rst b/doc/guides/tools/index.rst index 93dde4148e90..b71c12b8f2dd 100644 --- a/doc/guides/tools/index.rst +++ b/doc/guides/tools/index.rst @@ -8,6 +8,7 @@ DPDK Tools User Guides :maxdepth: 2 :numbered: + dumpcap proc_info pdump pmdinfo From patchwork Fri Sep 10 18:18:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 98683 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DDBB0A0547; Fri, 10 Sep 2021 20:19:56 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id ADE254117F; Fri, 10 Sep 2021 20:19:07 +0200 (CEST) Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) by mails.dpdk.org (Postfix) with ESMTP id 7E9C841155 for ; Fri, 10 Sep 2021 20:18:58 +0200 (CEST) Received: by mail-pg1-f171.google.com with SMTP id r2so2559134pgl.10 for ; Fri, 10 Sep 2021 11:18:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XyqQtx1VKvwB71wOpLr2p4bGL/oB0DPWsDpls3n0BsQ=; b=egq5qiLm+DTkWcE1xN71K0DU/vfsj7GNBlze7Uqsp89n7odTbKPrtzjVV23A2DVDsT nr4szKv3NRPGg7l/uMEi+4QgXtlHpRt4QLoPkf6iilkjQ33+7Or0GWwPN4cTp3p8DJT3 SJw113rv43xoC/SBfCn7fMNinrl3TcGA1cnooRpV+w60R8PpXQDIUS1nadQ2nbl/JjLZ CUoiE90q9yX8jo4Hsdq7ccyUCTvkY2Z/GSLJlmWLmV8BF6Lv4Bt3B1bXWFQ/qifZZwjX g4ivpzxq8G5BM0LsN3mPp/pukLk/ArhhmuoycrX8b0vRF6oHixvfraKhq8JSTtDfk5/A yRpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XyqQtx1VKvwB71wOpLr2p4bGL/oB0DPWsDpls3n0BsQ=; b=f3Lv9oaBFdeTd+CgZrYYLp9idBc6XnsINAmMU9BIZV8TqrLvpqOcSX5aWG880zoHqS 6IjRgs1ALT/ywWQ/4j83qSD4gvCGiIWCgutFb2iXwW/wq1A0bxMwAWjfd4Fs9VOQDXob hV9sBhuPc2ltGjNFbz3TgN0dawtjQuFSCr/916tadbqinKajuGBWxOBAEO53t78tH8Sq OtJTcGmi93VGE1YU2AAXzgEL8aGZhsx0tXPJ5tr3jqICP08+s0aNtQMvjdJVDYC0GSSJ cR8tD2EPHwYONjQr9mHlWg4HrWcXkS6DuPgM2ok2Qzj5LSMMgUSBCNnGBk6w4NcZx+Dv dn6Q== X-Gm-Message-State: AOAM531TYhgu1C2ni66d0YPHvPFOqVTa3yYH74Se2/hITeoSMT0fwRuc IgrlvUcwpU0OPsuFEs2NS5068M17a6P7RA== X-Google-Smtp-Source: ABdhPJyaSIv2Zq3ub0YbjNujtsRSDfTGl5MDiZHCXOFQmg6JacTPKVYuuvWc9oeYPXjUhM1a+7dTbg== X-Received: by 2002:a63:5942:: with SMTP id j2mr8459041pgm.78.1631297937252; Fri, 10 Sep 2021 11:18:57 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id p13sm5652857pjo.9.2021.09.10.11.18.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 11:18:56 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 10 Sep 2021 11:18:41 -0700 Message-Id: <20210910181841.530280-12-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210910181841.530280-1-stephen@networkplumber.org> References: <20210903004732.109023-1-stephen@networkplumber.org> <20210910181841.530280-1-stephen@networkplumber.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v7 11/11] MAINTAINERS: add entry for new pcapng and dumper X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Claim responsibility for the new code. Signed-off-by: Stephen Hemminger --- MAINTAINERS | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 266f5ac1dae8..06384ac2702d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1429,6 +1429,12 @@ F: app/test/test_pdump.* F: app/pdump/ F: doc/guides/tools/pdump.rst +Packet dump +M: Stephen Hemminger +F: lib/pcapng/ +F: doc/guides/prog_guide/pcapng_lib.rst +F: app/dumpcap/ +F: doc/guides/tools/dumpcap.rst Packet Framework ----------------