From patchwork Thu Sep 24 11:34:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Igor Russkikh X-Patchwork-Id: 78674 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id F230EA04B1; Thu, 24 Sep 2020 13:34:30 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7A6B61DDC7; Thu, 24 Sep 2020 13:34:30 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by dpdk.org (Postfix) with ESMTP id DA2561D671 for ; Thu, 24 Sep 2020 13:34:28 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 08OBU5Np003210; Thu, 24 Sep 2020 04:34:28 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=pfpt0220; bh=UgbK9XNGsG8nQYWFIsk61JQGul2Hv2ZXJj074rgk+9M=; b=Nq7QKf4dJnME7tGZ9GAykQGp9JWoSUlbxnkNVMmld9ncDkM9PCSxwXSrwZYQmJuYgcca iJh2fxcF0y2ST18t8w9BUr9m/FFbgVNCsaD8CYBi29jRgFNAHmZo4ErwNCWdMRbjqsHO yafrT1AzEZnNVv9iwjqVHEyo/SPyBveEsykZUqqk9kKsZY7nvVJNXUEKqkp9oT1iQAUY WaM/ycp51/+WTENpy6cDp9BDmKrYPoGdSudHfZOgLNQleVYZiRQSlWWEEEkAbYWJPBEE siVTQZEZtM+h4N4B2Avm8kMfJjLCdmy4Ks84ZjbHfMPfRIVek1Ac9vQMwFr2os91qiyL 0A== Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0a-0016f401.pphosted.com with ESMTP id 33nfbq4c7g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 24 Sep 2020 04:34:27 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 24 Sep 2020 04:34:26 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 24 Sep 2020 04:34:25 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 24 Sep 2020 04:34:25 -0700 Received: from NN-LT0019.marvell.com (NN-LT0019.marvell.com [10.193.39.7]) by maili.marvell.com (Postfix) with ESMTP id 6D8933F7040; Thu, 24 Sep 2020 04:34:23 -0700 (PDT) From: Igor Russkikh To: CC: Rasesh Mody , Devendra Singh Rawat , Wenzhuo Lu , Beilei Xing , Bernard Iremonger , Igor Russkikh Date: Thu, 24 Sep 2020 14:34:14 +0300 Message-ID: <20200924113414.483-1-irusskikh@marvell.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-09-24_08:2020-09-24, 2020-09-24 signatures=0 Subject: [dpdk-dev] [RFC PATCH] app/testpmd: tx pkt clones parameter in flowgen X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When testing high performance numbers, it is often that CPU performance limits the max values device can reach (both in pps and in gbps) Here instead of recreating each packet separately, we use clones counter to resend the same mbuf to the line multiple times. PMDs handle that transparently due to reference counting inside of mbuf. Verified on Marvell qede and atlantic PMDs. Signed-off-by: Igor Russkikh --- app/test-pmd/flowgen.c | 100 ++++++++++++++------------ app/test-pmd/parameters.c | 12 ++++ app/test-pmd/testpmd.c | 1 + app/test-pmd/testpmd.h | 1 + doc/guides/testpmd_app_ug/run_app.rst | 7 ++ 5 files changed, 74 insertions(+), 47 deletions(-) diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c index acf3e2460..b6f6e7a0e 100644 --- a/app/test-pmd/flowgen.c +++ b/app/test-pmd/flowgen.c @@ -94,6 +94,7 @@ pkt_burst_flow_gen(struct fwd_stream *fs) uint16_t nb_rx; uint16_t nb_tx; uint16_t nb_pkt; + uint16_t nb_clones = nb_pkt_clones; uint16_t i; uint32_t retry; uint64_t tx_offloads; @@ -123,53 +124,58 @@ pkt_burst_flow_gen(struct fwd_stream *fs) ol_flags |= PKT_TX_MACSEC; for (nb_pkt = 0; nb_pkt < nb_pkt_per_burst; nb_pkt++) { - pkt = rte_mbuf_raw_alloc(mbp); - if (!pkt) - break; - - pkt->data_len = pkt_size; - pkt->next = NULL; - - /* Initialize Ethernet header. */ - eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *); - rte_ether_addr_copy(&cfg_ether_dst, ð_hdr->d_addr); - rte_ether_addr_copy(&cfg_ether_src, ð_hdr->s_addr); - eth_hdr->ether_type = rte_cpu_to_be_16(RTE_ETHER_TYPE_IPV4); - - /* Initialize IP header. */ - ip_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); - memset(ip_hdr, 0, sizeof(*ip_hdr)); - ip_hdr->version_ihl = RTE_IPV4_VHL_DEF; - ip_hdr->type_of_service = 0; - ip_hdr->fragment_offset = 0; - ip_hdr->time_to_live = IP_DEFTTL; - ip_hdr->next_proto_id = IPPROTO_UDP; - ip_hdr->packet_id = 0; - ip_hdr->src_addr = rte_cpu_to_be_32(cfg_ip_src); - ip_hdr->dst_addr = rte_cpu_to_be_32(cfg_ip_dst + - next_flow); - ip_hdr->total_length = RTE_CPU_TO_BE_16(pkt_size - - sizeof(*eth_hdr)); - ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, - sizeof(*ip_hdr)); - - /* Initialize UDP header. */ - udp_hdr = (struct rte_udp_hdr *)(ip_hdr + 1); - udp_hdr->src_port = rte_cpu_to_be_16(cfg_udp_src); - udp_hdr->dst_port = rte_cpu_to_be_16(cfg_udp_dst); - udp_hdr->dgram_cksum = 0; /* No UDP checksum. */ - udp_hdr->dgram_len = RTE_CPU_TO_BE_16(pkt_size - - sizeof(*eth_hdr) - - sizeof(*ip_hdr)); - pkt->nb_segs = 1; - pkt->pkt_len = pkt_size; - pkt->ol_flags &= EXT_ATTACHED_MBUF; - pkt->ol_flags |= ol_flags; - pkt->vlan_tci = vlan_tci; - pkt->vlan_tci_outer = vlan_tci_outer; - pkt->l2_len = sizeof(struct rte_ether_hdr); - pkt->l3_len = sizeof(struct rte_ipv4_hdr); - pkts_burst[nb_pkt] = pkt; + if (!nb_pkt || !nb_clones) { + nb_clones = nb_pkt_clones; + pkt = rte_mbuf_raw_alloc(mbp); + if (!pkt) + break; + + pkt->data_len = pkt_size; + pkt->next = NULL; + + /* Initialize Ethernet header. */ + eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *); + rte_ether_addr_copy(&cfg_ether_dst, ð_hdr->d_addr); + rte_ether_addr_copy(&cfg_ether_src, ð_hdr->s_addr); + eth_hdr->ether_type = rte_cpu_to_be_16(RTE_ETHER_TYPE_IPV4); + + /* Initialize IP header. */ + ip_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); + memset(ip_hdr, 0, sizeof(*ip_hdr)); + ip_hdr->version_ihl = RTE_IPV4_VHL_DEF; + ip_hdr->type_of_service = 0; + ip_hdr->fragment_offset = 0; + ip_hdr->time_to_live = IP_DEFTTL; + ip_hdr->next_proto_id = IPPROTO_UDP; + ip_hdr->packet_id = 0; + ip_hdr->src_addr = rte_cpu_to_be_32(cfg_ip_src); + ip_hdr->dst_addr = rte_cpu_to_be_32(cfg_ip_dst + + next_flow); + ip_hdr->total_length = RTE_CPU_TO_BE_16(pkt_size - + sizeof(*eth_hdr)); + ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, + sizeof(*ip_hdr)); + + /* Initialize UDP header. */ + udp_hdr = (struct rte_udp_hdr *)(ip_hdr + 1); + udp_hdr->src_port = rte_cpu_to_be_16(cfg_udp_src); + udp_hdr->dst_port = rte_cpu_to_be_16(cfg_udp_dst); + udp_hdr->dgram_cksum = 0; /* No UDP checksum. */ + udp_hdr->dgram_len = RTE_CPU_TO_BE_16(pkt_size - + sizeof(*eth_hdr) - + sizeof(*ip_hdr)); + pkt->nb_segs = 1; + pkt->pkt_len = pkt_size; + pkt->ol_flags &= EXT_ATTACHED_MBUF; + pkt->ol_flags |= ol_flags; + pkt->vlan_tci = vlan_tci; + pkt->vlan_tci_outer = vlan_tci_outer; + pkt->l2_len = sizeof(struct rte_ether_hdr); + pkt->l3_len = sizeof(struct rte_ipv4_hdr); + } else { + nb_clones--; + } + pkts_burst[nb_pkt] = pkt; next_flow = (next_flow + 1) % cfg_n_flows; } diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c index 1ead59579..a2863bf8d 100644 --- a/app/test-pmd/parameters.c +++ b/app/test-pmd/parameters.c @@ -161,6 +161,7 @@ usage(char* progname) printf(" --hairpinq=N: set the number of hairpin queues per port to " "N.\n"); printf(" --burst=N: set the number of packets per burst to N.\n"); + printf(" --clones=N: set the number of single packet clones to send. Should be less than burst value.\n"); printf(" --mbcache=N: set the cache of mbuf memory pool to N.\n"); printf(" --rxpt=N: set prefetch threshold register of RX rings to N.\n"); printf(" --rxht=N: set the host threshold register of RX rings to N.\n"); @@ -645,6 +646,7 @@ launch_args_parse(int argc, char** argv) { "txd", 1, 0, 0 }, { "hairpinq", 1, 0, 0 }, { "burst", 1, 0, 0 }, + { "clones", 1, 0, 0 }, { "mbcache", 1, 0, 0 }, { "txpt", 1, 0, 0 }, { "txht", 1, 0, 0 }, @@ -1151,6 +1153,16 @@ launch_args_parse(int argc, char** argv) else nb_pkt_per_burst = (uint16_t) n; } + if (!strcmp(lgopts[opt_idx].name, "clones")) { + n = atoi(optarg); + if ((n >= 0) && + (n <= nb_pkt_per_burst)) + nb_pkt_clones = (uint16_t) n; + else + rte_exit(EXIT_FAILURE, + "clones must be >= 0 and <= %d (burst)\n", + nb_pkt_per_burst); + } if (!strcmp(lgopts[opt_idx].name, "mbcache")) { n = atoi(optarg); if ((n >= 0) && diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index fe6450cc0..18b4b63d1 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -228,6 +228,7 @@ uint32_t tx_pkt_times_intra; /**< Timings for send scheduling in TXONLY mode, time between packets. */ uint16_t nb_pkt_per_burst = DEF_PKT_BURST; /**< Number of packets per burst. */ +uint16_t nb_pkt_clones; /**< Number of tx packet clones to send. */ uint16_t mb_mempool_cache = DEF_MBUF_CACHE; /**< Size of mbuf mempool cache. */ /* current configuration is in DCB or not,0 means it is not in DCB mode */ diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index f139fe7a0..7337b5b94 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -431,6 +431,7 @@ extern enum tx_pkt_split tx_pkt_split; extern uint8_t txonly_multi_flow; extern uint16_t nb_pkt_per_burst; +extern uint16_t nb_pkt_clones; extern uint16_t mb_mempool_cache; extern int8_t rx_pthresh; extern int8_t rx_hthresh; diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst index e2539f693..42c2efb1f 100644 --- a/doc/guides/testpmd_app_ug/run_app.rst +++ b/doc/guides/testpmd_app_ug/run_app.rst @@ -296,6 +296,13 @@ The command line options are: If set to 0, driver default is used if defined. Else, if driver default is not defined, default of 32 is used. +* ``--clones=N`` + + Set the number of each packet clones to be sent in `flowgen` mode. + Sending clones reduces host CPU load on creating packets and may help + in testing extreme speeds or maxing out tx packet performance. + N should be not zero, but less than 'burst' parameter. + * ``--mbcache=N`` Set the cache of mbuf memory pools to N, where 0 <= N <= 512.