From patchwork Tue Oct 18 19:41:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Boyer X-Patchwork-Id: 118488 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 60359A0560; Tue, 18 Oct 2022 21:46:14 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 894E0427EB; Tue, 18 Oct 2022 21:46:01 +0200 (CEST) Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2079.outbound.protection.outlook.com [40.107.223.79]) by mails.dpdk.org (Postfix) with ESMTP id 02D7942829 for ; Tue, 18 Oct 2022 21:45:59 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k/oXbesVEZMeE2gR+aBvNKDmbuOm0eFO6CWNkOQvdBMDnLOxaSqF4L1NNkUmWkS2b+nUreexN1rX8yqw6HTDgx+8w5GwnHQ/aWZAVfQzjdemCZC+lN85lbkd1J7GUyGUdXSLv4f7832gl/av1CCpHe4zi0N3byNt5m/CvONXMdJhRqJvTVbI4AClOIrqVuwtVV8bnMqO9iEecP0aUiE5uyM3WXKqVFEXJEku1KfhRJFrbdvWmdAsxpvsd2ALP3/8qD1WGOsFGOh3OH9Hst5hd4I5x+i9MpIxfuuxoyw7wYEiv1it1sHOGxJqjHVmiTgL5bveN8IsmxNh9Zv9pjc/tA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Q1U7vBTyWlTxK1R+HcSMQAHiYyssv4+UuCVvovtLfcw=; b=EOQm/BueKGxgnI5433vUWLFnszAMujeO9mNlwpdpNCVei6pI6T+8LjV8W/K1toViob+eAmBRyre6PL1zEowjwPwdrweFv+i5oPEjs+c3h7w5PRruQnICM05WQugqzanG0M+L22CrjaJQDi2RLtOuJcJT4XVXKSDhhlCAKQLr3ZQAn5P1jUXOtouhl3dXug7a1doXyM9ZnZMf5PsMzWSRZhJV2/4ODFZEZT9c7sgF25VO+REseLWyHR827/wIiGvwFfsCJ9g+bc6oxsrQ68VJhkkKhi8x10MxXerl1/LaHrK4YBortge52WE8b6QaX2DF2tiWKu8wTZM9pol26Fk3qg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=dpdk.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Q1U7vBTyWlTxK1R+HcSMQAHiYyssv4+UuCVvovtLfcw=; b=Gsz4/aF9eeVpH4jEioaGDv0djY0b+mLtm4sn4MY/eLpvJs04eXZqRFFNXoYGZlxW+CFP6N6mIyhMTi7Q+ERTI4cZN5VYvw+d7l+W43G2ofKU8uzbpH9JTauyifsJ4vaVHAFGfMwwORluXy5KDuGwWfEknePLgod2pcNYpOwLRq0= Received: from DS7P222CA0001.NAMP222.PROD.OUTLOOK.COM (2603:10b6:8:2e::7) by DM6PR12MB4250.namprd12.prod.outlook.com (2603:10b6:5:21a::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5723.34; Tue, 18 Oct 2022 19:45:57 +0000 Received: from DM6NAM11FT084.eop-nam11.prod.protection.outlook.com (2603:10b6:8:2e:cafe::25) by DS7P222CA0001.outlook.office365.com (2603:10b6:8:2e::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5723.32 via Frontend Transport; Tue, 18 Oct 2022 19:45:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DM6NAM11FT084.mail.protection.outlook.com (10.13.172.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5746.16 via Frontend Transport; Tue, 18 Oct 2022 19:45:57 +0000 Received: from driver-dev1.pensando.io (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 18 Oct 2022 14:45:56 -0500 From: Andrew Boyer To: CC: Andrew Boyer Subject: [PATCH v2 21/36] net/ionic: overhaul transmit side for performance Date: Tue, 18 Oct 2022 12:41:16 -0700 Message-ID: <20221018194131.23006-22-andrew.boyer@amd.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221011005032.47584-1-andrew.boyer@amd.com> References: <20221011005032.47584-1-andrew.boyer@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6NAM11FT084:EE_|DM6PR12MB4250:EE_ X-MS-Office365-Filtering-Correlation-Id: 195a3abb-66b1-4ab0-d26d-08dab1416038 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wKJqWPaeSmnZn437fkbLVGSrv+f5Q8FYaJXT2kBrBbttm3PNs1hX6wu+40nxLAKkGugrxzgSs0wIvJkQ1efp9Z1KjEQGf7EBaWKZOf5GcTG/JLmFu6vuvkgno/Z29CxyIxJmDhX6Zmxr1QPv83lGDZzyfBU3SdvSZtTR6eleARABrg85o4EDYqymh28DGmTk0y6G/IJKf1ajfXACrONevapcPjV8iS5PQrDoMBFa9boZgZyzvJj5QFwGF1GFeBsKkyNlHVBPY8dE96WjjONTVn3n78OXV+bpawldWq6BbQz2X855Esh13sHYs0u4ZJ8UsPQSVo2F/P3m5gmwQuvoWpm5exHRoIDUCxB5KcRlapx+eKhXaUS1ZO8QRZW1mxPB2UpiAvN7DmLsjBVwfit3k1IgZUJ5nIJE72WV67Vfr62yGIP2KEN/iu85DpPhidsPufhpvImDwulfSX9gDKtDvZhCrkjfj+3YWOQ0hCA2Gg6A2R637/Gx8xT487B1CNeQiRUgVogkeAndGkbk2mAdR5nRG5CiKC86D7AS+k4GO9YpwpzqLZYQz8KElS8k6259UVGsv3gke6ulzZrPFBKcavmJqGlaZBQ2p8/8TqKAX9TiiKTMIMCPLnx7Bqk0SzOvk+3TE/PJ7v6Fe8t5CcjQKI0M56WhyCFoApsOcqcKw11bLhNrttXClKblZN497B5Kxm8SFS46bgg/ULMkxAGorOujkPZ4OUu9GBY5D6gs88f/5AHNGrqdaZF2/vna7YWEuttSmgHowkCuJlPh3wtAF8z5UdLbB+WwYvC/3K7yo/DJ4huezsge0UdGeWb/HCG27qgiVtCiaXv8i6+/BKdo4qOW0iWpEqHEU+TbXJDiMLc= X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230022)(4636009)(136003)(376002)(346002)(396003)(39860400002)(451199015)(40470700004)(46966006)(36840700001)(86362001)(36756003)(356005)(81166007)(82740400003)(2906002)(83380400001)(336012)(40460700003)(40480700001)(5660300002)(8936002)(44832011)(2616005)(16526019)(186003)(1076003)(47076005)(426003)(6666004)(26005)(36860700001)(478600001)(316002)(82310400005)(4326008)(6916009)(70586007)(70206006)(8676002)(41300700001)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Oct 2022 19:45:57.1048 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 195a3abb-66b1-4ab0-d26d-08dab1416038 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT084.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4250 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Linearize Tx mbuf chains in the info array. This avoids walking the mbuf chain during flush. Move a few branches out of the hot path. Signed-off-by: Andrew Boyer --- drivers/net/ionic/ionic_lif.c | 2 +- drivers/net/ionic/ionic_rxtx.c | 143 ++++++++++++++++++++------------- 2 files changed, 87 insertions(+), 58 deletions(-) diff --git a/drivers/net/ionic/ionic_lif.c b/drivers/net/ionic/ionic_lif.c index ede368d8ca..4305587fe0 100644 --- a/drivers/net/ionic/ionic_lif.c +++ b/drivers/net/ionic/ionic_lif.c @@ -817,7 +817,7 @@ ionic_tx_qcq_alloc(struct ionic_lif *lif, uint32_t socket_id, uint32_t index, "tx", flags, ntxq_descs, - 1, + num_segs_fw, sizeof(struct ionic_txq_desc), sizeof(struct ionic_txq_comp), sizeof(struct ionic_txq_sg_desc_v1), diff --git a/drivers/net/ionic/ionic_rxtx.c b/drivers/net/ionic/ionic_rxtx.c index a4082e9ba4..56701e90d4 100644 --- a/drivers/net/ionic/ionic_rxtx.c +++ b/drivers/net/ionic/ionic_rxtx.c @@ -64,7 +64,7 @@ ionic_tx_empty(struct ionic_tx_qcq *txq) { struct ionic_queue *q = &txq->qcq.q; - ionic_empty_array(q->info, q->num_descs, 0); + ionic_empty_array(q->info, q->num_descs * q->num_segs, 0); } static void __rte_cold @@ -102,50 +102,49 @@ ionic_tx_flush(struct ionic_tx_qcq *txq) { struct ionic_cq *cq = &txq->qcq.cq; struct ionic_queue *q = &txq->qcq.q; - struct rte_mbuf *txm, *next; - struct ionic_txq_comp *cq_desc_base = cq->base; - struct ionic_txq_comp *cq_desc; + struct rte_mbuf *txm; + struct ionic_txq_comp *cq_desc, *cq_desc_base = cq->base; void **info; - u_int32_t comp_index = (u_int32_t)-1; + uint32_t i; cq_desc = &cq_desc_base[cq->tail_idx]; + while (color_match(cq_desc->color, cq->done_color)) { cq->tail_idx = Q_NEXT_TO_SRVC(cq, 1); - - /* Prefetch the next 4 descriptors (not really useful here) */ - if ((cq->tail_idx & 0x3) == 0) - rte_prefetch0(&cq_desc_base[cq->tail_idx]); - if (cq->tail_idx == 0) cq->done_color = !cq->done_color; - comp_index = cq_desc->comp_index; + /* Prefetch 4 x 16B comp at cq->tail_idx + 4 */ + if ((cq->tail_idx & 0x3) == 0) + rte_prefetch0(&cq_desc_base[Q_NEXT_TO_SRVC(cq, 4)]); - cq_desc = &cq_desc_base[cq->tail_idx]; - } + while (q->tail_idx != rte_le_to_cpu_16(cq_desc->comp_index)) { + /* Prefetch 8 mbuf ptrs at q->tail_idx + 2 */ + rte_prefetch0(IONIC_INFO_PTR(q, Q_NEXT_TO_SRVC(q, 2))); - if (comp_index != (u_int32_t)-1) { - while (q->tail_idx != comp_index) { - info = IONIC_INFO_PTR(q, q->tail_idx); + /* Prefetch next mbuf */ + void **next_info = + IONIC_INFO_PTR(q, Q_NEXT_TO_SRVC(q, 1)); + if (next_info[0]) + rte_mbuf_prefetch_part2(next_info[0]); + if (next_info[1]) + rte_mbuf_prefetch_part2(next_info[1]); - q->tail_idx = Q_NEXT_TO_SRVC(q, 1); + info = IONIC_INFO_PTR(q, q->tail_idx); + for (i = 0; i < q->num_segs; i++) { + txm = info[i]; + if (!txm) + break; - /* Prefetch the next 4 descriptors */ - if ((q->tail_idx & 0x3) == 0) - /* q desc info */ - rte_prefetch0(&q->info[q->tail_idx]); - - /* - * Note: you can just use rte_pktmbuf_free, - * but this loop is faster - */ - txm = info[0]; - while (txm != NULL) { - next = txm->next; rte_pktmbuf_free_seg(txm); - txm = next; + + info[i] = NULL; } + + q->tail_idx = Q_NEXT_TO_SRVC(q, 1); } + + cq_desc = &cq_desc_base[cq->tail_idx]; } } @@ -327,9 +326,12 @@ ionic_tx_tso_post(struct ionic_queue *q, struct ionic_txq_desc *desc, uint16_t vlan_tci, bool has_vlan, bool start, bool done) { + struct rte_mbuf *txm_seg; void **info; uint64_t cmd; uint8_t flags = 0; + int i; + flags |= has_vlan ? IONIC_TXQ_DESC_FLAG_VLAN : 0; flags |= encap ? IONIC_TXQ_DESC_FLAG_ENCAP : 0; flags |= start ? IONIC_TXQ_DESC_FLAG_TSO_SOT : 0; @@ -345,7 +347,13 @@ ionic_tx_tso_post(struct ionic_queue *q, struct ionic_txq_desc *desc, if (done) { info = IONIC_INFO_PTR(q, q->head_idx); - info[0] = txm; + + /* Walk the mbuf chain to stash pointers in the array */ + txm_seg = txm; + for (i = 0; i < txm->nb_segs; i++) { + info[i] = txm_seg; + txm_seg = txm_seg->next; + } } q->head_idx = Q_NEXT_TO_POST(q, 1); @@ -497,8 +505,7 @@ ionic_tx(struct ionic_tx_qcq *txq, struct rte_mbuf *txm) struct ionic_tx_stats *stats = &txq->stats; struct rte_mbuf *txm_seg; void **info; - bool encap; - bool has_vlan; + rte_iova_t data_iova; uint64_t ol_flags = txm->ol_flags; uint64_t addr, cmd; uint8_t opcode = IONIC_TXQ_DESC_OPCODE_CSUM_NONE; @@ -524,32 +531,44 @@ ionic_tx(struct ionic_tx_qcq *txq, struct rte_mbuf *txm) if (opcode == IONIC_TXQ_DESC_OPCODE_CSUM_NONE) stats->no_csum++; - has_vlan = (ol_flags & RTE_MBUF_F_TX_VLAN); - encap = ((ol_flags & RTE_MBUF_F_TX_OUTER_IP_CKSUM) || - (ol_flags & RTE_MBUF_F_TX_OUTER_UDP_CKSUM)) && - ((ol_flags & RTE_MBUF_F_TX_OUTER_IPV4) || - (ol_flags & RTE_MBUF_F_TX_OUTER_IPV6)); + if (((ol_flags & RTE_MBUF_F_TX_OUTER_IP_CKSUM) || + (ol_flags & RTE_MBUF_F_TX_OUTER_UDP_CKSUM)) && + ((ol_flags & RTE_MBUF_F_TX_OUTER_IPV4) || + (ol_flags & RTE_MBUF_F_TX_OUTER_IPV6))) { + flags |= IONIC_TXQ_DESC_FLAG_ENCAP; + } - flags |= has_vlan ? IONIC_TXQ_DESC_FLAG_VLAN : 0; - flags |= encap ? IONIC_TXQ_DESC_FLAG_ENCAP : 0; + if (ol_flags & RTE_MBUF_F_TX_VLAN) { + flags |= IONIC_TXQ_DESC_FLAG_VLAN; + desc->vlan_tci = rte_cpu_to_le_16(txm->vlan_tci); + } addr = rte_cpu_to_le_64(rte_mbuf_data_iova(txm)); cmd = encode_txq_desc_cmd(opcode, flags, txm->nb_segs - 1, addr); desc->cmd = rte_cpu_to_le_64(cmd); desc->len = rte_cpu_to_le_16(txm->data_len); - desc->vlan_tci = rte_cpu_to_le_16(txm->vlan_tci); info[0] = txm; - elem = sg_desc_base[q->head_idx].elems; + if (txm->nb_segs > 1) { + txm_seg = txm->next; - txm_seg = txm->next; - while (txm_seg != NULL) { - elem->len = rte_cpu_to_le_16(txm_seg->data_len); - elem->addr = rte_cpu_to_le_64(rte_mbuf_data_iova(txm_seg)); - elem++; - txm_seg = txm_seg->next; + elem = sg_desc_base[q->head_idx].elems; + + while (txm_seg != NULL) { + /* Stash the mbuf ptr in the array */ + info++; + *info = txm_seg; + + /* Configure the SGE */ + data_iova = rte_mbuf_data_iova(txm_seg); + elem->len = rte_cpu_to_le_16(txm_seg->data_len); + elem->addr = rte_cpu_to_le_64(data_iova); + elem++; + + txm_seg = txm_seg->next; + } } q->head_idx = Q_NEXT_TO_POST(q, 1); @@ -565,11 +584,19 @@ ionic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, struct ionic_queue *q = &txq->qcq.q; struct ionic_tx_stats *stats = &txq->stats; struct rte_mbuf *mbuf; - uint32_t next_q_head_idx; uint32_t bytes_tx = 0; uint16_t nb_avail, nb_tx = 0; int err; + struct ionic_txq_desc *desc_base = q->base; + rte_prefetch0(&desc_base[q->head_idx]); + rte_prefetch0(IONIC_INFO_PTR(q, q->head_idx)); + + if (tx_pkts) { + rte_mbuf_prefetch_part1(tx_pkts[0]); + rte_mbuf_prefetch_part2(tx_pkts[0]); + } + /* Cleaning old buffers */ ionic_tx_flush(txq); @@ -580,11 +607,13 @@ ionic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, } while (nb_tx < nb_pkts) { - next_q_head_idx = Q_NEXT_TO_POST(q, 1); - if ((next_q_head_idx & 0x3) == 0) { - struct ionic_txq_desc *desc_base = q->base; - rte_prefetch0(&desc_base[next_q_head_idx]); - rte_prefetch0(&q->info[next_q_head_idx]); + uint16_t next_idx = Q_NEXT_TO_POST(q, 1); + rte_prefetch0(&desc_base[next_idx]); + rte_prefetch0(IONIC_INFO_PTR(q, next_idx)); + + if (nb_tx + 1 < nb_pkts) { + rte_mbuf_prefetch_part1(tx_pkts[nb_tx + 1]); + rte_mbuf_prefetch_part2(tx_pkts[nb_tx + 1]); } mbuf = tx_pkts[nb_tx]; @@ -605,10 +634,10 @@ ionic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, if (nb_tx > 0) { rte_wmb(); ionic_q_flush(q); - } - stats->packets += nb_tx; - stats->bytes += bytes_tx; + stats->packets += nb_tx; + stats->bytes += bytes_tx; + } return nb_tx; }