From patchwork Fri Oct 7 17:43:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Boyer, Andrew" X-Patchwork-Id: 117599 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 11A7CA04FD; Fri, 7 Oct 2022 19:46:32 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 13DA142BCE; Fri, 7 Oct 2022 19:44:54 +0200 (CEST) Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2040.outbound.protection.outlook.com [40.107.236.40]) by mails.dpdk.org (Postfix) with ESMTP id 8D23542BBC for ; Fri, 7 Oct 2022 19:44:52 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=g3it3hvGXzU3yMgN97cfy0iUfSg/uWxyB0zmEiTl7xcSrXIDYkklVDyfZfnPnL+B2MtvttYbwmdFuOuix/ZAvICYLl9rEHcRA2VXVGwykUluPQV8dl0GCGAKNph41ALMe7KTUhcDUy/zPFwPBrYQ41ecHlbUp3wOM1xSqnnBb/3ZEfeniERPHafH47prJdAx4SGrFdRGGlG5VnA6pwn3jaj7uuc0VQ8n3oz4/QNr4iRW5Kqhb0Z/timc4ykUPA/xxfnfJmT6mSgB8MbRIhmn19UXlWv1i2A2K8ZFt2Hsr8Wn9ukpPRAvMn72Q1aPU5aD2knFtFuSWNk4AMXdow6QtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=B2voeoinhL6qsZSiV0HfmqxW43C1Y7lHSF8Ka1DPvpA=; b=RfXsy0RptsnPNLl5cDTUs0Z8mADaYZEKXbFf8P/2s7m0qRm7Jw91A55HZn7gARlrWU8SdANK5ORQc5csbe8yTJ7GrPvDvD+GpxNz9MH4TxeVsEoDxeESz5R1zwcYGiB7fCZN79O557K2KR2OtEqD8xf/pw/x2Nv4UssxRdbby/wlOuhTQMrm19CG3gN5Rym1H4udq2D3K29QfoHON+zSOEM3PgpcIXv6ZLcwf0AjucpOqvhF97ZwiAoZiHbXobL6XP28WY8/B/MnRqmOHNzXvGg+bSmOR7avRg7Ez+YP5tyNwXX9HxLd5mlY+cZ3Yi+/G8HgtAFIRbBa459NnMD52g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=dpdk.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=B2voeoinhL6qsZSiV0HfmqxW43C1Y7lHSF8Ka1DPvpA=; b=22mfowOf166E9ynVHudM/yuO2A3QpYt2ZAJTIE1bFLkluAvXVjAioVO+SdIOISFZtsyAxyvlEY6me59kpo9NVVkKZBc06wX2Ybi8BwR8LauV0LFLnMCNQXnG1T+8bc4J9s52hY3kaH1lGzF6ul65mnOf3hZ14/nH7JWlxnULcIA= Received: from MW4PR04CA0257.namprd04.prod.outlook.com (2603:10b6:303:88::22) by SJ0PR12MB6781.namprd12.prod.outlook.com (2603:10b6:a03:44b::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.23; Fri, 7 Oct 2022 17:44:50 +0000 Received: from CO1NAM11FT003.eop-nam11.prod.protection.outlook.com (2603:10b6:303:88:cafe::e3) by MW4PR04CA0257.outlook.office365.com (2603:10b6:303:88::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.24 via Frontend Transport; Fri, 7 Oct 2022 17:44:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT003.mail.protection.outlook.com (10.13.175.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5709.10 via Frontend Transport; Fri, 7 Oct 2022 17:44:50 +0000 Received: from driver-dev1.pensando.io (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Fri, 7 Oct 2022 12:44:49 -0500 From: Andrew Boyer To: CC: Andrew Boyer Subject: [PATCH 20/35] net/ionic: overhaul transmit side for performance Date: Fri, 7 Oct 2022 10:43:21 -0700 Message-ID: <20221007174336.54354-21-andrew.boyer@amd.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221007174336.54354-1-andrew.boyer@amd.com> References: <20221007174336.54354-1-andrew.boyer@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT003:EE_|SJ0PR12MB6781:EE_ X-MS-Office365-Filtering-Correlation-Id: 6adca26e-9212-43c9-8656-08daa88ba283 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ShdaMs+vcZJd0H64coLzlaYciTk6CPwr2U1s/KrSAjoPh0QL5qaoIPCRfupHqo1ArrkU5w/kY7sNI8snGP185KC2+O1ST6rh2Jh9x5c1eClYDDoYJuR/D1ILgaOS3Zr7NeW3Mb+nwfHL1bmjnWhabxmRlE5nxYwsdMOYCF7u3Q6H4KyC1raP2JJdM4qvH0An17RvwB3TnUyHWGKe39bZWHdfb92FfqWJ8UokwwLjCovFrM9eutecDmn/B2634Fu8yrYM7Zi0G6w0KRUo7qj4I5F0q0BQsTfhy6PSYz+YAMk6dLjTXD7t6eN2eP5fCXRGW2/rkBWYHP+2o9Vhik11+Jl0aRZzJrP+d/dlwMua6yYQzfcUyuNwW0hI/i9gcMDfDgbKtjcftM31X6wa5wSptWEtf+X6nh3JrbG3rvWBIU8mMZE8O2fEEtVNTIRCKNhggpZoWNUK1fGL5iS/TEsh08uWVpHA0c8tFpFBTa0z+UzyY59j6e+NwKN7xHIiif26NBelmrMwav+w9DEqUV+IkdEr9tU2dEuM1hiW2/HNBd4w/CAkCcKVpTDyx/B08kTXw4JH7+6/+xg0SBcVBY52iXNSkE1toDAm1swOieC7VKhcSMP3YBj+6sYrsi8t4tJsmLskTjplMekgugEf45gyvIo63fN8A7vYkyDFwKCh0EYS9yWeYbPgRIIuq2KY8Pe3IfCsjN8dnuOOSLQpkI+z65GjLBeaeps1KwWPeGfFELC2z1BiwVSgMCmpvPPrk3T6hxlgg689ykluA2InB8x9ce4iZcFX5qFKmKkYsGD//UIn3WJycegXnPrD3+QKk4uS X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230022)(4636009)(396003)(39860400002)(346002)(376002)(136003)(451199015)(46966006)(36840700001)(40470700004)(8936002)(478600001)(6916009)(316002)(8676002)(70586007)(4326008)(6666004)(70206006)(82310400005)(2906002)(5660300002)(26005)(41300700001)(82740400003)(16526019)(186003)(1076003)(2616005)(336012)(40460700003)(44832011)(36860700001)(426003)(47076005)(36756003)(356005)(86362001)(81166007)(83380400001)(40480700001)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Oct 2022 17:44:50.5076 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6adca26e-9212-43c9-8656-08daa88ba283 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT003.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR12MB6781 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Linearize Tx mbuf chains in the info array. This avoids walking the mbuf chain during flush. Move a few branches out of the hot path. Signed-off-by: Andrew Boyer --- drivers/net/ionic/ionic_lif.c | 2 +- drivers/net/ionic/ionic_rxtx.c | 143 ++++++++++++++++++++------------- 2 files changed, 87 insertions(+), 58 deletions(-) diff --git a/drivers/net/ionic/ionic_lif.c b/drivers/net/ionic/ionic_lif.c index db5d42dda6..ac9b69fc70 100644 --- a/drivers/net/ionic/ionic_lif.c +++ b/drivers/net/ionic/ionic_lif.c @@ -817,7 +817,7 @@ ionic_tx_qcq_alloc(struct ionic_lif *lif, uint32_t socket_id, uint32_t index, "tx", flags, ntxq_descs, - 1, + num_segs_fw, sizeof(struct ionic_txq_desc), sizeof(struct ionic_txq_comp), sizeof(struct ionic_txq_sg_desc_v1), diff --git a/drivers/net/ionic/ionic_rxtx.c b/drivers/net/ionic/ionic_rxtx.c index bb6ca019d9..53b0add228 100644 --- a/drivers/net/ionic/ionic_rxtx.c +++ b/drivers/net/ionic/ionic_rxtx.c @@ -64,7 +64,7 @@ ionic_tx_empty(struct ionic_tx_qcq *txq) { struct ionic_queue *q = &txq->qcq.q; - ionic_empty_array(q->info, q->num_descs, 0); + ionic_empty_array(q->info, q->num_descs * q->num_segs, 0); } static void __rte_cold @@ -102,50 +102,49 @@ ionic_tx_flush(struct ionic_tx_qcq *txq) { struct ionic_cq *cq = &txq->qcq.cq; struct ionic_queue *q = &txq->qcq.q; - struct rte_mbuf *txm, *next; - struct ionic_txq_comp *cq_desc_base = cq->base; - struct ionic_txq_comp *cq_desc; + struct rte_mbuf *txm; + struct ionic_txq_comp *cq_desc, *cq_desc_base = cq->base; void **info; - u_int32_t comp_index = (u_int32_t)-1; + uint32_t i; cq_desc = &cq_desc_base[cq->tail_idx]; + while (color_match(cq_desc->color, cq->done_color)) { cq->tail_idx = Q_NEXT_TO_SRVC(cq, 1); - - /* Prefetch the next 4 descriptors (not really useful here) */ - if ((cq->tail_idx & 0x3) == 0) - rte_prefetch0(&cq_desc_base[cq->tail_idx]); - if (cq->tail_idx == 0) cq->done_color = !cq->done_color; - comp_index = cq_desc->comp_index; + /* Prefetch 4 x 16B comp at cq->tail_idx + 4 */ + if ((cq->tail_idx & 0x3) == 0) + rte_prefetch0(&cq_desc_base[Q_NEXT_TO_SRVC(cq, 4)]); - cq_desc = &cq_desc_base[cq->tail_idx]; - } + while (q->tail_idx != rte_le_to_cpu_16(cq_desc->comp_index)) { + /* Prefetch 8 mbuf ptrs at q->tail_idx + 2 */ + rte_prefetch0(IONIC_INFO_PTR(q, Q_NEXT_TO_SRVC(q, 2))); - if (comp_index != (u_int32_t)-1) { - while (q->tail_idx != comp_index) { - info = IONIC_INFO_PTR(q, q->tail_idx); + /* Prefetch next mbuf */ + void **next_info = + IONIC_INFO_PTR(q, Q_NEXT_TO_SRVC(q, 1)); + if (next_info[0]) + rte_mbuf_prefetch_part2(next_info[0]); + if (next_info[1]) + rte_mbuf_prefetch_part2(next_info[1]); - q->tail_idx = Q_NEXT_TO_SRVC(q, 1); + info = IONIC_INFO_PTR(q, q->tail_idx); + for (i = 0; i < q->num_segs; i++) { + txm = info[i]; + if (!txm) + break; - /* Prefetch the next 4 descriptors */ - if ((q->tail_idx & 0x3) == 0) - /* q desc info */ - rte_prefetch0(&q->info[q->tail_idx]); - - /* - * Note: you can just use rte_pktmbuf_free, - * but this loop is faster - */ - txm = info[0]; - while (txm != NULL) { - next = txm->next; rte_pktmbuf_free_seg(txm); - txm = next; + + info[i] = NULL; } + + q->tail_idx = Q_NEXT_TO_SRVC(q, 1); } + + cq_desc = &cq_desc_base[cq->tail_idx]; } } @@ -327,9 +326,12 @@ ionic_tx_tso_post(struct ionic_queue *q, struct ionic_txq_desc *desc, uint16_t vlan_tci, bool has_vlan, bool start, bool done) { + struct rte_mbuf *txm_seg; void **info; uint64_t cmd; uint8_t flags = 0; + int i; + flags |= has_vlan ? IONIC_TXQ_DESC_FLAG_VLAN : 0; flags |= encap ? IONIC_TXQ_DESC_FLAG_ENCAP : 0; flags |= start ? IONIC_TXQ_DESC_FLAG_TSO_SOT : 0; @@ -345,7 +347,13 @@ ionic_tx_tso_post(struct ionic_queue *q, struct ionic_txq_desc *desc, if (done) { info = IONIC_INFO_PTR(q, q->head_idx); - info[0] = txm; + + /* Walk the mbuf chain to stash pointers in the array */ + txm_seg = txm; + for (i = 0; i < txm->nb_segs; i++) { + info[i] = txm_seg; + txm_seg = txm_seg->next; + } } q->head_idx = Q_NEXT_TO_POST(q, 1); @@ -497,8 +505,7 @@ ionic_tx(struct ionic_tx_qcq *txq, struct rte_mbuf *txm) struct ionic_tx_stats *stats = &txq->stats; struct rte_mbuf *txm_seg; void **info; - bool encap; - bool has_vlan; + rte_iova_t data_iova; uint64_t ol_flags = txm->ol_flags; uint64_t addr, cmd; uint8_t opcode = IONIC_TXQ_DESC_OPCODE_CSUM_NONE; @@ -524,32 +531,44 @@ ionic_tx(struct ionic_tx_qcq *txq, struct rte_mbuf *txm) if (opcode == IONIC_TXQ_DESC_OPCODE_CSUM_NONE) stats->no_csum++; - has_vlan = (ol_flags & RTE_MBUF_F_TX_VLAN); - encap = ((ol_flags & RTE_MBUF_F_TX_OUTER_IP_CKSUM) || - (ol_flags & RTE_MBUF_F_TX_OUTER_UDP_CKSUM)) && - ((ol_flags & RTE_MBUF_F_TX_OUTER_IPV4) || - (ol_flags & RTE_MBUF_F_TX_OUTER_IPV6)); + if (((ol_flags & RTE_MBUF_F_TX_OUTER_IP_CKSUM) || + (ol_flags & RTE_MBUF_F_TX_OUTER_UDP_CKSUM)) && + ((ol_flags & RTE_MBUF_F_TX_OUTER_IPV4) || + (ol_flags & RTE_MBUF_F_TX_OUTER_IPV6))) { + flags |= IONIC_TXQ_DESC_FLAG_ENCAP; + } - flags |= has_vlan ? IONIC_TXQ_DESC_FLAG_VLAN : 0; - flags |= encap ? IONIC_TXQ_DESC_FLAG_ENCAP : 0; + if (ol_flags & RTE_MBUF_F_TX_VLAN) { + flags |= IONIC_TXQ_DESC_FLAG_VLAN; + desc->vlan_tci = rte_cpu_to_le_16(txm->vlan_tci); + } addr = rte_cpu_to_le_64(rte_mbuf_data_iova(txm)); cmd = encode_txq_desc_cmd(opcode, flags, txm->nb_segs - 1, addr); desc->cmd = rte_cpu_to_le_64(cmd); desc->len = rte_cpu_to_le_16(txm->data_len); - desc->vlan_tci = rte_cpu_to_le_16(txm->vlan_tci); info[0] = txm; - elem = sg_desc_base[q->head_idx].elems; + if (txm->nb_segs > 1) { + txm_seg = txm->next; - txm_seg = txm->next; - while (txm_seg != NULL) { - elem->len = rte_cpu_to_le_16(txm_seg->data_len); - elem->addr = rte_cpu_to_le_64(rte_mbuf_data_iova(txm_seg)); - elem++; - txm_seg = txm_seg->next; + elem = sg_desc_base[q->head_idx].elems; + + while (txm_seg != NULL) { + /* Stash the mbuf ptr in the array */ + info++; + *info = txm_seg; + + /* Configure the SGE */ + data_iova = rte_mbuf_data_iova(txm_seg); + elem->len = rte_cpu_to_le_16(txm_seg->data_len); + elem->addr = rte_cpu_to_le_64(data_iova); + elem++; + + txm_seg = txm_seg->next; + } } q->head_idx = Q_NEXT_TO_POST(q, 1); @@ -565,11 +584,19 @@ ionic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, struct ionic_queue *q = &txq->qcq.q; struct ionic_tx_stats *stats = &txq->stats; struct rte_mbuf *mbuf; - uint32_t next_q_head_idx; uint32_t bytes_tx = 0; uint16_t nb_avail, nb_tx = 0; int err; + struct ionic_txq_desc *desc_base = q->base; + rte_prefetch0(&desc_base[q->head_idx]); + rte_prefetch0(IONIC_INFO_PTR(q, q->head_idx)); + + if (tx_pkts) { + rte_mbuf_prefetch_part1(tx_pkts[0]); + rte_mbuf_prefetch_part2(tx_pkts[0]); + } + /* Cleaning old buffers */ ionic_tx_flush(txq); @@ -580,11 +607,13 @@ ionic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, } while (nb_tx < nb_pkts) { - next_q_head_idx = Q_NEXT_TO_POST(q, 1); - if ((next_q_head_idx & 0x3) == 0) { - struct ionic_txq_desc *desc_base = q->base; - rte_prefetch0(&desc_base[next_q_head_idx]); - rte_prefetch0(&q->info[next_q_head_idx]); + uint16_t next_idx = Q_NEXT_TO_POST(q, 1); + rte_prefetch0(&desc_base[next_idx]); + rte_prefetch0(IONIC_INFO_PTR(q, next_idx)); + + if (nb_tx + 1 < nb_pkts) { + rte_mbuf_prefetch_part1(tx_pkts[nb_tx + 1]); + rte_mbuf_prefetch_part2(tx_pkts[nb_tx + 1]); } mbuf = tx_pkts[nb_tx]; @@ -605,10 +634,10 @@ ionic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, if (nb_tx > 0) { rte_wmb(); ionic_q_flush(q); - } - stats->packets += nb_tx; - stats->bytes += bytes_tx; + stats->packets += nb_tx; + stats->bytes += bytes_tx; + } return nb_tx; }