From patchwork Thu Oct 21 08:56:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rongwei Liu X-Patchwork-Id: 102542 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 97279A0547; Thu, 21 Oct 2021 10:57:13 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4711A411DA; Thu, 21 Oct 2021 10:57:04 +0200 (CEST) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2044.outbound.protection.outlook.com [40.107.244.44]) by mails.dpdk.org (Postfix) with ESMTP id 28834411CB for ; Thu, 21 Oct 2021 10:57:01 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SXH1wygVwWxuaIykF5+X2SK3YESXAMasgD2O/elnXmoOffiff6XHQIU627dJ1M91g5RTemvNOzp8uaJ4XOFj2z2x95pthlD/KD8KUpsMYHx7i7UHG8gJU2kk1rQmDQPuCd9dCi5VKqzFadYhmT54N4HQS4Wh1rtnQ0KFrIDtu4ZWd5x9k/721jCXFPXXEWYhWDdIMVhRDxdYLNkGHX5gFGRr7iQ45506egkA4MuFaAxgnhHQsvW0BwV/HsYRH1i8WjJyFiCGJXMdgS1sS03L4YvWOlQ6okeCxRZcBVDYdCnYM6RzFxeOKkkjR6qcHISQa/s1XsQRNjhRRX2wYBO7KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oEbssCLKTFfJbHRWPeG+Dnxdef5ujNCdAeMkHCzW+Nc=; b=ZeQYBGBtcIpRHwLMePvtb0dP1yxvWQ5uVCM58AuTa4UDzehZDt3OFGefaI39yfBBz06/oZfZSaEDET0F45KBGUqzfV1nQfWvwgzrj9PZfk1KDQluPdc7yj9vzG3QX80ZLCigaPcAyNDeY61LEf66BEOdzNlUVLAn6dBExdrmEnaywlSi1d3qUwIc1If4KUibkWJDxZbQKTEw7RvBq5BhwMrk6+crH6WJDtfJNb2NTfdKXs4RbaROr2/6oaPYk3GKgqvITUAked2ZgqNCXeF3AIvYw4/sL/pwLVk3Pk2PtRL7ILuURmdM/u3dbGpjIuP/IjTxZvU2isRPLDTCwyoEmQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oEbssCLKTFfJbHRWPeG+Dnxdef5ujNCdAeMkHCzW+Nc=; b=i80Tf6hZOUmve8f+2ObsbAr5EP8HJus4iz3ORESPpKgdFVhpwjWP1FaP5VRPtOc5KNt/9wbpupPIZZZgTATVw+7txIrdtLgxGdwFUUAHuBBxc0huQ0bZPCLxNXRLBgsMsoQ3KSUBhmhE2mQaFqlYZnKmh9CduBSHzG2Ztk13DuCG/4S/reZyXlWs7IGoy2FEIyV20d/TRKhIX1a+yt0aqLG3kxsCIidEI8hfpnD0S05piXOBvdQ+Th8O1+CeJULBVJuxxBTbPwMvStItiFRTfGvXetwP07LvsNFw5WvcETpuInD6s3qdrdM3edn02uACSo158uY7SutaL4RK78l/vQ== Received: from DM6PR13CA0016.namprd13.prod.outlook.com (2603:10b6:5:bc::29) by CY4PR12MB1624.namprd12.prod.outlook.com (2603:10b6:910:a::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Thu, 21 Oct 2021 08:56:55 +0000 Received: from DM6NAM11FT056.eop-nam11.prod.protection.outlook.com (2603:10b6:5:bc:cafe::8d) by DM6PR13CA0016.outlook.office365.com (2603:10b6:5:bc::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.8 via Frontend Transport; Thu, 21 Oct 2021 08:56:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by DM6NAM11FT056.mail.protection.outlook.com (10.13.173.99) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4628.16 via Frontend Transport; Thu, 21 Oct 2021 08:56:55 +0000 Received: from nvidia.com (172.20.187.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 21 Oct 2021 08:56:51 +0000 From: Rongwei Liu To: , , , , Ray Kinsella CC: , , Jiawei Wang Date: Thu, 21 Oct 2021 11:56:35 +0300 Message-ID: <20211021085637.3627922-2-rongweil@nvidia.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211021085637.3627922-1-rongweil@nvidia.com> References: <20211021085637.3627922-1-rongweil@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.6] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2eef103c-2a05-42a5-ec6f-08d99470bbc3 X-MS-TrafficTypeDiagnostic: CY4PR12MB1624: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:7219; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: FLkQzVHnH5C9I8UP88Aajc5Up0RIN3Egc42qaOb8KbN2WSl9MFk/b4hPIzB3AwDKhNCtnPEucSBBZ/pFDMBX8itchH5oONUQEb2S9tBbL8ccHwh6K862UxIi44n40g52QmifxJbE1rdXDKleM+lyj0DdS9XccJ8gMjIfgfVckZqEQ9BalwWrFMzqluk2WdKiFvA+FTSZMbwe0VP/gy8jiYe7uCuTtredsMP0hqCooEJCyQrsiLaWgvmNmiOMWJzMXlkw+4LJ4QIFnwhtelMJB76FGk3iyxoXTTnmjxqMjHdMYK4UCvvupxCVstzWtAV54Mz0ziwuogi3F+zSByXSbhxEF4MscsjDWbRZim73zpHD/8Pb2cbdbNca6/8kpujsCPDC6eXD6oWzp8B0U2l13vnAQcMQWp4fV0LxJikhAzcUUuqHk5qP2fs3AvQNcHnRh8uryyTozSSy93zleWkHLHJDi4GvN/nrzKPMwnH3rKL9GB2E+OwwFmtIo9ebBJLkZBN3rgLdlD0c3dhZgxHztKP8rH7EnLpklgxMqD6dY7zX0i4evBNpAlyLZg0aDpugdezeKgACkMjN0Pzl/D9NP69/jBlmYd7ZUsG61JqNzXDTBPhHgqsAcmberD0OsoFw9HxtnMBJxTWU7QYpBLXvEMWMXVgy49biIuCeIX7h4V+7vUb07jTgnm0jTC+dJNpPJbt3ilC1fdsc/SB4RU/vmw== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(70586007)(70206006)(426003)(83380400001)(26005)(186003)(16526019)(508600001)(4326008)(110136005)(82310400003)(6286002)(7696005)(5660300002)(36906005)(7636003)(316002)(6666004)(54906003)(107886003)(47076005)(8936002)(356005)(36860700001)(336012)(55016002)(86362001)(1076003)(2616005)(8676002)(2906002)(36756003); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Oct 2021 08:56:55.1816 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2eef103c-2a05-42a5-ec6f-08d99470bbc3 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT056.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR12MB1624 Subject: [dpdk-dev] [PATCH v1 1/2] common/mlx5: support lag context query X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Added a new api mlx5_devx_cmd_query_lag() to query lag property from firmware including state/affinity/mode etc. Signed-off-by: Jiawei Wang Signed-off-by: Rongwei Liu Acked-by: Matan Azrad --- drivers/common/mlx5/mlx5_devx_cmds.c | 40 +++++++++++++++++++++++++ drivers/common/mlx5/mlx5_devx_cmds.h | 13 ++++++++ drivers/common/mlx5/mlx5_prm.h | 45 +++++++++++++++++++++++++++- drivers/common/mlx5/version.map | 1 + 4 files changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 6538bce57b..fb7c8e986f 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2800,3 +2800,43 @@ mlx5_devx_cmd_create_crypto_login_obj(void *ctx, crypto_login_obj->id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); return crypto_login_obj; } + +/** + * Query LAG context. + * + * @param[in] ctx + * Pointer to ibv_context, returned from mlx5dv_open_device. + * @param[out] lag_ctx + * Pointer to struct mlx5_devx_lag_context, to be set by the routine. + * + * @return + * 0 on success, a negative value otherwise. + */ +int +mlx5_devx_cmd_query_lag(void *ctx, + struct mlx5_devx_lag_context *lag_ctx) +{ + uint32_t in[MLX5_ST_SZ_DW(query_lag_in)] = {0}; + uint32_t out[MLX5_ST_SZ_DW(query_lag_out)] = {0}; + void *lctx; + int rc; + + MLX5_SET(query_lag_in, in, opcode, MLX5_CMD_OP_QUERY_LAG); + rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); + if (rc) + goto error; + lctx = MLX5_ADDR_OF(query_lag_out, out, context); + lag_ctx->fdb_selection_mode = MLX5_GET(lag_context, lctx, + fdb_selection_mode); + lag_ctx->port_select_mode = MLX5_GET(lag_context, lctx, + port_select_mode); + lag_ctx->lag_state = MLX5_GET(lag_context, lctx, lag_state); + lag_ctx->tx_remap_affinity_2 = MLX5_GET(lag_context, lctx, + tx_remap_affinity_2); + lag_ctx->tx_remap_affinity_1 = MLX5_GET(lag_context, lctx, + tx_remap_affinity_1); + return 0; +error: + rc = (rc > 0) ? -rc : rc; + return rc; +} diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index 6948cadd37..5e4f3b749e 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -197,6 +197,15 @@ struct mlx5_hca_attr { uint32_t umr_indirect_mkey_disabled:1; }; +/* LAG Context. */ +struct mlx5_devx_lag_context { + uint32_t fdb_selection_mode:1; + uint32_t port_select_mode:3; + uint32_t lag_state:3; + uint32_t tx_remap_affinity_1:4; + uint32_t tx_remap_affinity_2:4; +}; + struct mlx5_devx_wq_attr { uint32_t wq_type:4; uint32_t wq_signature:1; @@ -681,4 +690,8 @@ struct mlx5_devx_obj * mlx5_devx_cmd_create_crypto_login_obj(void *ctx, struct mlx5_devx_crypto_login_attr *attr); +__rte_internal +int +mlx5_devx_cmd_query_lag(void *ctx, + struct mlx5_devx_lag_context *lag_ctx); #endif /* RTE_PMD_MLX5_DEVX_CMDS_H_ */ diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 54e62aa153..eab80eaead 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1048,6 +1048,7 @@ enum { MLX5_CMD_OP_DEALLOC_PD = 0x801, MLX5_CMD_OP_ACCESS_REGISTER = 0x805, MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN = 0x816, + MLX5_CMD_OP_QUERY_LAG = 0x842, MLX5_CMD_OP_CREATE_TIR = 0x900, MLX5_CMD_OP_MODIFY_TIR = 0x901, MLX5_CMD_OP_CREATE_SQ = 0X904, @@ -1507,7 +1508,8 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 uar_4k[0x1]; u8 reserved_at_241[0x9]; u8 uar_sz[0x6]; - u8 reserved_at_250[0x8]; + u8 port_selection_cap[0x1]; + u8 reserved_at_251[0x7]; u8 log_pg_sz[0x8]; u8 bf[0x1]; u8 driver_version[0x1]; @@ -1974,6 +1976,14 @@ struct mlx5_ifc_query_nic_vport_context_in_bits { u8 reserved_at_68[0x18]; }; +/* + * lag_tx_port_affinity: 0 auto-selection, 1 PF1, 2 PF2 vice versa. + * Each TIS binds to one PF by setting lag_tx_port_affinity (>0). + * Once LAG enabled, we create multiple TISs and bind each one to + * different PFs, then TIS[i] gets affinity i+1 and goes to PF i+1. + */ +#define MLX5_IFC_LAG_MAP_TIS_AFFINITY(index, num) ((num) ? \ + (index) % (num) + 1 : 0) struct mlx5_ifc_tisc_bits { u8 strict_lag_tx_port_affinity[0x1]; u8 reserved_at_1[0x3]; @@ -2007,6 +2017,39 @@ struct mlx5_ifc_query_tis_in_bits { u8 reserved_at_60[0x20]; }; +/* port_select_mode definition. */ +enum mlx5_lag_mode_type { + MLX5_LAG_MODE_TIS = 0, + MLX5_LAG_MODE_HASH = 1, +}; + +struct mlx5_ifc_lag_context_bits { + u8 fdb_selection_mode[0x1]; + u8 reserved_at_1[0x14]; + u8 port_select_mode[0x3]; + u8 reserved_at_18[0x5]; + u8 lag_state[0x3]; + u8 reserved_at_20[0x14]; + u8 tx_remap_affinity_2[0x4]; + u8 reserved_at_38[0x4]; + u8 tx_remap_affinity_1[0x4]; +}; + +struct mlx5_ifc_query_lag_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + u8 reserved_at_40[0x40]; +}; + +struct mlx5_ifc_query_lag_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + u8 syndrome[0x20]; + struct mlx5_ifc_lag_context_bits context; +}; + struct mlx5_ifc_alloc_transport_domain_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index 5be01505b5..0fda60bd11 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -53,6 +53,7 @@ INTERNAL { mlx5_devx_cmd_modify_virtq; mlx5_devx_cmd_qp_query_tis_td; mlx5_devx_cmd_query_hca_attr; + mlx5_devx_cmd_query_lag; mlx5_devx_cmd_query_parse_samples; mlx5_devx_cmd_query_virtio_q_counters; # WINDOWS_NO_EXPORT mlx5_devx_cmd_query_virtq; From patchwork Thu Oct 21 08:56:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rongwei Liu X-Patchwork-Id: 102541 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 44862A0547; Thu, 21 Oct 2021 10:57:03 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E7544411C5; Thu, 21 Oct 2021 10:57:00 +0200 (CEST) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2086.outbound.protection.outlook.com [40.107.244.86]) by mails.dpdk.org (Postfix) with ESMTP id 91A90411C4 for ; Thu, 21 Oct 2021 10:56:59 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d7y09rGLcZpZevdOc+JedZ0xKs7IbPbBb37zxASO5kFDVUgK8nkelZZmGsM416jCIQNcb13u8lrZG9wnvCKzb6dIbeIpxkOLLdgOrgWA0BIJfa4BZnrQWpEcwBkcxmbLMQ8r3Mnvrjgg19Y1zPT5SwztEqkvFb2zcdXENvSukaXyPL5vvJ7DnmVPMf6lRk1CHlavUdBxKlF/wXBUaYaEC1Mqgaix2Ntgt9oasH1dlIBKrw48YkVfeoDkIumu0+Vbf762WR4RiCSOXTNaeLDAHaJI3wFysJguE4Q3pMVBV7pRbrV89piEz9b3y0l/US0XJ/OluhdCzMuSL5UD5D/ZGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oa8O3Gxx9FCUZhFdvjLqsNllGSTN5i/JQ5Pd+fc52Lc=; b=EKbBmU8u441vBmNY/ZzKleXo/dGX34IjqarfxSzvc9opmWqUeSoybbTvBYjaZFdySOt3Efnzr1UxrZONgFQNsDnR0N3/qECki79t59FQeaT0uTYbYwMFM3ahx677AokVO4pH5hM69XdTilopsPVOeVTwziu40837qw0EKsCJpiauTKzMeMHnBEYGl9PQHPPAVnbi9vidoJViUUWgWPh8WYGxhFU0+Sw36A2njf9r4Pf5mlGhPWMlMUXwrpAEUaRowbWpbuMmM4WzI7ZPGc2UvdDPNbQOF8gMQ2FouMBALwOLDeTUy26rMuieCnOxbPCCmerUuLjlif7dICfVHurpSA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oa8O3Gxx9FCUZhFdvjLqsNllGSTN5i/JQ5Pd+fc52Lc=; b=uQEIAEsdfy6T759H2pDh5N9GchJVclErNZ5Q1lzw1NqsxMARvL4M3hzuOkcNaKZQEpIgeLE/d8bnYUTP5whyfqhlBbo+cU+6yCHlZ+f59jJWFqFz6GafMik8nwZ95D5rdV5vmjP7Fcv7BUgIfJe1G9mZbdpiTKQQsLP5KyQhTU6uCHn1GNcA+90Vx4aZbjjuyNEklN8iXymFbUnLIKnON7A2UhiAWFT1UL0XRLVSnULGURxPqptnjd0aXoNnJ/F+ZMFT6/mWxdis9AX9cNb8qIwPareNl0KLF0Rgy2km76BoKi6Mh0HnDL35hsDsM3pAvKINbGN5lHCW9O3bjC28Uw== Received: from DM6PR13CA0006.namprd13.prod.outlook.com (2603:10b6:5:bc::19) by BL0PR12MB5012.namprd12.prod.outlook.com (2603:10b6:208:1ca::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Thu, 21 Oct 2021 08:56:56 +0000 Received: from DM6NAM11FT056.eop-nam11.prod.protection.outlook.com (2603:10b6:5:bc:cafe::fb) by DM6PR13CA0006.outlook.office365.com (2603:10b6:5:bc::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.8 via Frontend Transport; Thu, 21 Oct 2021 08:56:56 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; dpdk.org; dkim=none (message not signed) header.d=none;dpdk.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by DM6NAM11FT056.mail.protection.outlook.com (10.13.173.99) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4628.16 via Frontend Transport; Thu, 21 Oct 2021 08:56:56 +0000 Received: from nvidia.com (172.20.187.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 21 Oct 2021 08:56:53 +0000 From: Rongwei Liu To: , , , CC: , Date: Thu, 21 Oct 2021 11:56:36 +0300 Message-ID: <20211021085637.3627922-3-rongweil@nvidia.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211021085637.3627922-1-rongweil@nvidia.com> References: <20211021085637.3627922-1-rongweil@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.187.6] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 43fb63e9-d080-4d46-2ba3-08d99470bc6f X-MS-TrafficTypeDiagnostic: BL0PR12MB5012: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2733; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3VjfhKhHMR4Ng4WqGjJ0+NpAk76kUzSF+yC56+n5ubs0YsA+awGF2GF+Ml9X/kztWuyGnOqupbqxT4ZfsDTWuGcggw7gJhWipXwj5rpeZzHoUnK4rlv1SEYYcoeBOzBO4opgoFNEYiD6LOdSl85eQFZ3qcdxsx1KmyHxSbwscWOiNmh600XeQzUo4KzQ+2jP12pgFtO8OhMsTAFwwKqyhRPDYH9J/J/5FTwey0D9WWM41tbOfTMtkjTZXskwU6ReAy5+tyYL/VMNB+V4zbKi2QqMCk5fH3kCTG640NKNKV3xkQfgiCBV/b7EX345xrKn6Ry5TEXpAebtmo4XHu1zJSMO0fBR+XuNnXGUcbrFHfJSrZ54zSAu+TP+lw8u0u/xQtuxTx5d+pUqTKTbJzRvFngyXWBbFpwDkfRqrJyu0kshblLK9bD5fNsWFsJMsRzBr9iA8ICbZarvZmOkkP2fLAfjIAZMgN6OSxjguYKeS01dIODTaEIDESU0OJtbJABNwpRzRgrE1uG2QXWvFZrIE+BfC4BxBmPeD7MuyBcPUNQyoOkzHUhonzgLXNedgjIgaTcalVYh5gVvgH0EzkKj7uyByf63m6uJ7TSzX+1Zd0VA4lVLPPB8bVvzioXpC5goViTUxbQS1AConFymZut6viHorkHkPKEZhVkpldrKo5XlCDpd9cARWO3eyH1oCCrGOjJpK11NGaCEAvhFB6HUq5uqOQwla5xgATtlc4XYyqU= X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(16526019)(186003)(47076005)(26005)(4326008)(7636003)(70586007)(356005)(70206006)(2616005)(426003)(86362001)(54906003)(55016002)(36906005)(110136005)(36860700001)(508600001)(316002)(30864003)(336012)(82310400003)(8936002)(36756003)(2906002)(6666004)(7696005)(6286002)(1076003)(8676002)(83380400001)(107886003)(5660300002)(309714004); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Oct 2021 08:56:56.3189 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 43fb63e9-d080-4d46-2ba3-08d99470bc6f X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT056.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL0PR12MB5012 Subject: [dpdk-dev] [PATCH v1 2/2] net/mlx5: set txq affinity in round-robin X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Previously, we set txq affinity to 0 and let firmware to perform round-robin when bonding. Firmware uses a global counter to assign txq affinity to different physical ports accord to remainder after division. There are three dis-advantages: 1. The global counter is shared between kernel and dpdk. 2. After restarting pmd or port, the previous counter value is reused, so the new affinity is unpredictable. 3. There is no way to get what affinity is set by firmware. In this update, we will create several TISs up to the number of bonding ports and bind each TIS to one PF port. For each port, it will start to pick up TIS using its port index. Upper layer application can quickly calculate each txq's affinity without querying. At DPDK layer, when creating txq with 2 bonding ports, the affinity is set like: port 0: 1-->2-->1-->2 port 1: 2-->1-->2-->1 port 2: 1-->2-->1-->2 Note: Only applicable to DevX api. This affinity subjects to HW hash. Signed-off-by: Rongwei Liu Acked-by: Matan Azrad --- doc/guides/nics/mlx5.rst | 4 ++ drivers/net/mlx5/linux/mlx5_os.c | 2 +- drivers/net/mlx5/mlx5.c | 81 ++++++++++++++++++++++++++++---- drivers/net/mlx5/mlx5.h | 10 +++- drivers/net/mlx5/mlx5_devx.c | 37 ++++++++++++++- drivers/net/mlx5/mlx5_txpp.c | 4 +- 6 files changed, 124 insertions(+), 14 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 7b540504f9..dd059b227d 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -464,6 +464,10 @@ Limitations - In order to achieve best insertion rate, application should manage the flows per lcore. - Better to disable memory reclaim by setting ``reclaim_mem_mode`` to 0 to accelerate the flow object allocation and release with cache. +- HW hashed bonding + + - TXQ affinity subjects to HW hash once enabled. + Statistics ---------- diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 8a25ec8730..7356c91c92 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -878,7 +878,6 @@ mlx5_representor_match(struct mlx5_dev_spawn_data *spawn, return false; } - /** * Spawn an Ethernet device from Verbs information. * @@ -1668,6 +1667,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, */ MLX5_ASSERT(spawn->ifindex); priv->if_index = spawn->ifindex; + priv->lag_affinity_idx = sh->refcnt - 1; eth_dev->data->dev_private = priv; priv->dev_data = eth_dev->data; eth_dev->data->mac_addrs = priv->mac; diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index c712fc3465..ae54b18ad5 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1256,6 +1256,68 @@ mlx5_dev_ctx_shared_mempool_subscribe(struct rte_eth_dev *dev) return 0; } +/** + * Set up multiple TISs with different affinities according to + * number of bonding ports + * + * @param priv + * Pointer of shared context. + * + * @return + * Zero on success, -1 otherwise. + */ +static int +mlx5_setup_tis(struct mlx5_dev_ctx_shared *sh) +{ + int i; + struct mlx5_devx_lag_context lag_ctx = { 0 }; + struct mlx5_devx_tis_attr tis_attr = { 0 }; + + tis_attr.transport_domain = sh->td->id; + if (sh->bond.n_port) { + if (!mlx5_devx_cmd_query_lag(sh->ctx, &lag_ctx)) { + sh->lag.tx_remap_affinity[0] = + lag_ctx.tx_remap_affinity_1; + sh->lag.tx_remap_affinity[1] = + lag_ctx.tx_remap_affinity_2; + sh->lag.affinity_mode = lag_ctx.port_select_mode; + } else { + DRV_LOG(ERR, "Failed to query lag affinity."); + return -1; + } + if (sh->lag.affinity_mode == MLX5_LAG_MODE_TIS) { + for (i = 0; i < sh->bond.n_port; i++) { + tis_attr.lag_tx_port_affinity = + MLX5_IFC_LAG_MAP_TIS_AFFINITY(i, + sh->bond.n_port); + sh->tis[i] = mlx5_devx_cmd_create_tis(sh->ctx, + &tis_attr); + if (!sh->tis[i]) { + DRV_LOG(ERR, "Failed to TIS %d/%d for bonding device" + " %s.", i, sh->bond.n_port, + sh->ibdev_name); + return -1; + } + } + DRV_LOG(DEBUG, "LAG number of ports : %d, affinity_1 & 2 : pf%d & %d.\n", + sh->bond.n_port, lag_ctx.tx_remap_affinity_1, + lag_ctx.tx_remap_affinity_2); + return 0; + } + if (sh->lag.affinity_mode == MLX5_LAG_MODE_HASH) + DRV_LOG(INFO, "Device %s enabled HW hash based LAG.", + sh->ibdev_name); + } + tis_attr.lag_tx_port_affinity = 0; + sh->tis[0] = mlx5_devx_cmd_create_tis(sh->ctx, &tis_attr); + if (!sh->tis[0]) { + DRV_LOG(ERR, "Failed to TIS 0 for bonding device" + " %s.", sh->ibdev_name); + return -1; + } + return 0; +} + /** * Allocate shared device context. If there is multiport device the * master and representors will share this context, if there is single @@ -1283,7 +1345,6 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, struct mlx5_dev_ctx_shared *sh; int err = 0; uint32_t i; - struct mlx5_devx_tis_attr tis_attr = { 0 }; MLX5_ASSERT(spawn); /* Secondary process should not create the shared context. */ @@ -1354,9 +1415,7 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, err = ENOMEM; goto error; } - tis_attr.transport_domain = sh->td->id; - sh->tis = mlx5_devx_cmd_create_tis(sh->ctx, &tis_attr); - if (!sh->tis) { + if (mlx5_setup_tis(sh)) { DRV_LOG(ERR, "TIS allocation failure"); err = ENOMEM; goto error; @@ -1420,10 +1479,13 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, MLX5_ASSERT(sh); if (sh->share_cache.cache.table) mlx5_mr_btree_free(&sh->share_cache.cache); - if (sh->tis) - claim_zero(mlx5_devx_cmd_destroy(sh->tis)); if (sh->td) claim_zero(mlx5_devx_cmd_destroy(sh->td)); + i = 0; + do { + if (sh->tis[i]) + claim_zero(mlx5_devx_cmd_destroy(sh->tis[i])); + } while (++i < (uint32_t)sh->bond.n_port); if (sh->devx_rx_uar) mlx5_glue->devx_free_uar(sh->devx_rx_uar); if (sh->tx_uar) @@ -1449,6 +1511,7 @@ void mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) { int ret; + int i = 0; pthread_mutex_lock(&mlx5_dev_ctx_list_mutex); #ifdef RTE_LIBRTE_MLX5_DEBUG @@ -1510,8 +1573,10 @@ mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) } if (sh->pd) claim_zero(mlx5_os_dealloc_pd(sh->pd)); - if (sh->tis) - claim_zero(mlx5_devx_cmd_destroy(sh->tis)); + do { + if (sh->tis[i]) + claim_zero(mlx5_devx_cmd_destroy(sh->tis[i])); + } while (++i < sh->bond.n_port); if (sh->td) claim_zero(mlx5_devx_cmd_destroy(sh->td)); if (sh->devx_rx_uar) diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index adab9dc052..dc385a8cbb 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -1120,6 +1120,12 @@ struct mlx5_aso_ct_pools_mng { struct mlx5_aso_sq aso_sq; /* ASO queue objects. */ }; +/* LAG attr. */ +struct mlx5_lag { + uint8_t tx_remap_affinity[16]; /* The PF port number of affinity */ + uint8_t affinity_mode; /* TIS or hash based affinity */ +}; + /* * Shared Infiniband device context for Master/Representors * which belong to same IB device with multiple IB ports. @@ -1187,8 +1193,9 @@ struct mlx5_dev_ctx_shared { struct rte_intr_handle intr_handle; /* Interrupt handler for device. */ struct rte_intr_handle intr_handle_devx; /* DEVX interrupt handler. */ void *devx_comp; /* DEVX async comp obj. */ - struct mlx5_devx_obj *tis; /* TIS object. */ + struct mlx5_devx_obj *tis[16]; /* TIS object. */ struct mlx5_devx_obj *td; /* Transport domain. */ + struct mlx5_lag lag; /* LAG attributes */ void *tx_uar; /* Tx/packet pacing shared UAR. */ struct mlx5_flex_parser_profiles fp[MLX5_FLEX_PARSER_MAX]; /* Flex parser profiles information. */ @@ -1454,6 +1461,7 @@ struct mlx5_priv { uint32_t rss_shared_actions; /* RSS shared actions. */ struct mlx5_devx_obj *q_counters; /* DevX queue counter object. */ uint32_t counter_set_id; /* Queue counter ID to set in DevX objects. */ + uint32_t lag_affinity_idx; /* LAG mode queue 0 affinity starting. */ }; #define PORT_ID(priv) ((priv)->dev_data->port_id) diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c index a49602cb95..a24b1b897d 100644 --- a/drivers/net/mlx5/mlx5_devx.c +++ b/drivers/net/mlx5/mlx5_devx.c @@ -888,6 +888,37 @@ mlx5_devx_drop_action_destroy(struct rte_eth_dev *dev) rte_errno = ENOTSUP; } +/** + * Select TXQ TIS number. + * + * @param dev + * Pointer to Ethernet device. + * @param queue_idx + * Queue index in DPDK Tx queue array. + * + * @return + * > 0 on success, a negative errno value otherwise. + */ +static uint32_t +mlx5_get_txq_tis_num(struct rte_eth_dev *dev, uint16_t queue_idx) +{ + struct mlx5_priv *priv = dev->data->dev_private; + int tis_idx; + + if (priv->sh->bond.n_port && priv->sh->lag.affinity_mode == + MLX5_LAG_MODE_TIS) { + tis_idx = (priv->lag_affinity_idx + queue_idx) % + priv->sh->bond.n_port; + DRV_LOG(INFO, "port %d txq %d gets affinity %d and maps to PF %d.", + dev->data->port_id, queue_idx, tis_idx + 1, + priv->sh->lag.tx_remap_affinity[tis_idx]); + } else { + tis_idx = 0; + } + MLX5_ASSERT(priv->sh->tis[tis_idx]); + return priv->sh->tis[tis_idx]->id; +} + /** * Create the Tx hairpin queue object. * @@ -935,7 +966,8 @@ mlx5_txq_obj_hairpin_new(struct rte_eth_dev *dev, uint16_t idx) attr.wq_attr.log_hairpin_num_packets = attr.wq_attr.log_hairpin_data_sz - MLX5_HAIRPIN_QUEUE_STRIDE; - attr.tis_num = priv->sh->tis->id; + + attr.tis_num = mlx5_get_txq_tis_num(dev, idx); tmpl->sq = mlx5_devx_cmd_create_sq(priv->sh->ctx, &attr); if (!tmpl->sq) { DRV_LOG(ERR, @@ -992,14 +1024,15 @@ mlx5_txq_create_devx_sq_resources(struct rte_eth_dev *dev, uint16_t idx, .allow_swp = !!priv->config.swp, .cqn = txq_obj->cq_obj.cq->id, .tis_lst_sz = 1, - .tis_num = priv->sh->tis->id, .wq_attr = (struct mlx5_devx_wq_attr){ .pd = priv->sh->pdn, .uar_page = mlx5_os_get_devx_uar_page_id(priv->sh->tx_uar), }, .ts_format = mlx5_ts_format_conv(priv->sh->sq_ts_format), + .tis_num = mlx5_get_txq_tis_num(dev, idx), }; + /* Create Send Queue object with DevX. */ return mlx5_devx_sq_create(priv->sh->ctx, &txq_obj->sq_obj, log_desc_n, &sq_attr, priv->sh->numa_node); diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c index 2be7e71f89..6e874fa090 100644 --- a/drivers/net/mlx5/mlx5_txpp.c +++ b/drivers/net/mlx5/mlx5_txpp.c @@ -230,7 +230,7 @@ mlx5_txpp_create_rearm_queue(struct mlx5_dev_ctx_shared *sh) .cd_master = 1, .state = MLX5_SQC_STATE_RST, .tis_lst_sz = 1, - .tis_num = sh->tis->id, + .tis_num = sh->tis[0]->id, .wq_attr = (struct mlx5_devx_wq_attr){ .pd = sh->pdn, .uar_page = mlx5_os_get_devx_uar_page_id(sh->tx_uar), @@ -433,7 +433,7 @@ mlx5_txpp_create_clock_queue(struct mlx5_dev_ctx_shared *sh) /* Create send queue object for Clock Queue. */ if (sh->txpp.test) { sq_attr.tis_lst_sz = 1; - sq_attr.tis_num = sh->tis->id; + sq_attr.tis_num = sh->tis[0]->id; sq_attr.non_wire = 0; sq_attr.static_sq_wq = 1; } else {