Message ID | 20220524152041.737154-1-spiked@nvidia.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id F009CA04FF; Tue, 24 May 2022 17:21:03 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E1EF942B97; Tue, 24 May 2022 17:21:03 +0200 (CEST) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2080.outbound.protection.outlook.com [40.107.244.80]) by mails.dpdk.org (Postfix) with ESMTP id D427C42B95 for <dev@dpdk.org>; Tue, 24 May 2022 17:21:01 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eeXBNNFUJbCnFj6h1r9JiroNlAlbZC7oOgzR+jy1NIuNfL6eBu1ae9zqTdqCF/Psr4SrE+wzxJLaGJuXEWMb0l9EyinBFZZlrq5DEt33pY3+vsV2NVrWy1vSCDCAF5kYeADz/BkjUtD+x+y0rqU4uWZ5f8XiuomecFimHAhbQAmarlsQD1AqUhe+41FEB+7UlfCgCIyqU0DypjN1yDjrq3GDB72/aPwMsEOvPV2Zs87hkorp+nBMyS5h5gpSOv+xD5e9BE57J3gLk7aCm0nbnTgSee2qbzyaGGhuKn1J5TV63p/ujHaHsyFGD2d6FRJvSdDfnsWPegPvFIoS2gd+Vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4o9lVSDtBjJwMCrWOTz3lVO3llCxN2Wi7dv8XmxzT9I=; b=MHG4SRoKmaxljfFG9dP+pwcPjeE3VqEHXSPyfVnY3uYAXaNvbfkc5/wJhRpv88toz+KFQF4vXS7byhIVZaU+YMLpGcWqbPxTeOr3VN4DDomonimTU/n1OFfA1CBVZOm/SjopH1thdEvOno4dDtip7BSCIQiggW84X2oEQH7i/0GFEmrmYipICl+FGGXUw0m3pwifmL2rwDTm0x+HD2uKaiet+gFV8tvirwptoXpsonhN3XhgrtuYFzb82/hl61c7KbZQabh95L1Qlu8dtSXT7VjVflDAgyPsc/5DDBL62Rm74AkuFkKBlG+vbKQCni9n4t22FILr6cNIQntoF5DRwg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.235) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4o9lVSDtBjJwMCrWOTz3lVO3llCxN2Wi7dv8XmxzT9I=; b=ulPY0Bf91wQ7+jZXCVaDQX4es34E0BaH3gzCbbKN9Do5esyuHHmdan5D99ieP8Gv/9k9fc97xNPVCDofBI0cCRiHcu8VscN2Ass7YmDVVhJGB3jZG6v7o3uJsELXYnRX4enzOyB0HbXyf2ByBO34nt60cDxhPT/06mSN0levnv4srFSvQ9Wa0jCzeoaNYW/cs5C+MC7VoWlyFoG6hhahJd9hn72XUUX15ZOpVY0N9Pa0jpw4Dgdm8IWOKG5+VtS3rj7kXVYJ50ngMYsktm2J62tN/W12oN9VQWa+lTn/UptQ8z3JYnQ0ecM3Ol46eVZN46tC9rqllt5olyDRcmsZKg== Received: from MW4PR04CA0092.namprd04.prod.outlook.com (2603:10b6:303:83::7) by MN2PR12MB3214.namprd12.prod.outlook.com (2603:10b6:208:106::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5273.20; Tue, 24 May 2022 15:20:59 +0000 Received: from CO1NAM11FT043.eop-nam11.prod.protection.outlook.com (2603:10b6:303:83:cafe::62) by MW4PR04CA0092.outlook.office365.com (2603:10b6:303:83::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5293.13 via Frontend Transport; Tue, 24 May 2022 15:20:59 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.235) by CO1NAM11FT043.mail.protection.outlook.com (10.13.174.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5273.14 via Frontend Transport; Tue, 24 May 2022 15:20:59 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Tue, 24 May 2022 15:20:58 +0000 Received: from nvidia.com (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.22; Tue, 24 May 2022 08:20:56 -0700 From: Spike Du <spiked@nvidia.com> To: <matan@nvidia.com>, <viacheslavo@nvidia.com>, <orika@nvidia.com>, <thomas@monjalon.net> CC: <dev@dpdk.org>, <rasland@nvidia.com> Subject: [PATCH v3 0/7] introduce per-queue limit watermark and host shaper Date: Tue, 24 May 2022 18:20:34 +0300 Message-ID: <20220524152041.737154-1-spiked@nvidia.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220522055900.417282-1-spiked@nvidia.com> References: <20220522055900.417282-1-spiked@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: a7f8a260-0f9e-4de0-e0bb-08da3d9901b4 X-MS-TrafficTypeDiagnostic: MN2PR12MB3214:EE_ X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: <MN2PR12MB3214B2FFEE253A913E5DA535A8D79@MN2PR12MB3214.namprd12.prod.outlook.com> X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: MrGtokkFjoxuMbtt0NC6VHouaMMso5g1shcwhFWOMMYCsTpPVrXYo1mO6FRT1AsnX6oVA46z7SJ54gn5eE18R0z3NSiV8ntXytIbIO+40CnfXcLBuw5GWhfSgJNu1WbxDKVUGSoRuDggIaF41+i6Iw9vrCfhzDT7lHPmsDwWon+fVMNclxXf6ON2/GAAwiUXQuwB/O42vQRO6pGMybZT73ZJRNlCgkGR9tgt/uVwSoNsox7MfNQLYKOgR1ydXDzD+Zs5PVEqh7++Cp9WYOd0i5nfkTZhBOw7Udxtkh3Kl7zdZrfV6ftGAzUeUz0EtmkT7qOzBr+brss9mXeutMCXfR5O6cAjmyKPbBnqC5KBCdMDFlhEnPqxJ3PVjCNnMqC75QSRqCkFkSVGbZevpXKdPBF+OojmSkf/u9dr9w5Sow1IutWff/amRZ/AMu96DGI4/oKQFd+wyqSgcme85EHWLI1Td77H3RDR1zgMtA+QuQy2r6Zg4uKcXlZSXiWRgC1+pV+Gha4EEeIkanaZOtp+ZL7nOMPaQITQXYvPjuwFpxg+Sw95YAruS0Gpnix1wZZhq/BN1ywTeCZ3hc7GuXS5DTgyq5Og3S6bqvSihIZ5b5l9a65rezcua9B9NCIim0lR87s0LosnMWaARFPxBl9olKCzPMcKDrAeFgOHE49e0YmfOGuOITzjJa2JuYmo7Sh5+1Tp1qwYENni7Sw1rX4YyQ== X-Forefront-Antispam-Report: CIP:12.22.5.235; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:InfoNoRecords; CAT:NONE; SFS:(13230001)(4636009)(46966006)(36840700001)(40470700004)(54906003)(36860700001)(82310400005)(40460700003)(356005)(47076005)(107886003)(316002)(8676002)(36756003)(4326008)(5660300002)(70586007)(70206006)(110136005)(2616005)(508600001)(81166007)(26005)(6666004)(6286002)(336012)(7696005)(186003)(426003)(2906002)(16526019)(1076003)(86362001)(83380400001)(55016003)(8936002)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 May 2022 15:20:59.3174 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a7f8a260-0f9e-4de0-e0bb-08da3d9901b4 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[12.22.5.235]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT043.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB3214 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org |
Series |
introduce per-queue limit watermark and host shaper
|
|
Message
Spike Du
May 24, 2022, 3:20 p.m. UTC
LWM(limit watermark) is per RX queue attribute, when RX queue fullness reach the LWM limit, HW sends an event to dpdk application. Host shaper can configure shaper rate and lwm-triggered for a host port. The shaper limits the rate of traffic from host port to wire port. If lwm-triggered is enabled, a 100Mbps shaper is enabled automatically when one of the host port's Rx queues receives LWM event. These two features can combine to control traffic from host port to wire port. The work flow is configure LWM to RX queue and enable lwm-triggered flag in host shaper, after receiving LWM event, delay a while until RX queue is empty , then disable the shaper. We recycle this work flow to reduce RX queue drops. Add new libethdev API to set LWM, add rte event RTE_ETH_EVENT_RXQ_LIMIT_REACHED to handle LWM event. For host shaper, because it doesn't align to existing DPDK framework and is specific to Nvidia NIC, use PMD private API. For integration with testpmd, put the private cmdline function and LWM event handler in mlx5 PMD directory by adding a new file mlx5_test.c. Only add minimal code in testpmd to invoke interfaces from mlx5_test.c. Spike Du (7): net/mlx5: add LWM support for Rxq common/mlx5: share interrupt management ethdev: introduce Rx queue based limit watermark net/mlx5: add LWM event handling support net/mlx5: support Rx queue based limit watermark net/mlx5: add private API to config host port shaper app/testpmd: add LWM and Host Shaper command app/test-pmd/cmdline.c | 74 +++++ app/test-pmd/config.c | 21 ++ app/test-pmd/meson.build | 4 + app/test-pmd/testpmd.c | 24 ++ app/test-pmd/testpmd.h | 1 + doc/guides/nics/mlx5.rst | 84 ++++++ doc/guides/rel_notes/release_22_07.rst | 2 + drivers/common/mlx5/linux/meson.build | 13 + drivers/common/mlx5/linux/mlx5_common_os.c | 131 +++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 11 + drivers/common/mlx5/mlx5_prm.h | 26 ++ drivers/common/mlx5/version.map | 2 + drivers/common/mlx5/windows/mlx5_common_os.h | 24 ++ drivers/net/mlx5/linux/mlx5_ethdev_os.c | 71 ----- drivers/net/mlx5/linux/mlx5_os.c | 132 ++------- drivers/net/mlx5/linux/mlx5_socket.c | 53 +--- drivers/net/mlx5/mlx5.c | 68 +++++ drivers/net/mlx5/mlx5.h | 12 +- drivers/net/mlx5/mlx5_devx.c | 60 +++- drivers/net/mlx5/mlx5_devx.h | 1 + drivers/net/mlx5/mlx5_rx.c | 292 +++++++++++++++++++ drivers/net/mlx5/mlx5_rx.h | 13 + drivers/net/mlx5/mlx5_testpmd.c | 184 ++++++++++++ drivers/net/mlx5/mlx5_testpmd.h | 27 ++ drivers/net/mlx5/mlx5_txpp.c | 28 +- drivers/net/mlx5/rte_pmd_mlx5.h | 30 ++ drivers/net/mlx5/version.map | 2 + drivers/net/mlx5/windows/mlx5_ethdev_os.c | 22 -- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 48 +-- lib/ethdev/ethdev_driver.h | 22 ++ lib/ethdev/rte_ethdev.c | 52 ++++ lib/ethdev/rte_ethdev.h | 71 +++++ lib/ethdev/version.map | 2 + 33 files changed, 1299 insertions(+), 308 deletions(-) create mode 100644 drivers/net/mlx5/mlx5_testpmd.c create mode 100644 drivers/net/mlx5/mlx5_testpmd.h
Comments
+Cc people involved in previous versions 24/05/2022 17:20, Spike Du: > LWM(limit watermark) is per RX queue attribute, when RX queue fullness reach the LWM limit, HW sends an event to dpdk application. > Host shaper can configure shaper rate and lwm-triggered for a host port. > The shaper limits the rate of traffic from host port to wire port. > If lwm-triggered is enabled, a 100Mbps shaper is enabled automatically when one of the host port's Rx queues receives LWM event. > > These two features can combine to control traffic from host port to wire port. > The work flow is configure LWM to RX queue and enable lwm-triggered flag in host shaper, after receiving LWM event, delay a while until RX queue is empty , then disable the shaper. We recycle this work flow to reduce RX queue drops. > > Add new libethdev API to set LWM, add rte event RTE_ETH_EVENT_RXQ_LIMIT_REACHED to handle LWM event. For host shaper, because it doesn't align to existing DPDK framework and is specific to Nvidia NIC, use PMD private API. > > For integration with testpmd, put the private cmdline function and LWM event handler in mlx5 PMD directory by adding a new file mlx5_test.c. Only add minimal code in testpmd to invoke interfaces from mlx5_test.c. > > Spike Du (7): > net/mlx5: add LWM support for Rxq > common/mlx5: share interrupt management > ethdev: introduce Rx queue based limit watermark > net/mlx5: add LWM event handling support > net/mlx5: support Rx queue based limit watermark > net/mlx5: add private API to config host port shaper > app/testpmd: add LWM and Host Shaper command > > app/test-pmd/cmdline.c | 74 +++++ > app/test-pmd/config.c | 21 ++ > app/test-pmd/meson.build | 4 + > app/test-pmd/testpmd.c | 24 ++ > app/test-pmd/testpmd.h | 1 + > doc/guides/nics/mlx5.rst | 84 ++++++ > doc/guides/rel_notes/release_22_07.rst | 2 + > drivers/common/mlx5/linux/meson.build | 13 + > drivers/common/mlx5/linux/mlx5_common_os.c | 131 +++++++++ > drivers/common/mlx5/linux/mlx5_common_os.h | 11 + > drivers/common/mlx5/mlx5_prm.h | 26 ++ > drivers/common/mlx5/version.map | 2 + > drivers/common/mlx5/windows/mlx5_common_os.h | 24 ++ > drivers/net/mlx5/linux/mlx5_ethdev_os.c | 71 ----- > drivers/net/mlx5/linux/mlx5_os.c | 132 ++------- > drivers/net/mlx5/linux/mlx5_socket.c | 53 +--- > drivers/net/mlx5/mlx5.c | 68 +++++ > drivers/net/mlx5/mlx5.h | 12 +- > drivers/net/mlx5/mlx5_devx.c | 60 +++- > drivers/net/mlx5/mlx5_devx.h | 1 + > drivers/net/mlx5/mlx5_rx.c | 292 +++++++++++++++++++ > drivers/net/mlx5/mlx5_rx.h | 13 + > drivers/net/mlx5/mlx5_testpmd.c | 184 ++++++++++++ > drivers/net/mlx5/mlx5_testpmd.h | 27 ++ > drivers/net/mlx5/mlx5_txpp.c | 28 +- > drivers/net/mlx5/rte_pmd_mlx5.h | 30 ++ > drivers/net/mlx5/version.map | 2 + > drivers/net/mlx5/windows/mlx5_ethdev_os.c | 22 -- > drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 48 +-- > lib/ethdev/ethdev_driver.h | 22 ++ > lib/ethdev/rte_ethdev.c | 52 ++++ > lib/ethdev/rte_ethdev.h | 71 +++++ > lib/ethdev/version.map | 2 + > 33 files changed, 1299 insertions(+), 308 deletions(-) > create mode 100644 drivers/net/mlx5/mlx5_testpmd.c > create mode 100644 drivers/net/mlx5/mlx5_testpmd.h
> From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Tuesday, 24 May 2022 17.59 > > +Cc people involved in previous versions > > 24/05/2022 17:20, Spike Du: > > LWM(limit watermark) is per RX queue attribute, when RX queue > fullness reach the LWM limit, HW sends an event to dpdk application. > > Host shaper can configure shaper rate and lwm-triggered for a host > port. Please ignore this comment, it is not important, but I had to get it out of my system: I assume that the "LWM" name is from the NIC datasheet; otherwise I would probably prefer something with "threshold"... LWM is easily confused with "low water mark", which is the opposite of what the LWM does. Names are always open for discussion, so I won't object to it. > > The shaper limits the rate of traffic from host port to wire port. From host to wire? It is RX, so you must mean from wire to host. > > If lwm-triggered is enabled, a 100Mbps shaper is enabled > automatically when one of the host port's Rx queues receives LWM event. > > > > These two features can combine to control traffic from host port to > wire port. Again, you mean from wire to host? > > The work flow is configure LWM to RX queue and enable lwm-triggered > flag in host shaper, after receiving LWM event, delay a while until RX > queue is empty , then disable the shaper. We recycle this work flow to > reduce RX queue drops. You delay while RX queue gets drained by some other threads, I assume. Surely, the excess packets must be dropped somewhere, e.g. by the shaper? > > > > Add new libethdev API to set LWM, add rte event > RTE_ETH_EVENT_RXQ_LIMIT_REACHED to handle LWM event. Makes sense to make it public; could be usable for other purposes, similar to interrupt coalescing, as mentioned by Stephen. > > For host shaper, > because it doesn't align to existing DPDK framework and is specific to > Nvidia NIC, use PMD private API. Makes sense to keep it private. > > > > For integration with testpmd, put the private cmdline function and > LWM event handler in mlx5 PMD directory by adding a new file > mlx5_test.c. Only add minimal code in testpmd to invoke interfaces from > mlx5_test.c. > > > > Spike Du (7): > > net/mlx5: add LWM support for Rxq > > common/mlx5: share interrupt management > > ethdev: introduce Rx queue based limit watermark > > net/mlx5: add LWM event handling support > > net/mlx5: support Rx queue based limit watermark > > net/mlx5: add private API to config host port shaper > > app/testpmd: add LWM and Host Shaper command > >
24/05/2022 21:00, Morten Brørup: > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > 24/05/2022 17:20, Spike Du: > > > LWM(limit watermark) is per RX queue attribute, when RX queue > > fullness reach the LWM limit, HW sends an event to dpdk application. > > Please ignore this comment, it is not important, but I had to get it out of my system: I assume that the "LWM" name is from the NIC datasheet; otherwise I would probably prefer something with "threshold"... LWM is easily confused with "low water mark", which is the opposite of what the LWM does. Names are always open for discussion, so I won't object to it. Yes it is a threshold, and yes it is often called a watermark. I think we can get more ideas and votes about the naming. Please let's conclude on a short name which can be inserted easily in function names.
> -----Original Message----- > From: Morten Brørup <mb@smartsharesystems.com> > Sent: Wednesday, May 25, 2022 3:00 AM > To: NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; > Spike Du <spiked@nvidia.com> > Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko > <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; dev@dpdk.org; > Raslan Darawsheh <rasland@nvidia.com>; stephen@networkplumber.org; > andrew.rybchenko@oktetlabs.ru; ferruh.yigit@amd.com; > david.marchand@redhat.com > Subject: RE: [PATCH v3 0/7] introduce per-queue limit watermark and host > shaper > > External email: Use caution opening links or attachments > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > Sent: Tuesday, 24 May 2022 17.59 > > > > +Cc people involved in previous versions > > > > 24/05/2022 17:20, Spike Du: > > > LWM(limit watermark) is per RX queue attribute, when RX queue > > fullness reach the LWM limit, HW sends an event to dpdk application. > > > Host shaper can configure shaper rate and lwm-triggered for a host > > port. > > Please ignore this comment, it is not important, but I had to get it out of my > system: I assume that the "LWM" name is from the NIC datasheet; otherwise > I would probably prefer something with "threshold"... LWM is easily > confused with "low water mark", which is the opposite of what the LWM > does. Names are always open for discussion, so I won't object to it. > > > > The shaper limits the rate of traffic from host port to wire port. > > From host to wire? It is RX, so you must mean from wire to host. The host shaper is quite private to Nvidia's BlueField 2 NIC. The NIC is inserted In a server which we call it host-system, and the NIC has an embedded Arm-system Which does the forwarding. The traffic flows from host-system to wire like this: Host-system generates traffic, send it to Arm-system, Arm sends it to physical/wire port. So the RX happens between host-system and Arm-system, and the traffic is host to wire. The shaper also works in a special way: you configure it on Arm-system, but it takes effect On host-sysmem's TX side. > > > > If lwm-triggered is enabled, a 100Mbps shaper is enabled > > automatically when one of the host port's Rx queues receives LWM event. > > > > > > These two features can combine to control traffic from host port to > > wire port. > > Again, you mean from wire to host? Pls see above. > > > > The work flow is configure LWM to RX queue and enable lwm-triggered > > flag in host shaper, after receiving LWM event, delay a while until RX > > queue is empty , then disable the shaper. We recycle this work flow to > > reduce RX queue drops. > > You delay while RX queue gets drained by some other threads, I assume. The PMD thread drains the Rx queue, the PMD receiving as normal, as the PMD Implementation uses rte interrupt thread to handle LWM event. > > Surely, the excess packets must be dropped somewhere, e.g. by the shaper? > > > > > > > Add new libethdev API to set LWM, add rte event > > RTE_ETH_EVENT_RXQ_LIMIT_REACHED to handle LWM event. > > Makes sense to make it public; could be usable for other purposes, similar to > interrupt coalescing, as mentioned by Stephen. > > > > For host shaper, > > because it doesn't align to existing DPDK framework and is specific to > > Nvidia NIC, use PMD private API. > > Makes sense to keep it private. > > > > > > > For integration with testpmd, put the private cmdline function and > > LWM event handler in mlx5 PMD directory by adding a new file > > mlx5_test.c. Only add minimal code in testpmd to invoke interfaces > > from mlx5_test.c. > > > > > > Spike Du (7): > > > net/mlx5: add LWM support for Rxq > > > common/mlx5: share interrupt management > > > ethdev: introduce Rx queue based limit watermark > > > net/mlx5: add LWM event handling support > > > net/mlx5: support Rx queue based limit watermark > > > net/mlx5: add private API to config host port shaper > > > app/testpmd: add LWM and Host Shaper command > > >
> From: Spike Du [mailto:spiked@nvidia.com] > Sent: Wednesday, 25 May 2022 15.15 > > > From: Morten Brørup <mb@smartsharesystems.com> > > Sent: Wednesday, May 25, 2022 3:00 AM > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > Sent: Tuesday, 24 May 2022 17.59 > > > > > > +Cc people involved in previous versions > > > > > > 24/05/2022 17:20, Spike Du: > > > > LWM(limit watermark) is per RX queue attribute, when RX queue > > > fullness reach the LWM limit, HW sends an event to dpdk > application. > > > > Host shaper can configure shaper rate and lwm-triggered for a > host > > > port. > > > > Please ignore this comment, it is not important, but I had to get it > out of my > > system: I assume that the "LWM" name is from the NIC datasheet; > otherwise > > I would probably prefer something with "threshold"... LWM is easily > > confused with "low water mark", which is the opposite of what the LWM > > does. Names are always open for discussion, so I won't object to it. > > > > > > The shaper limits the rate of traffic from host port to wire > port. > > > > From host to wire? It is RX, so you must mean from wire to host. > > The host shaper is quite private to Nvidia's BlueField 2 NIC. The NIC > is inserted > In a server which we call it host-system, and the NIC has an embedded > Arm-system > Which does the forwarding. > The traffic flows from host-system to wire like this: > Host-system generates traffic, send it to Arm-system, Arm sends it to > physical/wire port. > So the RX happens between host-system and Arm-system, and the traffic > is host to wire. > The shaper also works in a special way: you configure it on Arm-system, > but it takes effect > On host-sysmem's TX side. > > > > > > > If lwm-triggered is enabled, a 100Mbps shaper is enabled > > > automatically when one of the host port's Rx queues receives LWM > event. > > > > > > > > These two features can combine to control traffic from host port > to > > > wire port. > > > > Again, you mean from wire to host? > > Pls see above. > > > > > > > The work flow is configure LWM to RX queue and enable lwm- > triggered > > > flag in host shaper, after receiving LWM event, delay a while until > RX > > > queue is empty , then disable the shaper. We recycle this work flow > to > > > reduce RX queue drops. > > > > You delay while RX queue gets drained by some other threads, I > assume. > > The PMD thread drains the Rx queue, the PMD receiving as normal, as > the PMD > Implementation uses rte interrupt thread to handle LWM event. > Thank you for the explanation, Spike. It really clarifies a lot! If this patch is intended for DPDK running on the host-system, then the LWM attribute is associated with a TX queue, not an RX queue. The packets are egressing from the host-system, so TX from the host-system's perspective. Otherwise, if this patch is for DPDK running on the embedded ARM-system, it should be highlighted somewhere. > > > > Surely, the excess packets must be dropped somewhere, e.g. by the > shaper? I guess the shaper doesn't have to drop any packets, but the host-system will simply be unable to put more packets into the queue if it runs full.
> -----Original Message----- > From: Morten Brørup <mb@smartsharesystems.com> > Sent: Wednesday, May 25, 2022 9:40 PM > To: Spike Du <spiked@nvidia.com>; NBU-Contact-Thomas Monjalon > (EXTERNAL) <thomas@monjalon.net> > Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko > <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; dev@dpdk.org; > Raslan Darawsheh <rasland@nvidia.com>; stephen@networkplumber.org; > andrew.rybchenko@oktetlabs.ru; ferruh.yigit@amd.com; > david.marchand@redhat.com > Subject: RE: [PATCH v3 0/7] introduce per-queue limit watermark and host > shaper > > External email: Use caution opening links or attachments > > > > From: Spike Du [mailto:spiked@nvidia.com] > > Sent: Wednesday, 25 May 2022 15.15 > > > > > From: Morten Brørup <mb@smartsharesystems.com> > > > Sent: Wednesday, May 25, 2022 3:00 AM > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > Sent: Tuesday, 24 May 2022 17.59 > > > > > > > > +Cc people involved in previous versions > > > > > > > > 24/05/2022 17:20, Spike Du: > > > > > LWM(limit watermark) is per RX queue attribute, when RX queue > > > > fullness reach the LWM limit, HW sends an event to dpdk > > application. > > > > > Host shaper can configure shaper rate and lwm-triggered for a > > host > > > > port. > > > > > > Please ignore this comment, it is not important, but I had to get it > > out of my > > > system: I assume that the "LWM" name is from the NIC datasheet; > > otherwise > > > I would probably prefer something with "threshold"... LWM is easily > > > confused with "low water mark", which is the opposite of what the > > > LWM does. Names are always open for discussion, so I won't object to it. > > > > > > > > The shaper limits the rate of traffic from host port to wire > > port. > > > > > > From host to wire? It is RX, so you must mean from wire to host. > > > > The host shaper is quite private to Nvidia's BlueField 2 NIC. The NIC > > is inserted In a server which we call it host-system, and the NIC has > > an embedded Arm-system Which does the forwarding. > > The traffic flows from host-system to wire like this: > > Host-system generates traffic, send it to Arm-system, Arm sends it to > > physical/wire port. > > So the RX happens between host-system and Arm-system, and the traffic > > is host to wire. > > The shaper also works in a special way: you configure it on > > Arm-system, but it takes effect On host-sysmem's TX side. > > > > > > > > > > If lwm-triggered is enabled, a 100Mbps shaper is enabled > > > > automatically when one of the host port's Rx queues receives LWM > > event. > > > > > > > > > > These two features can combine to control traffic from host port > > to > > > > wire port. > > > > > > Again, you mean from wire to host? > > > > Pls see above. > > > > > > > > > > The work flow is configure LWM to RX queue and enable lwm- > > triggered > > > > flag in host shaper, after receiving LWM event, delay a while > > > > until > > RX > > > > queue is empty , then disable the shaper. We recycle this work > > > > flow > > to > > > > reduce RX queue drops. > > > > > > You delay while RX queue gets drained by some other threads, I > > assume. > > > > The PMD thread drains the Rx queue, the PMD receiving as normal, as > > the PMD Implementation uses rte interrupt thread to handle LWM event. > > > > Thank you for the explanation, Spike. It really clarifies a lot! > > If this patch is intended for DPDK running on the host-system, then the LWM > attribute is associated with a TX queue, not an RX queue. The packets are > egressing from the host-system, so TX from the host-system's perspective. > > Otherwise, if this patch is for DPDK running on the embedded ARM-system, > it should be highlighted somewhere. The host-shaper patch is running on ARM-system, I think in that patch I have some explanation in mlx5.rst. The LWM patch is common and should work on any Rx queue(right now mlx5 doesn't support Hairpin Rx queue and shared Rx queue). On ARM-system, we can use it to monitor traffic from host(representor port) or from wire(physical port). LWM can also work on host-system if there is DPDK running, for example it can monitor traffic from Arm-system to host-system. > > > > > > > Surely, the excess packets must be dropped somewhere, e.g. by the > > shaper? > > I guess the shaper doesn't have to drop any packets, but the host-system will > simply be unable to put more packets into the queue if it runs full. > When LWM event happens, the host-shaper throttles traffic from host-system to Arm-system. Yes, the shaper doesn't drop pkts. Normally the shaper is small and if PMD thread on Arm keeps working, Rx queue is dropless. But if PMD thread doesn't receive fast enough, or even with a small shaper but host-system is sending some burst, Rx queue may still drop on Arm. Anyway even sometimes drop still happens, the cooperation of host-shaper and LWM greatly reduce the Rx drop on Arm.
On 5/24/22 22:22, Thomas Monjalon wrote: > 24/05/2022 21:00, Morten Brørup: >> From: Thomas Monjalon [mailto:thomas@monjalon.net] >>> 24/05/2022 17:20, Spike Du: >>>> LWM(limit watermark) is per RX queue attribute, when RX queue >>> fullness reach the LWM limit, HW sends an event to dpdk application. >> >> Please ignore this comment, it is not important, but I had to get it out of my system: I assume that the "LWM" name is from the NIC datasheet; otherwise I would probably prefer something with "threshold"... LWM is easily confused with "low water mark", which is the opposite of what the LWM does. Names are always open for discussion, so I won't object to it. > > Yes it is a threshold, and yes it is often called a watermark. > I think we can get more ideas and votes about the naming. > Please let's conclude on a short name which can be inserted > easily in function names. As I understand it is an Rx queue fill (level) threshold. "fill_thresh" or "flt" if the first one is too long.
> From: Spike Du [mailto:spiked@nvidia.com] > Sent: Wednesday, 25 May 2022 15.59 > > > From: Morten Brørup <mb@smartsharesystems.com> > > Sent: Wednesday, May 25, 2022 9:40 PM > > > > > From: Spike Du [mailto:spiked@nvidia.com] > > > Sent: Wednesday, 25 May 2022 15.15 > > > > > > > From: Morten Brørup <mb@smartsharesystems.com> > > > > Sent: Wednesday, May 25, 2022 3:00 AM > > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > Sent: Tuesday, 24 May 2022 17.59 > > > > > > > > > > +Cc people involved in previous versions > > > > > > > > > > 24/05/2022 17:20, Spike Du: > > > > > > LWM(limit watermark) is per RX queue attribute, when RX queue > > > > > fullness reach the LWM limit, HW sends an event to dpdk > > > application. > > > > > > Host shaper can configure shaper rate and lwm-triggered for a > > > host > > > > > port. > > > > > > > > > > > > > > The shaper limits the rate of traffic from host port to wire > > > port. > > > > > > > > From host to wire? It is RX, so you must mean from wire to host. > > > > > > The host shaper is quite private to Nvidia's BlueField 2 NIC. The > NIC > > > is inserted In a server which we call it host-system, and the NIC > has > > > an embedded Arm-system Which does the forwarding. > > > The traffic flows from host-system to wire like this: > > > Host-system generates traffic, send it to Arm-system, Arm sends it > to > > > physical/wire port. > > > So the RX happens between host-system and Arm-system, and the > traffic > > > is host to wire. > > > The shaper also works in a special way: you configure it on > > > Arm-system, but it takes effect On host-sysmem's TX side. > > > > > > > > > > > > > If lwm-triggered is enabled, a 100Mbps shaper is enabled > > > > > automatically when one of the host port's Rx queues receives > LWM > > > event. > > > > > > > > > > > > These two features can combine to control traffic from host > port > > > to > > > > > wire port. > > > > > > > > Again, you mean from wire to host? > > > > > > Pls see above. > > > > > > > > > > > > > The work flow is configure LWM to RX queue and enable lwm- > > > triggered > > > > > flag in host shaper, after receiving LWM event, delay a while > > > > > until > > > RX > > > > > queue is empty , then disable the shaper. We recycle this work > > > > > flow > > > to > > > > > reduce RX queue drops. > > > > > > > > You delay while RX queue gets drained by some other threads, I > > > assume. > > > > > > The PMD thread drains the Rx queue, the PMD receiving as normal, > as > > > the PMD Implementation uses rte interrupt thread to handle LWM > event. > > > > > > > Thank you for the explanation, Spike. It really clarifies a lot! > > > > If this patch is intended for DPDK running on the host-system, then > the LWM > > attribute is associated with a TX queue, not an RX queue. The packets > are > > egressing from the host-system, so TX from the host-system's > perspective. > > > > Otherwise, if this patch is for DPDK running on the embedded ARM- > system, > > it should be highlighted somewhere. > > The host-shaper patch is running on ARM-system, I think in that patch I > have some explanation in mlx5.rst. > The LWM patch is common and should work on any Rx queue(right now mlx5 > doesn't support Hairpin Rx queue and shared Rx queue). > On ARM-system, we can use it to monitor traffic from host(representor > port) or from wire(physical port). > LWM can also work on host-system if there is DPDK running, for example > it can monitor traffic from Arm-system to host-system. OK. Then I get it! I was reading the patch description wearing my host-system glasses, and thus got very confused. :-) > > > > > > > > > > > Surely, the excess packets must be dropped somewhere, e.g. by the > > > shaper? > > > > I guess the shaper doesn't have to drop any packets, but the host- > system will > > simply be unable to put more packets into the queue if it runs full. > > > > When LWM event happens, the host-shaper throttles traffic from host- > system to Arm-system. Yes, the shaper doesn't drop pkts. > Normally the shaper is small and if PMD thread on Arm keeps working, Rx > queue is dropless. > But if PMD thread doesn't receive fast enough, or even with a small > shaper but host-system is sending some burst, Rx queue may still drop > on Arm. > Anyway even sometimes drop still happens, the cooperation of host- > shaper and LWM greatly reduce the Rx drop on Arm. Thanks for elaborating. And yes, shapers are excellent for many scenarios.
On 5/25/22 17:16, Morten Brørup wrote: >> From: Spike Du [mailto:spiked@nvidia.com] >> Sent: Wednesday, 25 May 2022 15.59 >> >>> From: Morten Brørup <mb@smartsharesystems.com> >>> Sent: Wednesday, May 25, 2022 9:40 PM >>> >>>> From: Spike Du [mailto:spiked@nvidia.com] >>>> Sent: Wednesday, 25 May 2022 15.15 >>>> >>>>> From: Morten Brørup <mb@smartsharesystems.com> >>>>> Sent: Wednesday, May 25, 2022 3:00 AM >>>>> >>>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net] >>>>>> Sent: Tuesday, 24 May 2022 17.59 >>>>>> >>>>>> +Cc people involved in previous versions >>>>>> >>>>>> 24/05/2022 17:20, Spike Du: >>>>>>> LWM(limit watermark) is per RX queue attribute, when RX queue >>>>>> fullness reach the LWM limit, HW sends an event to dpdk >>>> application. >>>>>>> Host shaper can configure shaper rate and lwm-triggered for a >>>> host >>>>>> port. >>>>> >>>>> >>>>>>> The shaper limits the rate of traffic from host port to wire >>>> port. >>>>> >>>>> From host to wire? It is RX, so you must mean from wire to host. >>>> >>>> The host shaper is quite private to Nvidia's BlueField 2 NIC. The >> NIC >>>> is inserted In a server which we call it host-system, and the NIC >> has >>>> an embedded Arm-system Which does the forwarding. >>>> The traffic flows from host-system to wire like this: >>>> Host-system generates traffic, send it to Arm-system, Arm sends it >> to >>>> physical/wire port. >>>> So the RX happens between host-system and Arm-system, and the >> traffic >>>> is host to wire. >>>> The shaper also works in a special way: you configure it on >>>> Arm-system, but it takes effect On host-sysmem's TX side. >>>> >>>>> >>>>>>> If lwm-triggered is enabled, a 100Mbps shaper is enabled >>>>>> automatically when one of the host port's Rx queues receives >> LWM >>>> event. >>>>>>> >>>>>>> These two features can combine to control traffic from host >> port >>>> to >>>>>> wire port. >>>>> >>>>> Again, you mean from wire to host? >>>> >>>> Pls see above. >>>> >>>>> >>>>>>> The work flow is configure LWM to RX queue and enable lwm- >>>> triggered >>>>>> flag in host shaper, after receiving LWM event, delay a while >>>>>> until >>>> RX >>>>>> queue is empty , then disable the shaper. We recycle this work >>>>>> flow >>>> to >>>>>> reduce RX queue drops. >>>>> >>>>> You delay while RX queue gets drained by some other threads, I >>>> assume. >>>> >>>> The PMD thread drains the Rx queue, the PMD receiving as normal, >> as >>>> the PMD Implementation uses rte interrupt thread to handle LWM >> event. >>>> >>> >>> Thank you for the explanation, Spike. It really clarifies a lot! >>> >>> If this patch is intended for DPDK running on the host-system, then >> the LWM >>> attribute is associated with a TX queue, not an RX queue. The packets >> are >>> egressing from the host-system, so TX from the host-system's >> perspective. >>> >>> Otherwise, if this patch is for DPDK running on the embedded ARM- >> system, >>> it should be highlighted somewhere. >> >> The host-shaper patch is running on ARM-system, I think in that patch I >> have some explanation in mlx5.rst. >> The LWM patch is common and should work on any Rx queue(right now mlx5 >> doesn't support Hairpin Rx queue and shared Rx queue). >> On ARM-system, we can use it to monitor traffic from host(representor >> port) or from wire(physical port). >> LWM can also work on host-system if there is DPDK running, for example >> it can monitor traffic from Arm-system to host-system. > > OK. Then I get it! I was reading the patch description wearing my host-system glasses, and thus got very confused. :-) The description in cover letter is very misleading for me as well. It is not a problem right now after long detailed explanations. Hopefully there is no such problem in suggested ethdev documentation. I'll reread it carefully before applying when time comes. > >> >>> >>>>> >>>>> Surely, the excess packets must be dropped somewhere, e.g. by the >>>> shaper? >>> >>> I guess the shaper doesn't have to drop any packets, but the host- >> system will >>> simply be unable to put more packets into the queue if it runs full. >>> >> >> When LWM event happens, the host-shaper throttles traffic from host- >> system to Arm-system. Yes, the shaper doesn't drop pkts. >> Normally the shaper is small and if PMD thread on Arm keeps working, Rx >> queue is dropless. >> But if PMD thread doesn't receive fast enough, or even with a small >> shaper but host-system is sending some burst, Rx queue may still drop >> on Arm. >> Anyway even sometimes drop still happens, the cooperation of host- >> shaper and LWM greatly reduce the Rx drop on Arm. > > Thanks for elaborating. And yes, shapers are excellent for many scenarios. >