From patchwork Tue Feb 7 16:07:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Srikanth Yalavarthi X-Patchwork-Id: 123356 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D50D441C30; Tue, 7 Feb 2023 17:12:28 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C1A9643012; Tue, 7 Feb 2023 17:08:10 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 77AD742D67 for ; Tue, 7 Feb 2023 17:07:40 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 317BL2w0005847; Tue, 7 Feb 2023 08:07:39 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0220; bh=g0F0PZ9ZPD/qGG+AlQhaMVvMVOz8/Zrocbl4y04tzd0=; b=HSxEY5HNOSBy011Pmumeoc1atE+PwqzgidvUTlLg5YJ7UlE9/6xyrJCL9L1O8/Cgqc3a UWgCmbQjjEMs/poofXsPn2vdTXMjeqhv+e1KgbrOmgRy0JuXCR8exziIK5mDNBGLyUvj rhjyB5Ia7DVqknRQltDRYEKKJBe30Pjjd/fmwGW8yce7U1JKsIzhu8V9vtf3IjMongIf KGDU+ivxQ1tz8oTu13gtY8/x4vrVF96voLlSbTIBel4lDl+ZAzKMlTngSGrqjaa9kEsv jMzInocNCFRJCwPKl2WFQ5htxNJrjauQ9P0AbKvQm+4FatBhXCpUJNu0K6x+ZvR2Zxbw tQ== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nhqrtmsnd-20 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Tue, 07 Feb 2023 08:07:39 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Tue, 7 Feb 2023 08:07:32 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Tue, 7 Feb 2023 08:07:32 -0800 Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233]) by maili.marvell.com (Postfix) with ESMTP id 5E8F23F7088; Tue, 7 Feb 2023 08:07:32 -0800 (PST) From: Srikanth Yalavarthi To: Thomas Monjalon , Srikanth Yalavarthi CC: , , , , , Subject: [PATCH v5 38/39] ml/cnxk: add user guide for marvell cnxk ml driver Date: Tue, 7 Feb 2023 08:07:18 -0800 Message-ID: <20230207160719.1307-39-syalavarthi@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207160719.1307-1-syalavarthi@marvell.com> References: <20221208200220.20267-1-syalavarthi@marvell.com> <20230207160719.1307-1-syalavarthi@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: 1GsyvWl-58ZCnpOX7mnvV-BhBG_ncvSJ X-Proofpoint-ORIG-GUID: 1GsyvWl-58ZCnpOX7mnvV-BhBG_ncvSJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-02-07_07,2023-02-06_03,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Added user guide for Marvell cnxk ML driver for Marvell Octeon cnxk Soc family. Added details about device initialization, debug options and runtime device args supported by the driver. Signed-off-by: Srikanth Yalavarthi Acked-by: Shivah Shankar S Acked-by: Prince Takkar --- MAINTAINERS | 1 + doc/guides/index.rst | 1 + doc/guides/mldevs/cnxk.rst | 238 ++++++++++++++++++++++++++++++++++++ doc/guides/mldevs/index.rst | 14 +++ 4 files changed, 254 insertions(+) create mode 100644 doc/guides/mldevs/cnxk.rst create mode 100644 doc/guides/mldevs/index.rst diff --git a/MAINTAINERS b/MAINTAINERS index 8e9d6dc946..65153948d2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1442,6 +1442,7 @@ M: Srikanth Yalavarthi F: drivers/common/cnxk/hw/ml.h F: drivers/common/cnxk/roc_ml* F: drivers/ml/cnxk/ +F: doc/guides/mldevs/cnxk.rst Packet processing diff --git a/doc/guides/index.rst b/doc/guides/index.rst index 5eb5bd9c9a..0bd729530a 100644 --- a/doc/guides/index.rst +++ b/doc/guides/index.rst @@ -26,6 +26,7 @@ DPDK documentation eventdevs/index rawdevs/index mempool/index + mldevs/index platform/index contributing/index rel_notes/index diff --git a/doc/guides/mldevs/cnxk.rst b/doc/guides/mldevs/cnxk.rst new file mode 100644 index 0000000000..da40336299 --- /dev/null +++ b/doc/guides/mldevs/cnxk.rst @@ -0,0 +1,238 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright (c) 2022 Marvell. + +Marvell cnxk Machine Learning Poll Mode Driver +============================================== + +The cnxk ML poll mode driver provides support for offloading Machine +Learning inference operations to Machine Learning accelerator units +on the **Marvell OCTEON cnxk** SoC family. + +The cnxk ML PMD code is organized into multiple files with all file names +starting with cn10k, providing support for CN106XX and CN106XXS. + +More information about OCTEON cnxk SoCs may be obtained from ``_ + +Supported OCTEON cnxk SoCs +-------------------------- + +- CN106XX +- CN106XXS + +Features +-------- + +The OCTEON cnxk ML PMD provides support for the following set of operations: + +Slow-path device and ML model handling: + +* ``Device probing, configuration and close`` +* ``Device start / stop`` +* ``Model loading and unloading`` +* ``Model start / stop`` +* ``Data quantization and dequantization`` + +Fast-path Inference: + +* ``Inference execution`` +* ``Error handling`` + + +Installation +------------ + +The OCTEON cnxk ML PMD may be compiled natively on an OCTEON cnxk platform +or cross-compiled on an x86 platform. + +Refer to :doc:`../platform/cnxk` for instructions to build your DPDK +application. + + +Initialization +-------------- + +``CN10K Initialization`` + +List the ML PF devices available on cn10k platform: + +.. code-block:: console + + lspci -d:a092 + +``a092`` is the ML device PF id. You should see output similar to: + +.. code-block:: console + + 0000:00:10.0 System peripheral: Cavium, Inc. Device a092 + +Bind the ML PF device to the vfio_pci driver: + +.. code-block:: console + + cd + ./usertools/dpdk-devbind.py -u 0000:00:10.0 + ./usertools/dpdk-devbind.py -b vfio-pci 0000:00:10.0 + +Runtime Config Options +---------------------- + +- ``Firmware file path`` (default ``/lib/firmware/mlip-fw.bin``) + + Path to the firmware binary to be loaded during device configuration. + The ``fw_path`` ``devargs`` parameter can be used by the user to load + ML firmware from a custom path. + + For example:: + + -a 0000:00:10.0,fw_path="/home/user/ml_fw.bin" + + With the above configuration, driver loads the firmware from the path + "/home/user/ml_fw.bin". + +- ``Enable DPE warnings`` (default ``1``) + + ML firmware can be configured during load to handle the DPE errors reported + by ML inference engine. When enabled, firmware would mask the DPE non-fatal + hardware errors as warnings. The parameter ``enable_dpe_warnings`` ``devargs`` + is used fo this configuration. + + For example:: + + -a 0000:00:10.0,enable_dpe_warnings=0 + + With the above configuration, DPE non-fatal errors reported by HW are + considered as errors. + + +- ``Model data caching`` (default ``1``) + + Enable caching model data on ML ACC cores. Enabling this option executes a + dummy inference request in synchronous mode during model start stage. Caching + of model data improves the inferencing throughput / latency for the model. + The parameter ``cache_model_data`` ``devargs`` is used to enable data caching. + + For example:: + + -a 0000:00:10.0,cache_model_data=0 + + With the above configuration, model data caching is disabled. + + +- ``OCM allocation mode`` (default ``lowest``) + + Option to specify the method to be used while allocating OCM memory for a + model during model start. Two modes are supported by the driver. The + parameter ``ocm_alloc_mode`` ``devargs`` is used to select the OCM + allocation mode. + + ``lowest`` - Allocate OCM for the model from first available free slot. Search + for the free slot is done starting from the lowest tile ID and lowest page ID. + ``largest`` - Allocate OCM for the model from the slot with largest amount of + free space. + + For example:: + + -a 0000:00:10.0,ocm_alloc_mode=lowest + + With the above configuration, OCM allocation fo the model would be done from + the first available free slot / from the lowest possible tile ID. + + +- ``Enable hardware queue lock`` (default ``0``) + + Option to select the job request enqueue function to used to queue the requests + to hardware queue. The parameter ``hw_queue_lock`` ``devargs`` is used to select + the enqueue function. + + ``0`` - Disable (default), use lock free version of hardware enqueue function + for job queuing in enqueue burst operation. To avoid race condition in request + queuing to hardware, disabling hw_queue_lock restricts the number of queue-pairs + supported by cnxk driver to 1. + ``1`` - Enable, use spin-lock version of hardware enqueue function for job queuing. + Enabling spinlock version would disable restrictions on the number of queue-pairs + that can be supported by the driver. + + For example:: + + -a 0000:00:10.0,hw_queue_lock=1 + + With the above configuration, spinlock version of hardware enqueue function is used + in the fast path enqueue burst operation. + + +- ``Polling memory location`` (default ``ddr``) + + ML cnxk driver provides the option to select the memory location to be used + for polling to check the inference request completion. Driver supports using + the either DDR address space (``ddr``) or ML registers (``register``) as + polling locations. The parameter ``poll_mem`` ``devargs`` is used to specify + the poll location. + + For example:: + + -a 0000:00:10.0,poll_mem="register" + + With the above configuration, ML cnxk driver is configured to use ML registers + for polling in fastpath requests. + + +Debugging Options +----------------- + +.. _table_octeon_cnxk_ml_debug_options: + +.. table:: OCTEON cnxk ML PMD debug options + + +---+------------+-------------------------------------------------------+ + | # | Component | EAL log command | + +===+============+=======================================================+ + | 1 | ML | --log-level='pmd\.ml\.cnxk,8' | + +---+------------+-------------------------------------------------------+ + + +Extended stats +-------------- + +Marvell cnxk ML PMD supports reporting the inference latencies through extended +stats. The PMD supports the below list of 6 extended stats types per each model. +Total number of extended stats would be equal to 6 x number of models loaded. + +.. _table_octeon_cnxk_ml_xstats_names: + +.. table:: OCTEON cnxk ML PMD xstats names + + +---+---------------------+----------------------------------------------+ + | # | Type | Description | + +===+=====================+==============================================+ + | 1 | Avg-HW-Latency | Average hardware latency | + +---+---------------------+----------------------------------------------+ + | 2 | Min-HW-Latency | Minimum hardware latency | + +---+---------------------+----------------------------------------------+ + | 3 | Max-HW-Latency | Maximum hardware latency | + +---+---------------------+----------------------------------------------+ + | 4 | Avg-HW-Latency | Average firmware latency | + +---+---------------------+----------------------------------------------+ + | 5 | Avg-HW-Latency | Minimum firmware latency | + +---+---------------------+----------------------------------------------+ + | 6 | Avg-HW-Latency | Maximum firmware latency | + +---+---------------------+----------------------------------------------+ + +Latency values reported by the PMD through xstats can have units, either in +cycles or nano seconds. The units of the latency is determined during DPDK +initialization and would depend on the availability of SCLK. Latencies are +reported in nao seconds when the SCLK is available and in cycles otherwise. +Application needs to initialize at least one RVU for the clock to be available. + +xstats names are dynamically generated by the PMD and would have the format +"Model--Type-". + +For example:: + Model-1-Avg-FW-Latency-ns + +The above xstat name would report average firmware latency in nano seconds for +model with model ID 1. + +Number of xstats made available by the PMD change dynamically. The number would +increase with loading a model and would decrease with unloading a model. +Application needs to update the xstats map after a model is either loaded or +unloaded. diff --git a/doc/guides/mldevs/index.rst b/doc/guides/mldevs/index.rst new file mode 100644 index 0000000000..f201e54175 --- /dev/null +++ b/doc/guides/mldevs/index.rst @@ -0,0 +1,14 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright (c) 2022 Marvell. + +Machine Learning Device Driver +============================== + +The following are a list of ML device PMDs, which can be used from an +application through the ML device API. + +.. toctree:: + :maxdepth: 2 + :numbered: + + cnxk