From patchwork Wed Oct 18 06:47:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Srikanth Yalavarthi X-Patchwork-Id: 132843 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2AA1043196; Wed, 18 Oct 2023 08:48:27 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A3D9340EE4; Wed, 18 Oct 2023 08:48:18 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 573E540289 for ; Wed, 18 Oct 2023 08:48:14 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39HKEPaE011549 for ; Tue, 17 Oct 2023 23:48:13 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=JcofBQ+9693oz5OvpJ4e1mg8XL16jCe4KK3Avnyi/DI=; b=SpDALIFvfi92AjLmEV45tiOJqnvd7GFRcQ+gS8SQnfuz1Gb9P0VKJU4ix7lsJtmJ9TI+ 3J8vEZoxKHUOcUin6woAdLS+bbmW/evOmyzryefOW5IFGACUD6M9WQLQvr2iHUTu76h3 CWuOzwZ/TwueSh+VPjXX/ndk0VwSACJP9bBRVmRAIrNPcB9CggP8bCrCDqcbpeyzf9YP bb/UsmV4/c7osRfTxs/Pktl4Nok6JRCje6J56UJZ4ObL8AHkVjiprgteBWi7O/2HlgXG g+BSt0/at9UOFkzZOOE63iNFx3o6wuxpEI8SnEC6DwHChku768sGakZmyOS+uzJ6w0tR +A== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3tt1481rwg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Tue, 17 Oct 2023 23:48:12 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 17 Oct 2023 23:48:10 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Tue, 17 Oct 2023 23:48:10 -0700 Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233]) by maili.marvell.com (Postfix) with ESMTP id BCD363F704C; Tue, 17 Oct 2023 23:48:10 -0700 (PDT) From: Srikanth Yalavarthi To: Srikanth Yalavarthi CC: , , , Subject: [PATCH v5 02/34] ml/cnxk: add generic cnxk device structure Date: Tue, 17 Oct 2023 23:47:30 -0700 Message-ID: <20231018064806.24145-3-syalavarthi@marvell.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231018064806.24145-1-syalavarthi@marvell.com> References: <20230830155927.3566-1-syalavarthi@marvell.com> <20231018064806.24145-1-syalavarthi@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: X5BpWPXYGfD2GtXz0ZMSDpQqUlHom8_c X-Proofpoint-ORIG-GUID: X5BpWPXYGfD2GtXz0ZMSDpQqUlHom8_c X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-18_04,2023-10-17_01,2023-05-22_02 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Introduce generic cnxk device structure. This structure is a top level device structure for the driver, which would encapsulate the target / platform specific device structure. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.c | 316 ++++++++++---------- drivers/ml/cnxk/cn10k_ml_dev.h | 47 +-- drivers/ml/cnxk/cn10k_ml_model.c | 15 +- drivers/ml/cnxk/cn10k_ml_model.h | 8 +- drivers/ml/cnxk/cn10k_ml_ocm.c | 60 ++-- drivers/ml/cnxk/cn10k_ml_ops.c | 495 +++++++++++++++++-------------- drivers/ml/cnxk/cnxk_ml_dev.c | 11 + drivers/ml/cnxk/cnxk_ml_dev.h | 58 ++++ drivers/ml/cnxk/meson.build | 2 + 9 files changed, 563 insertions(+), 449 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_dev.c create mode 100644 drivers/ml/cnxk/cnxk_ml_dev.h diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c index e3c2badcef..3bc61443d8 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.c +++ b/drivers/ml/cnxk/cn10k_ml_dev.c @@ -10,13 +10,14 @@ #include #include -#include - #include -#include "cn10k_ml_dev.h" +#include + #include "cn10k_ml_ops.h" +#include "cnxk_ml_dev.h" + #define CN10K_ML_FW_PATH "fw_path" #define CN10K_ML_FW_ENABLE_DPE_WARNINGS "enable_dpe_warnings" #define CN10K_ML_FW_REPORT_DPE_WARNINGS "report_dpe_warnings" @@ -58,9 +59,6 @@ static const char *const valid_args[] = {CN10K_ML_FW_PATH, /* Supported OCM page sizes: 1KB, 2KB, 4KB, 8KB and 16KB */ static const int valid_ocm_page_size[] = {1024, 2048, 4096, 8192, 16384}; -/* Dummy operations for ML device */ -struct rte_ml_dev_ops ml_dev_dummy_ops = {0}; - static int parse_string_arg(const char *key __rte_unused, const char *value, void *extra_args) { @@ -90,7 +88,7 @@ parse_integer_arg(const char *key __rte_unused, const char *value, void *extra_a } static int -cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mldev) +cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *cn10k_mldev) { bool enable_dpe_warnings_set = false; bool report_dpe_warnings_set = false; @@ -127,7 +125,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_FW_ENABLE_DPE_WARNINGS) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_FW_ENABLE_DPE_WARNINGS, - &parse_integer_arg, &mldev->fw.enable_dpe_warnings); + &parse_integer_arg, &cn10k_mldev->fw.enable_dpe_warnings); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_FW_ENABLE_DPE_WARNINGS); @@ -139,7 +137,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_FW_REPORT_DPE_WARNINGS) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_FW_REPORT_DPE_WARNINGS, - &parse_integer_arg, &mldev->fw.report_dpe_warnings); + &parse_integer_arg, &cn10k_mldev->fw.report_dpe_warnings); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_FW_REPORT_DPE_WARNINGS); @@ -151,7 +149,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_DEV_CACHE_MODEL_DATA) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_DEV_CACHE_MODEL_DATA, &parse_integer_arg, - &mldev->cache_model_data); + &cn10k_mldev->cache_model_data); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_DEV_CACHE_MODEL_DATA); @@ -174,7 +172,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK, &parse_integer_arg, - &mldev->hw_queue_lock); + &cn10k_mldev->hw_queue_lock); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_DEV_HW_QUEUE_LOCK); @@ -186,7 +184,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_OCM_PAGE_SIZE) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_OCM_PAGE_SIZE, &parse_integer_arg, - &mldev->ocm_page_size); + &cn10k_mldev->ocm_page_size); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_OCM_PAGE_SIZE); ret = -EINVAL; @@ -197,49 +195,53 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde check_args: if (!fw_path_set) - mldev->fw.path = CN10K_ML_FW_PATH_DEFAULT; + cn10k_mldev->fw.path = CN10K_ML_FW_PATH_DEFAULT; else - mldev->fw.path = fw_path; - plt_info("ML: %s = %s", CN10K_ML_FW_PATH, mldev->fw.path); + cn10k_mldev->fw.path = fw_path; + plt_info("ML: %s = %s", CN10K_ML_FW_PATH, cn10k_mldev->fw.path); if (!enable_dpe_warnings_set) { - mldev->fw.enable_dpe_warnings = CN10K_ML_FW_ENABLE_DPE_WARNINGS_DEFAULT; + cn10k_mldev->fw.enable_dpe_warnings = CN10K_ML_FW_ENABLE_DPE_WARNINGS_DEFAULT; } else { - if ((mldev->fw.enable_dpe_warnings < 0) || (mldev->fw.enable_dpe_warnings > 1)) { + if ((cn10k_mldev->fw.enable_dpe_warnings < 0) || + (cn10k_mldev->fw.enable_dpe_warnings > 1)) { plt_err("Invalid argument, %s = %d\n", CN10K_ML_FW_ENABLE_DPE_WARNINGS, - mldev->fw.enable_dpe_warnings); + cn10k_mldev->fw.enable_dpe_warnings); ret = -EINVAL; goto exit; } } - plt_info("ML: %s = %d", CN10K_ML_FW_ENABLE_DPE_WARNINGS, mldev->fw.enable_dpe_warnings); + plt_info("ML: %s = %d", CN10K_ML_FW_ENABLE_DPE_WARNINGS, + cn10k_mldev->fw.enable_dpe_warnings); if (!report_dpe_warnings_set) { - mldev->fw.report_dpe_warnings = CN10K_ML_FW_REPORT_DPE_WARNINGS_DEFAULT; + cn10k_mldev->fw.report_dpe_warnings = CN10K_ML_FW_REPORT_DPE_WARNINGS_DEFAULT; } else { - if ((mldev->fw.report_dpe_warnings < 0) || (mldev->fw.report_dpe_warnings > 1)) { + if ((cn10k_mldev->fw.report_dpe_warnings < 0) || + (cn10k_mldev->fw.report_dpe_warnings > 1)) { plt_err("Invalid argument, %s = %d\n", CN10K_ML_FW_REPORT_DPE_WARNINGS, - mldev->fw.report_dpe_warnings); + cn10k_mldev->fw.report_dpe_warnings); ret = -EINVAL; goto exit; } } - plt_info("ML: %s = %d", CN10K_ML_FW_REPORT_DPE_WARNINGS, mldev->fw.report_dpe_warnings); + plt_info("ML: %s = %d", CN10K_ML_FW_REPORT_DPE_WARNINGS, + cn10k_mldev->fw.report_dpe_warnings); if (!cache_model_data_set) { - mldev->cache_model_data = CN10K_ML_DEV_CACHE_MODEL_DATA_DEFAULT; + cn10k_mldev->cache_model_data = CN10K_ML_DEV_CACHE_MODEL_DATA_DEFAULT; } else { - if ((mldev->cache_model_data < 0) || (mldev->cache_model_data > 1)) { + if ((cn10k_mldev->cache_model_data < 0) || (cn10k_mldev->cache_model_data > 1)) { plt_err("Invalid argument, %s = %d\n", CN10K_ML_DEV_CACHE_MODEL_DATA, - mldev->cache_model_data); + cn10k_mldev->cache_model_data); ret = -EINVAL; goto exit; } } - plt_info("ML: %s = %d", CN10K_ML_DEV_CACHE_MODEL_DATA, mldev->cache_model_data); + plt_info("ML: %s = %d", CN10K_ML_DEV_CACHE_MODEL_DATA, cn10k_mldev->cache_model_data); if (!ocm_alloc_mode_set) { - mldev->ocm.alloc_mode = CN10K_ML_OCM_ALLOC_MODE_DEFAULT; + cn10k_mldev->ocm.alloc_mode = CN10K_ML_OCM_ALLOC_MODE_DEFAULT; } else { if (!((strcmp(ocm_alloc_mode, "lowest") == 0) || (strcmp(ocm_alloc_mode, "largest") == 0))) { @@ -248,47 +250,47 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde ret = -EINVAL; goto exit; } - mldev->ocm.alloc_mode = ocm_alloc_mode; + cn10k_mldev->ocm.alloc_mode = ocm_alloc_mode; } - plt_info("ML: %s = %s", CN10K_ML_OCM_ALLOC_MODE, mldev->ocm.alloc_mode); + plt_info("ML: %s = %s", CN10K_ML_OCM_ALLOC_MODE, cn10k_mldev->ocm.alloc_mode); if (!hw_queue_lock_set) { - mldev->hw_queue_lock = CN10K_ML_DEV_HW_QUEUE_LOCK_DEFAULT; + cn10k_mldev->hw_queue_lock = CN10K_ML_DEV_HW_QUEUE_LOCK_DEFAULT; } else { - if ((mldev->hw_queue_lock < 0) || (mldev->hw_queue_lock > 1)) { + if ((cn10k_mldev->hw_queue_lock < 0) || (cn10k_mldev->hw_queue_lock > 1)) { plt_err("Invalid argument, %s = %d\n", CN10K_ML_DEV_HW_QUEUE_LOCK, - mldev->hw_queue_lock); + cn10k_mldev->hw_queue_lock); ret = -EINVAL; goto exit; } } - plt_info("ML: %s = %d", CN10K_ML_DEV_HW_QUEUE_LOCK, mldev->hw_queue_lock); + plt_info("ML: %s = %d", CN10K_ML_DEV_HW_QUEUE_LOCK, cn10k_mldev->hw_queue_lock); if (!ocm_page_size_set) { - mldev->ocm_page_size = CN10K_ML_OCM_PAGE_SIZE_DEFAULT; + cn10k_mldev->ocm_page_size = CN10K_ML_OCM_PAGE_SIZE_DEFAULT; } else { - if (mldev->ocm_page_size < 0) { + if (cn10k_mldev->ocm_page_size < 0) { plt_err("Invalid argument, %s = %d\n", CN10K_ML_OCM_PAGE_SIZE, - mldev->ocm_page_size); + cn10k_mldev->ocm_page_size); ret = -EINVAL; goto exit; } found = false; for (i = 0; i < PLT_DIM(valid_ocm_page_size); i++) { - if (mldev->ocm_page_size == valid_ocm_page_size[i]) { + if (cn10k_mldev->ocm_page_size == valid_ocm_page_size[i]) { found = true; break; } } if (!found) { - plt_err("Unsupported ocm_page_size = %d\n", mldev->ocm_page_size); + plt_err("Unsupported ocm_page_size = %d\n", cn10k_mldev->ocm_page_size); ret = -EINVAL; goto exit; } } - plt_info("ML: %s = %d", CN10K_ML_OCM_PAGE_SIZE, mldev->ocm_page_size); + plt_info("ML: %s = %d", CN10K_ML_OCM_PAGE_SIZE, cn10k_mldev->ocm_page_size); exit: rte_kvargs_free(kvlist); @@ -300,7 +302,8 @@ static int cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev) { struct rte_ml_dev_pmd_init_params init_params; - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; char name[RTE_ML_STR_MAX]; struct rte_ml_dev *dev; int ret; @@ -308,7 +311,7 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_de PLT_SET_USED(pci_drv); init_params = (struct rte_ml_dev_pmd_init_params){ - .socket_id = rte_socket_id(), .private_data_size = sizeof(struct cn10k_ml_dev)}; + .socket_id = rte_socket_id(), .private_data_size = sizeof(struct cnxk_ml_dev)}; ret = roc_plt_init(); if (ret < 0) { @@ -324,18 +327,20 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_de } /* Get private data space allocated */ - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cnxk_mldev->mldev = dev; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; if (rte_eal_process_type() == RTE_PROC_PRIMARY) { - mldev->roc.pci_dev = pci_dev; + cn10k_mldev->roc.pci_dev = pci_dev; - ret = cn10k_mldev_parse_devargs(dev->device->devargs, mldev); + ret = cn10k_mldev_parse_devargs(dev->device->devargs, cn10k_mldev); if (ret) { plt_err("Failed to parse devargs ret = %d", ret); goto pmd_destroy; } - ret = roc_ml_dev_init(&mldev->roc); + ret = roc_ml_dev_init(&cn10k_mldev->roc); if (ret) { plt_err("Failed to initialize ML ROC, ret = %d", ret); goto pmd_destroy; @@ -351,7 +356,7 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_de dev->dequeue_burst = NULL; dev->op_error_get = NULL; - mldev->state = ML_CN10K_DEV_STATE_PROBED; + cnxk_mldev->state = ML_CNXK_DEV_STATE_PROBED; return 0; @@ -368,7 +373,7 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_de static int cn10k_ml_pci_remove(struct rte_pci_device *pci_dev) { - struct cn10k_ml_dev *mldev; + struct cnxk_ml_dev *cnxk_mldev; char name[RTE_ML_STR_MAX]; struct rte_ml_dev *dev; int ret; @@ -383,8 +388,8 @@ cn10k_ml_pci_remove(struct rte_pci_device *pci_dev) return -ENODEV; if (rte_eal_process_type() == RTE_PROC_PRIMARY) { - mldev = dev->data->dev_private; - ret = roc_ml_dev_fini(&mldev->roc); + cnxk_mldev = dev->data->dev_private; + ret = roc_ml_dev_fini(&cnxk_mldev->cn10k_mldev.roc); if (ret) return ret; } @@ -430,45 +435,45 @@ cn10k_ml_fw_flags_get(struct cn10k_ml_fw *fw) static int cn10k_ml_fw_load_asim(struct cn10k_ml_fw *fw) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; uint64_t timeout_cycle; uint64_t reg_val64; bool timeout; int ret = 0; - mldev = fw->mldev; + cn10k_mldev = fw->cn10k_mldev; /* Reset HEAD and TAIL debug pointer registers */ - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C1); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C1); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C1); /* Set ML_MLR_BASE to base IOVA of the ML region in LLC/DRAM. */ reg_val64 = rte_eal_get_baseaddr(); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_MLR_BASE); - plt_ml_dbg("ML_MLR_BASE = 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_MLR_BASE)); - roc_ml_reg_save(&mldev->roc, ML_MLR_BASE); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_MLR_BASE); + plt_ml_dbg("ML_MLR_BASE = 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_MLR_BASE)); + roc_ml_reg_save(&cn10k_mldev->roc, ML_MLR_BASE); /* Update FW load completion structure */ fw->req->jd.hdr.jce.w1.u64 = PLT_U64_CAST(&fw->req->status); fw->req->jd.hdr.job_type = ML_CN10K_JOB_TYPE_FIRMWARE_LOAD; - fw->req->jd.hdr.result = roc_ml_addr_ap2mlip(&mldev->roc, &fw->req->result); + fw->req->jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &fw->req->result); fw->req->jd.fw_load.flags = cn10k_ml_fw_flags_get(fw); - plt_write64(ML_CN10K_POLL_JOB_START, &fw->req->status); + plt_write64(ML_CNXK_POLL_JOB_START, &fw->req->status); plt_wmb(); /* Enqueue FW load through scratch registers */ timeout = true; - timeout_cycle = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); - roc_ml_scratch_enqueue(&mldev->roc, &fw->req->jd); + timeout_cycle = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); + roc_ml_scratch_enqueue(&cn10k_mldev->roc, &fw->req->jd); plt_rmb(); do { - if (roc_ml_scratch_is_done_bit_set(&mldev->roc) && - (plt_read64(&fw->req->status) == ML_CN10K_POLL_JOB_FINISH)) { + if (roc_ml_scratch_is_done_bit_set(&cn10k_mldev->roc) && + (plt_read64(&fw->req->status) == ML_CNXK_POLL_JOB_FINISH)) { timeout = false; break; } @@ -480,11 +485,11 @@ cn10k_ml_fw_load_asim(struct cn10k_ml_fw *fw) } else { /* Set ML to disable new jobs */ reg_val64 = (ROC_ML_CFG_JD_SIZE | ROC_ML_CFG_MLIP_ENA); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); /* Clear scratch registers */ - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_WORK_PTR); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_FW_CTRL); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_WORK_PTR); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_FW_CTRL); if (timeout) { plt_err("Firmware load timeout"); @@ -498,14 +503,14 @@ cn10k_ml_fw_load_asim(struct cn10k_ml_fw *fw) } /* Reset scratch registers */ - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_FW_CTRL); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_WORK_PTR); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_FW_CTRL); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_WORK_PTR); /* Disable job execution, to be enabled in start */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 &= ~ROC_ML_CFG_ENA; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); return ret; } @@ -515,7 +520,7 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) { union ml_a35_0_rst_vector_base_s a35_0_rst_vector_base; union ml_a35_0_rst_vector_base_s a35_1_rst_vector_base; - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; uint64_t timeout_cycle; uint64_t reg_val64; uint32_t reg_val32; @@ -524,24 +529,24 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) int ret = 0; uint8_t i; - mldev = fw->mldev; + cn10k_mldev = fw->cn10k_mldev; /* Reset HEAD and TAIL debug pointer registers */ - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C1); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C1); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_EXCEPTION_SP_C1); /* (1) Write firmware images for ACC's two A35 cores to the ML region in LLC / DRAM. */ rte_memcpy(PLT_PTR_ADD(fw->data, FW_LINKER_OFFSET), buffer, size); /* (2) Set ML(0)_MLR_BASE = Base IOVA of the ML region in LLC/DRAM. */ reg_val64 = PLT_PTR_SUB_U64_CAST(fw->data, rte_eal_get_baseaddr()); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_MLR_BASE); - plt_ml_dbg("ML_MLR_BASE => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_MLR_BASE)); - roc_ml_reg_save(&mldev->roc, ML_MLR_BASE); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_MLR_BASE); + plt_ml_dbg("ML_MLR_BASE => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_MLR_BASE)); + roc_ml_reg_save(&cn10k_mldev->roc, ML_MLR_BASE); /* (3) Set ML(0)_AXI_BRIDGE_CTRL(1) = 0x184003 to remove back-pressure check on DMA AXI * bridge. @@ -549,9 +554,9 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) reg_val64 = (ROC_ML_AXI_BRIDGE_CTRL_AXI_RESP_CTRL | ROC_ML_AXI_BRIDGE_CTRL_BRIDGE_CTRL_MODE | ROC_ML_AXI_BRIDGE_CTRL_NCB_WR_BLK | ROC_ML_AXI_BRIDGE_CTRL_FORCE_WRESP_OK | ROC_ML_AXI_BRIDGE_CTRL_FORCE_RRESP_OK); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_AXI_BRIDGE_CTRL(1)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_AXI_BRIDGE_CTRL(1)); plt_ml_dbg("ML_AXI_BRIDGE_CTRL(1) => 0x%016lx", - roc_ml_reg_read64(&mldev->roc, ML_AXI_BRIDGE_CTRL(1))); + roc_ml_reg_read64(&cn10k_mldev->roc, ML_AXI_BRIDGE_CTRL(1))); /* (4) Set ML(0)_ANB(0..2)_BACKP_DISABLE = 0x3 to remove back-pressure on the AXI to NCB * bridges. @@ -559,9 +564,9 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) for (i = 0; i < ML_ANBX_NR; i++) { reg_val64 = (ROC_ML_ANBX_BACKP_DISABLE_EXTMSTR_B_BACKP_DISABLE | ROC_ML_ANBX_BACKP_DISABLE_EXTMSTR_R_BACKP_DISABLE); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_ANBX_BACKP_DISABLE(i)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_ANBX_BACKP_DISABLE(i)); plt_ml_dbg("ML_ANBX_BACKP_DISABLE(%u) => 0x%016lx", i, - roc_ml_reg_read64(&mldev->roc, ML_ANBX_BACKP_DISABLE(i))); + roc_ml_reg_read64(&cn10k_mldev->roc, ML_ANBX_BACKP_DISABLE(i))); } /* (5) Set ML(0)_ANB(0..2)_NCBI_P_OVR = 0x3000 and ML(0)_ANB(0..2)_NCBI_NP_OVR = 0x3000 to @@ -570,39 +575,40 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) for (i = 0; i < ML_ANBX_NR; i++) { reg_val64 = (ML_ANBX_NCBI_P_OVR_ANB_NCBI_P_NS_OVR | ML_ANBX_NCBI_P_OVR_ANB_NCBI_P_NS_OVR_VLD); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_ANBX_NCBI_P_OVR(i)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_ANBX_NCBI_P_OVR(i)); plt_ml_dbg("ML_ANBX_NCBI_P_OVR(%u) => 0x%016lx", i, - roc_ml_reg_read64(&mldev->roc, ML_ANBX_NCBI_P_OVR(i))); + roc_ml_reg_read64(&cn10k_mldev->roc, ML_ANBX_NCBI_P_OVR(i))); reg_val64 |= (ML_ANBX_NCBI_NP_OVR_ANB_NCBI_NP_NS_OVR | ML_ANBX_NCBI_NP_OVR_ANB_NCBI_NP_NS_OVR_VLD); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_ANBX_NCBI_NP_OVR(i)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_ANBX_NCBI_NP_OVR(i)); plt_ml_dbg("ML_ANBX_NCBI_NP_OVR(%u) => 0x%016lx", i, - roc_ml_reg_read64(&mldev->roc, ML_ANBX_NCBI_NP_OVR(i))); + roc_ml_reg_read64(&cn10k_mldev->roc, ML_ANBX_NCBI_NP_OVR(i))); } /* (6) Set ML(0)_CFG[MLIP_CLK_FORCE] = 1, to force turning on the MLIP clock. */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 |= ROC_ML_CFG_MLIP_CLK_FORCE; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); /* (7) Set ML(0)_JOB_MGR_CTRL[STALL_ON_IDLE] = 0, to make sure the boot request is accepted * when there is no job in the command queue. */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_JOB_MGR_CTRL); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_JOB_MGR_CTRL); reg_val64 &= ~ROC_ML_JOB_MGR_CTRL_STALL_ON_IDLE; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_JOB_MGR_CTRL); - plt_ml_dbg("ML_JOB_MGR_CTRL => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_JOB_MGR_CTRL)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_JOB_MGR_CTRL); + plt_ml_dbg("ML_JOB_MGR_CTRL => 0x%016lx", + roc_ml_reg_read64(&cn10k_mldev->roc, ML_JOB_MGR_CTRL)); /* (8) Set ML(0)_CFG[ENA] = 0 and ML(0)_CFG[MLIP_ENA] = 1 to bring MLIP out of reset while * keeping the job manager disabled. */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 |= ROC_ML_CFG_MLIP_ENA; reg_val64 &= ~ROC_ML_CFG_ENA; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); /* (9) Wait at least 70 coprocessor clock cycles. */ plt_delay_us(FW_WAIT_CYCLES); @@ -613,53 +619,57 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) * AXI outbound address divided by 4. Read after write. */ offset = PLT_PTR_ADD_U64_CAST( - fw->data, FW_LINKER_OFFSET - roc_ml_reg_read64(&mldev->roc, ML_MLR_BASE)); + fw->data, FW_LINKER_OFFSET - roc_ml_reg_read64(&cn10k_mldev->roc, ML_MLR_BASE)); a35_0_rst_vector_base.s.addr = (offset + ML_AXI_START_ADDR) / 4; a35_1_rst_vector_base.s.addr = (offset + ML_AXI_START_ADDR) / 4; - roc_ml_reg_write32(&mldev->roc, a35_0_rst_vector_base.w.w0, ML_A35_0_RST_VECTOR_BASE_W(0)); - reg_val32 = roc_ml_reg_read32(&mldev->roc, ML_A35_0_RST_VECTOR_BASE_W(0)); + roc_ml_reg_write32(&cn10k_mldev->roc, a35_0_rst_vector_base.w.w0, + ML_A35_0_RST_VECTOR_BASE_W(0)); + reg_val32 = roc_ml_reg_read32(&cn10k_mldev->roc, ML_A35_0_RST_VECTOR_BASE_W(0)); plt_ml_dbg("ML_A35_0_RST_VECTOR_BASE_W(0) => 0x%08x", reg_val32); - roc_ml_reg_write32(&mldev->roc, a35_0_rst_vector_base.w.w1, ML_A35_0_RST_VECTOR_BASE_W(1)); - reg_val32 = roc_ml_reg_read32(&mldev->roc, ML_A35_0_RST_VECTOR_BASE_W(1)); + roc_ml_reg_write32(&cn10k_mldev->roc, a35_0_rst_vector_base.w.w1, + ML_A35_0_RST_VECTOR_BASE_W(1)); + reg_val32 = roc_ml_reg_read32(&cn10k_mldev->roc, ML_A35_0_RST_VECTOR_BASE_W(1)); plt_ml_dbg("ML_A35_0_RST_VECTOR_BASE_W(1) => 0x%08x", reg_val32); - roc_ml_reg_write32(&mldev->roc, a35_1_rst_vector_base.w.w0, ML_A35_1_RST_VECTOR_BASE_W(0)); - reg_val32 = roc_ml_reg_read32(&mldev->roc, ML_A35_1_RST_VECTOR_BASE_W(0)); + roc_ml_reg_write32(&cn10k_mldev->roc, a35_1_rst_vector_base.w.w0, + ML_A35_1_RST_VECTOR_BASE_W(0)); + reg_val32 = roc_ml_reg_read32(&cn10k_mldev->roc, ML_A35_1_RST_VECTOR_BASE_W(0)); plt_ml_dbg("ML_A35_1_RST_VECTOR_BASE_W(0) => 0x%08x", reg_val32); - roc_ml_reg_write32(&mldev->roc, a35_1_rst_vector_base.w.w1, ML_A35_1_RST_VECTOR_BASE_W(1)); - reg_val32 = roc_ml_reg_read32(&mldev->roc, ML_A35_1_RST_VECTOR_BASE_W(1)); + roc_ml_reg_write32(&cn10k_mldev->roc, a35_1_rst_vector_base.w.w1, + ML_A35_1_RST_VECTOR_BASE_W(1)); + reg_val32 = roc_ml_reg_read32(&cn10k_mldev->roc, ML_A35_1_RST_VECTOR_BASE_W(1)); plt_ml_dbg("ML_A35_1_RST_VECTOR_BASE_W(1) => 0x%08x", reg_val32); /* (11) Clear MLIP's ML(0)_SW_RST_CTRL[ACC_RST]. This will bring the ACC cores and other * MLIP components out of reset. The cores will execute firmware from the ML region as * written in step 1. */ - reg_val32 = roc_ml_reg_read32(&mldev->roc, ML_SW_RST_CTRL); + reg_val32 = roc_ml_reg_read32(&cn10k_mldev->roc, ML_SW_RST_CTRL); reg_val32 &= ~ROC_ML_SW_RST_CTRL_ACC_RST; - roc_ml_reg_write32(&mldev->roc, reg_val32, ML_SW_RST_CTRL); - reg_val32 = roc_ml_reg_read32(&mldev->roc, ML_SW_RST_CTRL); + roc_ml_reg_write32(&cn10k_mldev->roc, reg_val32, ML_SW_RST_CTRL); + reg_val32 = roc_ml_reg_read32(&cn10k_mldev->roc, ML_SW_RST_CTRL); plt_ml_dbg("ML_SW_RST_CTRL => 0x%08x", reg_val32); /* (12) Wait for notification from firmware that ML is ready for job execution. */ fw->req->jd.hdr.jce.w1.u64 = PLT_U64_CAST(&fw->req->status); fw->req->jd.hdr.job_type = ML_CN10K_JOB_TYPE_FIRMWARE_LOAD; - fw->req->jd.hdr.result = roc_ml_addr_ap2mlip(&mldev->roc, &fw->req->result); + fw->req->jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &fw->req->result); fw->req->jd.fw_load.flags = cn10k_ml_fw_flags_get(fw); - plt_write64(ML_CN10K_POLL_JOB_START, &fw->req->status); + plt_write64(ML_CNXK_POLL_JOB_START, &fw->req->status); plt_wmb(); /* Enqueue FW load through scratch registers */ timeout = true; - timeout_cycle = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); - roc_ml_scratch_enqueue(&mldev->roc, &fw->req->jd); + timeout_cycle = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); + roc_ml_scratch_enqueue(&cn10k_mldev->roc, &fw->req->jd); plt_rmb(); do { - if (roc_ml_scratch_is_done_bit_set(&mldev->roc) && - (plt_read64(&fw->req->status) == ML_CN10K_POLL_JOB_FINISH)) { + if (roc_ml_scratch_is_done_bit_set(&cn10k_mldev->roc) && + (plt_read64(&fw->req->status) == ML_CNXK_POLL_JOB_FINISH)) { timeout = false; break; } @@ -671,11 +681,11 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) } else { /* Set ML to disable new jobs */ reg_val64 = (ROC_ML_CFG_JD_SIZE | ROC_ML_CFG_MLIP_ENA); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); /* Clear scratch registers */ - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_WORK_PTR); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_FW_CTRL); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_WORK_PTR); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_FW_CTRL); if (timeout) { plt_err("Firmware load timeout"); @@ -691,49 +701,51 @@ cn10k_ml_fw_load_cn10ka(struct cn10k_ml_fw *fw, void *buffer, uint64_t size) /* (13) Set ML(0)_JOB_MGR_CTRL[STALL_ON_IDLE] = 0x1; this is needed to shut down the MLIP * clock when there are no more jobs to process. */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_JOB_MGR_CTRL); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_JOB_MGR_CTRL); reg_val64 |= ROC_ML_JOB_MGR_CTRL_STALL_ON_IDLE; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_JOB_MGR_CTRL); - plt_ml_dbg("ML_JOB_MGR_CTRL => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_JOB_MGR_CTRL)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_JOB_MGR_CTRL); + plt_ml_dbg("ML_JOB_MGR_CTRL => 0x%016lx", + roc_ml_reg_read64(&cn10k_mldev->roc, ML_JOB_MGR_CTRL)); /* (14) Set ML(0)_CFG[MLIP_CLK_FORCE] = 0; the MLIP clock will be turned on/off based on job * activities. */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 &= ~ROC_ML_CFG_MLIP_CLK_FORCE; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); /* (15) Set ML(0)_CFG[ENA] to enable ML job execution. */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 |= ROC_ML_CFG_ENA; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); /* Reset scratch registers */ - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_FW_CTRL); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_WORK_PTR); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_FW_CTRL); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_WORK_PTR); /* Disable job execution, to be enabled in start */ - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 &= ~ROC_ML_CFG_ENA; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); /* Additional fixes: Set RO bit to fix O2D DMA bandwidth issue on cn10ka */ for (i = 0; i < ML_ANBX_NR; i++) { - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_ANBX_NCBI_P_OVR(i)); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_ANBX_NCBI_P_OVR(i)); reg_val64 |= (ML_ANBX_NCBI_P_OVR_ANB_NCBI_P_RO_OVR | ML_ANBX_NCBI_P_OVR_ANB_NCBI_P_RO_OVR_VLD); - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_ANBX_NCBI_P_OVR(i)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_ANBX_NCBI_P_OVR(i)); } return ret; } int -cn10k_ml_fw_load(struct cn10k_ml_dev *mldev) +cn10k_ml_fw_load(struct cnxk_ml_dev *cnxk_mldev) { + struct cn10k_ml_dev *cn10k_mldev; const struct plt_memzone *mz; struct cn10k_ml_fw *fw; void *fw_buffer = NULL; @@ -741,8 +753,9 @@ cn10k_ml_fw_load(struct cn10k_ml_dev *mldev) uint64_t fw_size = 0; int ret = 0; - fw = &mldev->fw; - fw->mldev = mldev; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + fw = &cn10k_mldev->fw; + fw->cn10k_mldev = cn10k_mldev; if (roc_env_is_emulator() || roc_env_is_hw()) { /* Read firmware image to a buffer */ @@ -773,8 +786,8 @@ cn10k_ml_fw_load(struct cn10k_ml_dev *mldev) memset(&fw->req->jd.fw_load.version[0], '\0', MLDEV_FIRMWARE_VERSION_LENGTH); /* Reset device, if in active state */ - if (roc_ml_mlip_is_enabled(&mldev->roc)) - roc_ml_mlip_reset(&mldev->roc, true); + if (roc_ml_mlip_is_enabled(&cn10k_mldev->roc)) + roc_ml_mlip_reset(&cn10k_mldev->roc, true); /* Load firmware */ if (roc_env_is_emulator() || roc_env_is_hw()) { @@ -787,22 +800,25 @@ cn10k_ml_fw_load(struct cn10k_ml_dev *mldev) } if (ret < 0) - cn10k_ml_fw_unload(mldev); + cn10k_ml_fw_unload(cnxk_mldev); return ret; } void -cn10k_ml_fw_unload(struct cn10k_ml_dev *mldev) +cn10k_ml_fw_unload(struct cnxk_ml_dev *cnxk_mldev) { + struct cn10k_ml_dev *cn10k_mldev; const struct plt_memzone *mz; uint64_t reg_val; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + /* Disable and reset device */ - reg_val = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val &= ~ROC_ML_CFG_MLIP_ENA; - roc_ml_reg_write64(&mldev->roc, reg_val, ML_CFG); - roc_ml_mlip_reset(&mldev->roc, true); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val, ML_CFG); + roc_ml_mlip_reset(&cn10k_mldev->roc, true); mz = plt_memzone_lookup(FW_MEMZONE_NAME); if (mz != NULL) diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h index 4aaeecff03..f9da1548c4 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.h +++ b/drivers/ml/cnxk/cn10k_ml_dev.h @@ -9,6 +9,9 @@ #include "cn10k_ml_ocm.h" +/* Dummy Device ops */ +extern struct rte_ml_dev_ops ml_dev_dummy_ops; + /* Marvell OCTEON CN10K ML PMD device name */ #define MLDEV_NAME_CN10K_PMD ml_cn10k @@ -36,17 +39,10 @@ /* Maximum number of segments for IO data */ #define ML_CN10K_MAX_SEGMENTS 1 -/* ML command timeout in seconds */ -#define ML_CN10K_CMD_TIMEOUT 5 - /* ML slow-path job flags */ #define ML_CN10K_SP_FLAGS_OCM_NONRELOCATABLE BIT(0) #define ML_CN10K_SP_FLAGS_EXTENDED_LOAD_JD BIT(1) -/* Poll mode job state */ -#define ML_CN10K_POLL_JOB_START 0 -#define ML_CN10K_POLL_JOB_FINISH 1 - /* Memory barrier macros */ #if defined(RTE_ARCH_ARM) #define dmb_st ({ asm volatile("dmb st" : : : "memory"); }) @@ -56,6 +52,7 @@ #define dsb_st #endif +struct cnxk_ml_dev; struct cn10k_ml_req; struct cn10k_ml_qp; @@ -68,21 +65,6 @@ enum cn10k_ml_job_type { ML_CN10K_JOB_TYPE_FIRMWARE_SELFTEST, }; -/* Device configuration state enum */ -enum cn10k_ml_dev_state { - /* Probed and not configured */ - ML_CN10K_DEV_STATE_PROBED = 0, - - /* Configured */ - ML_CN10K_DEV_STATE_CONFIGURED, - - /* Started */ - ML_CN10K_DEV_STATE_STARTED, - - /* Closed */ - ML_CN10K_DEV_STATE_CLOSED -}; - /* Error types enumeration */ enum cn10k_ml_error_etype { /* 0x0 */ ML_ETYPE_NO_ERROR = 0, /* No error */ @@ -379,7 +361,7 @@ struct cn10k_ml_jd { /* ML firmware structure */ struct cn10k_ml_fw { /* Device reference */ - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; /* Firmware file path */ const char *path; @@ -485,27 +467,12 @@ struct cn10k_ml_dev { /* Device ROC */ struct roc_ml roc; - /* Configuration state */ - enum cn10k_ml_dev_state state; - /* Firmware */ struct cn10k_ml_fw fw; /* OCM info */ struct cn10k_ml_ocm ocm; - /* Number of models loaded */ - uint16_t nb_models_loaded; - - /* Number of models unloaded */ - uint16_t nb_models_unloaded; - - /* Number of models started */ - uint16_t nb_models_started; - - /* Number of models stopped */ - uint16_t nb_models_stopped; - /* Extended stats data */ struct cn10k_ml_xstats xstats; @@ -528,7 +495,7 @@ struct cn10k_ml_dev { }; uint64_t cn10k_ml_fw_flags_get(struct cn10k_ml_fw *fw); -int cn10k_ml_fw_load(struct cn10k_ml_dev *mldev); -void cn10k_ml_fw_unload(struct cn10k_ml_dev *mldev); +int cn10k_ml_fw_load(struct cnxk_ml_dev *cnxk_mldev); +void cn10k_ml_fw_unload(struct cnxk_ml_dev *cnxk_mldev); #endif /* _CN10K_ML_DEV_H_ */ diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c index e0b750cd8e..cc46ca2efd 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.c +++ b/drivers/ml/cnxk/cn10k_ml_model.c @@ -6,10 +6,11 @@ #include -#include "cn10k_ml_dev.h" #include "cn10k_ml_model.h" #include "cn10k_ml_ocm.h" +#include "cnxk_ml_dev.h" + static enum rte_ml_io_type cn10k_ml_io_type_map(uint8_t type) { @@ -461,7 +462,7 @@ cn10k_ml_model_addr_update(struct cn10k_ml_model *model, uint8_t *buffer, uint8_ } int -cn10k_ml_model_ocm_pages_count(struct cn10k_ml_dev *mldev, uint16_t model_id, uint8_t *buffer, +cn10k_ml_model_ocm_pages_count(struct cn10k_ml_dev *cn10k_mldev, uint16_t model_id, uint8_t *buffer, uint16_t *wb_pages, uint16_t *scratch_pages) { struct cn10k_ml_model_metadata *metadata; @@ -470,7 +471,7 @@ cn10k_ml_model_ocm_pages_count(struct cn10k_ml_dev *mldev, uint16_t model_id, ui uint64_t wb_size; metadata = (struct cn10k_ml_model_metadata *)buffer; - ocm = &mldev->ocm; + ocm = &cn10k_mldev->ocm; /* Assume wb_size is zero for non-relocatable models */ if (metadata->model.ocm_relocatable) @@ -494,11 +495,11 @@ cn10k_ml_model_ocm_pages_count(struct cn10k_ml_dev *mldev, uint16_t model_id, ui scratch_size, *scratch_pages); /* Check if the model can be loaded on OCM */ - if ((*wb_pages + *scratch_pages) > mldev->ocm.num_pages) { + if ((*wb_pages + *scratch_pages) > cn10k_mldev->ocm.num_pages) { plt_err("Cannot create the model, OCM relocatable = %u", metadata->model.ocm_relocatable); plt_err("wb_pages (%u) + scratch_pages (%u) > %u", *wb_pages, *scratch_pages, - mldev->ocm.num_pages); + cn10k_mldev->ocm.num_pages); return -ENOMEM; } @@ -506,8 +507,8 @@ cn10k_ml_model_ocm_pages_count(struct cn10k_ml_dev *mldev, uint16_t model_id, ui * prevent the library from allocating the remaining space on the tile to other models. */ if (!metadata->model.ocm_relocatable) - *scratch_pages = - PLT_MAX(PLT_U64_CAST(*scratch_pages), PLT_U64_CAST(mldev->ocm.num_pages)); + *scratch_pages = PLT_MAX(PLT_U64_CAST(*scratch_pages), + PLT_U64_CAST(cn10k_mldev->ocm.num_pages)); return 0; } diff --git a/drivers/ml/cnxk/cn10k_ml_model.h b/drivers/ml/cnxk/cn10k_ml_model.h index 4cc0744891..3128b28db7 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.h +++ b/drivers/ml/cnxk/cn10k_ml_model.h @@ -13,6 +13,8 @@ #include "cn10k_ml_ocm.h" #include "cn10k_ml_ops.h" +struct cnxk_ml_dev; + /* Model state */ enum cn10k_ml_model_state { ML_CN10K_MODEL_STATE_LOADED, @@ -489,7 +491,7 @@ struct cn10k_ml_model_stats { /* Model Object */ struct cn10k_ml_model { /* Device reference */ - struct cn10k_ml_dev *mldev; + struct cnxk_ml_dev *mldev; /* Name */ char name[RTE_ML_STR_MAX]; @@ -537,8 +539,8 @@ int cn10k_ml_model_metadata_check(uint8_t *buffer, uint64_t size); void cn10k_ml_model_metadata_update(struct cn10k_ml_model_metadata *metadata); void cn10k_ml_model_addr_update(struct cn10k_ml_model *model, uint8_t *buffer, uint8_t *base_dma_addr); -int cn10k_ml_model_ocm_pages_count(struct cn10k_ml_dev *mldev, uint16_t model_id, uint8_t *buffer, - uint16_t *wb_pages, uint16_t *scratch_pages); +int cn10k_ml_model_ocm_pages_count(struct cn10k_ml_dev *cn10k_mldev, uint16_t model_id, + uint8_t *buffer, uint16_t *wb_pages, uint16_t *scratch_pages); void cn10k_ml_model_info_set(struct rte_ml_dev *dev, struct cn10k_ml_model *model); #endif /* _CN10K_ML_MODEL_H_ */ diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.c b/drivers/ml/cnxk/cn10k_ml_ocm.c index 6fb0bb620e..8094a0fab1 100644 --- a/drivers/ml/cnxk/cn10k_ml_ocm.c +++ b/drivers/ml/cnxk/cn10k_ml_ocm.c @@ -4,11 +4,12 @@ #include -#include "cn10k_ml_dev.h" +#include + #include "cn10k_ml_model.h" #include "cn10k_ml_ocm.h" -#include "roc_api.h" +#include "cnxk_ml_dev.h" /* OCM macros */ #define BYTE_LEN 8 @@ -217,7 +218,8 @@ int cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t wb_pages, uint16_t scratch_pages, uint64_t *tilemask) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_ocm *ocm; uint16_t used_scratch_pages_max; @@ -236,8 +238,9 @@ cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t w int max_slot_sz; int page_id; - mldev = dev->data->dev_private; - ocm = &mldev->ocm; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + ocm = &cn10k_mldev->ocm; if (num_tiles > ML_CN10K_OCM_NUMTILES) { plt_err("Invalid num_tiles = %u (> %u)", num_tiles, ML_CN10K_OCM_NUMTILES); @@ -254,8 +257,8 @@ cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t w tile_start = 0; search_end_tile = ocm->num_tiles - num_tiles; - /* allocate for local ocm mask */ - local_ocm_mask = rte_zmalloc("local_ocm_mask", mldev->ocm.mask_words, RTE_CACHE_LINE_SIZE); + /* Allocate for local ocm mask */ + local_ocm_mask = rte_zmalloc("local_ocm_mask", ocm->mask_words, RTE_CACHE_LINE_SIZE); if (local_ocm_mask == NULL) { plt_err("Unable to allocate memory for local_ocm_mask"); return -1; @@ -271,7 +274,7 @@ cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t w PLT_MAX(ocm->tile_ocm_info[tile_id].last_wb_page, used_last_wb_page_max); } - memset(local_ocm_mask, 0, mldev->ocm.mask_words); + memset(local_ocm_mask, 0, ocm->mask_words); for (tile_id = tile_start; tile_id < tile_start + num_tiles; tile_id++) { for (word_id = 0; word_id < ocm->mask_words; word_id++) local_ocm_mask[word_id] |= ocm->tile_ocm_info[tile_id].ocm_mask[word_id]; @@ -333,8 +336,9 @@ void cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, uint16_t model_id, uint64_t tilemask, int wb_page_start, uint16_t wb_pages, uint16_t scratch_pages) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_ocm *ocm; int scratch_page_start; @@ -345,8 +349,9 @@ cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, uint16_t model_id, uint64_t t int tile_id; int page_id; - mldev = dev->data->dev_private; - ocm = &mldev->ocm; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + ocm = &cn10k_mldev->ocm; model = dev->data->models[model_id]; /* Get first set bit, tile_start */ @@ -391,8 +396,9 @@ void cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t model_id) { struct cn10k_ml_model *local_model; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_ocm *ocm; int scratch_resize_pages; @@ -404,8 +410,9 @@ cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t model_id) int page_id; uint16_t i; - mldev = dev->data->dev_private; - ocm = &mldev->ocm; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + ocm = &cn10k_mldev->ocm; model = dev->data->models[model_id]; /* Update OCM info for WB memory */ @@ -453,35 +460,37 @@ cn10k_ml_ocm_pagemask_to_str(struct cn10k_ml_ocm_tile_info *tile_info, uint16_t char *p = str; int word; - /* add prefix 0x */ + /* Add prefix 0x */ *p++ = '0'; *p++ = 'x'; - /* build one word at a time */ + /* Build hex string */ for (word = nwords - 1; word >= 0; word--) { sprintf(p, "%02X", tile_info->ocm_mask[word]); p += 2; } - /* terminate */ + /* Terminate */ *p++ = 0; } void cn10k_ml_ocm_print(struct rte_ml_dev *dev, FILE *fp) { - char *str; - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_ocm *ocm; uint8_t tile_id; uint8_t word_id; int wb_pages; + char *str; - mldev = dev->data->dev_private; - ocm = &mldev->ocm; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + ocm = &cn10k_mldev->ocm; - /* nibbles + prefix '0x' */ - str = rte_zmalloc("ocm_mask_str", mldev->ocm.num_pages / 4 + 2, RTE_CACHE_LINE_SIZE); + /* Nibbles + prefix '0x' */ + str = rte_zmalloc("ocm_mask_str", ocm->num_pages / 4 + 2, RTE_CACHE_LINE_SIZE); if (str == NULL) { plt_err("Unable to allocate memory for ocm_mask_str"); return; @@ -492,9 +501,8 @@ cn10k_ml_ocm_print(struct rte_ml_dev *dev, FILE *fp) cn10k_ml_ocm_pagemask_to_str(&ocm->tile_ocm_info[tile_id], ocm->mask_words, str); wb_pages = 0 - ocm->tile_ocm_info[tile_id].scratch_pages; - for (word_id = 0; word_id < mldev->ocm.mask_words; word_id++) - wb_pages += - rte_popcount32(ocm->tile_ocm_info[tile_id].ocm_mask[word_id]); + for (word_id = 0; word_id < ocm->mask_words; word_id++) + wb_pages += rte_popcount32(ocm->tile_ocm_info[tile_id].ocm_mask[word_id]); fprintf(fp, "tile = %2u, scratch_pages = %4u," diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 11531afd8c..def6d4c756 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -7,10 +7,11 @@ #include -#include "cn10k_ml_dev.h" #include "cn10k_ml_model.h" #include "cn10k_ml_ops.h" +#include "cnxk_ml_dev.h" + /* ML model macros */ #define CN10K_ML_MODEL_MEMZONE_NAME "ml_cn10k_model_mz" @@ -85,7 +86,7 @@ cn10k_ml_set_poll_addr(struct cn10k_ml_req *req) static inline void cn10k_ml_set_poll_ptr(struct cn10k_ml_req *req) { - plt_write64(ML_CN10K_POLL_JOB_START, req->compl_W1); + plt_write64(ML_CNXK_POLL_JOB_START, req->compl_W1); } static inline uint64_t @@ -175,7 +176,7 @@ cn10k_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t nb_des qp->queue.reqs = (struct cn10k_ml_req *)va; qp->queue.head = 0; qp->queue.tail = 0; - qp->queue.wait_cycles = ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); + qp->queue.wait_cycles = ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); qp->nb_desc = nb_desc; qp->stats.enqueued_count = 0; qp->stats.dequeued_count = 0; @@ -199,16 +200,17 @@ cn10k_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t nb_des static void cn10k_ml_model_print(struct rte_ml_dev *dev, uint16_t model_id, FILE *fp) { - + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_ocm *ocm; char str[STR_LEN]; uint8_t i; uint8_t j; - mldev = dev->data->dev_private; - ocm = &mldev->ocm; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + ocm = &cn10k_mldev->ocm; model = dev->data->models[model_id]; /* Print debug info */ @@ -249,7 +251,7 @@ cn10k_ml_model_print(struct rte_ml_dev *dev, uint16_t model_id, FILE *fp) fprintf(fp, "%*s : 0x%0*" PRIx64 "\n", FIELD_LEN, "tilemask", ML_CN10K_OCM_NUMTILES / 4, model->model_mem_map.tilemask); fprintf(fp, "%*s : 0x%" PRIx64 "\n", FIELD_LEN, "ocm_wb_start", - model->model_mem_map.wb_page_start * mldev->ocm.page_size); + model->model_mem_map.wb_page_start * cn10k_mldev->ocm.page_size); } fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_inputs", model->metadata.model.num_input); @@ -325,7 +327,7 @@ cn10k_ml_model_print(struct rte_ml_dev *dev, uint16_t model_id, FILE *fp) } static void -cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *mldev, struct cn10k_ml_model *model, +cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *cn10k_mldev, struct cn10k_ml_model *model, struct cn10k_ml_req *req, enum cn10k_ml_job_type job_type) { struct cn10k_ml_model_metadata *metadata; @@ -340,7 +342,7 @@ cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *mldev, struct cn10k_ml_mode req->jd.hdr.model_id = model->model_id; req->jd.hdr.job_type = job_type; req->jd.hdr.fp_flags = 0x0; - req->jd.hdr.result = roc_ml_addr_ap2mlip(&mldev->roc, &req->result); + req->jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &req->result); if (job_type == ML_CN10K_JOB_TYPE_MODEL_START) { if (!model->metadata.model.ocm_relocatable) @@ -350,9 +352,9 @@ cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *mldev, struct cn10k_ml_mode req->jd.hdr.sp_flags |= ML_CN10K_SP_FLAGS_EXTENDED_LOAD_JD; req->jd.model_start.extended_args = - PLT_U64_CAST(roc_ml_addr_ap2mlip(&mldev->roc, &req->extended_args)); + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &req->extended_args)); req->jd.model_start.model_dst_ddr_addr = - PLT_U64_CAST(roc_ml_addr_ap2mlip(&mldev->roc, addr->init_run_addr)); + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, addr->init_run_addr)); req->jd.model_start.model_init_offset = 0x0; req->jd.model_start.model_main_offset = metadata->init_model.file_size; req->jd.model_start.model_finish_offset = @@ -372,7 +374,7 @@ cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *mldev, struct cn10k_ml_mode req->jd.model_start.ocm_wb_range_start = metadata->model.ocm_wb_range_start; req->jd.model_start.ocm_wb_range_end = metadata->model.ocm_wb_range_end; req->jd.model_start.ddr_wb_base_address = PLT_U64_CAST(roc_ml_addr_ap2mlip( - &mldev->roc, + &cn10k_mldev->roc, PLT_PTR_ADD(addr->finish_load_addr, metadata->finish_model.file_size))); req->jd.model_start.ddr_wb_range_start = metadata->model.ddr_wb_range_start; req->jd.model_start.ddr_wb_range_end = metadata->model.ddr_wb_range_end; @@ -383,7 +385,7 @@ cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *mldev, struct cn10k_ml_mode req->jd.model_start.output.s.ddr_range_end = metadata->model.ddr_output_range_end; req->extended_args.start.ddr_scratch_base_address = PLT_U64_CAST( - roc_ml_addr_ap2mlip(&mldev->roc, model->addr.scratch_base_addr)); + roc_ml_addr_ap2mlip(&cn10k_mldev->roc, model->addr.scratch_base_addr)); req->extended_args.start.ddr_scratch_range_start = metadata->model.ddr_scratch_range_start; req->extended_args.start.ddr_scratch_range_end = @@ -392,24 +394,20 @@ cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *mldev, struct cn10k_ml_mode } static __rte_always_inline void -cn10k_ml_prep_fp_job_descriptor(struct rte_ml_dev *dev, struct cn10k_ml_req *req, +cn10k_ml_prep_fp_job_descriptor(struct cn10k_ml_dev *cn10k_mldev, struct cn10k_ml_req *req, struct rte_ml_op *op) { - struct cn10k_ml_dev *mldev; - - mldev = dev->data->dev_private; - req->jd.hdr.jce.w0.u64 = 0; req->jd.hdr.jce.w1.u64 = req->compl_W1; req->jd.hdr.model_id = op->model_id; req->jd.hdr.job_type = ML_CN10K_JOB_TYPE_MODEL_RUN; req->jd.hdr.fp_flags = ML_FLAGS_POLL_COMPL; req->jd.hdr.sp_flags = 0x0; - req->jd.hdr.result = roc_ml_addr_ap2mlip(&mldev->roc, &req->result); + req->jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &req->result); req->jd.model_run.input_ddr_addr = - PLT_U64_CAST(roc_ml_addr_ap2mlip(&mldev->roc, op->input[0]->addr)); + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, op->input[0]->addr)); req->jd.model_run.output_ddr_addr = - PLT_U64_CAST(roc_ml_addr_ap2mlip(&mldev->roc, op->output[0]->addr)); + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, op->output[0]->addr)); req->jd.model_run.num_batches = op->nb_batches; } @@ -436,66 +434,69 @@ static const struct xstat_info model_stats[] = { static int cn10k_ml_xstats_init(struct rte_ml_dev *dev) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; uint16_t nb_stats; uint16_t stat_id; uint16_t model; uint16_t i; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; /* Allocate memory for xstats entries. Don't allocate during reconfigure */ nb_stats = RTE_DIM(device_stats) + ML_CN10K_MAX_MODELS * RTE_DIM(model_stats); - if (mldev->xstats.entries == NULL) - mldev->xstats.entries = rte_zmalloc("cn10k_ml_xstats", - sizeof(struct cn10k_ml_xstats_entry) * nb_stats, - PLT_CACHE_LINE_SIZE); + if (cn10k_mldev->xstats.entries == NULL) + cn10k_mldev->xstats.entries = rte_zmalloc( + "cn10k_ml_xstats", sizeof(struct cn10k_ml_xstats_entry) * nb_stats, + PLT_CACHE_LINE_SIZE); - if (mldev->xstats.entries == NULL) + if (cn10k_mldev->xstats.entries == NULL) return -ENOMEM; /* Initialize device xstats */ stat_id = 0; for (i = 0; i < RTE_DIM(device_stats); i++) { - mldev->xstats.entries[stat_id].map.id = stat_id; - snprintf(mldev->xstats.entries[stat_id].map.name, - sizeof(mldev->xstats.entries[stat_id].map.name), "%s", + cn10k_mldev->xstats.entries[stat_id].map.id = stat_id; + snprintf(cn10k_mldev->xstats.entries[stat_id].map.name, + sizeof(cn10k_mldev->xstats.entries[stat_id].map.name), "%s", device_stats[i].name); - mldev->xstats.entries[stat_id].mode = RTE_ML_DEV_XSTATS_DEVICE; - mldev->xstats.entries[stat_id].type = device_stats[i].type; - mldev->xstats.entries[stat_id].fn_id = CN10K_ML_XSTATS_FN_DEVICE; - mldev->xstats.entries[stat_id].obj_idx = 0; - mldev->xstats.entries[stat_id].reset_allowed = device_stats[i].reset_allowed; + cn10k_mldev->xstats.entries[stat_id].mode = RTE_ML_DEV_XSTATS_DEVICE; + cn10k_mldev->xstats.entries[stat_id].type = device_stats[i].type; + cn10k_mldev->xstats.entries[stat_id].fn_id = CN10K_ML_XSTATS_FN_DEVICE; + cn10k_mldev->xstats.entries[stat_id].obj_idx = 0; + cn10k_mldev->xstats.entries[stat_id].reset_allowed = device_stats[i].reset_allowed; stat_id++; } - mldev->xstats.count_mode_device = stat_id; + cn10k_mldev->xstats.count_mode_device = stat_id; /* Initialize model xstats */ for (model = 0; model < ML_CN10K_MAX_MODELS; model++) { - mldev->xstats.offset_for_model[model] = stat_id; + cn10k_mldev->xstats.offset_for_model[model] = stat_id; for (i = 0; i < RTE_DIM(model_stats); i++) { - mldev->xstats.entries[stat_id].map.id = stat_id; - mldev->xstats.entries[stat_id].mode = RTE_ML_DEV_XSTATS_MODEL; - mldev->xstats.entries[stat_id].type = model_stats[i].type; - mldev->xstats.entries[stat_id].fn_id = CN10K_ML_XSTATS_FN_MODEL; - mldev->xstats.entries[stat_id].obj_idx = model; - mldev->xstats.entries[stat_id].reset_allowed = model_stats[i].reset_allowed; + cn10k_mldev->xstats.entries[stat_id].map.id = stat_id; + cn10k_mldev->xstats.entries[stat_id].mode = RTE_ML_DEV_XSTATS_MODEL; + cn10k_mldev->xstats.entries[stat_id].type = model_stats[i].type; + cn10k_mldev->xstats.entries[stat_id].fn_id = CN10K_ML_XSTATS_FN_MODEL; + cn10k_mldev->xstats.entries[stat_id].obj_idx = model; + cn10k_mldev->xstats.entries[stat_id].reset_allowed = + model_stats[i].reset_allowed; /* Name of xstat is updated during model load */ - snprintf(mldev->xstats.entries[stat_id].map.name, - sizeof(mldev->xstats.entries[stat_id].map.name), "Model-%u-%s", - model, model_stats[i].name); + snprintf(cn10k_mldev->xstats.entries[stat_id].map.name, + sizeof(cn10k_mldev->xstats.entries[stat_id].map.name), + "Model-%u-%s", model, model_stats[i].name); stat_id++; } - mldev->xstats.count_per_model[model] = RTE_DIM(model_stats); + cn10k_mldev->xstats.count_per_model[model] = RTE_DIM(model_stats); } - mldev->xstats.count_mode_model = stat_id - mldev->xstats.count_mode_device; - mldev->xstats.count = stat_id; + cn10k_mldev->xstats.count_mode_model = stat_id - cn10k_mldev->xstats.count_mode_device; + cn10k_mldev->xstats.count = stat_id; return 0; } @@ -503,28 +504,32 @@ cn10k_ml_xstats_init(struct rte_ml_dev *dev) static void cn10k_ml_xstats_uninit(struct rte_ml_dev *dev) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; - rte_free(mldev->xstats.entries); - mldev->xstats.entries = NULL; + rte_free(cn10k_mldev->xstats.entries); + cn10k_mldev->xstats.entries = NULL; - mldev->xstats.count = 0; + cn10k_mldev->xstats.count = 0; } static void cn10k_ml_xstats_model_name_update(struct rte_ml_dev *dev, uint16_t model_id) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; uint16_t rclk_freq; uint16_t sclk_freq; uint16_t stat_id; char suffix[8]; uint16_t i; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; model = dev->data->models[model_id]; stat_id = RTE_DIM(device_stats) + model_id * RTE_DIM(model_stats); @@ -536,8 +541,8 @@ cn10k_ml_xstats_model_name_update(struct rte_ml_dev *dev, uint16_t model_id) /* Update xstat name based on model name and sclk availability */ for (i = 0; i < RTE_DIM(model_stats); i++) { - snprintf(mldev->xstats.entries[stat_id].map.name, - sizeof(mldev->xstats.entries[stat_id].map.name), "%s-%s-%s", + snprintf(cn10k_mldev->xstats.entries[stat_id].map.name, + sizeof(cn10k_mldev->xstats.entries[stat_id].map.name), "%s-%s-%s", model->metadata.model.name, model_stats[i].name, suffix); stat_id++; } @@ -547,19 +552,19 @@ static uint64_t cn10k_ml_dev_xstat_get(struct rte_ml_dev *dev, uint16_t obj_idx __rte_unused, enum cn10k_ml_xstats_type type) { - struct cn10k_ml_dev *mldev; + struct cnxk_ml_dev *cnxk_mldev; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; switch (type) { case nb_models_loaded: - return mldev->nb_models_loaded; + return cnxk_mldev->nb_models_loaded; case nb_models_unloaded: - return mldev->nb_models_unloaded; + return cnxk_mldev->nb_models_unloaded; case nb_models_started: - return mldev->nb_models_started; + return cnxk_mldev->nb_models_started; case nb_models_stopped: - return mldev->nb_models_stopped; + return cnxk_mldev->nb_models_stopped; default: return -1; } @@ -651,15 +656,17 @@ static int cn10k_ml_device_xstats_reset(struct rte_ml_dev *dev, const uint16_t stat_ids[], uint16_t nb_ids) { struct cn10k_ml_xstats_entry *xs; - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; uint16_t nb_stats; uint16_t stat_id; uint32_t i; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; if (stat_ids == NULL) - nb_stats = mldev->xstats.count_mode_device; + nb_stats = cn10k_mldev->xstats.count_mode_device; else nb_stats = nb_ids; @@ -669,10 +676,10 @@ cn10k_ml_device_xstats_reset(struct rte_ml_dev *dev, const uint16_t stat_ids[], else stat_id = stat_ids[i]; - if (stat_id >= mldev->xstats.count_mode_device) + if (stat_id >= cn10k_mldev->xstats.count_mode_device) return -EINVAL; - xs = &mldev->xstats.entries[stat_id]; + xs = &cn10k_mldev->xstats.entries[stat_id]; if (!xs->reset_allowed) continue; @@ -740,15 +747,17 @@ cn10k_ml_model_xstats_reset(struct rte_ml_dev *dev, int32_t model_id, const uint uint16_t nb_ids) { struct cn10k_ml_xstats_entry *xs; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; int32_t lcl_model_id = 0; uint16_t start_id; uint16_t end_id; int32_t i; int32_t j; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; for (i = 0; i < ML_CN10K_MAX_MODELS; i++) { if (model_id == -1) { model = dev->data->models[i]; @@ -765,12 +774,13 @@ cn10k_ml_model_xstats_reset(struct rte_ml_dev *dev, int32_t model_id, const uint } } - start_id = mldev->xstats.offset_for_model[i]; - end_id = mldev->xstats.offset_for_model[i] + mldev->xstats.count_per_model[i] - 1; + start_id = cn10k_mldev->xstats.offset_for_model[i]; + end_id = cn10k_mldev->xstats.offset_for_model[i] + + cn10k_mldev->xstats.count_per_model[i] - 1; if (stat_ids == NULL) { for (j = start_id; j <= end_id; j++) { - xs = &mldev->xstats.entries[j]; + xs = &cn10k_mldev->xstats.entries[j]; cn10k_ml_reset_model_stat(dev, i, xs->type); } } else { @@ -780,7 +790,7 @@ cn10k_ml_model_xstats_reset(struct rte_ml_dev *dev, int32_t model_id, const uint stat_ids[j], lcl_model_id); return -EINVAL; } - xs = &mldev->xstats.entries[stat_ids[j]]; + xs = &cn10k_mldev->xstats.entries[stat_ids[j]]; cn10k_ml_reset_model_stat(dev, i, xs->type); } } @@ -854,17 +864,19 @@ cn10k_ml_cache_model_data(struct rte_ml_dev *dev, uint16_t model_id) static int cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; if (dev_info == NULL) return -EINVAL; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; memset(dev_info, 0, sizeof(struct rte_ml_dev_info)); dev_info->driver_name = dev->device->driver->name; dev_info->max_models = ML_CN10K_MAX_MODELS; - if (mldev->hw_queue_lock) + if (cn10k_mldev->hw_queue_lock) dev_info->max_queue_pairs = ML_CN10K_MAX_QP_PER_DEVICE_SL; else dev_info->max_queue_pairs = ML_CN10K_MAX_QP_PER_DEVICE_LF; @@ -881,8 +893,9 @@ static int cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *conf) { struct rte_ml_dev_info dev_info; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_ocm *ocm; struct cn10k_ml_qp *qp; uint16_t model_id; @@ -895,7 +908,8 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c return -EINVAL; /* Get CN10K device handle */ - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; cn10k_ml_dev_info_get(dev, &dev_info); if (conf->nb_models > dev_info.max_models) { @@ -908,21 +922,21 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c return -EINVAL; } - if (mldev->state == ML_CN10K_DEV_STATE_PROBED) { + if (cnxk_mldev->state == ML_CNXK_DEV_STATE_PROBED) { plt_ml_dbg("Configuring ML device, nb_queue_pairs = %u, nb_models = %u", conf->nb_queue_pairs, conf->nb_models); /* Load firmware */ - ret = cn10k_ml_fw_load(mldev); + ret = cn10k_ml_fw_load(cnxk_mldev); if (ret != 0) return ret; - } else if (mldev->state == ML_CN10K_DEV_STATE_CONFIGURED) { + } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_CONFIGURED) { plt_ml_dbg("Re-configuring ML device, nb_queue_pairs = %u, nb_models = %u", conf->nb_queue_pairs, conf->nb_models); - } else if (mldev->state == ML_CN10K_DEV_STATE_STARTED) { + } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_STARTED) { plt_err("Device can't be reconfigured in started state\n"); return -ENOTSUP; - } else if (mldev->state == ML_CN10K_DEV_STATE_CLOSED) { + } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_CLOSED) { plt_err("Device can't be reconfigured after close\n"); return -ENOTSUP; } @@ -1013,10 +1027,10 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c } dev->data->nb_models = conf->nb_models; - ocm = &mldev->ocm; + ocm = &cn10k_mldev->ocm; ocm->num_tiles = ML_CN10K_OCM_NUMTILES; ocm->size_per_tile = ML_CN10K_OCM_TILESIZE; - ocm->page_size = mldev->ocm_page_size; + ocm->page_size = cn10k_mldev->ocm_page_size; ocm->num_pages = ocm->size_per_tile / ocm->page_size; ocm->mask_words = ocm->num_pages / (8 * sizeof(uint8_t)); @@ -1044,25 +1058,25 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c } /* Set JCMDQ enqueue function */ - if (mldev->hw_queue_lock == 1) - mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_sl; + if (cn10k_mldev->hw_queue_lock == 1) + cn10k_mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_sl; else - mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_lf; + cn10k_mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_lf; /* Set polling function pointers */ - mldev->set_poll_addr = cn10k_ml_set_poll_addr; - mldev->set_poll_ptr = cn10k_ml_set_poll_ptr; - mldev->get_poll_ptr = cn10k_ml_get_poll_ptr; + cn10k_mldev->set_poll_addr = cn10k_ml_set_poll_addr; + cn10k_mldev->set_poll_ptr = cn10k_ml_set_poll_ptr; + cn10k_mldev->get_poll_ptr = cn10k_ml_get_poll_ptr; dev->enqueue_burst = cn10k_ml_enqueue_burst; dev->dequeue_burst = cn10k_ml_dequeue_burst; dev->op_error_get = cn10k_ml_op_error_get; - mldev->nb_models_loaded = 0; - mldev->nb_models_started = 0; - mldev->nb_models_stopped = 0; - mldev->nb_models_unloaded = 0; - mldev->state = ML_CN10K_DEV_STATE_CONFIGURED; + cnxk_mldev->nb_models_loaded = 0; + cnxk_mldev->nb_models_started = 0; + cnxk_mldev->nb_models_stopped = 0; + cnxk_mldev->nb_models_unloaded = 0; + cnxk_mldev->state = ML_CNXK_DEV_STATE_CONFIGURED; return 0; @@ -1077,8 +1091,9 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c static int cn10k_ml_dev_close(struct rte_ml_dev *dev) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_qp *qp; uint16_t model_id; uint16_t qp_id; @@ -1086,10 +1101,11 @@ cn10k_ml_dev_close(struct rte_ml_dev *dev) if (dev == NULL) return -EINVAL; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; /* Release ocm_mask memory */ - rte_free(mldev->ocm.ocm_mask); + rte_free(cn10k_mldev->ocm.ocm_mask); /* Stop and unload all models */ for (model_id = 0; model_id < dev->data->nb_models; model_id++) { @@ -1125,21 +1141,21 @@ cn10k_ml_dev_close(struct rte_ml_dev *dev) cn10k_ml_xstats_uninit(dev); /* Unload firmware */ - cn10k_ml_fw_unload(mldev); + cn10k_ml_fw_unload(cnxk_mldev); /* Clear scratch registers */ - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_WORK_PTR); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_FW_CTRL); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C0); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C1); - roc_ml_reg_write64(&mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_WORK_PTR); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_FW_CTRL); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C0); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_HEAD_C1); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_SCRATCH_DBG_BUFFER_TAIL_C1); /* Reset ML_MLR_BASE */ - roc_ml_reg_write64(&mldev->roc, 0, ML_MLR_BASE); - plt_ml_dbg("ML_MLR_BASE = 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_MLR_BASE)); + roc_ml_reg_write64(&cn10k_mldev->roc, 0, ML_MLR_BASE); + plt_ml_dbg("ML_MLR_BASE = 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_MLR_BASE)); - mldev->state = ML_CN10K_DEV_STATE_CLOSED; + cnxk_mldev->state = ML_CNXK_DEV_STATE_CLOSED; /* Remove PCI device */ return rte_dev_remove(dev->device); @@ -1148,17 +1164,19 @@ cn10k_ml_dev_close(struct rte_ml_dev *dev) static int cn10k_ml_dev_start(struct rte_ml_dev *dev) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; uint64_t reg_val64; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 |= ROC_ML_CFG_ENA; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); - mldev->state = ML_CN10K_DEV_STATE_STARTED; + cnxk_mldev->state = ML_CNXK_DEV_STATE_STARTED; return 0; } @@ -1166,17 +1184,19 @@ cn10k_ml_dev_start(struct rte_ml_dev *dev) static int cn10k_ml_dev_stop(struct rte_ml_dev *dev) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; uint64_t reg_val64; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; - reg_val64 = roc_ml_reg_read64(&mldev->roc, ML_CFG); + reg_val64 = roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG); reg_val64 &= ~ROC_ML_CFG_ENA; - roc_ml_reg_write64(&mldev->roc, reg_val64, ML_CFG); - plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&mldev->roc, ML_CFG)); + roc_ml_reg_write64(&cn10k_mldev->roc, reg_val64, ML_CFG); + plt_ml_dbg("ML_CFG => 0x%016lx", roc_ml_reg_read64(&cn10k_mldev->roc, ML_CFG)); - mldev->state = ML_CN10K_DEV_STATE_CONFIGURED; + cnxk_mldev->state = ML_CNXK_DEV_STATE_CONFIGURED; return 0; } @@ -1259,22 +1279,24 @@ cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mod int32_t model_id, struct rte_ml_dev_xstats_map *xstats_map, uint32_t size) { - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; uint32_t xstats_mode_count; uint32_t idx = 0; uint32_t i; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; xstats_mode_count = 0; switch (mode) { case RTE_ML_DEV_XSTATS_DEVICE: - xstats_mode_count = mldev->xstats.count_mode_device; + xstats_mode_count = cn10k_mldev->xstats.count_mode_device; break; case RTE_ML_DEV_XSTATS_MODEL: if (model_id >= ML_CN10K_MAX_MODELS) break; - xstats_mode_count = mldev->xstats.count_per_model[model_id]; + xstats_mode_count = cn10k_mldev->xstats.count_per_model[model_id]; break; default: return -EINVAL; @@ -1283,16 +1305,17 @@ cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mod if (xstats_mode_count > size || xstats_map == NULL) return xstats_mode_count; - for (i = 0; i < mldev->xstats.count && idx < size; i++) { - if (mldev->xstats.entries[i].mode != mode) + for (i = 0; i < cn10k_mldev->xstats.count && idx < size; i++) { + if (cn10k_mldev->xstats.entries[i].mode != mode) continue; if (mode != RTE_ML_DEV_XSTATS_DEVICE && - model_id != mldev->xstats.entries[i].obj_idx) + model_id != cn10k_mldev->xstats.entries[i].obj_idx) continue; - strncpy(xstats_map[idx].name, mldev->xstats.entries[i].map.name, RTE_ML_STR_MAX); - xstats_map[idx].id = mldev->xstats.entries[i].map.id; + strncpy(xstats_map[idx].name, cn10k_mldev->xstats.entries[i].map.name, + RTE_ML_STR_MAX); + xstats_map[idx].id = cn10k_mldev->xstats.entries[i].map.id; idx++; } @@ -1304,13 +1327,15 @@ cn10k_ml_dev_xstats_by_name_get(struct rte_ml_dev *dev, const char *name, uint16 uint64_t *value) { struct cn10k_ml_xstats_entry *xs; - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; cn10k_ml_xstats_fn fn; uint32_t i; - mldev = dev->data->dev_private; - for (i = 0; i < mldev->xstats.count; i++) { - xs = &mldev->xstats.entries[i]; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + for (i = 0; i < cn10k_mldev->xstats.count; i++) { + xs = &cn10k_mldev->xstats.entries[i]; if (strncmp(xs->map.name, name, RTE_ML_STR_MAX) == 0) { if (stat_id != NULL) *stat_id = xs->map.id; @@ -1344,24 +1369,26 @@ cn10k_ml_dev_xstats_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode const uint16_t stat_ids[], uint64_t values[], uint16_t nb_ids) { struct cn10k_ml_xstats_entry *xs; - struct cn10k_ml_dev *mldev; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; uint32_t xstats_mode_count; cn10k_ml_xstats_fn fn; uint64_t val; uint32_t idx; uint32_t i; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; xstats_mode_count = 0; switch (mode) { case RTE_ML_DEV_XSTATS_DEVICE: - xstats_mode_count = mldev->xstats.count_mode_device; + xstats_mode_count = cn10k_mldev->xstats.count_mode_device; break; case RTE_ML_DEV_XSTATS_MODEL: if (model_id >= ML_CN10K_MAX_MODELS) return -EINVAL; - xstats_mode_count = mldev->xstats.count_per_model[model_id]; + xstats_mode_count = cn10k_mldev->xstats.count_per_model[model_id]; break; default: return -EINVAL; @@ -1369,8 +1396,8 @@ cn10k_ml_dev_xstats_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode idx = 0; for (i = 0; i < nb_ids && idx < xstats_mode_count; i++) { - xs = &mldev->xstats.entries[stat_ids[i]]; - if (stat_ids[i] > mldev->xstats.count || xs->mode != mode) + xs = &cn10k_mldev->xstats.entries[stat_ids[i]]; + if (stat_ids[i] > cn10k_mldev->xstats.count || xs->mode != mode) continue; if (mode == RTE_ML_DEV_XSTATS_MODEL && model_id != xs->obj_idx) { @@ -1418,8 +1445,9 @@ cn10k_ml_dev_xstats_reset(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mo static int cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_fw *fw; uint32_t head_loc; @@ -1432,8 +1460,9 @@ cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp) if (roc_env_is_asim()) return 0; - mldev = dev->data->dev_private; - fw = &mldev->fw; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + fw = &cn10k_mldev->fw; /* Dump model info */ for (model_id = 0; model_id < dev->data->nb_models; model_id++) { @@ -1451,15 +1480,19 @@ cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp) for (core_id = 0; core_id <= 1; core_id++) { bufsize = fw->req->jd.fw_load.debug.debug_buffer_size; if (core_id == 0) { - head_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_HEAD_C0); - tail_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_TAIL_C0); + head_loc = + roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_DBG_BUFFER_HEAD_C0); + tail_loc = + roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_DBG_BUFFER_TAIL_C0); head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core0_debug_ptr); - head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr); + head_ptr = roc_ml_addr_mlip2ap(&cn10k_mldev->roc, head_ptr); } else { - head_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_HEAD_C1); - tail_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_TAIL_C1); + head_loc = + roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_DBG_BUFFER_HEAD_C1); + tail_loc = + roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_DBG_BUFFER_TAIL_C1); head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core1_debug_ptr); - head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr); + head_ptr = roc_ml_addr_mlip2ap(&cn10k_mldev->roc, head_ptr); } if (head_loc < tail_loc) { fprintf(fp, "%.*s\n", tail_loc - head_loc, &head_ptr[head_loc]); @@ -1473,18 +1506,18 @@ cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp) for (core_id = 0; core_id <= 1; core_id++) { bufsize = fw->req->jd.fw_load.debug.exception_state_size; if ((core_id == 0) && - (roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0) != 0)) { + (roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0) != 0)) { head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core0_exception_buffer); fprintf(fp, "ML_SCRATCH_EXCEPTION_SP_C0 = 0x%016lx", - roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0)); - head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr); + roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0)); + head_ptr = roc_ml_addr_mlip2ap(&cn10k_mldev->roc, head_ptr); fprintf(fp, "%.*s", bufsize, head_ptr); - } else if ((core_id == 1) && - (roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C1) != 0)) { + } else if ((core_id == 1) && (roc_ml_reg_read64(&cn10k_mldev->roc, + ML_SCRATCH_EXCEPTION_SP_C1) != 0)) { head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core1_exception_buffer); fprintf(fp, "ML_SCRATCH_EXCEPTION_SP_C1 = 0x%016lx", - roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C1)); - head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr); + roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_EXCEPTION_SP_C1)); + head_ptr = roc_ml_addr_mlip2ap(&cn10k_mldev->roc, head_ptr); fprintf(fp, "%.*s", bufsize, head_ptr); } } @@ -1495,14 +1528,16 @@ cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp) static int cn10k_ml_dev_selftest(struct rte_ml_dev *dev) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; const struct plt_memzone *mz; - struct cn10k_ml_dev *mldev; struct cn10k_ml_req *req; uint64_t timeout_cycle; bool timeout; int ret; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; mz = plt_memzone_reserve_aligned("dev_selftest", sizeof(struct cn10k_ml_req), 0, ML_CN10K_ALIGN_SIZE); if (mz == NULL) { @@ -1515,20 +1550,20 @@ cn10k_ml_dev_selftest(struct rte_ml_dev *dev) memset(&req->jd, 0, sizeof(struct cn10k_ml_jd)); req->jd.hdr.jce.w1.u64 = PLT_U64_CAST(&req->status); req->jd.hdr.job_type = ML_CN10K_JOB_TYPE_FIRMWARE_SELFTEST; - req->jd.hdr.result = roc_ml_addr_ap2mlip(&mldev->roc, &req->result); - req->jd.fw_load.flags = cn10k_ml_fw_flags_get(&mldev->fw); - plt_write64(ML_CN10K_POLL_JOB_START, &req->status); + req->jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &req->result); + req->jd.fw_load.flags = cn10k_ml_fw_flags_get(&cn10k_mldev->fw); + plt_write64(ML_CNXK_POLL_JOB_START, &req->status); plt_wmb(); /* Enqueue firmware selftest request through scratch registers */ timeout = true; - timeout_cycle = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); - roc_ml_scratch_enqueue(&mldev->roc, &req->jd); + timeout_cycle = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); + roc_ml_scratch_enqueue(&cn10k_mldev->roc, &req->jd); plt_rmb(); do { - if (roc_ml_scratch_is_done_bit_set(&mldev->roc) && - (plt_read64(&req->status) == ML_CN10K_POLL_JOB_FINISH)) { + if (roc_ml_scratch_is_done_bit_set(&cn10k_mldev->roc) && + (plt_read64(&req->status) == ML_CNXK_POLL_JOB_FINISH)) { timeout = false; break; } @@ -1552,8 +1587,8 @@ int cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, uint16_t *model_id) { struct cn10k_ml_model_metadata *metadata; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; char str[RTE_MEMZONE_NAMESIZE]; const struct plt_memzone *mz; @@ -1574,7 +1609,7 @@ cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, if (ret != 0) return ret; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; /* Find model ID */ found = false; @@ -1591,7 +1626,8 @@ cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, } /* Get WB and scratch pages, check if model can be loaded. */ - ret = cn10k_ml_model_ocm_pages_count(mldev, idx, params->addr, &wb_pages, &scratch_pages); + ret = cn10k_ml_model_ocm_pages_count(&cnxk_mldev->cn10k_mldev, idx, params->addr, &wb_pages, + &scratch_pages); if (ret < 0) return ret; @@ -1623,7 +1659,7 @@ cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, } model = mz->addr; - model->mldev = mldev; + model->mldev = cnxk_mldev; model->model_id = idx; rte_memcpy(&model->metadata, params->addr, sizeof(struct cn10k_ml_model_metadata)); @@ -1680,7 +1716,7 @@ cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, plt_spinlock_init(&model->lock); model->state = ML_CN10K_MODEL_STATE_LOADED; dev->data->models[idx] = model; - mldev->nb_models_loaded++; + cnxk_mldev->nb_models_loaded++; /* Update xstats names */ cn10k_ml_xstats_model_name_update(dev, idx); @@ -1695,9 +1731,9 @@ cn10k_ml_model_unload(struct rte_ml_dev *dev, uint16_t model_id) { char str[RTE_MEMZONE_NAMESIZE]; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; + struct cnxk_ml_dev *cnxk_mldev; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; model = dev->data->models[model_id]; if (model == NULL) { @@ -1711,7 +1747,7 @@ cn10k_ml_model_unload(struct rte_ml_dev *dev, uint16_t model_id) } dev->data->models[model_id] = NULL; - mldev->nb_models_unloaded++; + cnxk_mldev->nb_models_unloaded++; snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u", CN10K_ML_MODEL_MEMZONE_NAME, model_id); return plt_memzone_free(plt_memzone_lookup(str)); @@ -1720,8 +1756,9 @@ cn10k_ml_model_unload(struct rte_ml_dev *dev, uint16_t model_id) int cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_ocm *ocm; struct cn10k_ml_req *req; @@ -1735,8 +1772,9 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) bool locked; int ret = 0; - mldev = dev->data->dev_private; - ocm = &mldev->ocm; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + ocm = &cn10k_mldev->ocm; model = dev->data->models[model_id]; if (model == NULL) { @@ -1746,11 +1784,11 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) /* Prepare JD */ req = model->req; - cn10k_ml_prep_sp_job_descriptor(mldev, model, req, ML_CN10K_JOB_TYPE_MODEL_START); + cn10k_ml_prep_sp_job_descriptor(cn10k_mldev, model, req, ML_CN10K_JOB_TYPE_MODEL_START); req->result.error_code.u64 = 0x0; req->result.user_ptr = NULL; - plt_write64(ML_CN10K_POLL_JOB_START, &req->status); + plt_write64(ML_CNXK_POLL_JOB_START, &req->status); plt_wmb(); num_tiles = model->metadata.model.tile_end - model->metadata.model.tile_start + 1; @@ -1815,26 +1853,26 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) job_dequeued = false; do { if (!job_enqueued) { - req->timeout = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); - job_enqueued = roc_ml_scratch_enqueue(&mldev->roc, &req->jd); + req->timeout = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); + job_enqueued = roc_ml_scratch_enqueue(&cn10k_mldev->roc, &req->jd); } if (job_enqueued && !job_dequeued) - job_dequeued = roc_ml_scratch_dequeue(&mldev->roc, &req->jd); + job_dequeued = roc_ml_scratch_dequeue(&cn10k_mldev->roc, &req->jd); if (job_dequeued) break; } while (plt_tsc_cycles() < req->timeout); if (job_dequeued) { - if (plt_read64(&req->status) == ML_CN10K_POLL_JOB_FINISH) { + if (plt_read64(&req->status) == ML_CNXK_POLL_JOB_FINISH) { if (req->result.error_code.u64 == 0) ret = 0; else ret = -1; } } else { /* Reset scratch registers */ - roc_ml_scratch_queue_reset(&mldev->roc); + roc_ml_scratch_queue_reset(&cn10k_mldev->roc); ret = -ETIME; } @@ -1843,7 +1881,7 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) if (plt_spinlock_trylock(&model->lock) != 0) { if (ret == 0) { model->state = ML_CN10K_MODEL_STATE_STARTED; - mldev->nb_models_started++; + cnxk_mldev->nb_models_started++; } else { model->state = ML_CN10K_MODEL_STATE_UNKNOWN; } @@ -1867,7 +1905,7 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) if (ret < 0) { /* Call unload to update model and FW state, ignore error */ rte_ml_model_stop(dev->data->dev_id, model_id); } else { - if (mldev->cache_model_data && roc_model_is_cn10ka()) + if (cn10k_mldev->cache_model_data && roc_model_is_cn10ka()) ret = cn10k_ml_cache_model_data(dev, model_id); } @@ -1877,8 +1915,9 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) int cn10k_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_ocm *ocm; struct cn10k_ml_req *req; @@ -1887,8 +1926,9 @@ cn10k_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id) bool locked; int ret = 0; - mldev = dev->data->dev_private; - ocm = &mldev->ocm; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + ocm = &cn10k_mldev->ocm; model = dev->data->models[model_id]; if (model == NULL) { @@ -1898,11 +1938,11 @@ cn10k_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id) /* Prepare JD */ req = model->req; - cn10k_ml_prep_sp_job_descriptor(mldev, model, req, ML_CN10K_JOB_TYPE_MODEL_STOP); + cn10k_ml_prep_sp_job_descriptor(cn10k_mldev, model, req, ML_CN10K_JOB_TYPE_MODEL_STOP); req->result.error_code.u64 = 0x0; req->result.user_ptr = NULL; - plt_write64(ML_CN10K_POLL_JOB_START, &req->status); + plt_write64(ML_CNXK_POLL_JOB_START, &req->status); plt_wmb(); locked = false; @@ -1941,33 +1981,33 @@ cn10k_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id) job_dequeued = false; do { if (!job_enqueued) { - req->timeout = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); - job_enqueued = roc_ml_scratch_enqueue(&mldev->roc, &req->jd); + req->timeout = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); + job_enqueued = roc_ml_scratch_enqueue(&cn10k_mldev->roc, &req->jd); } if (job_enqueued && !job_dequeued) - job_dequeued = roc_ml_scratch_dequeue(&mldev->roc, &req->jd); + job_dequeued = roc_ml_scratch_dequeue(&cn10k_mldev->roc, &req->jd); if (job_dequeued) break; } while (plt_tsc_cycles() < req->timeout); if (job_dequeued) { - if (plt_read64(&req->status) == ML_CN10K_POLL_JOB_FINISH) { + if (plt_read64(&req->status) == ML_CNXK_POLL_JOB_FINISH) { if (req->result.error_code.u64 == 0x0) ret = 0; else ret = -1; } } else { - roc_ml_scratch_queue_reset(&mldev->roc); + roc_ml_scratch_queue_reset(&cn10k_mldev->roc); ret = -ETIME; } locked = false; while (!locked) { if (plt_spinlock_trylock(&model->lock) != 0) { - mldev->nb_models_stopped++; + cnxk_mldev->nb_models_stopped++; model->state = ML_CN10K_MODEL_STATE_LOADED; plt_spinlock_unlock(&model->lock); locked = true; @@ -2211,8 +2251,9 @@ cn10k_ml_result_update(struct rte_ml_dev *dev, int qp_id, struct cn10k_ml_result struct rte_ml_op *op) { struct cn10k_ml_model_stats *stats; + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_qp *qp; uint64_t hw_latency; uint64_t fw_latency; @@ -2258,14 +2299,16 @@ cn10k_ml_result_update(struct rte_ml_dev *dev, int qp_id, struct cn10k_ml_result /* Handle driver error */ if (result->error_code.s.etype == ML_ETYPE_DRIVER) { - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; /* Check for exception */ - if ((roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0) != 0) || - (roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C1) != 0)) + if ((roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0) != + 0) || + (roc_ml_reg_read64(&cn10k_mldev->roc, ML_SCRATCH_EXCEPTION_SP_C1) != 0)) result->error_code.s.stype = ML_DRIVER_ERR_EXCEPTION; - else if ((roc_ml_reg_read64(&mldev->roc, ML_CORE_INT_LO) != 0) || - (roc_ml_reg_read64(&mldev->roc, ML_CORE_INT_HI) != 0)) + else if ((roc_ml_reg_read64(&cn10k_mldev->roc, ML_CORE_INT_LO) != 0) || + (roc_ml_reg_read64(&cn10k_mldev->roc, ML_CORE_INT_HI) != 0)) result->error_code.s.stype = ML_DRIVER_ERR_FW_ERROR; else result->error_code.s.stype = ML_DRIVER_ERR_UNKNOWN; @@ -2282,8 +2325,9 @@ __rte_hot uint16_t cn10k_ml_enqueue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_queue *queue; - struct cn10k_ml_dev *mldev; struct cn10k_ml_req *req; struct cn10k_ml_qp *qp; struct rte_ml_op *op; @@ -2292,7 +2336,8 @@ cn10k_ml_enqueue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op uint64_t head; bool enqueued; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; qp = dev->data->queue_pairs[qp_id]; queue = &qp->queue; @@ -2307,15 +2352,15 @@ cn10k_ml_enqueue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op op = ops[count]; req = &queue->reqs[head]; - mldev->set_poll_addr(req); - cn10k_ml_prep_fp_job_descriptor(dev, req, op); + cn10k_mldev->set_poll_addr(req); + cn10k_ml_prep_fp_job_descriptor(cn10k_mldev, req, op); memset(&req->result, 0, sizeof(struct cn10k_ml_result)); req->result.error_code.s.etype = ML_ETYPE_UNKNOWN; req->result.user_ptr = op->user_ptr; - mldev->set_poll_ptr(req); - enqueued = mldev->ml_jcmdq_enqueue(&mldev->roc, &req->jcmd); + cn10k_mldev->set_poll_ptr(req); + enqueued = cn10k_mldev->ml_jcmdq_enqueue(&cn10k_mldev->roc, &req->jcmd); if (unlikely(!enqueued)) goto jcmdq_full; @@ -2339,8 +2384,9 @@ __rte_hot uint16_t cn10k_ml_dequeue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_queue *queue; - struct cn10k_ml_dev *mldev; struct cn10k_ml_req *req; struct cn10k_ml_qp *qp; @@ -2348,7 +2394,8 @@ cn10k_ml_dequeue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op uint16_t count; uint64_t tail; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; qp = dev->data->queue_pairs[qp_id]; queue = &qp->queue; @@ -2361,8 +2408,8 @@ cn10k_ml_dequeue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op dequeue_req: req = &queue->reqs[tail]; - status = mldev->get_poll_ptr(req); - if (unlikely(status != ML_CN10K_POLL_JOB_FINISH)) { + status = cn10k_mldev->get_poll_ptr(req); + if (unlikely(status != ML_CNXK_POLL_JOB_FINISH)) { if (plt_tsc_cycles() < req->timeout) goto empty_or_active; else /* Timeout, set indication of driver error */ @@ -2420,30 +2467,32 @@ cn10k_ml_op_error_get(struct rte_ml_dev *dev, struct rte_ml_op *op, struct rte_m __rte_hot int cn10k_ml_inference_sync(struct rte_ml_dev *dev, struct rte_ml_op *op) { + struct cn10k_ml_dev *cn10k_mldev; + struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_model *model; - struct cn10k_ml_dev *mldev; struct cn10k_ml_req *req; bool timeout; int ret = 0; - mldev = dev->data->dev_private; + cnxk_mldev = dev->data->dev_private; + cn10k_mldev = &cnxk_mldev->cn10k_mldev; model = dev->data->models[op->model_id]; req = model->req; cn10k_ml_set_poll_addr(req); - cn10k_ml_prep_fp_job_descriptor(dev, req, op); + cn10k_ml_prep_fp_job_descriptor(cn10k_mldev, req, op); memset(&req->result, 0, sizeof(struct cn10k_ml_result)); req->result.error_code.s.etype = ML_ETYPE_UNKNOWN; req->result.user_ptr = op->user_ptr; - mldev->set_poll_ptr(req); + cn10k_mldev->set_poll_ptr(req); req->jcmd.w1.s.jobptr = PLT_U64_CAST(&req->jd); timeout = true; - req->timeout = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); + req->timeout = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); do { - if (mldev->ml_jcmdq_enqueue(&mldev->roc, &req->jcmd)) { + if (cn10k_mldev->ml_jcmdq_enqueue(&cn10k_mldev->roc, &req->jcmd)) { req->op = op; timeout = false; break; @@ -2457,7 +2506,7 @@ cn10k_ml_inference_sync(struct rte_ml_dev *dev, struct rte_ml_op *op) timeout = true; do { - if (mldev->get_poll_ptr(req) == ML_CN10K_POLL_JOB_FINISH) { + if (cn10k_mldev->get_poll_ptr(req) == ML_CNXK_POLL_JOB_FINISH) { timeout = false; break; } diff --git a/drivers/ml/cnxk/cnxk_ml_dev.c b/drivers/ml/cnxk/cnxk_ml_dev.c new file mode 100644 index 0000000000..2a5c17c973 --- /dev/null +++ b/drivers/ml/cnxk/cnxk_ml_dev.c @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2023 Marvell. + */ + +#include +#include + +#include "cnxk_ml_dev.h" + +/* Dummy operations for ML device */ +struct rte_ml_dev_ops ml_dev_dummy_ops = {0}; diff --git a/drivers/ml/cnxk/cnxk_ml_dev.h b/drivers/ml/cnxk/cnxk_ml_dev.h new file mode 100644 index 0000000000..51315de622 --- /dev/null +++ b/drivers/ml/cnxk/cnxk_ml_dev.h @@ -0,0 +1,58 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2023 Marvell. + */ + +#ifndef _CNXK_ML_DEV_H_ +#define _CNXK_ML_DEV_H_ + +#include + +#include "cn10k_ml_dev.h" + +/* ML command timeout in seconds */ +#define ML_CNXK_CMD_TIMEOUT 5 + +/* Poll mode job state */ +#define ML_CNXK_POLL_JOB_START 0 +#define ML_CNXK_POLL_JOB_FINISH 1 + +/* Device configuration state enum */ +enum cnxk_ml_dev_state { + /* Probed and not configured */ + ML_CNXK_DEV_STATE_PROBED = 0, + + /* Configured */ + ML_CNXK_DEV_STATE_CONFIGURED, + + /* Started */ + ML_CNXK_DEV_STATE_STARTED, + + /* Closed */ + ML_CNXK_DEV_STATE_CLOSED +}; + +/* Device private data */ +struct cnxk_ml_dev { + /* RTE device */ + struct rte_ml_dev *mldev; + + /* Configuration state */ + enum cnxk_ml_dev_state state; + + /* Number of models loaded */ + uint16_t nb_models_loaded; + + /* Number of models unloaded */ + uint16_t nb_models_unloaded; + + /* Number of models started */ + uint16_t nb_models_started; + + /* Number of models stopped */ + uint16_t nb_models_stopped; + + /* CN10K device structure */ + struct cn10k_ml_dev cn10k_mldev; +}; + +#endif /* _CNXK_ML_DEV_H_ */ diff --git a/drivers/ml/cnxk/meson.build b/drivers/ml/cnxk/meson.build index 94fa4283b1..03a2d4ecf2 100644 --- a/drivers/ml/cnxk/meson.build +++ b/drivers/ml/cnxk/meson.build @@ -12,6 +12,7 @@ driver_sdk_headers = files( 'cn10k_ml_ops.h', 'cn10k_ml_model.h', 'cn10k_ml_ocm.h', + 'cnxk_ml_dev.h', ) sources = files( @@ -19,6 +20,7 @@ sources = files( 'cn10k_ml_ops.c', 'cn10k_ml_model.c', 'cn10k_ml_ocm.c', + 'cnxk_ml_dev.c', ) deps += ['mldev', 'common_cnxk', 'kvargs', 'hash']