From patchwork Tue Feb  7 16:06:55 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Srikanth Yalavarthi <syalavarthi@marvell.com>
X-Patchwork-Id: 123330
X-Patchwork-Delegate: thomas@monjalon.net
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@inbox.dpdk.org
Delivered-To: patchwork@inbox.dpdk.org
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id BAEDE41C30;
	Tue,  7 Feb 2023 17:09:01 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id F38EB42D46;
	Tue,  7 Feb 2023 17:07:41 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com
 [67.231.148.174])
 by mails.dpdk.org (Postfix) with ESMTP id 9B07B42D10
 for <dev@dpdk.org>; Tue,  7 Feb 2023 17:07:29 +0100 (CET)
Received: from pps.filterd (m0045849.ppops.net [127.0.0.1])
 by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id
 317EgVTm017206 for <dev@dpdk.org>; Tue, 7 Feb 2023 08:07:28 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com;
 h=from : to : cc :
 subject : date : message-id : in-reply-to : references : mime-version :
 content-type; s=pfpt0220; bh=v33YgRIAk6wnQyoc86Zqjn89o2cq1Sw6cDn4LFZfYrE=;
 b=D5Koa/ml31MOeegcJBhGfY+T2JMCXcx/PQ51yPZx0b+8LZzcYsn5bjoJsQAGELRmspvx
 d0yUueQxCrfK8UnjUEghN1IvFWUTLoTGSoc2bCI1Eh690OYeY10mEkvDrVS2cXPALiJK
 zh/68EC1cRWWMfcGH6Fx01TooCLVOoXfw+A20Mmp4vtqLd75qfmVbjceYNAO661lwQk4
 /b4IxYBkAo4SAopeMxw6HBJmvxNxDyhS6ifGTZXLkfwCe388kFUc/M31jmasIP1M/Scm
 ANhpczqotgjdOM8edhi3hZg8snKf07d8H2NTBqQ6+lFXq7v3Y6GfcsB3pJmXgbb1xUAV Ag==
Received: from dc5-exch02.marvell.com ([199.233.59.182])
 by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3nkdyrssx7-9
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT)
 for <dev@dpdk.org>; Tue, 07 Feb 2023 08:07:28 -0800
Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.42;
 Tue, 7 Feb 2023 08:07:26 -0800
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend
 Transport; Tue, 7 Feb 2023 08:07:26 -0800
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id 4E64D3F7043;
 Tue,  7 Feb 2023 08:07:26 -0800 (PST)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>
CC: <dev@dpdk.org>, <sshankarnara@marvell.com>, <jerinj@marvell.com>,
 <aprabhu@marvell.com>, <ptakkar@marvell.com>, <pshukla@marvell.com>
Subject: [PATCH v5 15/39] ml/cnxk: add structures for slow and fast path JDs
Date: Tue, 7 Feb 2023 08:06:55 -0800
Message-ID: <20230207160719.1307-16-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20230207160719.1307-1-syalavarthi@marvell.com>
References: <20221208200220.20267-1-syalavarthi@marvell.com>
 <20230207160719.1307-1-syalavarthi@marvell.com>
MIME-Version: 1.0
X-Proofpoint-ORIG-GUID: jY-_e41J5pYv9A-2mLQK_VihT-IoXofK
X-Proofpoint-GUID: jY-_e41J5pYv9A-2mLQK_VihT-IoXofK
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1
 definitions=2023-02-07_07,2023-02-06_03,2022-06-22_01
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Added JD structures for load, unload and run jobs. Initialize
job command and allocate memory for request structures for slow
path jobs.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 drivers/ml/cnxk/cn10k_ml_dev.h   | 99 ++++++++++++++++++++++++++++++++
 drivers/ml/cnxk/cn10k_ml_model.h |  4 ++
 drivers/ml/cnxk/cn10k_ml_ops.c   | 19 +++++-
 drivers/ml/cnxk/cn10k_ml_ops.h   |  4 ++
 4 files changed, 125 insertions(+), 1 deletion(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h
index 02a4496c97..68fcc957fa 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.h
+++ b/drivers/ml/cnxk/cn10k_ml_dev.h
@@ -188,6 +188,105 @@ struct cn10k_ml_jd {
 
 			uint8_t rsvd[8];
 		} fw_load;
+
+		struct cn10k_ml_jd_section_model_start {
+			/* Source model start address in DDR relative to ML_MLR_BASE */
+			uint64_t model_src_ddr_addr;
+
+			/* Destination model start address in DDR relative to ML_MLR_BASE */
+			uint64_t model_dst_ddr_addr;
+
+			/* Offset to model init section in the model */
+			uint64_t model_init_offset : 32;
+
+			/* Size of init section in the model */
+			uint64_t model_init_size : 32;
+
+			/* Offset to model main section in the model */
+			uint64_t model_main_offset : 32;
+
+			/* Size of main section in the model */
+			uint64_t model_main_size : 32;
+
+			/* Offset to model finish section in the model */
+			uint64_t model_finish_offset : 32;
+
+			/* Size of finish section in the model */
+			uint64_t model_finish_size : 32;
+
+			/* Offset to WB in model bin */
+			uint64_t model_wb_offset : 32;
+
+			/* Number of model layers */
+			uint64_t num_layers : 8;
+
+			/* Number of gather entries, 0 means linear input mode (= no gather) */
+			uint64_t num_gather_entries : 8;
+
+			/* Number of scatter entries 0 means linear input mode (= no scatter) */
+			uint64_t num_scatter_entries : 8;
+
+			/* Tile mask to load model */
+			uint64_t tilemask : 8;
+
+			/* Batch size of model  */
+			uint64_t batch_size : 32;
+
+			/* OCM WB base address */
+			uint64_t ocm_wb_base_address : 32;
+
+			/* OCM WB range start */
+			uint64_t ocm_wb_range_start : 32;
+
+			/* OCM WB range End */
+			uint64_t ocm_wb_range_end : 32;
+
+			/* DDR WB address */
+			uint64_t ddr_wb_base_address;
+
+			/* DDR WB range start */
+			uint64_t ddr_wb_range_start : 32;
+
+			/* DDR WB range end */
+			uint64_t ddr_wb_range_end : 32;
+
+			union {
+				/* Points to gather list if num_gather_entries > 0 */
+				void *gather_list;
+				struct {
+					/* Linear input mode */
+					uint64_t ddr_range_start : 32;
+					uint64_t ddr_range_end : 32;
+				} s;
+			} input;
+
+			union {
+				/* Points to scatter list if num_scatter_entries > 0 */
+				void *scatter_list;
+				struct {
+					/* Linear output mode */
+					uint64_t ddr_range_start : 32;
+					uint64_t ddr_range_end : 32;
+				} s;
+			} output;
+		} model_start;
+
+		struct cn10k_ml_jd_section_model_stop {
+			uint8_t rsvd[96];
+		} model_stop;
+
+		struct cn10k_ml_jd_section_model_run {
+			/* Address of the input for the run relative to ML_MLR_BASE */
+			uint64_t input_ddr_addr;
+
+			/* Address of the output for the run relative to ML_MLR_BASE */
+			uint64_t output_ddr_addr;
+
+			/* Number of batches to run in variable batch processing */
+			uint16_t num_batches;
+
+			uint8_t rsvd[78];
+		} model_run;
 	};
 };
 
diff --git a/drivers/ml/cnxk/cn10k_ml_model.h b/drivers/ml/cnxk/cn10k_ml_model.h
index 7893635787..355915deeb 100644
--- a/drivers/ml/cnxk/cn10k_ml_model.h
+++ b/drivers/ml/cnxk/cn10k_ml_model.h
@@ -11,6 +11,7 @@
 
 #include "cn10k_ml_dev.h"
 #include "cn10k_ml_ocm.h"
+#include "cn10k_ml_ops.h"
 
 /* Model state */
 enum cn10k_ml_model_state {
@@ -426,6 +427,9 @@ struct cn10k_ml_model {
 
 	/* State */
 	enum cn10k_ml_model_state state;
+
+	/* Slow-path operations request pointer */
+	struct cn10k_ml_req *req;
 };
 
 int cn10k_ml_model_metadata_check(uint8_t *buffer, uint64_t size);
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 302ce8a452..56adce12ea 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -12,6 +12,10 @@
 /* ML model macros */
 #define CN10K_ML_MODEL_MEMZONE_NAME "ml_cn10k_model_mz"
 
+/* ML Job descriptor flags */
+#define ML_FLAGS_POLL_COMPL BIT(0)
+#define ML_FLAGS_SSO_COMPL  BIT(1)
+
 static void
 qp_memzone_name_get(char *name, int size, int dev_id, int qp_id)
 {
@@ -65,6 +69,7 @@ cn10k_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t nb_des
 	struct cn10k_ml_qp *qp;
 	uint32_t len;
 	uint8_t *va;
+	uint64_t i;
 
 	/* Allocate queue pair */
 	qp = rte_zmalloc_socket("cn10k_ml_pmd_queue_pair", sizeof(struct cn10k_ml_qp), ROC_ALIGN,
@@ -95,6 +100,12 @@ cn10k_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t nb_des
 	qp->queue.wait_cycles = ML_CN10K_CMD_TIMEOUT * plt_tsc_hz();
 	qp->nb_desc = nb_desc;
 
+	/* Initialize job command */
+	for (i = 0; i < qp->nb_desc; i++) {
+		memset(&qp->queue.reqs[i].jd, 0, sizeof(struct cn10k_ml_jd));
+		qp->queue.reqs[i].jcmd.w1.s.jobptr = PLT_U64_CAST(&qp->queue.reqs[i].jd);
+	}
+
 	return qp;
 
 qp_free:
@@ -468,7 +479,8 @@ cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params,
 			  metadata->finish_model.file_size + metadata->weights_bias.file_size;
 	model_data_size = PLT_ALIGN_CEIL(model_data_size, ML_CN10K_ALIGN_SIZE);
 	mz_size = PLT_ALIGN_CEIL(sizeof(struct cn10k_ml_model), ML_CN10K_ALIGN_SIZE) +
-		  2 * model_data_size;
+		  2 * model_data_size +
+		  PLT_ALIGN_CEIL(sizeof(struct cn10k_ml_req), ML_CN10K_ALIGN_SIZE);
 
 	/* Allocate memzone for model object and model data */
 	snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u", CN10K_ML_MODEL_MEMZONE_NAME, idx);
@@ -507,6 +519,11 @@ cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params,
 	model->model_mem_map.wb_pages = wb_pages;
 	model->model_mem_map.scratch_pages = scratch_pages;
 
+	/* Set slow-path request address and state */
+	model->req = PLT_PTR_ADD(
+		mz->addr, PLT_ALIGN_CEIL(sizeof(struct cn10k_ml_model), ML_CN10K_ALIGN_SIZE) +
+				  2 * model_data_size);
+
 	plt_spinlock_init(&model->lock);
 	model->state = ML_CN10K_MODEL_STATE_LOADED;
 	dev->data->models[idx] = model;
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index d7842ecd73..c86ce66f19 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -6,6 +6,7 @@
 #define _CN10K_ML_OPS_H_
 
 #include <rte_mldev.h>
+#include <rte_mldev_pmd.h>
 
 #include <roc_api.h>
 
@@ -21,6 +22,9 @@ struct cn10k_ml_req {
 
 	/* Status field for poll mode requests */
 	volatile uint64_t status;
+
+	/* Job command */
+	struct ml_job_cmd_s jcmd;
 } __rte_aligned(ROC_ALIGN);
 
 /* Request queue */