From patchwork Wed Oct 18 06:47:40 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Srikanth Yalavarthi <syalavarthi@marvell.com>
X-Patchwork-Id: 132856
X-Patchwork-Delegate: jerinj@marvell.com
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@inbox.dpdk.org
Delivered-To: patchwork@inbox.dpdk.org
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 2886843196;
	Wed, 18 Oct 2023 08:50:18 +0200 (CEST)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 0AABE42E30;
	Wed, 18 Oct 2023 08:48:36 +0200 (CEST)
Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com
 [67.231.156.173])
 by mails.dpdk.org (Postfix) with ESMTP id 6E380402E6
 for <dev@dpdk.org>; Wed, 18 Oct 2023 08:48:19 +0200 (CEST)
Received: from pps.filterd (m0045851.ppops.net [127.0.0.1])
 by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id
 39I3vKJx020024 for <dev@dpdk.org>; Tue, 17 Oct 2023 23:48:18 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com;
 h=from : to : cc :
 subject : date : message-id : in-reply-to : references : mime-version :
 content-transfer-encoding : content-type; s=pfpt0220;
 bh=5l7ve3c6UJL0U1W+4B+kpzXViBWQ/54UF5A4FhCvz8Q=;
 b=jnatJ7eIICJfcq/rdXWdR1PgkVrcV3wZkUEeam6DSC4br/Yc6N6HR6WyG7EK2MrHyL65
 JDOvtHDAYvS6wOU9Ty2o9Fgl/LY130e1h8FxYQjZkkAQw2I59QhZI7FGcA7c369PTEjG
 +ra5D0WXcBXNWc6lOejibbRiKnh1pky68a86K45vsZ9asBVGtouAwMeZZOL8RzQPMpV/
 JwJBbCxUTrz4N6j+rqS6dFRN+SPzEhive7xo7YWda9NfERVcFUPCTBaz+9nsi1o5nHV0
 kaz8KojPeEk06XGx0TKUZK8E2zV0EP/o3GENtFpKiLXPsFbTK1az8ZVFrV8hrGvkxrKI hg==
Received: from dc5-exch02.marvell.com ([199.233.59.182])
 by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3tstb3ursq-10
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT)
 for <dev@dpdk.org>; Tue, 17 Oct 2023 23:48:18 -0700
Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48;
 Tue, 17 Oct 2023 23:48:13 -0700
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.48 via Frontend
 Transport; Tue, 17 Oct 2023 23:48:13 -0700
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id 614DB3F704A;
 Tue, 17 Oct 2023 23:48:13 -0700 (PDT)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>
CC: <dev@dpdk.org>, <sshankarnara@marvell.com>, <aprabhu@marvell.com>,
 <ptakkar@marvell.com>
Subject: [PATCH v5 12/34] ml/cnxk: update data quantization functions
Date: Tue, 17 Oct 2023 23:47:40 -0700
Message-ID: <20231018064806.24145-13-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20231018064806.24145-1-syalavarthi@marvell.com>
References: <20230830155927.3566-1-syalavarthi@marvell.com>
 <20231018064806.24145-1-syalavarthi@marvell.com>
MIME-Version: 1.0
X-Proofpoint-ORIG-GUID: nb2tVvBm1T2k5ndIurYHJOMb6J6IMKJG
X-Proofpoint-GUID: nb2tVvBm1T2k5ndIurYHJOMb6J6IMKJG
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26
 definitions=2023-10-18_04,2023-10-17_01,2023-05-22_02
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Added cnxk wrapper functions to quantize input data and
dequantize output data.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 164 ---------------------------------
 drivers/ml/cnxk/cn10k_ml_ops.h |   7 --
 drivers/ml/cnxk/cnxk_ml_io.c   |  95 +++++++++++++++++++
 drivers/ml/cnxk/cnxk_ml_io.h   |   3 +
 drivers/ml/cnxk/cnxk_ml_ops.c  |  78 +++++++++++++++-
 drivers/ml/cnxk/meson.build    |   1 +
 6 files changed, 175 insertions(+), 173 deletions(-)
 create mode 100644 drivers/ml/cnxk/cnxk_ml_io.c

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index c0d6216485..ff190b7f86 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -1856,170 +1856,6 @@ cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_mode
 	return 0;
 }
 
-int
-cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **dbuffer,
-		     struct rte_ml_buff_seg **qbuffer)
-{
-	struct cnxk_ml_model *model;
-	uint8_t model_input_type;
-	uint8_t *lcl_dbuffer;
-	uint8_t *lcl_qbuffer;
-	uint8_t input_type;
-	float qscale;
-	uint32_t i;
-	uint32_t j;
-	int ret;
-
-	model = dev->data->models[model_id];
-
-	if (model == NULL) {
-		plt_err("Invalid model_id = %u", model_id);
-		return -EINVAL;
-	}
-
-	lcl_dbuffer = dbuffer[0]->addr;
-	lcl_qbuffer = qbuffer[0]->addr;
-
-	for (i = 0; i < model->layer[0].glow.metadata.model.num_input; i++) {
-		if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) {
-			input_type = model->layer[0].glow.metadata.input1[i].input_type;
-			model_input_type = model->layer[0].glow.metadata.input1[i].model_input_type;
-			qscale = model->layer[0].glow.metadata.input1[i].qscale;
-		} else {
-			j = i - MRVL_ML_NUM_INPUT_OUTPUT_1;
-			input_type = model->layer[0].glow.metadata.input2[j].input_type;
-			model_input_type = model->layer[0].glow.metadata.input2[j].model_input_type;
-			qscale = model->layer[0].glow.metadata.input2[j].qscale;
-		}
-
-		if (input_type == model_input_type) {
-			rte_memcpy(lcl_qbuffer, lcl_dbuffer, model->layer[0].info.input[i].sz_d);
-		} else {
-			switch (model->layer[0].glow.metadata.input1[i].model_input_type) {
-			case RTE_ML_IO_TYPE_INT8:
-				ret = rte_ml_io_float32_to_int8(
-					qscale, model->layer[0].info.input[i].nb_elements,
-					lcl_dbuffer, lcl_qbuffer);
-				break;
-			case RTE_ML_IO_TYPE_UINT8:
-				ret = rte_ml_io_float32_to_uint8(
-					qscale, model->layer[0].info.input[i].nb_elements,
-					lcl_dbuffer, lcl_qbuffer);
-				break;
-			case RTE_ML_IO_TYPE_INT16:
-				ret = rte_ml_io_float32_to_int16(
-					qscale, model->layer[0].info.input[i].nb_elements,
-					lcl_dbuffer, lcl_qbuffer);
-				break;
-			case RTE_ML_IO_TYPE_UINT16:
-				ret = rte_ml_io_float32_to_uint16(
-					qscale, model->layer[0].info.input[i].nb_elements,
-					lcl_dbuffer, lcl_qbuffer);
-				break;
-			case RTE_ML_IO_TYPE_FP16:
-				ret = rte_ml_io_float32_to_float16(
-					model->layer[0].info.input[i].nb_elements, lcl_dbuffer,
-					lcl_qbuffer);
-				break;
-			default:
-				plt_err("Unsupported model_input_type[%u] : %u", i,
-					model->layer[0].glow.metadata.input1[i].model_input_type);
-				ret = -ENOTSUP;
-			}
-			if (ret < 0)
-				return ret;
-		}
-
-		lcl_dbuffer += model->layer[0].info.input[i].sz_d;
-		lcl_qbuffer += model->layer[0].info.input[i].sz_q;
-	}
-
-	return 0;
-}
-
-int
-cn10k_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **qbuffer,
-		       struct rte_ml_buff_seg **dbuffer)
-{
-	struct cnxk_ml_model *model;
-	uint8_t model_output_type;
-	uint8_t *lcl_qbuffer;
-	uint8_t *lcl_dbuffer;
-	uint8_t output_type;
-	float dscale;
-	uint32_t i;
-	uint32_t j;
-	int ret;
-
-	model = dev->data->models[model_id];
-
-	if (model == NULL) {
-		plt_err("Invalid model_id = %u", model_id);
-		return -EINVAL;
-	}
-
-	lcl_dbuffer = dbuffer[0]->addr;
-	lcl_qbuffer = qbuffer[0]->addr;
-
-	for (i = 0; i < model->layer[0].glow.metadata.model.num_output; i++) {
-		if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) {
-			output_type = model->layer[0].glow.metadata.output1[i].output_type;
-			model_output_type =
-				model->layer[0].glow.metadata.output1[i].model_output_type;
-			dscale = model->layer[0].glow.metadata.output1[i].dscale;
-		} else {
-			j = i - MRVL_ML_NUM_INPUT_OUTPUT_1;
-			output_type = model->layer[0].glow.metadata.output2[j].output_type;
-			model_output_type =
-				model->layer[0].glow.metadata.output2[j].model_output_type;
-			dscale = model->layer[0].glow.metadata.output2[j].dscale;
-		}
-
-		if (output_type == model_output_type) {
-			rte_memcpy(lcl_dbuffer, lcl_qbuffer, model->layer[0].info.output[i].sz_q);
-		} else {
-			switch (model->layer[0].glow.metadata.output1[i].model_output_type) {
-			case RTE_ML_IO_TYPE_INT8:
-				ret = rte_ml_io_int8_to_float32(
-					dscale, model->layer[0].info.output[i].nb_elements,
-					lcl_qbuffer, lcl_dbuffer);
-				break;
-			case RTE_ML_IO_TYPE_UINT8:
-				ret = rte_ml_io_uint8_to_float32(
-					dscale, model->layer[0].info.output[i].nb_elements,
-					lcl_qbuffer, lcl_dbuffer);
-				break;
-			case RTE_ML_IO_TYPE_INT16:
-				ret = rte_ml_io_int16_to_float32(
-					dscale, model->layer[0].info.output[i].nb_elements,
-					lcl_qbuffer, lcl_dbuffer);
-				break;
-			case RTE_ML_IO_TYPE_UINT16:
-				ret = rte_ml_io_uint16_to_float32(
-					dscale, model->layer[0].info.output[i].nb_elements,
-					lcl_qbuffer, lcl_dbuffer);
-				break;
-			case RTE_ML_IO_TYPE_FP16:
-				ret = rte_ml_io_float16_to_float32(
-					model->layer[0].info.output[i].nb_elements, lcl_qbuffer,
-					lcl_dbuffer);
-				break;
-			default:
-				plt_err("Unsupported model_output_type[%u] : %u", i,
-					model->layer[0].glow.metadata.output1[i].model_output_type);
-				ret = -ENOTSUP;
-			}
-			if (ret < 0)
-				return ret;
-		}
-
-		lcl_qbuffer += model->layer[0].info.output[i].sz_q;
-		lcl_dbuffer += model->layer[0].info.output[i].sz_d;
-	}
-
-	return 0;
-}
-
 static __rte_always_inline void
 queue_index_advance(uint64_t *index, uint64_t nb_desc)
 {
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index ef12069f0d..780e2a9f9c 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -320,13 +320,6 @@ int cn10k_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *mo
 int cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model,
 				 void *buffer);
 
-/* I/O ops */
-int cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id,
-			 struct rte_ml_buff_seg **dbuffer, struct rte_ml_buff_seg **qbuffer);
-
-int cn10k_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id,
-			   struct rte_ml_buff_seg **qbuffer, struct rte_ml_buff_seg **dbuffer);
-
 /* Fast-path ops */
 __rte_hot uint16_t cn10k_ml_enqueue_burst(struct rte_ml_dev *dev, uint16_t qp_id,
 					  struct rte_ml_op **ops, uint16_t nb_ops);
diff --git a/drivers/ml/cnxk/cnxk_ml_io.c b/drivers/ml/cnxk/cnxk_ml_io.c
new file mode 100644
index 0000000000..c78009ab0c
--- /dev/null
+++ b/drivers/ml/cnxk/cnxk_ml_io.c
@@ -0,0 +1,95 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2023 Marvell.
+ */
+
+#include <rte_mldev.h>
+
+#include <mldev_utils.h>
+
+#include <roc_api.h>
+
+#include "cnxk_ml_io.h"
+
+inline int
+cnxk_ml_io_quantize_single(struct cnxk_ml_io *input, uint8_t *dbuffer, uint8_t *qbuffer)
+{
+	enum rte_ml_io_type qtype;
+	enum rte_ml_io_type dtype;
+	uint32_t nb_elements;
+	float qscale;
+	int ret = 0;
+
+	dtype = input->dtype;
+	qtype = input->qtype;
+	qscale = input->scale;
+	nb_elements = input->nb_elements;
+
+	if (dtype == qtype) {
+		rte_memcpy(qbuffer, dbuffer, input->sz_d);
+	} else {
+		switch (qtype) {
+		case RTE_ML_IO_TYPE_INT8:
+			ret = rte_ml_io_float32_to_int8(qscale, nb_elements, dbuffer, qbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT8:
+			ret = rte_ml_io_float32_to_uint8(qscale, nb_elements, dbuffer, qbuffer);
+			break;
+		case RTE_ML_IO_TYPE_INT16:
+			ret = rte_ml_io_float32_to_int16(qscale, nb_elements, dbuffer, qbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT16:
+			ret = rte_ml_io_float32_to_uint16(qscale, nb_elements, dbuffer, qbuffer);
+			break;
+		case RTE_ML_IO_TYPE_FP16:
+			ret = rte_ml_io_float32_to_float16(nb_elements, dbuffer, qbuffer);
+			break;
+		default:
+			plt_err("Unsupported qtype : %u", qtype);
+			ret = -ENOTSUP;
+		}
+	}
+
+	return ret;
+}
+
+inline int
+cnxk_ml_io_dequantize_single(struct cnxk_ml_io *output, uint8_t *qbuffer, uint8_t *dbuffer)
+{
+	enum rte_ml_io_type qtype;
+	enum rte_ml_io_type dtype;
+	uint32_t nb_elements;
+	float dscale;
+	int ret = 0;
+
+	dtype = output->dtype;
+	qtype = output->qtype;
+	dscale = output->scale;
+	nb_elements = output->nb_elements;
+
+	if (dtype == qtype) {
+		rte_memcpy(dbuffer, qbuffer, output->sz_q);
+	} else {
+		switch (qtype) {
+		case RTE_ML_IO_TYPE_INT8:
+			ret = rte_ml_io_int8_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT8:
+			ret = rte_ml_io_uint8_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
+		case RTE_ML_IO_TYPE_INT16:
+			ret = rte_ml_io_int16_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT16:
+			ret = rte_ml_io_uint16_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
+		case RTE_ML_IO_TYPE_FP16:
+			ret = rte_ml_io_float16_to_float32(nb_elements, qbuffer, dbuffer);
+			break;
+		default:
+			plt_err("Unsupported qtype: %u", qtype);
+			ret = -ENOTSUP;
+		}
+	}
+
+	return ret;
+}
diff --git a/drivers/ml/cnxk/cnxk_ml_io.h b/drivers/ml/cnxk/cnxk_ml_io.h
index 29ec7ec511..5de166c252 100644
--- a/drivers/ml/cnxk/cnxk_ml_io.h
+++ b/drivers/ml/cnxk/cnxk_ml_io.h
@@ -76,4 +76,7 @@ struct cnxk_ml_io_info {
 	uint32_t total_output_sz_d;
 };
 
+int cnxk_ml_io_quantize_single(struct cnxk_ml_io *input, uint8_t *dbuffer, uint8_t *qbuffer);
+int cnxk_ml_io_dequantize_single(struct cnxk_ml_io *output, uint8_t *qbuffer, uint8_t *dbuffer);
+
 #endif /* _CNXK_ML_IO_H_ */
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index 9ce37fcfd1..63842025fc 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -5,6 +5,8 @@
 #include <rte_mldev.h>
 #include <rte_mldev_pmd.h>
 
+#include <mldev_utils.h>
+
 #include "cnxk_ml_dev.h"
 #include "cnxk_ml_io.h"
 #include "cnxk_ml_model.h"
@@ -648,6 +650,78 @@ cnxk_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, void *buf
 	return cn10k_ml_model_params_update(cnxk_mldev, model, buffer);
 }
 
+static int
+cnxk_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **dbuffer,
+		    struct rte_ml_buff_seg **qbuffer)
+{
+	struct cnxk_ml_io_info *info = NULL;
+	struct cnxk_ml_model *model;
+	uint8_t *lcl_dbuffer;
+	uint8_t *lcl_qbuffer;
+	uint32_t i;
+	int ret;
+
+	if ((dev == NULL) || (dbuffer == NULL) || (qbuffer == NULL))
+		return -EINVAL;
+
+	model = dev->data->models[model_id];
+	if (model == NULL) {
+		plt_err("Invalid model_id = %u", model_id);
+		return -EINVAL;
+	}
+
+	info = &model->layer[0].info;
+
+	lcl_dbuffer = dbuffer[0]->addr;
+	lcl_qbuffer = qbuffer[0]->addr;
+	for (i = 0; i < info->nb_inputs; i++) {
+		ret = cnxk_ml_io_quantize_single(&info->input[i], lcl_dbuffer, lcl_qbuffer);
+		if (ret < 0)
+			return ret;
+
+		lcl_dbuffer += info->input[i].sz_d;
+		lcl_qbuffer += info->input[i].sz_q;
+	}
+
+	return 0;
+}
+
+static int
+cnxk_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **qbuffer,
+		      struct rte_ml_buff_seg **dbuffer)
+{
+	struct cnxk_ml_io_info *info = NULL;
+	struct cnxk_ml_model *model;
+	uint8_t *lcl_qbuffer;
+	uint8_t *lcl_dbuffer;
+	uint32_t i;
+	int ret;
+
+	if ((dev == NULL) || (qbuffer == NULL) || (dbuffer == NULL))
+		return -EINVAL;
+
+	model = dev->data->models[model_id];
+	if (model == NULL) {
+		plt_err("Invalid model_id = %u", model_id);
+		return -EINVAL;
+	}
+
+	info = &model->layer[model->nb_layers - 1].info;
+
+	lcl_qbuffer = qbuffer[0]->addr;
+	lcl_dbuffer = dbuffer[0]->addr;
+	for (i = 0; i < info->nb_outputs; i++) {
+		ret = cnxk_ml_io_dequantize_single(&info->output[i], lcl_qbuffer, lcl_dbuffer);
+		if (ret < 0)
+			return ret;
+
+		lcl_qbuffer += info->output[i].sz_q;
+		lcl_dbuffer += info->output[i].sz_d;
+	}
+
+	return 0;
+}
+
 struct rte_ml_dev_ops cnxk_ml_ops = {
 	/* Device control ops */
 	.dev_info_get = cnxk_ml_dev_info_get,
@@ -679,6 +753,6 @@ struct rte_ml_dev_ops cnxk_ml_ops = {
 	.model_params_update = cnxk_ml_model_params_update,
 
 	/* I/O ops */
-	.io_quantize = cn10k_ml_io_quantize,
-	.io_dequantize = cn10k_ml_io_dequantize,
+	.io_quantize = cnxk_ml_io_quantize,
+	.io_dequantize = cnxk_ml_io_dequantize,
 };
diff --git a/drivers/ml/cnxk/meson.build b/drivers/ml/cnxk/meson.build
index 6385ac4548..9cc4ddec70 100644
--- a/drivers/ml/cnxk/meson.build
+++ b/drivers/ml/cnxk/meson.build
@@ -25,6 +25,7 @@ sources = files(
         'cn10k_ml_model.c',
         'cn10k_ml_ocm.c',
         'cnxk_ml_dev.c',
+        'cnxk_ml_io.c',
         'cnxk_ml_model.c',
         'cnxk_ml_ops.c',
 )