From patchwork Sun Jan  7 15:28:10 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Srikanth Yalavarthi <syalavarthi@marvell.com>
X-Patchwork-Id: 135778
X-Patchwork-Delegate: thomas@monjalon.net
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@inbox.dpdk.org
Delivered-To: patchwork@inbox.dpdk.org
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id C5ED243857;
	Sun,  7 Jan 2024 16:28:40 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 9D0E84067D;
	Sun,  7 Jan 2024 16:28:28 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com
 [67.231.156.173])
 by mails.dpdk.org (Postfix) with ESMTP id F176C40649
 for <dev@dpdk.org>; Sun,  7 Jan 2024 16:28:25 +0100 (CET)
Received: from pps.filterd (m0045851.ppops.net [127.0.0.1])
 by mx0b-0016f401.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id
 407Eu6FW026032; Sun, 7 Jan 2024 07:28:22 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=
 pfpt0220; bh=1Mj7RvzDcJsy9ZmUwcE+hQ8BTqoqQX/boUYiCQc347k=; b=KoK
 GZaPyOumGOH7mKr0dqCCgWyUBraRajzpnlPv/l+wiSFL3Mk8BDUlW2sAjn9x9Ows
 fPE/zLiTzEBLVuqE6FtJoiFIFw/iARizKf14TX+4NxlbHV9NfR0yO51Z81Va0FX9
 2kEUbtwvjgsFZzSR31IX00leRb979/r7RX4UtHn9Hw3RcLQKX7XvzF5xRh4wYNao
 QE1G1Q77sOwd4gLGgJSsQ+rVZmdXtj7b/ckDRiERV8ueQpCdAN5saCzHoHKYs3Ds
 byX5GFeatuaq1uJjO59NZNSROZUN1ZV6sVo/eI8Q3MUnjT4HAF6R+ihZmpNZDYbf
 zppRsdvLrQl995mB+1g==
Received: from dc5-exch02.marvell.com ([199.233.59.182])
 by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3vf78n29ns-2
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT);
 Sun, 07 Jan 2024 07:28:22 -0800 (PST)
Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48;
 Sun, 7 Jan 2024 07:28:20 -0800
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com
 (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend
 Transport; Sun, 7 Jan 2024 07:28:20 -0800
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id F35B13F7093;
 Sun,  7 Jan 2024 07:28:19 -0800 (PST)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>, Ruifeng Wang
 <ruifeng.wang@arm.com>
CC: <dev@dpdk.org>, <aprabhu@marvell.com>, <sshankarnara@marvell.com>,
 <ptakkar@marvell.com>
Subject: [PATCH 1/3] mldev: add conversion routines for 32-bit integers
Date: Sun, 7 Jan 2024 07:28:10 -0800
Message-ID: <20240107152813.2668-2-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20240107152813.2668-1-syalavarthi@marvell.com>
References: <20240107152813.2668-1-syalavarthi@marvell.com>
MIME-Version: 1.0
X-Proofpoint-ORIG-GUID: MySrgWmDX__d2ww17HJb5TmOngW61R24
X-Proofpoint-GUID: MySrgWmDX__d2ww17HJb5TmOngW61R24
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26
 definitions=2023-12-09_02,2023-12-07_01,2023-05-22_02
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Added routines to convert data from 32-bit integer type to
float32_t and vice-versa.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 lib/mldev/mldev_utils.h        |  92 +++++++++++++
 lib/mldev/mldev_utils_neon.c   | 242 +++++++++++++++++++++++++++++++++
 lib/mldev/mldev_utils_scalar.c |  98 +++++++++++++
 lib/mldev/version.map          |   4 +
 4 files changed, 436 insertions(+)

diff --git a/lib/mldev/mldev_utils.h b/lib/mldev/mldev_utils.h
index 220afb42f0d..1d041531b43 100644
--- a/lib/mldev/mldev_utils.h
+++ b/lib/mldev/mldev_utils.h
@@ -236,6 +236,98 @@ __rte_internal
 int
 rte_ml_io_uint16_to_float32(float scale, uint64_t nb_elements, void *input, void *output);
 
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in single precision floating format (float32) to signed
+ * 32-bit integer format (INT32).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ * @param[out] output
+ *	Output buffer to store INT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_float32_to_int32(float scale, uint64_t nb_elements, void *input, void *output);
+
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in signed 32-bit integer format (INT32) to single precision
+ * floating format (float32).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing INT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ * @param[out] output
+ *	Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_int32_to_float32(float scale, uint64_t nb_elements, void *input, void *output);
+
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in single precision floating format (float32) to unsigned
+ * 32-bit integer format (UINT32).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ * @param[out] output
+ *	Output buffer to store UINT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_float32_to_uint32(float scale, uint64_t nb_elements, void *input, void *output);
+
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in unsigned 32-bit integer format (UINT32) to single
+ * precision floating format (float32).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing UINT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ * @param[out] output
+ *	Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void *output);
+
 /**
  * @internal
  *
diff --git a/lib/mldev/mldev_utils_neon.c b/lib/mldev/mldev_utils_neon.c
index c7baec012b8..250fa43fa73 100644
--- a/lib/mldev/mldev_utils_neon.c
+++ b/lib/mldev/mldev_utils_neon.c
@@ -600,6 +600,248 @@ rte_ml_io_uint16_to_float32(float scale, uint64_t nb_elements, void *input, void
 	return 0;
 }
 
+static inline void
+__float32_to_int32_neon_s32x4(float scale, float *input, int32_t *output)
+{
+	float32x4_t f32x4;
+	int32x4_t s32x4;
+
+	/* load 4 x float elements */
+	f32x4 = vld1q_f32(input);
+
+	/* scale */
+	f32x4 = vmulq_n_f32(f32x4, scale);
+
+	/* convert to int32x4_t using round to nearest with ties away rounding mode */
+	s32x4 = vcvtaq_s32_f32(f32x4);
+
+	/* store 4 elements */
+	vst1q_s32(output, s32x4);
+}
+
+static inline void
+__float32_to_int32_neon_s32x1(float scale, float *input, int32_t *output)
+{
+	/* scale and convert, round to nearest with ties away rounding mode */
+	*output = vcvtas_s32_f32(scale * (*input));
+}
+
+int
+rte_ml_io_float32_to_int32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	int32_t *output_buffer;
+	uint64_t nb_iterations;
+	uint32_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (int32_t *)output;
+	vlen = 2 * sizeof(float) / sizeof(int32_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__float32_to_int32_neon_s32x4(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__float32_to_int32_neon_s32x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+static inline void
+__int32_to_float32_neon_f32x4(float scale, int32_t *input, float *output)
+{
+	float32x4_t f32x4;
+	int32x4_t s32x4;
+
+	/* load 4 x int32_t elements */
+	s32x4 = vld1q_s32(input);
+
+	/* convert int32_t to float */
+	f32x4 = vcvtq_f32_s32(s32x4);
+
+	/* scale */
+	f32x4 = vmulq_n_f32(f32x4, scale);
+
+	/* store float32x4_t */
+	vst1q_f32(output, f32x4);
+}
+
+static inline void
+__int32_to_float32_neon_f32x1(float scale, int32_t *input, float *output)
+{
+	*output = scale * vcvts_f32_s32(*input);
+}
+
+int
+rte_ml_io_int32_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	int32_t *input_buffer;
+	float *output_buffer;
+	uint64_t nb_iterations;
+	uint32_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (int32_t *)input;
+	output_buffer = (float *)output;
+	vlen = 2 * sizeof(float) / sizeof(int32_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__int32_to_float32_neon_f32x4(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__int32_to_float32_neon_f32x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+static inline void
+__float32_to_uint32_neon_u32x4(float scale, float *input, uint32_t *output)
+{
+	float32x4_t f32x4;
+	uint32x4_t u32x4;
+
+	/* load 4 float elements */
+	f32x4 = vld1q_f32(input);
+
+	/* scale */
+	f32x4 = vmulq_n_f32(f32x4, scale);
+
+	/* convert using round to nearest with ties to away rounding mode */
+	u32x4 = vcvtaq_u32_f32(f32x4);
+
+	/* store 4 elements */
+	vst1q_u32(output, u32x4);
+}
+
+static inline void
+__float32_to_uint32_neon_u32x1(float scale, float *input, uint32_t *output)
+{
+	/* scale and convert, round to nearest with ties away rounding mode */
+	*output = vcvtas_u32_f32(scale * (*input));
+}
+
+int
+rte_ml_io_float32_to_uint32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	uint32_t *output_buffer;
+	uint64_t nb_iterations;
+	uint64_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (uint32_t *)output;
+	vlen = 2 * sizeof(float) / sizeof(uint32_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__float32_to_uint32_neon_u32x4(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__float32_to_uint32_neon_u32x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+static inline void
+__uint32_to_float32_neon_f32x4(float scale, uint32_t *input, float *output)
+{
+	float32x4_t f32x4;
+	uint32x4_t u32x4;
+
+	/* load 4 x uint32_t elements */
+	u32x4 = vld1q_u32(input);
+
+	/* convert uint32_t to float */
+	f32x4 = vcvtq_f32_u32(u32x4);
+
+	/* scale */
+	f32x4 = vmulq_n_f32(f32x4, scale);
+
+	/* store float32x4_t */
+	vst1q_f32(output, f32x4);
+}
+
+static inline void
+__uint32_to_float32_neon_f32x1(float scale, uint32_t *input, float *output)
+{
+	*output = scale * vcvts_f32_u32(*input);
+}
+
+int
+rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	uint32_t *input_buffer;
+	float *output_buffer;
+	uint64_t nb_iterations;
+	uint32_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (uint32_t *)input;
+	output_buffer = (float *)output;
+	vlen = 2 * sizeof(float) / sizeof(uint32_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__uint32_to_float32_neon_f32x4(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__uint32_to_float32_neon_f32x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
 static inline void
 __float32_to_float16_neon_f16x4(float32_t *input, float16_t *output)
 {
diff --git a/lib/mldev/mldev_utils_scalar.c b/lib/mldev/mldev_utils_scalar.c
index 4d6cb880240..af1a3a103b2 100644
--- a/lib/mldev/mldev_utils_scalar.c
+++ b/lib/mldev/mldev_utils_scalar.c
@@ -229,6 +229,104 @@ rte_ml_io_uint16_to_float32(float scale, uint64_t nb_elements, void *input, void
 	return 0;
 }
 
+int
+rte_ml_io_float32_to_int32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	int32_t *output_buffer;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (int32_t *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		*output_buffer = (int32_t)round((*input_buffer) * scale);
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+int
+rte_ml_io_int32_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	int32_t *input_buffer;
+	float *output_buffer;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (int32_t *)input;
+	output_buffer = (float *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		*output_buffer = scale * (float)(*input_buffer);
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+int
+rte_ml_io_float32_to_uint32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	uint32_t *output_buffer;
+	int32_t i32;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (uint32_t *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		i32 = (int32_t)round((*input_buffer) * scale);
+
+		if (i32 < 0)
+			i32 = 0;
+
+		*output_buffer = (uint32_t)i32;
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+int
+rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	uint32_t *input_buffer;
+	float *output_buffer;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (uint32_t *)input;
+	output_buffer = (float *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		*output_buffer = scale * (float)(*input_buffer);
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
 /* Convert a single precision floating point number (float32) into a half precision
  * floating point number (float16) using round to nearest rounding mode.
  */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 99841db6aa9..2e8f1555225 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -57,6 +57,10 @@ INTERNAL {
 	rte_ml_io_int16_to_float32;
 	rte_ml_io_float32_to_uint16;
 	rte_ml_io_uint16_to_float32;
+	rte_ml_io_float32_to_int32;
+	rte_ml_io_int32_to_float32;
+	rte_ml_io_float32_to_uint32;
+	rte_ml_io_uint32_to_float32;
 	rte_ml_io_float32_to_float16;
 	rte_ml_io_float16_to_float32;
 	rte_ml_io_float32_to_bfloat16;

From patchwork Sun Jan  7 15:28:11 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Srikanth Yalavarthi <syalavarthi@marvell.com>
X-Patchwork-Id: 135777
X-Patchwork-Delegate: thomas@monjalon.net
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@inbox.dpdk.org
Delivered-To: patchwork@inbox.dpdk.org
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id BB01043857;
	Sun,  7 Jan 2024 16:28:34 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 6258E402F1;
	Sun,  7 Jan 2024 16:28:27 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com
 [67.231.156.173])
 by mails.dpdk.org (Postfix) with ESMTP id 9015240608
 for <dev@dpdk.org>; Sun,  7 Jan 2024 16:28:25 +0100 (CET)
Received: from pps.filterd (m0045851.ppops.net [127.0.0.1])
 by mx0b-0016f401.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id
 407Et9Wf023658; Sun, 7 Jan 2024 07:28:22 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=
 pfpt0220; bh=zOEwlv6b+T8O6bWdzno1nkOK2Bt4Zms8Xc1QYDR0TVs=; b=b4q
 KS0MnSiCTwAUtYmPLeYZGsy6OfgMPzzV08C/DnfoZbxCAQN+5e0S5ZvBnN+Urm0W
 efDQw1az9Cz6CIlPEND7Gx4U53xJF3cjOE2yLrvSy8y1ovwVMdYX+r0ZPMTt6KAL
 8xDuR6Yft+VzE2YRJa9G6yjLq77StfSR1QYJ+QKSn/fe3hCPIO5ANAuivxoI50nD
 kRSsHU5xiSOh/HfNf529Tpjkycr08XFhO3uf80wiSfcl6XlI0wrpwtnCoEZSzyql
 Fbvx5FC3+aSksGoObvM4kDDJB0EbYPvIIcnYK1wZ1BoMwvNxuMbhvZHGQiiEfZtT
 XDrollI7C5Mah98xlaA==
Received: from dc5-exch01.marvell.com ([199.233.59.181])
 by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3vf78n29nt-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT);
 Sun, 07 Jan 2024 07:28:22 -0800 (PST)
Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com
 (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48;
 Sun, 7 Jan 2024 07:28:20 -0800
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.48 via Frontend
 Transport; Sun, 7 Jan 2024 07:28:20 -0800
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id 54D8D3F709E;
 Sun,  7 Jan 2024 07:28:20 -0800 (PST)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>, Ruifeng Wang
 <ruifeng.wang@arm.com>
CC: <dev@dpdk.org>, <aprabhu@marvell.com>, <sshankarnara@marvell.com>,
 <ptakkar@marvell.com>
Subject: [PATCH 2/3] mldev: add support for 64-integer data type
Date: Sun, 7 Jan 2024 07:28:11 -0800
Message-ID: <20240107152813.2668-3-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20240107152813.2668-1-syalavarthi@marvell.com>
References: <20240107152813.2668-1-syalavarthi@marvell.com>
MIME-Version: 1.0
X-Proofpoint-ORIG-GUID: PJz01FH56BpV9nNx_rbpR3vFulNzzail
X-Proofpoint-GUID: PJz01FH56BpV9nNx_rbpR3vFulNzzail
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26
 definitions=2023-12-09_02,2023-12-07_01,2023-05-22_02
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Added support in mldev spec for 64-bit integer types. Added
routines to convert data from 64-bit integer type to float32_t
and vice-versa.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 lib/mldev/mldev_utils.c        |   4 +
 lib/mldev/mldev_utils.h        |  92 ++++++++++
 lib/mldev/mldev_utils_neon.c   | 324 +++++++++++++++++++++++++++++++++
 lib/mldev/mldev_utils_scalar.c |  98 ++++++++++
 lib/mldev/rte_mldev.h          |   4 +
 lib/mldev/version.map          |   4 +
 6 files changed, 526 insertions(+)

diff --git a/lib/mldev/mldev_utils.c b/lib/mldev/mldev_utils.c
index ccd2c39ca89..13ac615e9fc 100644
--- a/lib/mldev/mldev_utils.c
+++ b/lib/mldev/mldev_utils.c
@@ -32,6 +32,10 @@ rte_ml_io_type_size_get(enum rte_ml_io_type type)
 		return sizeof(int32_t);
 	case RTE_ML_IO_TYPE_UINT32:
 		return sizeof(uint32_t);
+	case RTE_ML_IO_TYPE_INT64:
+		return sizeof(int64_t);
+	case RTE_ML_IO_TYPE_UINT64:
+		return sizeof(uint64_t);
 	case RTE_ML_IO_TYPE_FP8:
 		return sizeof(uint8_t);
 	case RTE_ML_IO_TYPE_FP16:
diff --git a/lib/mldev/mldev_utils.h b/lib/mldev/mldev_utils.h
index 1d041531b43..6daae6d0a1c 100644
--- a/lib/mldev/mldev_utils.h
+++ b/lib/mldev/mldev_utils.h
@@ -328,6 +328,98 @@ __rte_internal
 int
 rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void *output);
 
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in single precision floating format (float32) to signed
+ * 64-bit integer format (INT64).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ * @param[out] output
+ *	Output buffer to store INT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_float32_to_int64(float scale, uint64_t nb_elements, void *input, void *output);
+
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in signed 64-bit integer format (INT64) to single precision
+ * floating format (float32).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing INT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes.
+ * @param[out] output
+ *	Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_int64_to_float32(float scale, uint64_t nb_elements, void *input, void *output);
+
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in single precision floating format (float32) to unsigned
+ * 64-bit integer format (UINT64).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ * @param[out] output
+ *	Output buffer to store UINT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_float32_to_uint64(float scale, uint64_t nb_elements, void *input, void *output);
+
+/**
+ * @internal
+ *
+ * Convert a buffer containing numbers in unsigned 64-bit integer format (UINT64) to single
+ * precision floating format (float32).
+ *
+ * @param[in] scale
+ *      Scale factor for conversion.
+ * @param[in] nb_elements
+ *	Number of elements in the buffer.
+ * @param[in] input
+ *	Input buffer containing UINT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes.
+ * @param[out] output
+ *	Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes.
+ *
+ * @return
+ *	- 0, Success.
+ *	- < 0, Error code on failure.
+ */
+__rte_internal
+int
+rte_ml_io_uint64_to_float32(float scale, uint64_t nb_elements, void *input, void *output);
+
 /**
  * @internal
  *
diff --git a/lib/mldev/mldev_utils_neon.c b/lib/mldev/mldev_utils_neon.c
index 250fa43fa73..4cde2ebabd3 100644
--- a/lib/mldev/mldev_utils_neon.c
+++ b/lib/mldev/mldev_utils_neon.c
@@ -842,6 +842,330 @@ rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void
 	return 0;
 }
 
+static inline void
+__float32_to_int64_neon_s64x2(float scale, float *input, int64_t *output)
+{
+	float32x2_t f32x2;
+	float64x2_t f64x2;
+	int64x2_t s64x2;
+
+	/* load 2 x float elements */
+	f32x2 = vld1_f32(input);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* convert to float64x2_t */
+	f64x2 = vcvt_f64_f32(f32x2);
+
+	/* convert to int64x2_t */
+	s64x2 = vcvtaq_s64_f64(f64x2);
+
+	/* store 2 elements */
+	vst1q_s64(output, s64x2);
+}
+
+static inline void
+__float32_to_int64_neon_s64x1(float scale, float *input, int64_t *output)
+{
+	float32x2_t f32x2;
+	float64x2_t f64x2;
+	int64x2_t s64x2;
+
+	/* load 1 x float element */
+	f32x2 = vdup_n_f32(*input);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* convert to float64x2_t */
+	f64x2 = vcvt_f64_f32(f32x2);
+
+	/* convert to int64x2_t */
+	s64x2 = vcvtaq_s64_f64(f64x2);
+
+	/* store lane 0 of int64x2_t */
+	vst1q_lane_s64(output, s64x2, 0);
+}
+
+int
+rte_ml_io_float32_to_int64(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	int64_t *output_buffer;
+	uint64_t nb_iterations;
+	uint32_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (int64_t *)output;
+	vlen = 4 * sizeof(float) / sizeof(int64_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__float32_to_int64_neon_s64x2(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__float32_to_int64_neon_s64x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+static inline void
+__int64_to_float32_neon_f32x2(float scale, int64_t *input, float *output)
+{
+	int64x2_t s64x2;
+	float64x2_t f64x2;
+	float32x2_t f32x2;
+
+	/* load 2 x int64_t elements */
+	s64x2 = vld1q_s64(input);
+
+	/* convert int64x2_t to float64x2_t */
+	f64x2 = vcvtq_f64_s64(s64x2);
+
+	/* convert float64x2_t to float32x2_t */
+	f32x2 = vcvt_f32_f64(f64x2);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* store float32x2_t */
+	vst1_f32(output, f32x2);
+}
+
+static inline void
+__int64_to_float32_neon_f32x1(float scale, int64_t *input, float *output)
+{
+	int64x2_t s64x2;
+	float64x2_t f64x2;
+	float32x2_t f32x2;
+
+	/* load 2 x int64_t elements */
+	s64x2 = vld1q_lane_s64(input, vdupq_n_s64(0), 0);
+
+	/* convert int64x2_t to float64x2_t */
+	f64x2 = vcvtq_f64_s64(s64x2);
+
+	/* convert float64x2_t to float32x2_t */
+	f32x2 = vcvt_f32_f64(f64x2);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* store float32x2_t */
+	vst1_lane_f32(output, f32x2, 0);
+}
+
+int
+rte_ml_io_int64_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	int64_t *input_buffer;
+	float *output_buffer;
+	uint64_t nb_iterations;
+	uint32_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (int64_t *)input;
+	output_buffer = (float *)output;
+	vlen = 4 * sizeof(float) / sizeof(int64_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__int64_to_float32_neon_f32x2(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__int64_to_float32_neon_f32x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+static inline void
+__float32_to_uint64_neon_u64x2(float scale, float *input, uint64_t *output)
+{
+	float32x2_t f32x2;
+	float64x2_t f64x2;
+	uint64x2_t u64x2;
+
+	/* load 2 x float elements */
+	f32x2 = vld1_f32(input);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* convert to float64x2_t */
+	f64x2 = vcvt_f64_f32(f32x2);
+
+	/* convert to int64x2_t */
+	u64x2 = vcvtaq_u64_f64(f64x2);
+
+	/* store 2 elements */
+	vst1q_u64(output, u64x2);
+}
+
+static inline void
+__float32_to_uint64_neon_u64x1(float scale, float *input, uint64_t *output)
+{
+	float32x2_t f32x2;
+	float64x2_t f64x2;
+	uint64x2_t u64x2;
+
+	/* load 1 x float element */
+	f32x2 = vld1_lane_f32(input, vdup_n_f32(0), 0);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* convert to float64x2_t */
+	f64x2 = vcvt_f64_f32(f32x2);
+
+	/* convert to int64x2_t */
+	u64x2 = vcvtaq_u64_f64(f64x2);
+
+	/* store 2 elements */
+	vst1q_lane_u64(output, u64x2, 0);
+}
+
+int
+rte_ml_io_float32_to_uint64(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	uint64_t *output_buffer;
+	uint64_t nb_iterations;
+	uint32_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (uint64_t *)output;
+	vlen = 4 * sizeof(float) / sizeof(uint64_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__float32_to_uint64_neon_u64x2(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__float32_to_uint64_neon_u64x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+static inline void
+__uint64_to_float32_neon_f32x2(float scale, uint64_t *input, float *output)
+{
+	uint64x2_t u64x2;
+	float64x2_t f64x2;
+	float32x2_t f32x2;
+
+	/* load 2 x int64_t elements */
+	u64x2 = vld1q_u64(input);
+
+	/* convert int64x2_t to float64x2_t */
+	f64x2 = vcvtq_f64_u64(u64x2);
+
+	/* convert float64x2_t to float32x2_t */
+	f32x2 = vcvt_f32_f64(f64x2);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* store float32x2_t */
+	vst1_f32(output, f32x2);
+}
+
+static inline void
+__uint64_to_float32_neon_f32x1(float scale, uint64_t *input, float *output)
+{
+	uint64x2_t u64x2;
+	float64x2_t f64x2;
+	float32x2_t f32x2;
+
+	/* load 2 x int64_t elements */
+	u64x2 = vld1q_lane_u64(input, vdupq_n_u64(0), 0);
+
+	/* convert int64x2_t to float64x2_t */
+	f64x2 = vcvtq_f64_u64(u64x2);
+
+	/* convert float64x2_t to float32x2_t */
+	f32x2 = vcvt_f32_f64(f64x2);
+
+	/* scale */
+	f32x2 = vmul_n_f32(f32x2, scale);
+
+	/* store float32x2_t */
+	vst1_lane_f32(output, f32x2, 0);
+}
+
+int
+rte_ml_io_uint64_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	uint64_t *input_buffer;
+	float *output_buffer;
+	uint64_t nb_iterations;
+	uint32_t vlen;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (uint64_t *)input;
+	output_buffer = (float *)output;
+	vlen = 4 * sizeof(float) / sizeof(uint64_t);
+	nb_iterations = nb_elements / vlen;
+
+	/* convert vlen elements in each iteration */
+	for (i = 0; i < nb_iterations; i++) {
+		__uint64_to_float32_neon_f32x2(scale, input_buffer, output_buffer);
+		input_buffer += vlen;
+		output_buffer += vlen;
+	}
+
+	/* convert leftover elements */
+	i = i * vlen;
+	for (; i < nb_elements; i++) {
+		__uint64_to_float32_neon_f32x1(scale, input_buffer, output_buffer);
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
 static inline void
 __float32_to_float16_neon_f16x4(float32_t *input, float16_t *output)
 {
diff --git a/lib/mldev/mldev_utils_scalar.c b/lib/mldev/mldev_utils_scalar.c
index af1a3a103b2..63a9900cc8c 100644
--- a/lib/mldev/mldev_utils_scalar.c
+++ b/lib/mldev/mldev_utils_scalar.c
@@ -327,6 +327,104 @@ rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void
 	return 0;
 }
 
+int
+rte_ml_io_float32_to_int64(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	int64_t *output_buffer;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (int64_t *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		*output_buffer = (int64_t)round((*input_buffer) * scale);
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+int
+rte_ml_io_int64_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	int64_t *input_buffer;
+	float *output_buffer;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (int64_t *)input;
+	output_buffer = (float *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		*output_buffer = scale * (float)(*input_buffer);
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+int
+rte_ml_io_float32_to_uint64(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	float *input_buffer;
+	uint64_t *output_buffer;
+	int64_t i64;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (float *)input;
+	output_buffer = (uint64_t *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		i64 = (int64_t)round((*input_buffer) * scale);
+
+		if (i64 < 0)
+			i64 = 0;
+
+		*output_buffer = (uint64_t)i64;
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
+int
+rte_ml_io_uint64_to_float32(float scale, uint64_t nb_elements, void *input, void *output)
+{
+	uint64_t *input_buffer;
+	float *output_buffer;
+	uint64_t i;
+
+	if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL))
+		return -EINVAL;
+
+	input_buffer = (uint64_t *)input;
+	output_buffer = (float *)output;
+
+	for (i = 0; i < nb_elements; i++) {
+		*output_buffer = scale * (float)(*input_buffer);
+
+		input_buffer++;
+		output_buffer++;
+	}
+
+	return 0;
+}
+
 /* Convert a single precision floating point number (float32) into a half precision
  * floating point number (float16) using round to nearest rounding mode.
  */
diff --git a/lib/mldev/rte_mldev.h b/lib/mldev/rte_mldev.h
index 5cf6f0566f1..27e372fbcf1 100644
--- a/lib/mldev/rte_mldev.h
+++ b/lib/mldev/rte_mldev.h
@@ -874,6 +874,10 @@ enum rte_ml_io_type {
 	/**< 32-bit integer */
 	RTE_ML_IO_TYPE_UINT32,
 	/**< 32-bit unsigned integer */
+	RTE_ML_IO_TYPE_INT64,
+	/**< 32-bit integer */
+	RTE_ML_IO_TYPE_UINT64,
+	/**< 32-bit unsigned integer */
 	RTE_ML_IO_TYPE_FP8,
 	/**< 8-bit floating point number */
 	RTE_ML_IO_TYPE_FP16,
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 2e8f1555225..1978695314e 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -61,6 +61,10 @@ INTERNAL {
 	rte_ml_io_int32_to_float32;
 	rte_ml_io_float32_to_uint32;
 	rte_ml_io_uint32_to_float32;
+	rte_ml_io_float32_to_int64;
+	rte_ml_io_int64_to_float32;
+	rte_ml_io_float32_to_uint64;
+	rte_ml_io_uint64_to_float32;
 	rte_ml_io_float32_to_float16;
 	rte_ml_io_float16_to_float32;
 	rte_ml_io_float32_to_bfloat16;

From patchwork Sun Jan  7 15:28:12 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Srikanth Yalavarthi <syalavarthi@marvell.com>
X-Patchwork-Id: 135776
X-Patchwork-Delegate: thomas@monjalon.net
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@inbox.dpdk.org
Delivered-To: patchwork@inbox.dpdk.org
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 44B4243857;
	Sun,  7 Jan 2024 16:28:29 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 44F91402F2;
	Sun,  7 Jan 2024 16:28:25 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com
 [67.231.156.173])
 by mails.dpdk.org (Postfix) with ESMTP id 0834540263
 for <dev@dpdk.org>; Sun,  7 Jan 2024 16:28:23 +0100 (CET)
Received: from pps.filterd (m0045851.ppops.net [127.0.0.1])
 by mx0b-0016f401.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id
 407Eu6FX026032 for <dev@dpdk.org>; Sun, 7 Jan 2024 07:28:23 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=
 pfpt0220; bh=LDF2mYxDWtGmb32hjuxaH/5IWDyinavc2kVKiHOro/M=; b=J/5
 id1yCczaDNPyDuLX4ctPgWrhz9iCS1M2bh8vMtbRw4HwFpVSclVJiD2dN3WJut5A
 /QMCqDeQxEKnLAtVFzb/UOdEhXz1TEQ++28pFOnhX0eDCMFb/3/rVaw8eYBbxnMg
 k1r/97pxt0y4pBk4nTaK2FdsTZePwvYRXAfgaxA/6LISDYEcuIXs+SeaRdp/vCd0
 e8AXbM+xf0v1f4ks920X/mUp/tuWu2htcbMkEFaLTVLpGpLKfdEmRtnT0SZ3x3/m
 gY/MmTlHhcbaCbjGmKxeEZax6RsVKCMbyU6P/A4ES9U+u8lqVXMf9Cuw5up5qBMT
 U17pXN2obinoiWBOi+Q==
Received: from dc5-exch02.marvell.com ([199.233.59.182])
 by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3vf78n29ns-3
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT)
 for <dev@dpdk.org>; Sun, 07 Jan 2024 07:28:23 -0800 (PST)
Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48;
 Sun, 7 Jan 2024 07:28:20 -0800
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.48 via Frontend
 Transport; Sun, 7 Jan 2024 07:28:20 -0800
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id A3D1A3F70A1;
 Sun,  7 Jan 2024 07:28:20 -0800 (PST)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>
CC: <dev@dpdk.org>, <aprabhu@marvell.com>, <sshankarnara@marvell.com>,
 <ptakkar@marvell.com>
Subject: [PATCH 3/3] ml/cnxk: add support for additional integer types
Date: Sun, 7 Jan 2024 07:28:12 -0800
Message-ID: <20240107152813.2668-4-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20240107152813.2668-1-syalavarthi@marvell.com>
References: <20240107152813.2668-1-syalavarthi@marvell.com>
MIME-Version: 1.0
X-Proofpoint-ORIG-GUID: qgIbeJMzvA-wm8ta814XaySNDr5U_MGl
X-Proofpoint-GUID: qgIbeJMzvA-wm8ta814XaySNDr5U_MGl
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26
 definitions=2023-12-09_02,2023-12-07_01,2023-05-22_02
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Added support quantization and dequantization of 32-bit
and 64-bit integer types.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 drivers/ml/cnxk/cnxk_ml_io.c     | 24 ++++++++++++++++++++++++
 drivers/ml/cnxk/mvtvm_ml_model.c |  4 ++++
 2 files changed, 28 insertions(+)

diff --git a/drivers/ml/cnxk/cnxk_ml_io.c b/drivers/ml/cnxk/cnxk_ml_io.c
index c78009ab0cd..4b0adc2ae47 100644
--- a/drivers/ml/cnxk/cnxk_ml_io.c
+++ b/drivers/ml/cnxk/cnxk_ml_io.c
@@ -40,6 +40,18 @@ cnxk_ml_io_quantize_single(struct cnxk_ml_io *input, uint8_t *dbuffer, uint8_t *
 		case RTE_ML_IO_TYPE_UINT16:
 			ret = rte_ml_io_float32_to_uint16(qscale, nb_elements, dbuffer, qbuffer);
 			break;
+		case RTE_ML_IO_TYPE_INT32:
+			ret = rte_ml_io_float32_to_int32(qscale, nb_elements, dbuffer, qbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT32:
+			ret = rte_ml_io_float32_to_uint32(qscale, nb_elements, dbuffer, qbuffer);
+			break;
+		case RTE_ML_IO_TYPE_INT64:
+			ret = rte_ml_io_float32_to_int64(qscale, nb_elements, dbuffer, qbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT64:
+			ret = rte_ml_io_float32_to_uint64(qscale, nb_elements, dbuffer, qbuffer);
+			break;
 		case RTE_ML_IO_TYPE_FP16:
 			ret = rte_ml_io_float32_to_float16(nb_elements, dbuffer, qbuffer);
 			break;
@@ -82,6 +94,18 @@ cnxk_ml_io_dequantize_single(struct cnxk_ml_io *output, uint8_t *qbuffer, uint8_
 		case RTE_ML_IO_TYPE_UINT16:
 			ret = rte_ml_io_uint16_to_float32(dscale, nb_elements, qbuffer, dbuffer);
 			break;
+		case RTE_ML_IO_TYPE_INT32:
+			ret = rte_ml_io_int32_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT32:
+			ret = rte_ml_io_uint32_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
+		case RTE_ML_IO_TYPE_INT64:
+			ret = rte_ml_io_int64_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
+		case RTE_ML_IO_TYPE_UINT64:
+			ret = rte_ml_io_uint64_to_float32(dscale, nb_elements, qbuffer, dbuffer);
+			break;
 		case RTE_ML_IO_TYPE_FP16:
 			ret = rte_ml_io_float16_to_float32(nb_elements, qbuffer, dbuffer);
 			break;
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c
index 0dbe08e9889..e3234ae4422 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.c
+++ b/drivers/ml/cnxk/mvtvm_ml_model.c
@@ -150,6 +150,8 @@ mvtvm_ml_io_type_map(DLDataType dltype)
 			return RTE_ML_IO_TYPE_INT16;
 		else if (dltype.bits == 32)
 			return RTE_ML_IO_TYPE_INT32;
+		else if (dltype.bits == 64)
+			return RTE_ML_IO_TYPE_INT64;
 		break;
 	case kDLUInt:
 		if (dltype.bits == 8)
@@ -158,6 +160,8 @@ mvtvm_ml_io_type_map(DLDataType dltype)
 			return RTE_ML_IO_TYPE_UINT16;
 		else if (dltype.bits == 32)
 			return RTE_ML_IO_TYPE_UINT32;
+		else if (dltype.bits == 64)
+			return RTE_ML_IO_TYPE_UINT64;
 		break;
 	case kDLFloat:
 		if (dltype.bits == 8)