diff mbox

[dpdk-dev,v2] baseband/turbo_sw: update Turbo Software driver

Message ID 20180417144359.29232-1-kamilx.chalupnik@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Pablo de Lara Guarch
Headers show

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail apply patch file failure

Commit Message

Kamil Chalupnik April 17, 2018, 2:43 p.m. UTC
Update Turbo Software driver for Wireless Baseband Device:
- support for optional CRC overlap in decode processing implemented
- function scaling input LLR values to specific range [-16, 16] added
- sizes of the internal buffers used by decoding were increased due to
  problem with memory for large test vectors
- new test vectors to check device capabilities added

Signed-off-by: KamilX Chalupnik <kamilx.chalupnik@intel.com>

v2:
- logging macros fixed

---
 app/test-bbdev/Makefile                            |  2 +
 app/test-bbdev/test_bbdev_perf.c                   | 44 ++++++++++++++-
 app/test-bbdev/test_bbdev_vector.c                 |  2 +
 .../test_vectors/turbo_enc_c1_k40_r0_e1190_rm.data | 36 ++++++++++++
 .../test_vectors/turbo_enc_c1_k40_r0_e1194_rm.data | 36 ++++++++++++
 .../test_vectors/turbo_enc_c1_k40_r0_e1196_rm.data | 36 ++++++++++++
 .../test_vectors/turbo_enc_c1_k40_r0_e272_rm.data  | 33 +++++++++++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 64 +++++++++++++++-------
 lib/librte_bbdev/rte_bbdev_op.h                    | 10 +++-
 9 files changed, 241 insertions(+), 22 deletions(-)
 create mode 100644 app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1190_rm.data
 create mode 100644 app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1194_rm.data
 create mode 100644 app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1196_rm.data
 create mode 100644 app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e272_rm.data

Comments

Mokhtar, Amr April 17, 2018, 4:56 p.m. UTC | #1
> -----Original Message-----
> From: Chalupnik, KamilX
> Sent: Tuesday 17 April 2018 15:44
> To: dev@dpdk.org
> Cc: Mokhtar, Amr <amr.mokhtar@intel.com>; Chalupnik, KamilX
> <kamilx.chalupnik@intel.com>
> Subject: [PATCH v2] baseband/turbo_sw: update Turbo Software driver
> 
> Update Turbo Software driver for Wireless Baseband Device:
> - support for optional CRC overlap in decode processing implemented
> - function scaling input LLR values to specific range [-16, 16] added
> - sizes of the internal buffers used by decoding were increased due to
>   problem with memory for large test vectors
> - new test vectors to check device capabilities added
> 
> Signed-off-by: KamilX Chalupnik <kamilx.chalupnik@intel.com>
> 
> v2:
> - logging macros fixed
> 
> ---

There is a dependency between (baseband/turbo_sw) patches. Should be applied in this order:

1. baseband/turbo_sw: splitting Queue Groups
2. baseband/turbo_sw: offload cost measurement test
3. baseband/turbo_sw: optimization of turbo software driver
4. baseband/turbo_sw: update Turbo Software driver

Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
De Lara Guarch, Pablo April 19, 2018, 2:31 p.m. UTC | #2
Hi,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Mokhtar, Amr
> Sent: Tuesday, April 17, 2018 5:56 PM
> To: Chalupnik, KamilX <kamilx.chalupnik@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo Software
> driver
> 
> 
> > -----Original Message-----
> > From: Chalupnik, KamilX
> > Sent: Tuesday 17 April 2018 15:44
> > To: dev@dpdk.org
> > Cc: Mokhtar, Amr <amr.mokhtar@intel.com>; Chalupnik, KamilX
> > <kamilx.chalupnik@intel.com>
> > Subject: [PATCH v2] baseband/turbo_sw: update Turbo Software driver
> >
> > Update Turbo Software driver for Wireless Baseband Device:
> > - support for optional CRC overlap in decode processing implemented
> > - function scaling input LLR values to specific range [-16, 16] added
> > - sizes of the internal buffers used by decoding were increased due to
> >   problem with memory for large test vectors
> > - new test vectors to check device capabilities added
> >
> > Signed-off-by: KamilX Chalupnik <kamilx.chalupnik@intel.com>
> >
> > v2:
> > - logging macros fixed
> >
> > ---
> 
> There is a dependency between (baseband/turbo_sw) patches. Should be applied
> in this order:
> 
> 1. baseband/turbo_sw: splitting Queue Groups 2. baseband/turbo_sw: offload
> cost measurement test 3. baseband/turbo_sw: optimization of turbo software
> driver 4. baseband/turbo_sw: update Turbo Software driver
> 
> Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>

For next time, send a patchset, so the order is clear.

Thanks,
Pablo
De Lara Guarch, Pablo April 24, 2018, 5:55 p.m. UTC | #3
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of KamilX Chalupnik
> Sent: Tuesday, April 17, 2018 3:44 PM
> To: dev@dpdk.org
> Cc: Mokhtar, Amr <amr.mokhtar@intel.com>; Chalupnik, KamilX
> <kamilx.chalupnik@intel.com>
> Subject: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo Software
> driver
> 
> Update Turbo Software driver for Wireless Baseband Device:
> - support for optional CRC overlap in decode processing implemented
> - function scaling input LLR values to specific range [-16, 16] added
> - sizes of the internal buffers used by decoding were increased due to
>   problem with memory for large test vectors
> - new test vectors to check device capabilities added
> 

Split this patch into multiple patches, each one doing a single item of your above list.
Again, make sure that it can be compiled and that is functional along the patches.

Thanks,
Pablo
Mokhtar, Amr April 24, 2018, 6:53 p.m. UTC | #4
> -----Original Message-----
> From: De Lara Guarch, Pablo
> Sent: Tuesday 24 April 2018 18:56
> To: Chalupnik, KamilX <kamilx.chalupnik@intel.com>; dev@dpdk.org
> Cc: Mokhtar, Amr <amr.mokhtar@intel.com>; Chalupnik, KamilX
> <kamilx.chalupnik@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo
> Software driver
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of KamilX Chalupnik
> > Sent: Tuesday, April 17, 2018 3:44 PM
> > To: dev@dpdk.org
> > Cc: Mokhtar, Amr <amr.mokhtar@intel.com>; Chalupnik, KamilX
> > <kamilx.chalupnik@intel.com>
> > Subject: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo
> Software
> > driver
> >
> > Update Turbo Software driver for Wireless Baseband Device:
> > - support for optional CRC overlap in decode processing implemented
> > - function scaling input LLR values to specific range [-16, 16] added
> > - sizes of the internal buffers used by decoding were increased due to
> >   problem with memory for large test vectors
> > - new test vectors to check device capabilities added
> >
> 
> Split this patch into multiple patches, each one doing a single item of your
> above list.
> Again, make sure that it can be compiled and that is functional along the
> patches.
> 

Too much splits is a bit an overkill.
All the above changes are enhancements of Turbo coding operations.
They all fall under one common topic and appears like they are good to stay
combined in one patch.
The new test vectors are related to the added enhancements.

> Thanks,
> Pablo
De Lara Guarch, Pablo April 25, 2018, 7:37 a.m. UTC | #5
Hi Amr,

> -----Original Message-----
> From: Mokhtar, Amr
> Sent: Tuesday, April 24, 2018 7:53 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Chalupnik, KamilX
> <kamilx.chalupnik@intel.com>; dev@dpdk.org
> Cc: Chalupnik, KamilX <kamilx.chalupnik@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo Software
> driver
> 
> 
> > -----Original Message-----
> > From: De Lara Guarch, Pablo
> > Sent: Tuesday 24 April 2018 18:56
> > To: Chalupnik, KamilX <kamilx.chalupnik@intel.com>; dev@dpdk.org
> > Cc: Mokhtar, Amr <amr.mokhtar@intel.com>; Chalupnik, KamilX
> > <kamilx.chalupnik@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo
> > Software driver
> >
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of KamilX
> > > Chalupnik
> > > Sent: Tuesday, April 17, 2018 3:44 PM
> > > To: dev@dpdk.org
> > > Cc: Mokhtar, Amr <amr.mokhtar@intel.com>; Chalupnik, KamilX
> > > <kamilx.chalupnik@intel.com>
> > > Subject: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo
> > Software
> > > driver
> > >
> > > Update Turbo Software driver for Wireless Baseband Device:
> > > - support for optional CRC overlap in decode processing implemented
> > > - function scaling input LLR values to specific range [-16, 16]
> > > added
> > > - sizes of the internal buffers used by decoding were increased due to
> > >   problem with memory for large test vectors
> > > - new test vectors to check device capabilities added
> > >
> >
> > Split this patch into multiple patches, each one doing a single item
> > of your above list.
> > Again, make sure that it can be compiled and that is functional along
> > the patches.
> >
> 
> Too much splits is a bit an overkill.
> All the above changes are enhancements of Turbo coding operations.
> They all fall under one common topic and appears like they are good to stay
> combined in one patch.
> The new test vectors are related to the added enhancements.

I understand that they fall under the same top, that's why you should send them
in the same patchset. In DPDK, we aim at shorter patches (when possible),
with are easier to review. We tend to avoid patches making multiple changes, when they
can be breakable (generally, when you have a list of changes in your commit message,
that means they should go into separate patches).

Thanks,
Pablo
diff mbox

Patch

diff --git a/app/test-bbdev/Makefile b/app/test-bbdev/Makefile
index 9aedd77..6da0c8e 100644
--- a/app/test-bbdev/Makefile
+++ b/app/test-bbdev/Makefile
@@ -20,4 +20,6 @@  SRCS-$(CONFIG_RTE_TEST_BBDEV) += test_bbdev.c
 SRCS-$(CONFIG_RTE_TEST_BBDEV) += test_bbdev_perf.c
 SRCS-$(CONFIG_RTE_TEST_BBDEV) += test_bbdev_vector.c
 
+LDLIBS += -lm
+
 include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index be2e20c..812787c 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -4,6 +4,7 @@ 
 
 #include <stdio.h>
 #include <inttypes.h>
+#include <math.h>
 
 #include <rte_eal.h>
 #include <rte_common.h>
@@ -631,10 +632,32 @@  allocate_buffers_on_socket(struct rte_bbdev_op_data **buffers, const int len,
 	return (*buffers == NULL) ? TEST_FAILED : TEST_SUCCESS;
 }
 
+static void
+limit_input_llr_val_range(struct rte_bbdev_op_data *input_ops,
+		const uint16_t n, const int8_t max_llr_modulus)
+{
+	uint16_t i, byte_idx;
+
+	for (i = 0; i < n; ++i) {
+		struct rte_mbuf *m = input_ops[i].data;
+		while (m != NULL) {
+			int8_t *llr = rte_pktmbuf_mtod_offset(m, int8_t *,
+					input_ops[i].offset);
+			for (byte_idx = 0; byte_idx < input_ops[i].length;
+					++byte_idx)
+				llr[byte_idx] = round((double)max_llr_modulus *
+						llr[byte_idx] / INT8_MAX);
+
+			m = m->next;
+		}
+	}
+}
+
 static int
 fill_queue_buffers(struct test_op_params *op_params,
 		struct rte_mempool *in_mp, struct rte_mempool *hard_out_mp,
 		struct rte_mempool *soft_out_mp, uint16_t queue_id,
+		const struct rte_bbdev_op_cap *capabilities,
 		uint16_t min_alignment, const int socket_id)
 {
 	int ret;
@@ -671,6 +694,10 @@  fill_queue_buffers(struct test_op_params *op_params,
 				"Couldn't init rte_bbdev_op_data structs");
 	}
 
+	if (test_vector.op_type == RTE_BBDEV_OP_TURBO_DEC)
+		limit_input_llr_val_range(*queue_ops[DATA_INPUT], n,
+			capabilities->cap.turbo_dec.max_llr_modulus);
+
 	return 0;
 }
 
@@ -1017,6 +1044,7 @@  run_test_case_on_device(test_case_function *test_case_func, uint8_t dev_id,
 	struct active_device *ad;
 	unsigned int burst_sz = get_burst_sz();
 	enum rte_bbdev_op_type op_type = test_vector.op_type;
+	const struct rte_bbdev_op_cap *capabilities = NULL;
 
 	ad = &active_devs[dev_id];
 
@@ -1049,9 +1077,20 @@  run_test_case_on_device(test_case_function *test_case_func, uint8_t dev_id,
 		goto fail;
 	}
 
-	if (test_vector.op_type == RTE_BBDEV_OP_TURBO_DEC)
+	if (test_vector.op_type == RTE_BBDEV_OP_TURBO_DEC) {
+		/* Find Decoder capabilities */
+		const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
+		while (cap->type != RTE_BBDEV_OP_NONE) {
+			if (cap->type == RTE_BBDEV_OP_TURBO_DEC) {
+				capabilities = cap;
+				break;
+			}
+		}
+		TEST_ASSERT_NOT_NULL(capabilities,
+				"Couldn't find Decoder capabilities");
+
 		create_reference_dec_op(op_params->ref_dec_op);
-	else if (test_vector.op_type == RTE_BBDEV_OP_TURBO_ENC)
+	} else if (test_vector.op_type == RTE_BBDEV_OP_TURBO_ENC)
 		create_reference_enc_op(op_params->ref_enc_op);
 
 	for (i = 0; i < ad->nb_queues; ++i) {
@@ -1060,6 +1099,7 @@  run_test_case_on_device(test_case_function *test_case_func, uint8_t dev_id,
 				ad->hard_out_mbuf_pool,
 				ad->soft_out_mbuf_pool,
 				ad->queue_ids[i],
+				capabilities,
 				info.drv.min_alignment,
 				socket_id);
 		if (f_ret != TEST_SUCCESS) {
diff --git a/app/test-bbdev/test_bbdev_vector.c b/app/test-bbdev/test_bbdev_vector.c
index addef05..a37e35f 100644
--- a/app/test-bbdev/test_bbdev_vector.c
+++ b/app/test-bbdev/test_bbdev_vector.c
@@ -144,6 +144,8 @@  op_decoder_flag_strtoul(char *token, uint32_t *op_flag_value)
 		*op_flag_value = RTE_BBDEV_TURBO_MAP_DEC;
 	else if (!strcmp(token, "RTE_BBDEV_TURBO_DEC_SCATTER_GATHER"))
 		*op_flag_value = RTE_BBDEV_TURBO_DEC_SCATTER_GATHER;
+	else if (!strcmp(token, "RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP"))
+		*op_flag_value = RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP;
 	else {
 		printf("The given value is not a turbo decoder flag\n");
 		return -1;
diff --git a/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1190_rm.data b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1190_rm.data
new file mode 100644
index 0000000..6221756
--- /dev/null
+++ b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1190_rm.data
@@ -0,0 +1,36 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+
+op_type =
+RTE_BBDEV_OP_TURBO_ENC
+
+input0 =
+0x11D2BCAC, 0x4D
+
+output0 =
+0xD2399179, 0x640EB999, 0x2CBAF577, 0xAF224AE2, 0x9D139927, 0xE6909B29, 0xA25B7F47, 0x2AA224CE,
+0x399179F2, 0x0EB999D2, 0xBAF57764, 0x224AE22C, 0x139927AF, 0x909B299D, 0x5B7F47E6, 0xA224CEA2,
+0x9179F22A, 0xB999D239, 0xF577640E, 0x4AE22CBA, 0x9927AF22, 0x9B299D13, 0x7F47E690, 0x24CEA25B,
+0x79F22AA2, 0x99D23991, 0x77640EB9, 0xE22CBAF5, 0x27AF224A, 0x299D1399, 0x47E6909B, 0xCEA25B7F,
+0xF22AA224, 0xD2399179, 0x640EB999, 0x2CBAF577, 0xAF224AE2, 0x24
+
+e =
+1190
+
+k =
+40
+
+ncb =
+192
+
+rv_index =
+0
+
+code_block_mode =
+1
+
+op_flags =
+RTE_BBDEV_TURBO_RATE_MATCH
+
+expected_status =
+OK
diff --git a/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1194_rm.data b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1194_rm.data
new file mode 100644
index 0000000..c569abd
--- /dev/null
+++ b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1194_rm.data
@@ -0,0 +1,36 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+
+op_type =
+RTE_BBDEV_OP_TURBO_ENC
+
+input0 =
+0x11D2BCAC, 0x4D
+
+output0 =
+0xB3E8D6DF, 0xBC8A2889, 0x744E649E, 0x99436EA6, 0x8B6EFD1D, 0xAB889238, 0xE744E6C9, 0x39E4664A,
+0xE8D6DF91, 0x8A2889B3, 0x4E649EBC, 0x436EA674, 0x6EFD1D99, 0x8892388B, 0x44E6C9AB, 0xE4664AE7,
+0xD6DF9139, 0x2889B3E8, 0x649EBC8A, 0x6EA6744E, 0xFD1D9943, 0x92388B6E, 0xE6C9AB88, 0x664AE744,
+0xDF9139E4, 0x89B3E8D6, 0x9EBC8A28, 0xA6744E64, 0x1D99436E, 0x388B6EFD, 0xC9AB8892, 0x4AE744E6,
+0x9139E466, 0xB3E8D6DF, 0xBC8A2889, 0x744E649E, 0x99436EA6, 0xC01D
+
+e =
+1194
+
+k =
+40
+
+ncb =
+192
+
+rv_index =
+2
+
+code_block_mode =
+1
+
+op_flags =
+RTE_BBDEV_TURBO_RATE_MATCH
+
+expected_status =
+OK
diff --git a/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1196_rm.data b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1196_rm.data
new file mode 100644
index 0000000..72be6f5
--- /dev/null
+++ b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e1196_rm.data
@@ -0,0 +1,36 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+
+op_type =
+RTE_BBDEV_OP_TURBO_ENC
+
+input0 =
+0x11D2BCAC, 0x4D
+
+output0 =
+0xBC8A2889, 0x744E649E, 0x99436EA6, 0x8B6EFD1D, 0xAB889238, 0xE744E6C9, 0x39E4664A, 0xE8D6DF91,
+0x8A2889B3, 0x4E649EBC, 0x436EA674, 0x6EFD1D99, 0x8892388B, 0x44E6C9AB, 0xE4664AE7, 0xD6DF9139,
+0x2889B3E8, 0x649EBC8A, 0x6EA6744E, 0xFD1D9943, 0x92388B6E, 0xE6C9AB88, 0x664AE744, 0xDF9139E4,
+0x89B3E8D6, 0x9EBC8A28, 0xA6744E64, 0x1D99436E, 0x388B6EFD, 0xC9AB8892, 0x4AE744E6, 0x9139E466,
+0xB3E8D6DF, 0xBC8A2889, 0x744E649E, 0x99436EA6, 0x8B6EFD1D, 0x9038
+
+e =
+1196
+
+k =
+40
+
+ncb =
+192
+
+rv_index =
+3
+
+code_block_mode =
+1
+
+op_flags =
+RTE_BBDEV_TURBO_RATE_MATCH
+
+expected_status =
+OK
diff --git a/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e272_rm.data b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e272_rm.data
new file mode 100644
index 0000000..883a76c
--- /dev/null
+++ b/app/test-bbdev/test_vectors/turbo_enc_c1_k40_r0_e272_rm.data
@@ -0,0 +1,33 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+
+op_type =
+RTE_BBDEV_OP_TURBO_ENC
+
+input0 =
+0x11d2bcac, 0x4d
+
+output0 =
+0xd2399179, 0x640eb999, 0x2cbaf577, 0xaf224ae2, 0x9d139927, 0xe6909b29, 0xa25b7f47, 0x2aa224ce,
+0x79f2
+
+e =
+272
+
+k =
+40
+
+ncb =
+192
+
+rv_index =
+0
+
+code_block_mode =
+1
+
+op_flags =
+RTE_BBDEV_TURBO_RATE_MATCH
+
+expected_status =
+OK
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index de78e23..697f9d9 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -33,6 +33,10 @@  static int bbdev_turbo_sw_logtype;
 	rte_bbdev_log(DEBUG, RTE_STR(__LINE__) ":%s() " fmt, __func__, \
 		##__VA_ARGS__)
 
+#define DEINT_INPUT_BUF_SIZE (((RTE_BBDEV_MAX_CB_SIZE >> 3) + 1) * 48)
+#define DEINT_OUTPUT_BUF_SIZE (DEINT_INPUT_BUF_SIZE * 6)
+#define ADAPTER_OUTPUT_BUF_SIZE ((RTE_BBDEV_MAX_CB_SIZE + 4) * 48)
+
 /* private data structure */
 struct bbdev_private {
 	unsigned int max_nb_queues;  /**< Max number of queues */
@@ -135,7 +139,9 @@  info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info)
 					RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION,
+				.max_llr_modulus = 16,
 				.num_buffers_src = RTE_BBDEV_MAX_CODE_BLOCKS,
 				.num_buffers_hard_out =
 						RTE_BBDEV_MAX_CODE_BLOCKS,
@@ -286,7 +292,7 @@  q_setup(struct rte_bbdev *dev, uint16_t q_id,
 		return -ENAMETOOLONG;
 	}
 	q->code_block = rte_zmalloc_socket(name,
-			(6144 >> 3) * sizeof(*q->code_block),
+			RTE_BBDEV_MAX_CB_SIZE * sizeof(*q->code_block),
 			RTE_CACHE_LINE_SIZE, queue_conf->socket);
 	if (q->code_block == NULL) {
 		rte_bbdev_log(ERR,
@@ -305,7 +311,7 @@  q_setup(struct rte_bbdev *dev, uint16_t q_id,
 		return -ENAMETOOLONG;
 	}
 	q->deint_input = rte_zmalloc_socket(name,
-			RTE_BBDEV_MAX_KW * sizeof(*q->deint_input),
+			DEINT_INPUT_BUF_SIZE * sizeof(*q->deint_input),
 			RTE_CACHE_LINE_SIZE, queue_conf->socket);
 	if (q->deint_input == NULL) {
 		rte_bbdev_log(ERR,
@@ -324,7 +330,7 @@  q_setup(struct rte_bbdev *dev, uint16_t q_id,
 		return -ENAMETOOLONG;
 	}
 	q->deint_output = rte_zmalloc_socket(NULL,
-			RTE_BBDEV_MAX_KW * sizeof(*q->deint_output),
+			DEINT_OUTPUT_BUF_SIZE * sizeof(*q->deint_output),
 			RTE_CACHE_LINE_SIZE, queue_conf->socket);
 	if (q->deint_output == NULL) {
 		rte_bbdev_log(ERR,
@@ -343,7 +349,7 @@  q_setup(struct rte_bbdev *dev, uint16_t q_id,
 		return -ENAMETOOLONG;
 	}
 	q->adapter_output = rte_zmalloc_socket(NULL,
-			RTE_BBDEV_MAX_CB_SIZE * 6 * sizeof(*q->adapter_output),
+			ADAPTER_OUTPUT_BUF_SIZE * sizeof(*q->adapter_output),
 			RTE_CACHE_LINE_SIZE, queue_conf->socket);
 	if (q->adapter_output == NULL) {
 		rte_bbdev_log(ERR,
@@ -624,8 +630,16 @@  process_enc_cb(struct turbo_sw_queue *q, struct rte_bbdev_enc_op *op,
 
 	/* Rate-matching */
 	if (enc->op_flags & RTE_BBDEV_TURBO_RATE_MATCH) {
+		uint8_t mask_id;
+		/* Integer round up division by 8 */
+		uint16_t out_len = (e + 7) >> 3;
+		/* The mask array is indexed using E%8. E is an even number so
+		 * there are only 4 possible values.
+		 */
+		const uint8_t mask_out[] = {0xFF, 0xC0, 0xF0, 0xFC};
+
 		/* get output data starting address */
-		rm_out = (uint8_t *)rte_pktmbuf_append(m_out, (e >> 3));
+		rm_out = (uint8_t *)rte_pktmbuf_append(m_out, out_len);
 		if (rm_out == NULL) {
 			op->status |= 1 << RTE_BBDEV_DATA_ERROR;
 			rte_bbdev_log(ERR,
@@ -665,7 +679,7 @@  process_enc_cb(struct turbo_sw_queue *q, struct rte_bbdev_enc_op *op,
 		rm_req.tin1 = out1;
 		rm_req.tin2 = out2;
 		rm_resp.output = rm_out;
-		rm_resp.OutputLen = (e >> 3);
+		rm_resp.OutputLen = out_len;
 		if (enc->op_flags & RTE_BBDEV_TURBO_RV_INDEX_BYPASS)
 			rm_req.bypass_rvidx = 1;
 		else
@@ -681,6 +695,12 @@  process_enc_cb(struct turbo_sw_queue *q, struct rte_bbdev_enc_op *op,
 			return;
 		}
 
+		/* SW fills an entire last byte even if E%8 != 0. Clear the
+		 * superfluous data bits for consistency with HW device.
+		 */
+		mask_id = (e & 7) >> 1;
+		rm_out[out_len - 1] &= mask_out[mask_id];
+
 #ifdef RTE_TEST_BBDEV
 		q_stats->turbo_perf_time += rte_rdtsc_precise() - start_time;
 #endif
@@ -866,7 +886,7 @@  remove_nulls_from_circular_buf(const uint8_t *in, uint8_t *out, uint16_t k,
 	}
 
 	/* Last interlaced row is different - its last byte is the only padding
-	 * byte. We can have from 2 up to 26 padding bytes (Nd) per sub-block.
+	 * byte. We can have from 4 up to 28 padding bytes (Nd) per sub-block.
 	 * After interlacing the 1st and 2nd parity sub-blocks we can have 0, 1
 	 * or 2 padding bytes each time we make a step of 2 * R_SUBBLOCK bytes
 	 * (moving to another column). 2nd parity sub-block uses the same
@@ -877,10 +897,10 @@  remove_nulls_from_circular_buf(const uint8_t *in, uint8_t *out, uint16_t k,
 	 * 32nd (31+1) byte, then 64th etc. (step is C_SUBBLOCK == 32) and the
 	 * last byte will be the first byte from the sub-block:
 	 * (32 + 32 * (R_SUBBLOCK-1)) % Kw == Kw % Kw == 0. Nd can't  be smaller
-	 * than 2 so we know that bytes with ids 0 and 1 must be the padding
-	 * bytes. The bytes from the 1st parity sub-block are the bytes from the
-	 * 31st column - Nd can't be greater than 26 so we are sure that there
-	 * are no padding bytes in 31st column.
+	 * than 4 so we know that bytes with ids 0, 1, 2 and 3 must be the
+	 * padding bytes. The bytes from the 1st parity sub-block are the bytes
+	 * from the 31st column - Nd can't be greater than 28 so we are sure
+	 * that there are no padding bytes in 31st column.
 	 */
 	rte_memcpy(&out[out_idx], &in[in_idx], 2 * r_subblock - 1);
 }
@@ -895,14 +915,14 @@  move_padding_bytes(const uint8_t *in, uint8_t *out, uint16_t k,
 
 	rte_memcpy(&out[nd], in, d);
 	rte_memcpy(&out[nd + kpi + 64], &in[kpi], d);
-	rte_memcpy(&out[nd + 2 * (kpi + 64)], &in[2 * kpi], d);
+	rte_memcpy(&out[(nd - 1) + 2 * (kpi + 64)], &in[2 * kpi], d);
 }
 
 static inline void
 process_dec_cb(struct turbo_sw_queue *q, struct rte_bbdev_dec_op *op,
 		uint8_t c, uint16_t k, uint16_t kw, struct rte_mbuf *m_in,
 		struct rte_mbuf *m_out, uint16_t in_offset, uint16_t out_offset,
-		bool check_crc_24b, uint16_t total_left)
+		bool check_crc_24b, uint16_t crc24_overlap, uint16_t total_left)
 {
 #ifdef RTE_LIBRTE_BBDEV_DEBUG
 	int ret;
@@ -966,7 +986,7 @@  process_dec_cb(struct turbo_sw_queue *q, struct rte_bbdev_dec_op *op,
 	adapter_resp.pharqout = q->adapter_output;
 	bblib_turbo_adapter_ul(&adapter_req, &adapter_resp);
 
-	out = (uint8_t *)rte_pktmbuf_append(m_out, (k >> 3));
+	out = (uint8_t *)rte_pktmbuf_append(m_out, ((k - crc24_overlap) >> 3));
 	if (out == NULL) {
 		op->status |= 1 << RTE_BBDEV_DATA_ERROR;
 		rte_bbdev_log(ERR, "Too little space in output mbuf");
@@ -1008,6 +1028,7 @@  enqueue_dec_one_op(struct turbo_sw_queue *q, struct rte_bbdev_dec_op *op)
 {
 	uint8_t c, r = 0;
 	uint16_t kw, k = 0;
+	uint16_t crc24_overlap = 0;
 	struct rte_bbdev_op_turbo_dec *dec = &op->turbo_dec;
 	struct rte_mbuf *m_in = dec->input.data;
 	struct rte_mbuf *m_out = dec->hard_output.data;
@@ -1031,6 +1052,10 @@  enqueue_dec_one_op(struct turbo_sw_queue *q, struct rte_bbdev_dec_op *op)
 		c = 1;
 	}
 
+	if ((c > 1) && !check_bit(dec->op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
 	while (total_left > 0) {
 		if (dec->code_block_mode == 0)
 			k = (r < dec->tb_params.c_neg) ?
@@ -1050,17 +1075,18 @@  enqueue_dec_one_op(struct turbo_sw_queue *q, struct rte_bbdev_dec_op *op)
 
 		process_dec_cb(q, op, c, k, kw, m_in, m_out, in_offset,
 				out_offset, check_bit(dec->op_flags,
-				RTE_BBDEV_TURBO_CRC_TYPE_24B), total_left);
-		/* As a result of decoding we get Code Block with included
-		 * decoded CRC24 at the end of Code Block. Type of CRC24 is
-		 * specified by flag.
+				RTE_BBDEV_TURBO_CRC_TYPE_24B), crc24_overlap,
+				total_left);
+		/* To keep CRC24 attached to end of Code block, use
+		 * RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP flag as it
+		 * removed by default once verified.
 		 */
 
 		/* Update total_left */
 		total_left -= kw;
 		/* Update offsets for next CBs (if exist) */
 		in_offset += kw;
-		out_offset += (k >> 3);
+		out_offset += ((k - crc24_overlap) >> 3);
 		r++;
 	}
 	if (total_left != 0) {
diff --git a/lib/librte_bbdev/rte_bbdev_op.h b/lib/librte_bbdev/rte_bbdev_op.h
index 1a80588..83f62c2 100644
--- a/lib/librte_bbdev/rte_bbdev_op.h
+++ b/lib/librte_bbdev/rte_bbdev_op.h
@@ -102,7 +102,11 @@  enum rte_bbdev_op_td_flag_bitmasks {
 	 */
 	RTE_BBDEV_TURBO_MAP_DEC = (1ULL << 14),
 	/**< Set if a device supports scatter-gather functionality */
-	RTE_BBDEV_TURBO_DEC_SCATTER_GATHER = (1ULL << 15)
+	RTE_BBDEV_TURBO_DEC_SCATTER_GATHER = (1ULL << 15),
+	/**< Set to keep CRC24B bits appended while decoding. Only usable when
+	 * decoding Transport Blocks (code_block_mode = 0).
+	 */
+	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP = (1ULL << 16)
 };
 
 /** Flags for turbo encoder operation and capability structure */
@@ -379,6 +383,10 @@  struct rte_bbdev_op_turbo_enc {
 struct rte_bbdev_op_cap_turbo_dec {
 	/**< Flags from rte_bbdev_op_td_flag_bitmasks */
 	uint32_t capability_flags;
+	/** Maximal LLR absolute value. Acceptable LLR values lie in range
+	 * [-max_llr_modulus, max_llr_modulus].
+	 */
+	int8_t max_llr_modulus;
 	uint8_t num_buffers_src;  /**< Num input code block buffers */
 	/**< Num hard output code block buffers */
 	uint8_t num_buffers_hard_out;