From patchwork Fri Sep 24 04:37:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nipun Gupta X-Patchwork-Id: 99530 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 688B6A0C43; Fri, 24 Sep 2021 06:39:14 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 81324412E0; Fri, 24 Sep 2021 06:38:47 +0200 (CEST) Received: from inva021.nxp.com (inva021.nxp.com [92.121.34.21]) by mails.dpdk.org (Postfix) with ESMTP id CD00A412D7 for ; Fri, 24 Sep 2021 06:38:42 +0200 (CEST) Received: from inva021.nxp.com (localhost [127.0.0.1]) by inva021.eu-rdc02.nxp.com (Postfix) with ESMTP id 545D62009A0; Fri, 24 Sep 2021 06:38:41 +0200 (CEST) Received: from aprdc01srsp001v.ap-rdc01.nxp.com (aprdc01srsp001v.ap-rdc01.nxp.com [165.114.16.16]) by inva021.eu-rdc02.nxp.com (Postfix) with ESMTP id B3C45202175; Fri, 24 Sep 2021 06:38:33 +0200 (CEST) Received: from lsv03274.swis.in-blr01.nxp.com (lsv03274.swis.in-blr01.nxp.com [92.120.147.114]) by aprdc01srsp001v.ap-rdc01.nxp.com (Postfix) with ESMTP id CE74A183AD61; Fri, 24 Sep 2021 12:37:25 +0800 (+08) From: nipun.gupta@nxp.com To: dev@dpdk.org, gakhil@marvell.com, nicolas.chautru@intel.com Cc: david.marchand@redhat.com, hemant.agrawal@nxp.com, Nipun Gupta Date: Fri, 24 Sep 2021 10:07:19 +0530 Message-Id: <20210924043722.27980-7-nipun.gupta@nxp.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210924043722.27980-1-nipun.gupta@nxp.com> References: <20210318063421.14895-1-hemant.agrawal@nxp.com> <20210924043722.27980-1-nipun.gupta@nxp.com> X-Virus-Scanned: ClamAV using ClamSMTP Subject: [dpdk-dev] [PATCH v6 6/9] baseband/la12xx: add enqueue and dequeue support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Hemant Agrawal Add support for enqueue and dequeue the LDPC enc/dec from the modem device. Signed-off-by: Nipun Gupta Signed-off-by: Hemant Agrawal --- doc/guides/bbdevs/features/la12xx.ini | 13 + doc/guides/bbdevs/la12xx.rst | 47 ++- drivers/baseband/la12xx/bbdev_la12xx.c | 328 ++++++++++++++++++++- drivers/baseband/la12xx/bbdev_la12xx_ipc.h | 37 +++ 4 files changed, 420 insertions(+), 5 deletions(-) create mode 100644 doc/guides/bbdevs/features/la12xx.ini diff --git a/doc/guides/bbdevs/features/la12xx.ini b/doc/guides/bbdevs/features/la12xx.ini new file mode 100644 index 0000000000..0aec5eecb6 --- /dev/null +++ b/doc/guides/bbdevs/features/la12xx.ini @@ -0,0 +1,13 @@ +; +; Supported features of the 'la12xx' bbdev driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Turbo Decoder (4G) = N +Turbo Encoder (4G) = N +LDPC Decoder (5G) = Y +LDPC Encoder (5G) = Y +LLR/HARQ Compression = N +HW Accelerated = Y +BBDEV API = Y diff --git a/doc/guides/bbdevs/la12xx.rst b/doc/guides/bbdevs/la12xx.rst index 3c9ac5c047..b111ec0dd6 100644 --- a/doc/guides/bbdevs/la12xx.rst +++ b/doc/guides/bbdevs/la12xx.rst @@ -16,10 +16,11 @@ Features LA12xx PMD supports the following features: +- LDPC Encode in the DL +- LDPC Decode in the UL - Maximum of 8 UL queues - Maximum of 8 DL queues - PCIe Gen-3 x8 Interface -- MSI-X Installation ------------ @@ -79,3 +80,47 @@ For enabling logs, use the following EAL parameter: Using ``bb.la12xx`` as log matching criteria, all Baseband PMD logs can be enabled which are lower than logging ``level``. + +Test Application +---------------- + +BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing +the functionality of LA12xx for FEC encode and decode, depending on the device +capabilities. The test application is located under app->test-bbdev folder and has the +following options: + +.. code-block:: console + + "-p", "--testapp-path": specifies path to the bbdev test app. + "-e", "--eal-params" : EAL arguments which are passed to the test app. + "-t", "--timeout" : Timeout in seconds (default=300). + "-c", "--test-cases" : Defines test cases to run. Run all if not specified. + "-v", "--test-vector" : Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data). + "-n", "--num-ops" : Number of operations to process on device (default=32). + "-b", "--burst-size" : Operations enqueue/dequeue burst size (default=32). + "-s", "--snr" : SNR in dB used when generating LLRs for bler tests. + "-s", "--iter_max" : Number of iterations for LDPC decoder. + "-l", "--num-lcores" : Number of lcores to run (default=16). + "-i", "--init-device" : Initialise PF device with default values. + + +To execute the test application tool using simple decode or encode data, +type one of the following: + +.. code-block:: console + + ./test-bbdev.py -e="--vdev=baseband_la12xx,socket_id=0,max_nb_queues=8" -c validation -n 64 -b 1 -v ./ldpc_dec_default.data + ./test-bbdev.py -e="--vdev=baseband_la12xx,socket_id=0,max_nb_queues=8" -c validation -n 64 -b 1 -v ./ldpc_enc_default.data + +The test application ``test-bbdev.py``, supports the ability to configure the PF device with +a default set of values, if the "-i" or "- -init-device" option is included. The default values +are defined in test_bbdev_perf.c. + + +Test Vectors +~~~~~~~~~~~~ + +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides +a range of additional tests under the test_vectors folder, which may be useful. The results +of these tests will depend on the LA12xx FEC capabilities which may cause some +testcases to be skipped, but no failure should be reported. diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c index c758fe3833..fa16660a77 100644 --- a/drivers/baseband/la12xx/bbdev_la12xx.c +++ b/drivers/baseband/la12xx/bbdev_la12xx.c @@ -122,6 +122,10 @@ la12xx_queue_release(struct rte_bbdev *dev, uint16_t q_id) ((uint64_t) ((unsigned long) (A) \ - ((uint64_t)ipc_priv->hugepg_start.host_vaddr))) +#define MODEM_P2V(A) \ + ((uint64_t) ((unsigned long) (A) \ + + (unsigned long)(ipc_priv->peb_start.host_vaddr))) + static int ipc_queue_configure(uint32_t channel_id, ipc_t instance, struct bbdev_la12xx_q_priv *q_priv) { @@ -336,6 +340,318 @@ static const struct rte_bbdev_ops pmd_ops = { .queue_release = la12xx_queue_release, .start = la12xx_start }; + +static inline int +is_bd_ring_full(uint32_t ci, uint32_t ci_flag, + uint32_t pi, uint32_t pi_flag) +{ + if (pi == ci) { + if (pi_flag != ci_flag) + return 1; /* Ring is Full */ + } + return 0; +} + +static inline int +prepare_ldpc_enc_op(struct rte_bbdev_enc_op *bbdev_enc_op, + struct bbdev_la12xx_q_priv *q_priv __rte_unused, + struct rte_bbdev_op_data *in_op_data __rte_unused, + struct rte_bbdev_op_data *out_op_data) +{ + struct rte_bbdev_op_ldpc_enc *ldpc_enc = &bbdev_enc_op->ldpc_enc; + uint32_t total_out_bits; + + total_out_bits = (ldpc_enc->tb_params.cab * + ldpc_enc->tb_params.ea) + (ldpc_enc->tb_params.c - + ldpc_enc->tb_params.cab) * ldpc_enc->tb_params.eb; + + ldpc_enc->output.length = (total_out_bits + 7)/8; + + rte_pktmbuf_append(out_op_data->data, ldpc_enc->output.length); + + return 0; +} + +static inline int +prepare_ldpc_dec_op(struct rte_bbdev_dec_op *bbdev_dec_op, + struct bbdev_ipc_dequeue_op *bbdev_ipc_op, + struct bbdev_la12xx_q_priv *q_priv __rte_unused, + struct rte_bbdev_op_data *out_op_data __rte_unused) +{ + struct rte_bbdev_op_ldpc_dec *ldpc_dec = &bbdev_dec_op->ldpc_dec; + uint32_t total_out_bits; + uint32_t num_code_blocks = 0; + uint16_t sys_cols; + + sys_cols = (ldpc_dec->basegraph == 1) ? 22 : 10; + if (ldpc_dec->tb_params.c == 1) { + total_out_bits = ((sys_cols * ldpc_dec->z_c) - + ldpc_dec->n_filler); + /* 5G-NR protocol uses 16 bit CRC when output packet + * size <= 3824 (bits). Otherwise 24 bit CRC is used. + * Adjust the output bits accordingly + */ + if (total_out_bits - 16 <= 3824) + total_out_bits -= 16; + else + total_out_bits -= 24; + ldpc_dec->hard_output.length = (total_out_bits / 8); + } else { + total_out_bits = (((sys_cols * ldpc_dec->z_c) - + ldpc_dec->n_filler - 24) * + ldpc_dec->tb_params.c); + ldpc_dec->hard_output.length = (total_out_bits / 8) - 3; + } + + num_code_blocks = ldpc_dec->tb_params.c; + + bbdev_ipc_op->num_code_blocks = rte_cpu_to_be_32(num_code_blocks); + + return 0; +} + +static int +enqueue_single_op(struct bbdev_la12xx_q_priv *q_priv, void *bbdev_op) +{ + struct bbdev_la12xx_private *priv = q_priv->bbdev_priv; + ipc_userspace_t *ipc_priv = priv->ipc_priv; + ipc_instance_t *ipc_instance = ipc_priv->instance; + struct bbdev_ipc_dequeue_op *bbdev_ipc_op; + struct rte_bbdev_op_ldpc_enc *ldpc_enc; + struct rte_bbdev_op_ldpc_dec *ldpc_dec; + uint32_t q_id = q_priv->q_id; + uint32_t ci, ci_flag, pi, pi_flag; + ipc_ch_t *ch = &(ipc_instance->ch_list[q_id]); + ipc_br_md_t *md = &(ch->md); + size_t virt; + char *huge_start_addr = + (char *)q_priv->bbdev_priv->ipc_priv->hugepg_start.host_vaddr; + struct rte_bbdev_op_data *in_op_data, *out_op_data; + char *data_ptr; + uint32_t l1_pcie_addr; + int ret; + + ci = IPC_GET_CI_INDEX(q_priv->host_ci); + ci_flag = IPC_GET_CI_FLAG(q_priv->host_ci); + + pi = IPC_GET_PI_INDEX(q_priv->host_pi); + pi_flag = IPC_GET_PI_FLAG(q_priv->host_pi); + + rte_bbdev_dp_log(DEBUG, "before bd_ring_full: pi: %u, ci: %u," + "pi_flag: %u, ci_flag: %u, ring size: %u", + pi, ci, pi_flag, ci_flag, q_priv->queue_size); + + if (is_bd_ring_full(ci, ci_flag, pi, pi_flag)) { + rte_bbdev_dp_log(DEBUG, "bd ring full for queue id: %d", q_id); + return IPC_CH_FULL; + } + + virt = MODEM_P2V(q_priv->host_params->bd_m_modem_ptr[pi]); + bbdev_ipc_op = (struct bbdev_ipc_dequeue_op *)virt; + q_priv->bbdev_op[pi] = bbdev_op; + + switch (q_priv->op_type) { + case RTE_BBDEV_OP_LDPC_ENC: + ldpc_enc = &(((struct rte_bbdev_enc_op *)bbdev_op)->ldpc_enc); + in_op_data = &ldpc_enc->input; + out_op_data = &ldpc_enc->output; + + ret = prepare_ldpc_enc_op(bbdev_op, q_priv, + in_op_data, out_op_data); + if (ret) { + rte_bbdev_log(ERR, "process_ldpc_enc_op fail, ret: %d", + ret); + return ret; + } + break; + + case RTE_BBDEV_OP_LDPC_DEC: + ldpc_dec = &(((struct rte_bbdev_dec_op *)bbdev_op)->ldpc_dec); + in_op_data = &ldpc_dec->input; + + out_op_data = &ldpc_dec->hard_output; + + ret = prepare_ldpc_dec_op(bbdev_op, bbdev_ipc_op, + q_priv, out_op_data); + if (ret) { + rte_bbdev_log(ERR, "process_ldpc_dec_op fail, ret: %d", + ret); + return ret; + } + break; + + default: + rte_bbdev_log(ERR, "unsupported bbdev_ipc op type"); + return -1; + } + + if (in_op_data->data) { + data_ptr = rte_pktmbuf_mtod(in_op_data->data, char *); + l1_pcie_addr = (uint32_t)GUL_USER_HUGE_PAGE_ADDR + + data_ptr - huge_start_addr; + bbdev_ipc_op->in_addr = l1_pcie_addr; + bbdev_ipc_op->in_len = in_op_data->length; + } + + if (out_op_data->data) { + data_ptr = rte_pktmbuf_mtod(out_op_data->data, char *); + l1_pcie_addr = (uint32_t)GUL_USER_HUGE_PAGE_ADDR + + data_ptr - huge_start_addr; + bbdev_ipc_op->out_addr = rte_cpu_to_be_32(l1_pcie_addr); + bbdev_ipc_op->out_len = rte_cpu_to_be_32(out_op_data->length); + } + + /* Move Producer Index forward */ + pi++; + /* Flip the PI flag, if wrapping */ + if (unlikely(q_priv->queue_size == pi)) { + pi = 0; + pi_flag = pi_flag ? 0 : 1; + } + + if (pi_flag) + IPC_SET_PI_FLAG(pi); + else + IPC_RESET_PI_FLAG(pi); + q_priv->host_pi = pi; + + /* Wait for Data Copy & pi_flag update to complete before updating pi */ + rte_mb(); + /* now update pi */ + md->pi = rte_cpu_to_be_32(pi); + + rte_bbdev_dp_log(DEBUG, "enter: pi: %u, ci: %u," + "pi_flag: %u, ci_flag: %u, ring size: %u", + pi, ci, pi_flag, ci_flag, q_priv->queue_size); + + return 0; +} + +/* Enqueue decode burst */ +static uint16_t +enqueue_dec_ops(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t nb_ops) +{ + struct bbdev_la12xx_q_priv *q_priv = q_data->queue_private; + int nb_enqueued, ret; + + for (nb_enqueued = 0; nb_enqueued < nb_ops; nb_enqueued++) { + ret = enqueue_single_op(q_priv, ops[nb_enqueued]); + if (ret) + break; + } + + q_data->queue_stats.enqueue_err_count += nb_ops - nb_enqueued; + q_data->queue_stats.enqueued_count += nb_enqueued; + + return nb_enqueued; +} + +/* Enqueue encode burst */ +static uint16_t +enqueue_enc_ops(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t nb_ops) +{ + struct bbdev_la12xx_q_priv *q_priv = q_data->queue_private; + int nb_enqueued, ret; + + for (nb_enqueued = 0; nb_enqueued < nb_ops; nb_enqueued++) { + ret = enqueue_single_op(q_priv, ops[nb_enqueued]); + if (ret) + break; + } + + q_data->queue_stats.enqueue_err_count += nb_ops - nb_enqueued; + q_data->queue_stats.enqueued_count += nb_enqueued; + + return nb_enqueued; +} + +/* Dequeue encode burst */ +static void * +dequeue_single_op(struct bbdev_la12xx_q_priv *q_priv, void *dst) +{ + void *op; + uint32_t ci, ci_flag; + uint32_t temp_ci; + + temp_ci = q_priv->host_params->ci; + if (temp_ci == q_priv->host_ci) + return NULL; + + ci = IPC_GET_CI_INDEX(q_priv->host_ci); + ci_flag = IPC_GET_CI_FLAG(q_priv->host_ci); + + rte_bbdev_dp_log(DEBUG, + "ci: %u, ci_flag: %u, ring size: %u", + ci, ci_flag, q_priv->queue_size); + + op = q_priv->bbdev_op[ci]; + + rte_memcpy(dst, q_priv->msg_ch_vaddr[ci], + sizeof(struct bbdev_ipc_enqueue_op)); + + /* Move Consumer Index forward */ + ci++; + /* Flip the CI flag, if wrapping */ + if (q_priv->queue_size == ci) { + ci = 0; + ci_flag = ci_flag ? 0 : 1; + } + if (ci_flag) + IPC_SET_CI_FLAG(ci); + else + IPC_RESET_CI_FLAG(ci); + + q_priv->host_ci = ci; + + rte_bbdev_dp_log(DEBUG, + "exit: ci: %u, ci_flag: %u, ring size: %u", + ci, ci_flag, q_priv->queue_size); + + return op; +} + +/* Dequeue decode burst */ +static uint16_t +dequeue_dec_ops(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t nb_ops) +{ + struct bbdev_la12xx_q_priv *q_priv = q_data->queue_private; + struct bbdev_ipc_enqueue_op bbdev_ipc_op; + int nb_dequeued; + + for (nb_dequeued = 0; nb_dequeued < nb_ops; nb_dequeued++) { + ops[nb_dequeued] = dequeue_single_op(q_priv, &bbdev_ipc_op); + if (!ops[nb_dequeued]) + break; + ops[nb_dequeued]->status = bbdev_ipc_op.status; + } + q_data->queue_stats.dequeued_count += nb_dequeued; + + return nb_dequeued; +} + +/* Dequeue encode burst */ +static uint16_t +dequeue_enc_ops(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t nb_ops) +{ + struct bbdev_la12xx_q_priv *q_priv = q_data->queue_private; + struct bbdev_ipc_enqueue_op bbdev_ipc_op; + int nb_enqueued; + + for (nb_enqueued = 0; nb_enqueued < nb_ops; nb_enqueued++) { + ops[nb_enqueued] = dequeue_single_op(q_priv, &bbdev_ipc_op); + if (!ops[nb_enqueued]) + break; + ops[nb_enqueued]->status = bbdev_ipc_op.status; + } + q_data->queue_stats.enqueued_count += nb_enqueued; + + return nb_enqueued; +} + static struct hugepage_info * get_hugepage_info(void) { @@ -709,10 +1025,14 @@ la12xx_bbdev_create(struct rte_vdev_device *vdev, bbdev->intr_handle = NULL; /* register rx/tx burst functions for data path */ - bbdev->dequeue_enc_ops = NULL; - bbdev->dequeue_dec_ops = NULL; - bbdev->enqueue_enc_ops = NULL; - bbdev->enqueue_dec_ops = NULL; + bbdev->dequeue_enc_ops = dequeue_enc_ops; + bbdev->dequeue_dec_ops = dequeue_dec_ops; + bbdev->enqueue_enc_ops = enqueue_enc_ops; + bbdev->enqueue_dec_ops = enqueue_dec_ops; + bbdev->dequeue_ldpc_enc_ops = dequeue_enc_ops; + bbdev->dequeue_ldpc_dec_ops = dequeue_dec_ops; + bbdev->enqueue_ldpc_enc_ops = enqueue_enc_ops; + bbdev->enqueue_ldpc_dec_ops = enqueue_dec_ops; return 0; } diff --git a/drivers/baseband/la12xx/bbdev_la12xx_ipc.h b/drivers/baseband/la12xx/bbdev_la12xx_ipc.h index 5f613fb087..b6a7f677d0 100644 --- a/drivers/baseband/la12xx/bbdev_la12xx_ipc.h +++ b/drivers/baseband/la12xx/bbdev_la12xx_ipc.h @@ -73,6 +73,25 @@ typedef struct { _IOWR(GUL_IPC_MAGIC, 5, struct ipc_msg *) #define IOCTL_GUL_IPC_CHANNEL_RAISE_INTERRUPT _IOW(GUL_IPC_MAGIC, 6, int *) +#define GUL_USER_HUGE_PAGE_OFFSET (0) +#define GUL_PCI1_ADDR_BASE (0x00000000ULL) + +#define GUL_USER_HUGE_PAGE_ADDR (GUL_PCI1_ADDR_BASE + GUL_USER_HUGE_PAGE_OFFSET) + +/* IPC PI/CI index & flag manipulation helpers */ +#define IPC_PI_CI_FLAG_MASK 0x80000000 /* (1<<31) */ +#define IPC_PI_CI_INDEX_MASK 0x7FFFFFFF /* ~(1<<31) */ + +#define IPC_SET_PI_FLAG(x) (x |= IPC_PI_CI_FLAG_MASK) +#define IPC_RESET_PI_FLAG(x) (x &= IPC_PI_CI_INDEX_MASK) +#define IPC_GET_PI_FLAG(x) (x >> 31) +#define IPC_GET_PI_INDEX(x) (x & IPC_PI_CI_INDEX_MASK) + +#define IPC_SET_CI_FLAG(x) (x |= IPC_PI_CI_FLAG_MASK) +#define IPC_RESET_CI_FLAG(x) (x &= IPC_PI_CI_INDEX_MASK) +#define IPC_GET_CI_FLAG(x) (x >> 31) +#define IPC_GET_CI_INDEX(x) (x & IPC_PI_CI_INDEX_MASK) + /** buffer ring common metadata */ typedef struct ipc_bd_ring_md { volatile uint32_t pi; /**< Producer index and flag (MSB) @@ -180,6 +199,24 @@ struct bbdev_ipc_enqueue_op { uint32_t rsvd; }; +/** Structure specifying dequeue operation (dequeue at LA1224) */ +struct bbdev_ipc_dequeue_op { + /** Input buffer memory address */ + uint32_t in_addr; + /** Input buffer memory length */ + uint32_t in_len; + /** Output buffer memory address */ + uint32_t out_addr; + /** Output buffer memory length */ + uint32_t out_len; + /* Number of code blocks. Only set when HARQ is used */ + uint32_t num_code_blocks; + /** Dequeue Operation flags */ + uint32_t op_flags; + /** Shared metadata between L1 and L2 */ + uint32_t shared_metadata; +}; + /* This shared memory would be on the host side which have copy of some * of the parameters which are also part of Shared BD ring. Read access * of these parameters from the host side would not be over PCI.