From patchwork Fri Sep 4 17:53:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76585 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 87EB6A04C5; Fri, 4 Sep 2020 20:00:08 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7D5EF1C192; Fri, 4 Sep 2020 19:58:24 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 703C71C0B4 for ; Fri, 4 Sep 2020 19:58:11 +0200 (CEST) IronPort-SDR: lotv00Tp1Z3VW/tcy9Gy1yIxs2p7U84TeOOPmci3CA9/xXMWtNAa8iIVBj/l6CsK+VAeMIBN67 L+QGuhk+cwog== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="137316300" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="137316300" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: 4Lrr5QYfo30LFBGAaHVcWHFkhMvEfVL4vBeUQhxpuRqmnpH98SNMZC7hWv23intydzzzZcAqtW OxiKo5dNvZoA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765374" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:05 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:53:57 -0700 Message-Id: <1599242047-58232-2-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add stubs for the ACC100 PMD Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao Signed-off-by: Nicolas Chautru --- config/common_base | 4 + doc/guides/bbdevs/acc100.rst | 233 +++++++++++++++++++++ doc/guides/bbdevs/index.rst | 1 + doc/guides/rel_notes/release_20_11.rst | 6 + drivers/baseband/Makefile | 2 + drivers/baseband/acc100/Makefile | 25 +++ drivers/baseband/acc100/meson.build | 6 + drivers/baseband/acc100/rte_acc100_pmd.c | 175 ++++++++++++++++ drivers/baseband/acc100/rte_acc100_pmd.h | 37 ++++ .../acc100/rte_pmd_bbdev_acc100_version.map | 3 + drivers/baseband/meson.build | 2 +- mk/rte.app.mk | 1 + 12 files changed, 494 insertions(+), 1 deletion(-) create mode 100644 doc/guides/bbdevs/acc100.rst create mode 100644 drivers/baseband/acc100/Makefile create mode 100644 drivers/baseband/acc100/meson.build create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map diff --git a/config/common_base b/config/common_base index fbf0ee7..218ab16 100644 --- a/config/common_base +++ b/config/common_base @@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y # CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y +# Compile PMD for ACC100 bbdev device +# +CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y + # # Compile PMD for Intel FPGA LTE FEC bbdev device # diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst new file mode 100644 index 0000000..f87ee09 --- /dev/null +++ b/doc/guides/bbdevs/acc100.rst @@ -0,0 +1,233 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2020 Intel Corporation + +Intel(R) ACC100 5G/4G FEC Poll Mode Driver +========================================== + +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an +implementation of a VRAN FEC wireless acceleration function. +This device is also known as Mount Bryce. + +Features +-------- + +ACC100 5G/4G FEC PMD supports the following features: + +- LDPC Encode in the DL (5GNR) +- LDPC Decode in the UL (5GNR) +- Turbo Encode in the DL (4G) +- Turbo Decode in the UL (4G) +- 16 VFs per PF (physical device) +- Maximum of 128 queues per VF +- PCIe Gen-3 x16 Interface +- MSI +- SR-IOV + +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities: + +* For the LDPC encode operation: + - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s) + - ``RTE_BBDEV_LDPC_RATE_MATCH`` : if set then do not do Rate Match bypass + - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver + +* For the LDPC decode operation: + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` : check CRC24B from CB(s) + - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` : disable early termination + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` : drops CRC24B bits appended while decoding + - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` : provides an input for HARQ combining + - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` : provides an input for HARQ combining + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` : HARQ memory input is internal + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` : HARQ memory output is internal + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` : loopback data to/from HARQ memory + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` : HARQ memory includes the fillers bits + - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` : supports scatter-gather for input/output data + - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` : supports compression of the HARQ input/output + - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` : supports LLR input compression + +* For the turbo encode operation: + - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s) + - ``RTE_BBDEV_TURBO_RATE_MATCH`` : if set then do not do Rate Match bypass + - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` : set for encoder dequeue interrupts + - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` : set to bypass RV index + - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` : supports scatter-gather for input/output data + +* For the turbo decode operation: + - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` : check CRC24B from CB(s) + - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` : perform subblock de-interleave + - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` : set for decoder dequeue interrupts + - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` : set if negative LLR encoder i/p is supported + - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` : set if positive LLR encoder i/p is supported + - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` : keep CRC24B bits appended while decoding + - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` : set early early termination feature + - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` : supports scatter-gather for input/output data + - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` : set half iteration granularity + +Installation +------------ + +Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The +default set of bbdev compile flags may be found in config/common_base, where for example +the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``, +is already set. + +DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual. +The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The +hugepage configuration of a server may be examined using: + +.. code-block:: console + + grep Huge* /proc/meminfo + + +Initialization +-------------- + +When the device first powers up, its PCI Physical Functions (PF) can be listed through this command: + +.. code-block:: console + + sudo lspci -vd8086:0d5c + +The physical and virtual functions are compatible with Linux UIO drivers: +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G +FEC device firstly needs to be bound to one of these linux drivers through DPDK. + + +Bind PF UIO driver(s) +~~~~~~~~~~~~~~~~~~~~~ + +Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use +``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver. + +The igb_uio driver may be bound to the PF PCI device using one of three methods: + + +1. PCI functions (physical or virtual, depending on the use case) can be bound to +the UIO driver by repeating this command for every function. + +.. code-block:: console + + cd + insmod ./build/kmod/igb_uio.ko + echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id + lspci -vd8086:0d5c + + +2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool + +.. code-block:: console + + cd + ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0 + +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c + + +3. A third way to bind is to use ``dpdk-setup.sh`` tool + +.. code-block:: console + + cd + ./usertools/dpdk-setup.sh + + select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module' + or + select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required + enter PCI device ID + select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding + + +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not +support SR-IOV configuration right out of the box, so it will need to be patched. + + +Enable Virtual Functions +~~~~~~~~~~~~~~~~~~~~~~~~ + +Now, it should be visible in the printouts that PCI PF is under igb_uio control +"``Kernel driver in use: igb_uio``" + +To show the number of available VFs on the device, read ``sriov_totalvfs`` file.. + +.. code-block:: console + + cat /sys/bus/pci/devices/0000\:\:./sriov_totalvfs + + where 0000\:\:. is the PCI device ID + + +To enable VFs via igb_uio, echo the number of virtual functions intended to +enable to ``max_vfs`` file.. + +.. code-block:: console + + echo > /sys/bus/pci/devices/0000\:\:./max_vfs + + +Afterwards, all VFs must be bound to appropriate UIO drivers as required, same +way it was done with the physical function previously. + +Enabling SR-IOV via vfio driver is pretty much the same, except that the file +name is different: + +.. code-block:: console + + echo > /sys/bus/pci/devices/0000\:\:./sriov_numvfs + + +Configure the VFs through PF +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The PCI virtual functions must be configured before working or getting assigned +to VMs/Containers. The configuration involves allocating the number of hardware +queues, priorities, load balance, bandwidth and other settings necessary for the +device to perform FEC functions. + +This configuration needs to be executed at least once after reboot or PCI FLR and can +be achieved by using the function ``acc100_configure()``, which sets up the +parameters defined in ``acc100_conf`` structure. + +Test Application +---------------- + +BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing +the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's +capabilities. The test application is located under app->test-bbdev folder and has the +following options: + +.. code-block:: console + + "-p", "--testapp-path": specifies path to the bbdev test app. + "-e", "--eal-params" : EAL arguments which are passed to the test app. + "-t", "--timeout" : Timeout in seconds (default=300). + "-c", "--test-cases" : Defines test cases to run. Run all if not specified. + "-v", "--test-vector" : Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data). + "-n", "--num-ops" : Number of operations to process on device (default=32). + "-b", "--burst-size" : Operations enqueue/dequeue burst size (default=32). + "-s", "--snr" : SNR in dB used when generating LLRs for bler tests. + "-s", "--iter_max" : Number of iterations for LDPC decoder. + "-l", "--num-lcores" : Number of lcores to run (default=16). + "-i", "--init-device" : Initialise PF device with default values. + + +To execute the test application tool using simple decode or encode data, +type one of the following: + +.. code-block:: console + + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data + + +The test application ``test-bbdev.py``, supports the ability to configure the PF device with +a default set of values, if the "-i" or "- -init-device" option is included. The default values +are defined in test_bbdev_perf.c. + + +Test Vectors +~~~~~~~~~~~~ + +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides +a range of additional tests under the test_vectors folder, which may be useful. The results +of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some +testcases to be skipped, but no failure should be reported. diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst index a8092dd..4445cbd 100644 --- a/doc/guides/bbdevs/index.rst +++ b/doc/guides/bbdevs/index.rst @@ -13,3 +13,4 @@ Baseband Device Drivers turbo_sw fpga_lte_fec fpga_5gnr_fec + acc100 diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index df227a1..b3ab614 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -55,6 +55,12 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Added Intel ACC100 bbdev PMD.** + + Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator + also known as Mount Bryce. See the + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver. + Removed Items ------------- diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile index dcc0969..b640294 100644 --- a/drivers/baseband/Makefile +++ b/drivers/baseband/Makefile @@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += null DEPDIRS-null = $(core-libs) DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw DEPDIRS-turbo_sw = $(core-libs) +DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100 +DEPDIRS-acc100 = $(core-libs) DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec DEPDIRS-fpga_lte_fec = $(core-libs) DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += fpga_5gnr_fec diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile new file mode 100644 index 0000000..c79e487 --- /dev/null +++ b/drivers/baseband/acc100/Makefile @@ -0,0 +1,25 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +include $(RTE_SDK)/mk/rte.vars.mk + +# library name +LIB = librte_pmd_bbdev_acc100.a + +# build flags +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile +LDLIBS += -lrte_bbdev +LDLIBS += -lrte_pci -lrte_bus_pci + +# versioning export map +EXPORT_MAP := rte_pmd_bbdev_acc100_version.map + +# library version +LIBABIVER := 1 + +# library source files +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build new file mode 100644 index 0000000..8afafc2 --- /dev/null +++ b/drivers/baseband/acc100/meson.build @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci'] + +sources = files('rte_acc100_pmd.c') diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c new file mode 100644 index 0000000..1b4cd13 --- /dev/null +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -0,0 +1,175 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include "rte_acc100_pmd.h" + +#ifdef RTE_LIBRTE_BBDEV_DEBUG +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); +#else +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); +#endif + +/* Free 64MB memory used for software rings */ +static int +acc100_dev_close(struct rte_bbdev *dev __rte_unused) +{ + return 0; +} + +static const struct rte_bbdev_ops acc100_bbdev_ops = { + .close = acc100_dev_close, +}; + +/* ACC100 PCI PF address map */ +static struct rte_pci_id pci_id_acc100_pf_map[] = { + { + RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID) + }, + {.device_id = 0}, +}; + +/* ACC100 PCI VF address map */ +static struct rte_pci_id pci_id_acc100_vf_map[] = { + { + RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID) + }, + {.device_id = 0}, +}; + +/* Initialization Function */ +static void +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) +{ + struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device); + + dev->dev_ops = &acc100_bbdev_ops; + + ((struct acc100_device *) dev->data->dev_private)->pf_device = + !strcmp(drv->driver.name, + RTE_STR(ACC100PF_DRIVER_NAME)); + ((struct acc100_device *) dev->data->dev_private)->mmio_base = + pci_dev->mem_resource[0].addr; + + rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"", + drv->driver.name, dev->data->name, + (void *)pci_dev->mem_resource[0].addr, + pci_dev->mem_resource[0].phys_addr); +} + +static int acc100_pci_probe(struct rte_pci_driver *pci_drv, + struct rte_pci_device *pci_dev) +{ + struct rte_bbdev *bbdev = NULL; + char dev_name[RTE_BBDEV_NAME_MAX_LEN]; + + if (pci_dev == NULL) { + rte_bbdev_log(ERR, "NULL PCI device"); + return -EINVAL; + } + + rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name)); + + /* Allocate memory to be used privately by drivers */ + bbdev = rte_bbdev_allocate(pci_dev->device.name); + if (bbdev == NULL) + return -ENODEV; + + /* allocate device private memory */ + bbdev->data->dev_private = rte_zmalloc_socket(dev_name, + sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE, + pci_dev->device.numa_node); + + if (bbdev->data->dev_private == NULL) { + rte_bbdev_log(CRIT, + "Allocate of %zu bytes for device \"%s\" failed", + sizeof(struct acc100_device), dev_name); + rte_bbdev_release(bbdev); + return -ENOMEM; + } + + /* Fill HW specific part of device structure */ + bbdev->device = &pci_dev->device; + bbdev->intr_handle = &pci_dev->intr_handle; + bbdev->data->socket_id = pci_dev->device.numa_node; + + /* Invoke ACC100 device initialization function */ + acc100_bbdev_init(bbdev, pci_drv); + + rte_bbdev_log_debug("Initialised bbdev %s (id = %u)", + dev_name, bbdev->data->dev_id); + return 0; +} + +static int acc100_pci_remove(struct rte_pci_device *pci_dev) +{ + struct rte_bbdev *bbdev; + int ret; + uint8_t dev_id; + + if (pci_dev == NULL) + return -EINVAL; + + /* Find device */ + bbdev = rte_bbdev_get_named_dev(pci_dev->device.name); + if (bbdev == NULL) { + rte_bbdev_log(CRIT, + "Couldn't find HW dev \"%s\" to uninitialise it", + pci_dev->device.name); + return -ENODEV; + } + dev_id = bbdev->data->dev_id; + + /* free device private memory before close */ + rte_free(bbdev->data->dev_private); + + /* Close device */ + ret = rte_bbdev_close(dev_id); + if (ret < 0) + rte_bbdev_log(ERR, + "Device %i failed to close during uninit: %i", + dev_id, ret); + + /* release bbdev from library */ + rte_bbdev_release(bbdev); + + rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id); + + return 0; +} + +static struct rte_pci_driver acc100_pci_pf_driver = { + .probe = acc100_pci_probe, + .remove = acc100_pci_remove, + .id_table = pci_id_acc100_pf_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING +}; + +static struct rte_pci_driver acc100_pci_vf_driver = { + .probe = acc100_pci_probe, + .remove = acc100_pci_remove, + .id_table = pci_id_acc100_vf_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING +}; + +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver); +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map); +RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver); +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map); + diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h new file mode 100644 index 0000000..6f46df0 --- /dev/null +++ b/drivers/baseband/acc100/rte_acc100_pmd.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_ACC100_PMD_H_ +#define _RTE_ACC100_PMD_H_ + +/* Helper macro for logging */ +#define rte_bbdev_log(level, fmt, ...) \ + rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \ + ##__VA_ARGS__) + +#ifdef RTE_LIBRTE_BBDEV_DEBUG +#define rte_bbdev_log_debug(fmt, ...) \ + rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \ + ##__VA_ARGS__) +#else +#define rte_bbdev_log_debug(fmt, ...) +#endif + +/* ACC100 PF and VF driver names */ +#define ACC100PF_DRIVER_NAME intel_acc100_pf +#define ACC100VF_DRIVER_NAME intel_acc100_vf + +/* ACC100 PCI vendor & device IDs */ +#define RTE_ACC100_VENDOR_ID (0x8086) +#define RTE_ACC100_PF_DEVICE_ID (0x0d5c) +#define RTE_ACC100_VF_DEVICE_ID (0x0d5d) + +/* Private data structure for each ACC100 device */ +struct acc100_device { + void *mmio_base; /**< Base address of MMIO registers (BAR0) */ + bool pf_device; /**< True if this is a PF ACC100 device */ + bool configured; /**< True if this ACC100 device is configured */ +}; + +#endif /* _RTE_ACC100_PMD_H_ */ diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map new file mode 100644 index 0000000..4a76d1d --- /dev/null +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build index 415b672..72301ce 100644 --- a/drivers/baseband/meson.build +++ b/drivers/baseband/meson.build @@ -5,7 +5,7 @@ if is_windows subdir_done() endif -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec'] +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100'] config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@' driver_name_fmt = 'rte_pmd_bbdev_@0@' diff --git a/mk/rte.app.mk b/mk/rte.app.mk index a544259..a77f538 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD) += -lrte_pmd_netvsc ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y) _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += -lrte_pmd_bbdev_null +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += -lrte_pmd_bbdev_acc100 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -lrte_pmd_bbdev_fpga_lte_fec _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -lrte_pmd_bbdev_fpga_5gnr_fec From patchwork Fri Sep 4 17:53:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76578 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 51ABFA04C5; Fri, 4 Sep 2020 19:58:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A342A1C116; Fri, 4 Sep 2020 19:58:16 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 295C61C0BF for ; Fri, 4 Sep 2020 19:58:09 +0200 (CEST) IronPort-SDR: k9y8kCUNygn2XK25Hjoc0CGMJCwTAUmyMkLCR92noyrpy5RlyYEIHJJeIytUq/K0L2ZFc9kwVI SXX0AzUJwIZQ== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614833" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614833" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: c5Csfa5bXdkMGC3GK4jsEU2f5Gk+hMe7WMFL2xHACrwPsNd0Ch9L5j+WtHucXpNlbIDxgAmuWq 7QxJTfteNMzA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765380" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:05 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:53:58 -0700 Message-Id: <1599242047-58232-3-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add in the list of registers for the device and related HW specs definitions. Signed-off-by: Nicolas Chautru Reviewed-by: Rosen Xu Acked-by: Liu Tianjiao --- drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++ drivers/baseband/acc100/acc100_vf_enum.h | 73 ++ drivers/baseband/acc100/rte_acc100_pmd.h | 490 ++++++++++++++ 3 files changed, 1631 insertions(+) create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h new file mode 100644 index 0000000..a1ee416 --- /dev/null +++ b/drivers/baseband/acc100/acc100_pf_enum.h @@ -0,0 +1,1068 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2017 Intel Corporation + */ + +#ifndef ACC100_PF_ENUM_H +#define ACC100_PF_ENUM_H + +/* + * ACC100 Register mapping on PF BAR0 + * This is automatically generated from RDL, format may change with new RDL + * Release. + * Variable names are as is + */ +enum { + HWPfQmgrEgressQueuesTemplate = 0x0007FE00, + HWPfQmgrIngressAq = 0x00080000, + HWPfQmgrArbQAvail = 0x00A00010, + HWPfQmgrArbQBlock = 0x00A00014, + HWPfQmgrAqueueDropNotifEn = 0x00A00024, + HWPfQmgrAqueueDisableNotifEn = 0x00A00028, + HWPfQmgrSoftReset = 0x00A00038, + HWPfQmgrInitStatus = 0x00A0003C, + HWPfQmgrAramWatchdogCount = 0x00A00040, + HWPfQmgrAramWatchdogCounterEn = 0x00A00044, + HWPfQmgrAxiWatchdogCount = 0x00A00048, + HWPfQmgrAxiWatchdogCounterEn = 0x00A0004C, + HWPfQmgrProcessWatchdogCount = 0x00A00050, + HWPfQmgrProcessWatchdogCounterEn = 0x00A00054, + HWPfQmgrProcessUl4GWatchdogCounter = 0x00A00058, + HWPfQmgrProcessDl4GWatchdogCounter = 0x00A0005C, + HWPfQmgrProcessUl5GWatchdogCounter = 0x00A00060, + HWPfQmgrProcessDl5GWatchdogCounter = 0x00A00064, + HWPfQmgrProcessMldWatchdogCounter = 0x00A00068, + HWPfQmgrMsiOverflowUpperVf = 0x00A00070, + HWPfQmgrMsiOverflowLowerVf = 0x00A00074, + HWPfQmgrMsiWatchdogOverflow = 0x00A00078, + HWPfQmgrMsiOverflowEnable = 0x00A0007C, + HWPfQmgrDebugAqPointerMemGrp = 0x00A00100, + HWPfQmgrDebugOutputArbQFifoGrp = 0x00A00140, + HWPfQmgrDebugMsiFifoGrp = 0x00A00180, + HWPfQmgrDebugAxiWdTimeoutMsiFifo = 0x00A001C0, + HWPfQmgrDebugProcessWdTimeoutMsiFifo = 0x00A001C4, + HWPfQmgrDepthLog2Grp = 0x00A00200, + HWPfQmgrTholdGrp = 0x00A00300, + HWPfQmgrGrpTmplateReg0Indx = 0x00A00600, + HWPfQmgrGrpTmplateReg1Indx = 0x00A00680, + HWPfQmgrGrpTmplateReg2indx = 0x00A00700, + HWPfQmgrGrpTmplateReg3Indx = 0x00A00780, + HWPfQmgrGrpTmplateReg4Indx = 0x00A00800, + HWPfQmgrVfBaseAddr = 0x00A01000, + HWPfQmgrUl4GWeightRrVf = 0x00A02000, + HWPfQmgrDl4GWeightRrVf = 0x00A02100, + HWPfQmgrUl5GWeightRrVf = 0x00A02200, + HWPfQmgrDl5GWeightRrVf = 0x00A02300, + HWPfQmgrMldWeightRrVf = 0x00A02400, + HWPfQmgrArbQDepthGrp = 0x00A02F00, + HWPfQmgrGrpFunction0 = 0x00A02F40, + HWPfQmgrGrpFunction1 = 0x00A02F44, + HWPfQmgrGrpPriority = 0x00A02F48, + HWPfQmgrWeightSync = 0x00A03000, + HWPfQmgrAqEnableVf = 0x00A10000, + HWPfQmgrAqResetVf = 0x00A20000, + HWPfQmgrRingSizeVf = 0x00A20004, + HWPfQmgrGrpDepthLog20Vf = 0x00A20008, + HWPfQmgrGrpDepthLog21Vf = 0x00A2000C, + HWPfQmgrGrpFunction0Vf = 0x00A20010, + HWPfQmgrGrpFunction1Vf = 0x00A20014, + HWPfDmaConfig0Reg = 0x00B80000, + HWPfDmaConfig1Reg = 0x00B80004, + HWPfDmaQmgrAddrReg = 0x00B80008, + HWPfDmaSoftResetReg = 0x00B8000C, + HWPfDmaAxcacheReg = 0x00B80010, + HWPfDmaVersionReg = 0x00B80014, + HWPfDmaFrameThreshold = 0x00B80018, + HWPfDmaTimestampLo = 0x00B8001C, + HWPfDmaTimestampHi = 0x00B80020, + HWPfDmaAxiStatus = 0x00B80028, + HWPfDmaAxiControl = 0x00B8002C, + HWPfDmaNoQmgr = 0x00B80030, + HWPfDmaQosScale = 0x00B80034, + HWPfDmaQmanen = 0x00B80040, + HWPfDmaQmgrQosBase = 0x00B80060, + HWPfDmaFecClkGatingEnable = 0x00B80080, + HWPfDmaPmEnable = 0x00B80084, + HWPfDmaQosEnable = 0x00B80088, + HWPfDmaHarqWeightedRrFrameThreshold = 0x00B800B0, + HWPfDmaDataSmallWeightedRrFrameThresh = 0x00B800B4, + HWPfDmaDataLargeWeightedRrFrameThresh = 0x00B800B8, + HWPfDmaInboundCbMaxSize = 0x00B800BC, + HWPfDmaInboundDrainDataSize = 0x00B800C0, + HWPfDmaVfDdrBaseRw = 0x00B80400, + HWPfDmaCmplTmOutCnt = 0x00B80800, + HWPfDmaProcTmOutCnt = 0x00B80804, + HWPfDmaStatusRrespBresp = 0x00B80810, + HWPfDmaCfgRrespBresp = 0x00B80814, + HWPfDmaStatusMemParErr = 0x00B80818, + HWPfDmaCfgMemParErrEn = 0x00B8081C, + HWPfDmaStatusDmaHwErr = 0x00B80820, + HWPfDmaCfgDmaHwErrEn = 0x00B80824, + HWPfDmaStatusFecCoreErr = 0x00B80828, + HWPfDmaCfgFecCoreErrEn = 0x00B8082C, + HWPfDmaStatusFcwDescrErr = 0x00B80830, + HWPfDmaCfgFcwDescrErrEn = 0x00B80834, + HWPfDmaStatusBlockTransmit = 0x00B80838, + HWPfDmaBlockOnErrEn = 0x00B8083C, + HWPfDmaStatusFlushDma = 0x00B80840, + HWPfDmaFlushDmaOnErrEn = 0x00B80844, + HWPfDmaStatusSdoneFifoFull = 0x00B80848, + HWPfDmaStatusDescriptorErrLoVf = 0x00B8084C, + HWPfDmaStatusDescriptorErrHiVf = 0x00B80850, + HWPfDmaStatusFcwErrLoVf = 0x00B80854, + HWPfDmaStatusFcwErrHiVf = 0x00B80858, + HWPfDmaStatusDataErrLoVf = 0x00B8085C, + HWPfDmaStatusDataErrHiVf = 0x00B80860, + HWPfDmaCfgMsiEnSoftwareErr = 0x00B80864, + HWPfDmaDescriptorSignatuture = 0x00B80868, + HWPfDmaFcwSignature = 0x00B8086C, + HWPfDmaErrorDetectionEn = 0x00B80870, + HWPfDmaErrCntrlFifoDebug = 0x00B8087C, + HWPfDmaStatusToutData = 0x00B80880, + HWPfDmaStatusToutDesc = 0x00B80884, + HWPfDmaStatusToutUnexpData = 0x00B80888, + HWPfDmaStatusToutUnexpDesc = 0x00B8088C, + HWPfDmaStatusToutProcess = 0x00B80890, + HWPfDmaConfigCtoutOutDataEn = 0x00B808A0, + HWPfDmaConfigCtoutOutDescrEn = 0x00B808A4, + HWPfDmaConfigUnexpComplDataEn = 0x00B808A8, + HWPfDmaConfigUnexpComplDescrEn = 0x00B808AC, + HWPfDmaConfigPtoutOutEn = 0x00B808B0, + HWPfDmaFec5GulDescBaseLoRegVf = 0x00B88020, + HWPfDmaFec5GulDescBaseHiRegVf = 0x00B88024, + HWPfDmaFec5GulRespPtrLoRegVf = 0x00B88028, + HWPfDmaFec5GulRespPtrHiRegVf = 0x00B8802C, + HWPfDmaFec5GdlDescBaseLoRegVf = 0x00B88040, + HWPfDmaFec5GdlDescBaseHiRegVf = 0x00B88044, + HWPfDmaFec5GdlRespPtrLoRegVf = 0x00B88048, + HWPfDmaFec5GdlRespPtrHiRegVf = 0x00B8804C, + HWPfDmaFec4GulDescBaseLoRegVf = 0x00B88060, + HWPfDmaFec4GulDescBaseHiRegVf = 0x00B88064, + HWPfDmaFec4GulRespPtrLoRegVf = 0x00B88068, + HWPfDmaFec4GulRespPtrHiRegVf = 0x00B8806C, + HWPfDmaFec4GdlDescBaseLoRegVf = 0x00B88080, + HWPfDmaFec4GdlDescBaseHiRegVf = 0x00B88084, + HWPfDmaFec4GdlRespPtrLoRegVf = 0x00B88088, + HWPfDmaFec4GdlRespPtrHiRegVf = 0x00B8808C, + HWPfDmaVfDdrBaseRangeRo = 0x00B880A0, + HWPfQosmonACntrlReg = 0x00B90000, + HWPfQosmonAEvalOverflow0 = 0x00B90008, + HWPfQosmonAEvalOverflow1 = 0x00B9000C, + HWPfQosmonADivTerm = 0x00B90010, + HWPfQosmonATickTerm = 0x00B90014, + HWPfQosmonAEvalTerm = 0x00B90018, + HWPfQosmonAAveTerm = 0x00B9001C, + HWPfQosmonAForceEccErr = 0x00B90020, + HWPfQosmonAEccErrDetect = 0x00B90024, + HWPfQosmonAIterationConfig0Low = 0x00B90060, + HWPfQosmonAIterationConfig0High = 0x00B90064, + HWPfQosmonAIterationConfig1Low = 0x00B90068, + HWPfQosmonAIterationConfig1High = 0x00B9006C, + HWPfQosmonAIterationConfig2Low = 0x00B90070, + HWPfQosmonAIterationConfig2High = 0x00B90074, + HWPfQosmonAIterationConfig3Low = 0x00B90078, + HWPfQosmonAIterationConfig3High = 0x00B9007C, + HWPfQosmonAEvalMemAddr = 0x00B90080, + HWPfQosmonAEvalMemData = 0x00B90084, + HWPfQosmonAXaction = 0x00B900C0, + HWPfQosmonARemThres1Vf = 0x00B90400, + HWPfQosmonAThres2Vf = 0x00B90404, + HWPfQosmonAWeiFracVf = 0x00B90408, + HWPfQosmonARrWeiVf = 0x00B9040C, + HWPfPermonACntrlRegVf = 0x00B98000, + HWPfPermonACountVf = 0x00B98008, + HWPfPermonAKCntLoVf = 0x00B98010, + HWPfPermonAKCntHiVf = 0x00B98014, + HWPfPermonADeltaCntLoVf = 0x00B98020, + HWPfPermonADeltaCntHiVf = 0x00B98024, + HWPfPermonAVersionReg = 0x00B9C000, + HWPfPermonACbControlFec = 0x00B9C0F0, + HWPfPermonADltTimerLoFec = 0x00B9C0F4, + HWPfPermonADltTimerHiFec = 0x00B9C0F8, + HWPfPermonACbCountFec = 0x00B9C100, + HWPfPermonAAccExecTimerLoFec = 0x00B9C104, + HWPfPermonAAccExecTimerHiFec = 0x00B9C108, + HWPfPermonAExecTimerMinFec = 0x00B9C200, + HWPfPermonAExecTimerMaxFec = 0x00B9C204, + HWPfPermonAControlBusMon = 0x00B9C400, + HWPfPermonAConfigBusMon = 0x00B9C404, + HWPfPermonASkipCountBusMon = 0x00B9C408, + HWPfPermonAMinLatBusMon = 0x00B9C40C, + HWPfPermonAMaxLatBusMon = 0x00B9C500, + HWPfPermonATotalLatLowBusMon = 0x00B9C504, + HWPfPermonATotalLatUpperBusMon = 0x00B9C508, + HWPfPermonATotalReqCntBusMon = 0x00B9C50C, + HWPfQosmonBCntrlReg = 0x00BA0000, + HWPfQosmonBEvalOverflow0 = 0x00BA0008, + HWPfQosmonBEvalOverflow1 = 0x00BA000C, + HWPfQosmonBDivTerm = 0x00BA0010, + HWPfQosmonBTickTerm = 0x00BA0014, + HWPfQosmonBEvalTerm = 0x00BA0018, + HWPfQosmonBAveTerm = 0x00BA001C, + HWPfQosmonBForceEccErr = 0x00BA0020, + HWPfQosmonBEccErrDetect = 0x00BA0024, + HWPfQosmonBIterationConfig0Low = 0x00BA0060, + HWPfQosmonBIterationConfig0High = 0x00BA0064, + HWPfQosmonBIterationConfig1Low = 0x00BA0068, + HWPfQosmonBIterationConfig1High = 0x00BA006C, + HWPfQosmonBIterationConfig2Low = 0x00BA0070, + HWPfQosmonBIterationConfig2High = 0x00BA0074, + HWPfQosmonBIterationConfig3Low = 0x00BA0078, + HWPfQosmonBIterationConfig3High = 0x00BA007C, + HWPfQosmonBEvalMemAddr = 0x00BA0080, + HWPfQosmonBEvalMemData = 0x00BA0084, + HWPfQosmonBXaction = 0x00BA00C0, + HWPfQosmonBRemThres1Vf = 0x00BA0400, + HWPfQosmonBThres2Vf = 0x00BA0404, + HWPfQosmonBWeiFracVf = 0x00BA0408, + HWPfQosmonBRrWeiVf = 0x00BA040C, + HWPfPermonBCntrlRegVf = 0x00BA8000, + HWPfPermonBCountVf = 0x00BA8008, + HWPfPermonBKCntLoVf = 0x00BA8010, + HWPfPermonBKCntHiVf = 0x00BA8014, + HWPfPermonBDeltaCntLoVf = 0x00BA8020, + HWPfPermonBDeltaCntHiVf = 0x00BA8024, + HWPfPermonBVersionReg = 0x00BAC000, + HWPfPermonBCbControlFec = 0x00BAC0F0, + HWPfPermonBDltTimerLoFec = 0x00BAC0F4, + HWPfPermonBDltTimerHiFec = 0x00BAC0F8, + HWPfPermonBCbCountFec = 0x00BAC100, + HWPfPermonBAccExecTimerLoFec = 0x00BAC104, + HWPfPermonBAccExecTimerHiFec = 0x00BAC108, + HWPfPermonBExecTimerMinFec = 0x00BAC200, + HWPfPermonBExecTimerMaxFec = 0x00BAC204, + HWPfPermonBControlBusMon = 0x00BAC400, + HWPfPermonBConfigBusMon = 0x00BAC404, + HWPfPermonBSkipCountBusMon = 0x00BAC408, + HWPfPermonBMinLatBusMon = 0x00BAC40C, + HWPfPermonBMaxLatBusMon = 0x00BAC500, + HWPfPermonBTotalLatLowBusMon = 0x00BAC504, + HWPfPermonBTotalLatUpperBusMon = 0x00BAC508, + HWPfPermonBTotalReqCntBusMon = 0x00BAC50C, + HWPfFecUl5gCntrlReg = 0x00BC0000, + HWPfFecUl5gI2MThreshReg = 0x00BC0004, + HWPfFecUl5gVersionReg = 0x00BC0100, + HWPfFecUl5gFcwStatusReg = 0x00BC0104, + HWPfFecUl5gWarnReg = 0x00BC0108, + HwPfFecUl5gIbDebugReg = 0x00BC0200, + HwPfFecUl5gObLlrDebugReg = 0x00BC0204, + HwPfFecUl5gObHarqDebugReg = 0x00BC0208, + HwPfFecUl5g1CntrlReg = 0x00BC1000, + HwPfFecUl5g1I2MThreshReg = 0x00BC1004, + HwPfFecUl5g1VersionReg = 0x00BC1100, + HwPfFecUl5g1FcwStatusReg = 0x00BC1104, + HwPfFecUl5g1WarnReg = 0x00BC1108, + HwPfFecUl5g1IbDebugReg = 0x00BC1200, + HwPfFecUl5g1ObLlrDebugReg = 0x00BC1204, + HwPfFecUl5g1ObHarqDebugReg = 0x00BC1208, + HwPfFecUl5g2CntrlReg = 0x00BC2000, + HwPfFecUl5g2I2MThreshReg = 0x00BC2004, + HwPfFecUl5g2VersionReg = 0x00BC2100, + HwPfFecUl5g2FcwStatusReg = 0x00BC2104, + HwPfFecUl5g2WarnReg = 0x00BC2108, + HwPfFecUl5g2IbDebugReg = 0x00BC2200, + HwPfFecUl5g2ObLlrDebugReg = 0x00BC2204, + HwPfFecUl5g2ObHarqDebugReg = 0x00BC2208, + HwPfFecUl5g3CntrlReg = 0x00BC3000, + HwPfFecUl5g3I2MThreshReg = 0x00BC3004, + HwPfFecUl5g3VersionReg = 0x00BC3100, + HwPfFecUl5g3FcwStatusReg = 0x00BC3104, + HwPfFecUl5g3WarnReg = 0x00BC3108, + HwPfFecUl5g3IbDebugReg = 0x00BC3200, + HwPfFecUl5g3ObLlrDebugReg = 0x00BC3204, + HwPfFecUl5g3ObHarqDebugReg = 0x00BC3208, + HwPfFecUl5g4CntrlReg = 0x00BC4000, + HwPfFecUl5g4I2MThreshReg = 0x00BC4004, + HwPfFecUl5g4VersionReg = 0x00BC4100, + HwPfFecUl5g4FcwStatusReg = 0x00BC4104, + HwPfFecUl5g4WarnReg = 0x00BC4108, + HwPfFecUl5g4IbDebugReg = 0x00BC4200, + HwPfFecUl5g4ObLlrDebugReg = 0x00BC4204, + HwPfFecUl5g4ObHarqDebugReg = 0x00BC4208, + HwPfFecUl5g5CntrlReg = 0x00BC5000, + HwPfFecUl5g5I2MThreshReg = 0x00BC5004, + HwPfFecUl5g5VersionReg = 0x00BC5100, + HwPfFecUl5g5FcwStatusReg = 0x00BC5104, + HwPfFecUl5g5WarnReg = 0x00BC5108, + HwPfFecUl5g5IbDebugReg = 0x00BC5200, + HwPfFecUl5g5ObLlrDebugReg = 0x00BC5204, + HwPfFecUl5g5ObHarqDebugReg = 0x00BC5208, + HwPfFecUl5g6CntrlReg = 0x00BC6000, + HwPfFecUl5g6I2MThreshReg = 0x00BC6004, + HwPfFecUl5g6VersionReg = 0x00BC6100, + HwPfFecUl5g6FcwStatusReg = 0x00BC6104, + HwPfFecUl5g6WarnReg = 0x00BC6108, + HwPfFecUl5g6IbDebugReg = 0x00BC6200, + HwPfFecUl5g6ObLlrDebugReg = 0x00BC6204, + HwPfFecUl5g6ObHarqDebugReg = 0x00BC6208, + HwPfFecUl5g7CntrlReg = 0x00BC7000, + HwPfFecUl5g7I2MThreshReg = 0x00BC7004, + HwPfFecUl5g7VersionReg = 0x00BC7100, + HwPfFecUl5g7FcwStatusReg = 0x00BC7104, + HwPfFecUl5g7WarnReg = 0x00BC7108, + HwPfFecUl5g7IbDebugReg = 0x00BC7200, + HwPfFecUl5g7ObLlrDebugReg = 0x00BC7204, + HwPfFecUl5g7ObHarqDebugReg = 0x00BC7208, + HwPfFecUl5g8CntrlReg = 0x00BC8000, + HwPfFecUl5g8I2MThreshReg = 0x00BC8004, + HwPfFecUl5g8VersionReg = 0x00BC8100, + HwPfFecUl5g8FcwStatusReg = 0x00BC8104, + HwPfFecUl5g8WarnReg = 0x00BC8108, + HwPfFecUl5g8IbDebugReg = 0x00BC8200, + HwPfFecUl5g8ObLlrDebugReg = 0x00BC8204, + HwPfFecUl5g8ObHarqDebugReg = 0x00BC8208, + HWPfFecDl5gCntrlReg = 0x00BCF000, + HWPfFecDl5gI2MThreshReg = 0x00BCF004, + HWPfFecDl5gVersionReg = 0x00BCF100, + HWPfFecDl5gFcwStatusReg = 0x00BCF104, + HWPfFecDl5gWarnReg = 0x00BCF108, + HWPfFecUlVersionReg = 0x00BD0000, + HWPfFecUlControlReg = 0x00BD0004, + HWPfFecUlStatusReg = 0x00BD0008, + HWPfFecDlVersionReg = 0x00BDF000, + HWPfFecDlClusterConfigReg = 0x00BDF004, + HWPfFecDlBurstThres = 0x00BDF00C, + HWPfFecDlClusterStatusReg0 = 0x00BDF040, + HWPfFecDlClusterStatusReg1 = 0x00BDF044, + HWPfFecDlClusterStatusReg2 = 0x00BDF048, + HWPfFecDlClusterStatusReg3 = 0x00BDF04C, + HWPfFecDlClusterStatusReg4 = 0x00BDF050, + HWPfFecDlClusterStatusReg5 = 0x00BDF054, + HWPfChaFabPllPllrst = 0x00C40000, + HWPfChaFabPllClk0 = 0x00C40004, + HWPfChaFabPllClk1 = 0x00C40008, + HWPfChaFabPllBwadj = 0x00C4000C, + HWPfChaFabPllLbw = 0x00C40010, + HWPfChaFabPllResetq = 0x00C40014, + HWPfChaFabPllPhshft0 = 0x00C40018, + HWPfChaFabPllPhshft1 = 0x00C4001C, + HWPfChaFabPllDivq0 = 0x00C40020, + HWPfChaFabPllDivq1 = 0x00C40024, + HWPfChaFabPllDivq2 = 0x00C40028, + HWPfChaFabPllDivq3 = 0x00C4002C, + HWPfChaFabPllDivq4 = 0x00C40030, + HWPfChaFabPllDivq5 = 0x00C40034, + HWPfChaFabPllDivq6 = 0x00C40038, + HWPfChaFabPllDivq7 = 0x00C4003C, + HWPfChaDl5gPllPllrst = 0x00C40080, + HWPfChaDl5gPllClk0 = 0x00C40084, + HWPfChaDl5gPllClk1 = 0x00C40088, + HWPfChaDl5gPllBwadj = 0x00C4008C, + HWPfChaDl5gPllLbw = 0x00C40090, + HWPfChaDl5gPllResetq = 0x00C40094, + HWPfChaDl5gPllPhshft0 = 0x00C40098, + HWPfChaDl5gPllPhshft1 = 0x00C4009C, + HWPfChaDl5gPllDivq0 = 0x00C400A0, + HWPfChaDl5gPllDivq1 = 0x00C400A4, + HWPfChaDl5gPllDivq2 = 0x00C400A8, + HWPfChaDl5gPllDivq3 = 0x00C400AC, + HWPfChaDl5gPllDivq4 = 0x00C400B0, + HWPfChaDl5gPllDivq5 = 0x00C400B4, + HWPfChaDl5gPllDivq6 = 0x00C400B8, + HWPfChaDl5gPllDivq7 = 0x00C400BC, + HWPfChaDl4gPllPllrst = 0x00C40100, + HWPfChaDl4gPllClk0 = 0x00C40104, + HWPfChaDl4gPllClk1 = 0x00C40108, + HWPfChaDl4gPllBwadj = 0x00C4010C, + HWPfChaDl4gPllLbw = 0x00C40110, + HWPfChaDl4gPllResetq = 0x00C40114, + HWPfChaDl4gPllPhshft0 = 0x00C40118, + HWPfChaDl4gPllPhshft1 = 0x00C4011C, + HWPfChaDl4gPllDivq0 = 0x00C40120, + HWPfChaDl4gPllDivq1 = 0x00C40124, + HWPfChaDl4gPllDivq2 = 0x00C40128, + HWPfChaDl4gPllDivq3 = 0x00C4012C, + HWPfChaDl4gPllDivq4 = 0x00C40130, + HWPfChaDl4gPllDivq5 = 0x00C40134, + HWPfChaDl4gPllDivq6 = 0x00C40138, + HWPfChaDl4gPllDivq7 = 0x00C4013C, + HWPfChaUl5gPllPllrst = 0x00C40180, + HWPfChaUl5gPllClk0 = 0x00C40184, + HWPfChaUl5gPllClk1 = 0x00C40188, + HWPfChaUl5gPllBwadj = 0x00C4018C, + HWPfChaUl5gPllLbw = 0x00C40190, + HWPfChaUl5gPllResetq = 0x00C40194, + HWPfChaUl5gPllPhshft0 = 0x00C40198, + HWPfChaUl5gPllPhshft1 = 0x00C4019C, + HWPfChaUl5gPllDivq0 = 0x00C401A0, + HWPfChaUl5gPllDivq1 = 0x00C401A4, + HWPfChaUl5gPllDivq2 = 0x00C401A8, + HWPfChaUl5gPllDivq3 = 0x00C401AC, + HWPfChaUl5gPllDivq4 = 0x00C401B0, + HWPfChaUl5gPllDivq5 = 0x00C401B4, + HWPfChaUl5gPllDivq6 = 0x00C401B8, + HWPfChaUl5gPllDivq7 = 0x00C401BC, + HWPfChaUl4gPllPllrst = 0x00C40200, + HWPfChaUl4gPllClk0 = 0x00C40204, + HWPfChaUl4gPllClk1 = 0x00C40208, + HWPfChaUl4gPllBwadj = 0x00C4020C, + HWPfChaUl4gPllLbw = 0x00C40210, + HWPfChaUl4gPllResetq = 0x00C40214, + HWPfChaUl4gPllPhshft0 = 0x00C40218, + HWPfChaUl4gPllPhshft1 = 0x00C4021C, + HWPfChaUl4gPllDivq0 = 0x00C40220, + HWPfChaUl4gPllDivq1 = 0x00C40224, + HWPfChaUl4gPllDivq2 = 0x00C40228, + HWPfChaUl4gPllDivq3 = 0x00C4022C, + HWPfChaUl4gPllDivq4 = 0x00C40230, + HWPfChaUl4gPllDivq5 = 0x00C40234, + HWPfChaUl4gPllDivq6 = 0x00C40238, + HWPfChaUl4gPllDivq7 = 0x00C4023C, + HWPfChaDdrPllPllrst = 0x00C40280, + HWPfChaDdrPllClk0 = 0x00C40284, + HWPfChaDdrPllClk1 = 0x00C40288, + HWPfChaDdrPllBwadj = 0x00C4028C, + HWPfChaDdrPllLbw = 0x00C40290, + HWPfChaDdrPllResetq = 0x00C40294, + HWPfChaDdrPllPhshft0 = 0x00C40298, + HWPfChaDdrPllPhshft1 = 0x00C4029C, + HWPfChaDdrPllDivq0 = 0x00C402A0, + HWPfChaDdrPllDivq1 = 0x00C402A4, + HWPfChaDdrPllDivq2 = 0x00C402A8, + HWPfChaDdrPllDivq3 = 0x00C402AC, + HWPfChaDdrPllDivq4 = 0x00C402B0, + HWPfChaDdrPllDivq5 = 0x00C402B4, + HWPfChaDdrPllDivq6 = 0x00C402B8, + HWPfChaDdrPllDivq7 = 0x00C402BC, + HWPfChaErrStatus = 0x00C40400, + HWPfChaErrMask = 0x00C40404, + HWPfChaDebugPcieMsiFifo = 0x00C40410, + HWPfChaDebugDdrMsiFifo = 0x00C40414, + HWPfChaDebugMiscMsiFifo = 0x00C40418, + HWPfChaPwmSet = 0x00C40420, + HWPfChaDdrRstStatus = 0x00C40430, + HWPfChaDdrStDoneStatus = 0x00C40434, + HWPfChaDdrWbRstCfg = 0x00C40438, + HWPfChaDdrApbRstCfg = 0x00C4043C, + HWPfChaDdrPhyRstCfg = 0x00C40440, + HWPfChaDdrCpuRstCfg = 0x00C40444, + HWPfChaDdrSifRstCfg = 0x00C40448, + HWPfChaPadcfgPcomp0 = 0x00C41000, + HWPfChaPadcfgNcomp0 = 0x00C41004, + HWPfChaPadcfgOdt0 = 0x00C41008, + HWPfChaPadcfgProtect0 = 0x00C4100C, + HWPfChaPreemphasisProtect0 = 0x00C41010, + HWPfChaPreemphasisCompen0 = 0x00C41040, + HWPfChaPreemphasisOdten0 = 0x00C41044, + HWPfChaPadcfgPcomp1 = 0x00C41100, + HWPfChaPadcfgNcomp1 = 0x00C41104, + HWPfChaPadcfgOdt1 = 0x00C41108, + HWPfChaPadcfgProtect1 = 0x00C4110C, + HWPfChaPreemphasisProtect1 = 0x00C41110, + HWPfChaPreemphasisCompen1 = 0x00C41140, + HWPfChaPreemphasisOdten1 = 0x00C41144, + HWPfChaPadcfgPcomp2 = 0x00C41200, + HWPfChaPadcfgNcomp2 = 0x00C41204, + HWPfChaPadcfgOdt2 = 0x00C41208, + HWPfChaPadcfgProtect2 = 0x00C4120C, + HWPfChaPreemphasisProtect2 = 0x00C41210, + HWPfChaPreemphasisCompen2 = 0x00C41240, + HWPfChaPreemphasisOdten4 = 0x00C41444, + HWPfChaPreemphasisOdten2 = 0x00C41244, + HWPfChaPadcfgPcomp3 = 0x00C41300, + HWPfChaPadcfgNcomp3 = 0x00C41304, + HWPfChaPadcfgOdt3 = 0x00C41308, + HWPfChaPadcfgProtect3 = 0x00C4130C, + HWPfChaPreemphasisProtect3 = 0x00C41310, + HWPfChaPreemphasisCompen3 = 0x00C41340, + HWPfChaPreemphasisOdten3 = 0x00C41344, + HWPfChaPadcfgPcomp4 = 0x00C41400, + HWPfChaPadcfgNcomp4 = 0x00C41404, + HWPfChaPadcfgOdt4 = 0x00C41408, + HWPfChaPadcfgProtect4 = 0x00C4140C, + HWPfChaPreemphasisProtect4 = 0x00C41410, + HWPfChaPreemphasisCompen4 = 0x00C41440, + HWPfHiVfToPfDbellVf = 0x00C80000, + HWPfHiPfToVfDbellVf = 0x00C80008, + HWPfHiInfoRingBaseLoVf = 0x00C80010, + HWPfHiInfoRingBaseHiVf = 0x00C80014, + HWPfHiInfoRingPointerVf = 0x00C80018, + HWPfHiInfoRingIntWrEnVf = 0x00C80020, + HWPfHiInfoRingPf2VfWrEnVf = 0x00C80024, + HWPfHiMsixVectorMapperVf = 0x00C80060, + HWPfHiModuleVersionReg = 0x00C84000, + HWPfHiIosf2axiErrLogReg = 0x00C84004, + HWPfHiHardResetReg = 0x00C84008, + HWPfHi5GHardResetReg = 0x00C8400C, + HWPfHiInfoRingBaseLoRegPf = 0x00C84010, + HWPfHiInfoRingBaseHiRegPf = 0x00C84014, + HWPfHiInfoRingPointerRegPf = 0x00C84018, + HWPfHiInfoRingIntWrEnRegPf = 0x00C84020, + HWPfHiInfoRingVf2pfLoWrEnReg = 0x00C84024, + HWPfHiInfoRingVf2pfHiWrEnReg = 0x00C84028, + HWPfHiLogParityErrStatusReg = 0x00C8402C, + HWPfHiLogDataParityErrorVfStatusLo = 0x00C84030, + HWPfHiLogDataParityErrorVfStatusHi = 0x00C84034, + HWPfHiBlockTransmitOnErrorEn = 0x00C84038, + HWPfHiCfgMsiIntWrEnRegPf = 0x00C84040, + HWPfHiCfgMsiVf2pfLoWrEnReg = 0x00C84044, + HWPfHiCfgMsiVf2pfHighWrEnReg = 0x00C84048, + HWPfHiMsixVectorMapperPf = 0x00C84060, + HWPfHiApbWrWaitTime = 0x00C84100, + HWPfHiXCounterMaxValue = 0x00C84104, + HWPfHiPfMode = 0x00C84108, + HWPfHiClkGateHystReg = 0x00C8410C, + HWPfHiSnoopBitsReg = 0x00C84110, + HWPfHiMsiDropEnableReg = 0x00C84114, + HWPfHiMsiStatReg = 0x00C84120, + HWPfHiFifoOflStatReg = 0x00C84124, + HWPfHiHiDebugReg = 0x00C841F4, + HWPfHiDebugMemSnoopMsiFifo = 0x00C841F8, + HWPfHiDebugMemSnoopInputFifo = 0x00C841FC, + HWPfHiMsixMappingConfig = 0x00C84200, + HWPfHiJunkReg = 0x00C8FF00, + HWPfDdrUmmcVer = 0x00D00000, + HWPfDdrUmmcCap = 0x00D00010, + HWPfDdrUmmcCtrl = 0x00D00020, + HWPfDdrMpcPe = 0x00D00080, + HWPfDdrMpcPpri3 = 0x00D00090, + HWPfDdrMpcPpri2 = 0x00D000A0, + HWPfDdrMpcPpri1 = 0x00D000B0, + HWPfDdrMpcPpri0 = 0x00D000C0, + HWPfDdrMpcPrwgrpCtrl = 0x00D000D0, + HWPfDdrMpcPbw7 = 0x00D000E0, + HWPfDdrMpcPbw6 = 0x00D000F0, + HWPfDdrMpcPbw5 = 0x00D00100, + HWPfDdrMpcPbw4 = 0x00D00110, + HWPfDdrMpcPbw3 = 0x00D00120, + HWPfDdrMpcPbw2 = 0x00D00130, + HWPfDdrMpcPbw1 = 0x00D00140, + HWPfDdrMpcPbw0 = 0x00D00150, + HWPfDdrMemoryInit = 0x00D00200, + HWPfDdrMemoryInitDone = 0x00D00210, + HWPfDdrMemInitPhyTrng0 = 0x00D00240, + HWPfDdrMemInitPhyTrng1 = 0x00D00250, + HWPfDdrMemInitPhyTrng2 = 0x00D00260, + HWPfDdrMemInitPhyTrng3 = 0x00D00270, + HWPfDdrBcDram = 0x00D003C0, + HWPfDdrBcAddrMap = 0x00D003D0, + HWPfDdrBcRef = 0x00D003E0, + HWPfDdrBcTim0 = 0x00D00400, + HWPfDdrBcTim1 = 0x00D00410, + HWPfDdrBcTim2 = 0x00D00420, + HWPfDdrBcTim3 = 0x00D00430, + HWPfDdrBcTim4 = 0x00D00440, + HWPfDdrBcTim5 = 0x00D00450, + HWPfDdrBcTim6 = 0x00D00460, + HWPfDdrBcTim7 = 0x00D00470, + HWPfDdrBcTim8 = 0x00D00480, + HWPfDdrBcTim9 = 0x00D00490, + HWPfDdrBcTim10 = 0x00D004A0, + HWPfDdrBcTim12 = 0x00D004C0, + HWPfDdrDfiInit = 0x00D004D0, + HWPfDdrDfiInitComplete = 0x00D004E0, + HWPfDdrDfiTim0 = 0x00D004F0, + HWPfDdrDfiTim1 = 0x00D00500, + HWPfDdrDfiPhyUpdEn = 0x00D00530, + HWPfDdrMemStatus = 0x00D00540, + HWPfDdrUmmcErrStatus = 0x00D00550, + HWPfDdrUmmcIntStatus = 0x00D00560, + HWPfDdrUmmcIntEn = 0x00D00570, + HWPfDdrPhyRdLatency = 0x00D48400, + HWPfDdrPhyRdLatencyDbi = 0x00D48410, + HWPfDdrPhyWrLatency = 0x00D48420, + HWPfDdrPhyTrngType = 0x00D48430, + HWPfDdrPhyMrsTiming2 = 0x00D48440, + HWPfDdrPhyMrsTiming0 = 0x00D48450, + HWPfDdrPhyMrsTiming1 = 0x00D48460, + HWPfDdrPhyDramTmrd = 0x00D48470, + HWPfDdrPhyDramTmod = 0x00D48480, + HWPfDdrPhyDramTwpre = 0x00D48490, + HWPfDdrPhyDramTrfc = 0x00D484A0, + HWPfDdrPhyDramTrwtp = 0x00D484B0, + HWPfDdrPhyMr01Dimm = 0x00D484C0, + HWPfDdrPhyMr01DimmDbi = 0x00D484D0, + HWPfDdrPhyMr23Dimm = 0x00D484E0, + HWPfDdrPhyMr45Dimm = 0x00D484F0, + HWPfDdrPhyMr67Dimm = 0x00D48500, + HWPfDdrPhyWrlvlWwRdlvlRr = 0x00D48510, + HWPfDdrPhyOdtEn = 0x00D48520, + HWPfDdrPhyFastTrng = 0x00D48530, + HWPfDdrPhyDynTrngGap = 0x00D48540, + HWPfDdrPhyDynRcalGap = 0x00D48550, + HWPfDdrPhyIdletimeout = 0x00D48560, + HWPfDdrPhyRstCkeGap = 0x00D48570, + HWPfDdrPhyCkeMrsGap = 0x00D48580, + HWPfDdrPhyMemVrefMidVal = 0x00D48590, + HWPfDdrPhyVrefStep = 0x00D485A0, + HWPfDdrPhyVrefThreshold = 0x00D485B0, + HWPfDdrPhyPhyVrefMidVal = 0x00D485C0, + HWPfDdrPhyDqsCountMax = 0x00D485D0, + HWPfDdrPhyDqsCountNum = 0x00D485E0, + HWPfDdrPhyDramRow = 0x00D485F0, + HWPfDdrPhyDramCol = 0x00D48600, + HWPfDdrPhyDramBgBa = 0x00D48610, + HWPfDdrPhyDynamicUpdreqrel = 0x00D48620, + HWPfDdrPhyVrefLimits = 0x00D48630, + HWPfDdrPhyIdtmTcStatus = 0x00D6C020, + HWPfDdrPhyIdtmFwVersion = 0x00D6C410, + HWPfDdrPhyRdlvlGateInitDelay = 0x00D70000, + HWPfDdrPhyRdenSmplabc = 0x00D70008, + HWPfDdrPhyVrefNibble0 = 0x00D7000C, + HWPfDdrPhyVrefNibble1 = 0x00D70010, + HWPfDdrPhyRdlvlGateDqsSmpl0 = 0x00D70014, + HWPfDdrPhyRdlvlGateDqsSmpl1 = 0x00D70018, + HWPfDdrPhyRdlvlGateDqsSmpl2 = 0x00D7001C, + HWPfDdrPhyDqsCount = 0x00D70020, + HWPfDdrPhyWrlvlRdlvlGateStatus = 0x00D70024, + HWPfDdrPhyErrorFlags = 0x00D70028, + HWPfDdrPhyPowerDown = 0x00D70030, + HWPfDdrPhyPrbsSeedByte0 = 0x00D70034, + HWPfDdrPhyPrbsSeedByte1 = 0x00D70038, + HWPfDdrPhyPcompDq = 0x00D70040, + HWPfDdrPhyNcompDq = 0x00D70044, + HWPfDdrPhyPcompDqs = 0x00D70048, + HWPfDdrPhyNcompDqs = 0x00D7004C, + HWPfDdrPhyPcompCmd = 0x00D70050, + HWPfDdrPhyNcompCmd = 0x00D70054, + HWPfDdrPhyPcompCk = 0x00D70058, + HWPfDdrPhyNcompCk = 0x00D7005C, + HWPfDdrPhyRcalOdtDq = 0x00D70060, + HWPfDdrPhyRcalOdtDqs = 0x00D70064, + HWPfDdrPhyRcalMask1 = 0x00D70068, + HWPfDdrPhyRcalMask2 = 0x00D7006C, + HWPfDdrPhyRcalCtrl = 0x00D70070, + HWPfDdrPhyRcalCnt = 0x00D70074, + HWPfDdrPhyRcalOverride = 0x00D70078, + HWPfDdrPhyRcalGateen = 0x00D7007C, + HWPfDdrPhyCtrl = 0x00D70080, + HWPfDdrPhyWrlvlAlg = 0x00D70084, + HWPfDdrPhyRcalVreftTxcmdOdt = 0x00D70088, + HWPfDdrPhyRdlvlGateParam = 0x00D7008C, + HWPfDdrPhyRdlvlGateParam2 = 0x00D70090, + HWPfDdrPhyRcalVreftTxdata = 0x00D70094, + HWPfDdrPhyCmdIntDelay = 0x00D700A4, + HWPfDdrPhyAlertN = 0x00D700A8, + HWPfDdrPhyTrngReqWpre2tck = 0x00D700AC, + HWPfDdrPhyCmdPhaseSel = 0x00D700B4, + HWPfDdrPhyCmdDcdl = 0x00D700B8, + HWPfDdrPhyCkDcdl = 0x00D700BC, + HWPfDdrPhySwTrngCtrl1 = 0x00D700C0, + HWPfDdrPhySwTrngCtrl2 = 0x00D700C4, + HWPfDdrPhyRcalPcompRden = 0x00D700C8, + HWPfDdrPhyRcalNcompRden = 0x00D700CC, + HWPfDdrPhyRcalCompen = 0x00D700D0, + HWPfDdrPhySwTrngRdqs = 0x00D700D4, + HWPfDdrPhySwTrngWdqs = 0x00D700D8, + HWPfDdrPhySwTrngRdena = 0x00D700DC, + HWPfDdrPhySwTrngRdenb = 0x00D700E0, + HWPfDdrPhySwTrngRdenc = 0x00D700E4, + HWPfDdrPhySwTrngWdq = 0x00D700E8, + HWPfDdrPhySwTrngRdq = 0x00D700EC, + HWPfDdrPhyPcfgHmValue = 0x00D700F0, + HWPfDdrPhyPcfgTimerValue = 0x00D700F4, + HWPfDdrPhyPcfgSoftwareTraining = 0x00D700F8, + HWPfDdrPhyPcfgMcStatus = 0x00D700FC, + HWPfDdrPhyWrlvlPhRank0 = 0x00D70100, + HWPfDdrPhyRdenPhRank0 = 0x00D70104, + HWPfDdrPhyRdenIntRank0 = 0x00D70108, + HWPfDdrPhyRdqsDcdlRank0 = 0x00D7010C, + HWPfDdrPhyRdqsShadowDcdlRank0 = 0x00D70110, + HWPfDdrPhyWdqsDcdlRank0 = 0x00D70114, + HWPfDdrPhyWdmDcdlShadowRank0 = 0x00D70118, + HWPfDdrPhyWdmDcdlRank0 = 0x00D7011C, + HWPfDdrPhyDbiDcdlRank0 = 0x00D70120, + HWPfDdrPhyRdenDcdlaRank0 = 0x00D70124, + HWPfDdrPhyDbiDcdlShadowRank0 = 0x00D70128, + HWPfDdrPhyRdenDcdlbRank0 = 0x00D7012C, + HWPfDdrPhyWdqsShadowDcdlRank0 = 0x00D70130, + HWPfDdrPhyRdenDcdlcRank0 = 0x00D70134, + HWPfDdrPhyRdenShadowDcdlaRank0 = 0x00D70138, + HWPfDdrPhyWrlvlIntRank0 = 0x00D7013C, + HWPfDdrPhyRdqDcdlBit0Rank0 = 0x00D70200, + HWPfDdrPhyRdqDcdlShadowBit0Rank0 = 0x00D70204, + HWPfDdrPhyWdqDcdlBit0Rank0 = 0x00D70208, + HWPfDdrPhyWdqDcdlShadowBit0Rank0 = 0x00D7020C, + HWPfDdrPhyRdqDcdlBit1Rank0 = 0x00D70240, + HWPfDdrPhyRdqDcdlShadowBit1Rank0 = 0x00D70244, + HWPfDdrPhyWdqDcdlBit1Rank0 = 0x00D70248, + HWPfDdrPhyWdqDcdlShadowBit1Rank0 = 0x00D7024C, + HWPfDdrPhyRdqDcdlBit2Rank0 = 0x00D70280, + HWPfDdrPhyRdqDcdlShadowBit2Rank0 = 0x00D70284, + HWPfDdrPhyWdqDcdlBit2Rank0 = 0x00D70288, + HWPfDdrPhyWdqDcdlShadowBit2Rank0 = 0x00D7028C, + HWPfDdrPhyRdqDcdlBit3Rank0 = 0x00D702C0, + HWPfDdrPhyRdqDcdlShadowBit3Rank0 = 0x00D702C4, + HWPfDdrPhyWdqDcdlBit3Rank0 = 0x00D702C8, + HWPfDdrPhyWdqDcdlShadowBit3Rank0 = 0x00D702CC, + HWPfDdrPhyRdqDcdlBit4Rank0 = 0x00D70300, + HWPfDdrPhyRdqDcdlShadowBit4Rank0 = 0x00D70304, + HWPfDdrPhyWdqDcdlBit4Rank0 = 0x00D70308, + HWPfDdrPhyWdqDcdlShadowBit4Rank0 = 0x00D7030C, + HWPfDdrPhyRdqDcdlBit5Rank0 = 0x00D70340, + HWPfDdrPhyRdqDcdlShadowBit5Rank0 = 0x00D70344, + HWPfDdrPhyWdqDcdlBit5Rank0 = 0x00D70348, + HWPfDdrPhyWdqDcdlShadowBit5Rank0 = 0x00D7034C, + HWPfDdrPhyRdqDcdlBit6Rank0 = 0x00D70380, + HWPfDdrPhyRdqDcdlShadowBit6Rank0 = 0x00D70384, + HWPfDdrPhyWdqDcdlBit6Rank0 = 0x00D70388, + HWPfDdrPhyWdqDcdlShadowBit6Rank0 = 0x00D7038C, + HWPfDdrPhyRdqDcdlBit7Rank0 = 0x00D703C0, + HWPfDdrPhyRdqDcdlShadowBit7Rank0 = 0x00D703C4, + HWPfDdrPhyWdqDcdlBit7Rank0 = 0x00D703C8, + HWPfDdrPhyWdqDcdlShadowBit7Rank0 = 0x00D703CC, + HWPfDdrPhyIdtmStatus = 0x00D740D0, + HWPfDdrPhyIdtmError = 0x00D74110, + HWPfDdrPhyIdtmDebug = 0x00D74120, + HWPfDdrPhyIdtmDebugInt = 0x00D74130, + HwPfPcieLnAsicCfgovr = 0x00D80000, + HwPfPcieLnAclkmixer = 0x00D80004, + HwPfPcieLnTxrampfreq = 0x00D80008, + HwPfPcieLnLanetest = 0x00D8000C, + HwPfPcieLnDcctrl = 0x00D80010, + HwPfPcieLnDccmeas = 0x00D80014, + HwPfPcieLnDccovrAclk = 0x00D80018, + HwPfPcieLnDccovrTxa = 0x00D8001C, + HwPfPcieLnDccovrTxk = 0x00D80020, + HwPfPcieLnDccovrDclk = 0x00D80024, + HwPfPcieLnDccovrEclk = 0x00D80028, + HwPfPcieLnDcctrimAclk = 0x00D8002C, + HwPfPcieLnDcctrimTx = 0x00D80030, + HwPfPcieLnDcctrimDclk = 0x00D80034, + HwPfPcieLnDcctrimEclk = 0x00D80038, + HwPfPcieLnQuadCtrl = 0x00D8003C, + HwPfPcieLnQuadCorrIndex = 0x00D80040, + HwPfPcieLnQuadCorrStatus = 0x00D80044, + HwPfPcieLnAsicRxovr1 = 0x00D80048, + HwPfPcieLnAsicRxovr2 = 0x00D8004C, + HwPfPcieLnAsicEqinfovr = 0x00D80050, + HwPfPcieLnRxcsr = 0x00D80054, + HwPfPcieLnRxfectrl = 0x00D80058, + HwPfPcieLnRxtest = 0x00D8005C, + HwPfPcieLnEscount = 0x00D80060, + HwPfPcieLnCdrctrl = 0x00D80064, + HwPfPcieLnCdrctrl2 = 0x00D80068, + HwPfPcieLnCdrcfg0Ctrl0 = 0x00D8006C, + HwPfPcieLnCdrcfg0Ctrl1 = 0x00D80070, + HwPfPcieLnCdrcfg0Ctrl2 = 0x00D80074, + HwPfPcieLnCdrcfg1Ctrl0 = 0x00D80078, + HwPfPcieLnCdrcfg1Ctrl1 = 0x00D8007C, + HwPfPcieLnCdrcfg1Ctrl2 = 0x00D80080, + HwPfPcieLnCdrcfg2Ctrl0 = 0x00D80084, + HwPfPcieLnCdrcfg2Ctrl1 = 0x00D80088, + HwPfPcieLnCdrcfg2Ctrl2 = 0x00D8008C, + HwPfPcieLnCdrcfg3Ctrl0 = 0x00D80090, + HwPfPcieLnCdrcfg3Ctrl1 = 0x00D80094, + HwPfPcieLnCdrcfg3Ctrl2 = 0x00D80098, + HwPfPcieLnCdrphase = 0x00D8009C, + HwPfPcieLnCdrfreq = 0x00D800A0, + HwPfPcieLnCdrstatusPhase = 0x00D800A4, + HwPfPcieLnCdrstatusFreq = 0x00D800A8, + HwPfPcieLnCdroffset = 0x00D800AC, + HwPfPcieLnRxvosctl = 0x00D800B0, + HwPfPcieLnRxvosctl2 = 0x00D800B4, + HwPfPcieLnRxlosctl = 0x00D800B8, + HwPfPcieLnRxlos = 0x00D800BC, + HwPfPcieLnRxlosvval = 0x00D800C0, + HwPfPcieLnRxvosd0 = 0x00D800C4, + HwPfPcieLnRxvosd1 = 0x00D800C8, + HwPfPcieLnRxvosep0 = 0x00D800CC, + HwPfPcieLnRxvosep1 = 0x00D800D0, + HwPfPcieLnRxvosen0 = 0x00D800D4, + HwPfPcieLnRxvosen1 = 0x00D800D8, + HwPfPcieLnRxvosafe = 0x00D800DC, + HwPfPcieLnRxvosa0 = 0x00D800E0, + HwPfPcieLnRxvosa0Out = 0x00D800E4, + HwPfPcieLnRxvosa1 = 0x00D800E8, + HwPfPcieLnRxvosa1Out = 0x00D800EC, + HwPfPcieLnRxmisc = 0x00D800F0, + HwPfPcieLnRxbeacon = 0x00D800F4, + HwPfPcieLnRxdssout = 0x00D800F8, + HwPfPcieLnRxdssout2 = 0x00D800FC, + HwPfPcieLnAlphapctrl = 0x00D80100, + HwPfPcieLnAlphanctrl = 0x00D80104, + HwPfPcieLnAdaptctrl = 0x00D80108, + HwPfPcieLnAdaptctrl1 = 0x00D8010C, + HwPfPcieLnAdaptstatus = 0x00D80110, + HwPfPcieLnAdaptvga1 = 0x00D80114, + HwPfPcieLnAdaptvga2 = 0x00D80118, + HwPfPcieLnAdaptvga3 = 0x00D8011C, + HwPfPcieLnAdaptvga4 = 0x00D80120, + HwPfPcieLnAdaptboost1 = 0x00D80124, + HwPfPcieLnAdaptboost2 = 0x00D80128, + HwPfPcieLnAdaptboost3 = 0x00D8012C, + HwPfPcieLnAdaptboost4 = 0x00D80130, + HwPfPcieLnAdaptsslms1 = 0x00D80134, + HwPfPcieLnAdaptsslms2 = 0x00D80138, + HwPfPcieLnAdaptvgaStatus = 0x00D8013C, + HwPfPcieLnAdaptboostStatus = 0x00D80140, + HwPfPcieLnAdaptsslmsStatus1 = 0x00D80144, + HwPfPcieLnAdaptsslmsStatus2 = 0x00D80148, + HwPfPcieLnAfectrl1 = 0x00D8014C, + HwPfPcieLnAfectrl2 = 0x00D80150, + HwPfPcieLnAfectrl3 = 0x00D80154, + HwPfPcieLnAfedefault1 = 0x00D80158, + HwPfPcieLnAfedefault2 = 0x00D8015C, + HwPfPcieLnDfectrl1 = 0x00D80160, + HwPfPcieLnDfectrl2 = 0x00D80164, + HwPfPcieLnDfectrl3 = 0x00D80168, + HwPfPcieLnDfectrl4 = 0x00D8016C, + HwPfPcieLnDfectrl5 = 0x00D80170, + HwPfPcieLnDfectrl6 = 0x00D80174, + HwPfPcieLnAfestatus1 = 0x00D80178, + HwPfPcieLnAfestatus2 = 0x00D8017C, + HwPfPcieLnDfestatus1 = 0x00D80180, + HwPfPcieLnDfestatus2 = 0x00D80184, + HwPfPcieLnDfestatus3 = 0x00D80188, + HwPfPcieLnDfestatus4 = 0x00D8018C, + HwPfPcieLnDfestatus5 = 0x00D80190, + HwPfPcieLnAlphastatus = 0x00D80194, + HwPfPcieLnFomctrl1 = 0x00D80198, + HwPfPcieLnFomctrl2 = 0x00D8019C, + HwPfPcieLnFomctrl3 = 0x00D801A0, + HwPfPcieLnAclkcalStatus = 0x00D801A4, + HwPfPcieLnOffscorrStatus = 0x00D801A8, + HwPfPcieLnEyewidthStatus = 0x00D801AC, + HwPfPcieLnEyeheightStatus = 0x00D801B0, + HwPfPcieLnAsicTxovr1 = 0x00D801B4, + HwPfPcieLnAsicTxovr2 = 0x00D801B8, + HwPfPcieLnAsicTxovr3 = 0x00D801BC, + HwPfPcieLnTxbiasadjOvr = 0x00D801C0, + HwPfPcieLnTxcsr = 0x00D801C4, + HwPfPcieLnTxtest = 0x00D801C8, + HwPfPcieLnTxtestword = 0x00D801CC, + HwPfPcieLnTxtestwordHigh = 0x00D801D0, + HwPfPcieLnTxdrive = 0x00D801D4, + HwPfPcieLnMtcsLn = 0x00D801D8, + HwPfPcieLnStatsumLn = 0x00D801DC, + HwPfPcieLnRcbusScratch = 0x00D801E0, + HwPfPcieLnRcbusMinorrev = 0x00D801F0, + HwPfPcieLnRcbusMajorrev = 0x00D801F4, + HwPfPcieLnRcbusBlocktype = 0x00D801F8, + HwPfPcieSupPllcsr = 0x00D80800, + HwPfPcieSupPlldiv = 0x00D80804, + HwPfPcieSupPllcal = 0x00D80808, + HwPfPcieSupPllcalsts = 0x00D8080C, + HwPfPcieSupPllmeas = 0x00D80810, + HwPfPcieSupPlldactrim = 0x00D80814, + HwPfPcieSupPllbiastrim = 0x00D80818, + HwPfPcieSupPllbwtrim = 0x00D8081C, + HwPfPcieSupPllcaldly = 0x00D80820, + HwPfPcieSupRefclkonpclkctrl = 0x00D80824, + HwPfPcieSupPclkdelay = 0x00D80828, + HwPfPcieSupPhyconfig = 0x00D8082C, + HwPfPcieSupRcalIntf = 0x00D80830, + HwPfPcieSupAuxcsr = 0x00D80834, + HwPfPcieSupVref = 0x00D80838, + HwPfPcieSupLinkmode = 0x00D8083C, + HwPfPcieSupRrefcalctl = 0x00D80840, + HwPfPcieSupRrefcal = 0x00D80844, + HwPfPcieSupRrefcaldly = 0x00D80848, + HwPfPcieSupTximpcalctl = 0x00D8084C, + HwPfPcieSupTximpcal = 0x00D80850, + HwPfPcieSupTximpoffset = 0x00D80854, + HwPfPcieSupTximpcaldly = 0x00D80858, + HwPfPcieSupRximpcalctl = 0x00D8085C, + HwPfPcieSupRximpcal = 0x00D80860, + HwPfPcieSupRximpoffset = 0x00D80864, + HwPfPcieSupRximpcaldly = 0x00D80868, + HwPfPcieSupFence = 0x00D8086C, + HwPfPcieSupMtcs = 0x00D80870, + HwPfPcieSupStatsum = 0x00D809B8, + HwPfPciePcsDpStatus0 = 0x00D81000, + HwPfPciePcsDpControl0 = 0x00D81004, + HwPfPciePcsPmaStatusLane0 = 0x00D81008, + HwPfPciePcsPipeStatusLane0 = 0x00D8100C, + HwPfPciePcsTxdeemph0Lane0 = 0x00D81010, + HwPfPciePcsTxdeemph1Lane0 = 0x00D81014, + HwPfPciePcsInternalStatusLane0 = 0x00D81018, + HwPfPciePcsDpStatus1 = 0x00D8101C, + HwPfPciePcsDpControl1 = 0x00D81020, + HwPfPciePcsPmaStatusLane1 = 0x00D81024, + HwPfPciePcsPipeStatusLane1 = 0x00D81028, + HwPfPciePcsTxdeemph0Lane1 = 0x00D8102C, + HwPfPciePcsTxdeemph1Lane1 = 0x00D81030, + HwPfPciePcsInternalStatusLane1 = 0x00D81034, + HwPfPciePcsDpStatus2 = 0x00D81038, + HwPfPciePcsDpControl2 = 0x00D8103C, + HwPfPciePcsPmaStatusLane2 = 0x00D81040, + HwPfPciePcsPipeStatusLane2 = 0x00D81044, + HwPfPciePcsTxdeemph0Lane2 = 0x00D81048, + HwPfPciePcsTxdeemph1Lane2 = 0x00D8104C, + HwPfPciePcsInternalStatusLane2 = 0x00D81050, + HwPfPciePcsDpStatus3 = 0x00D81054, + HwPfPciePcsDpControl3 = 0x00D81058, + HwPfPciePcsPmaStatusLane3 = 0x00D8105C, + HwPfPciePcsPipeStatusLane3 = 0x00D81060, + HwPfPciePcsTxdeemph0Lane3 = 0x00D81064, + HwPfPciePcsTxdeemph1Lane3 = 0x00D81068, + HwPfPciePcsInternalStatusLane3 = 0x00D8106C, + HwPfPciePcsEbStatus0 = 0x00D81070, + HwPfPciePcsEbStatus1 = 0x00D81074, + HwPfPciePcsEbStatus2 = 0x00D81078, + HwPfPciePcsEbStatus3 = 0x00D8107C, + HwPfPciePcsPllSettingPcieG1 = 0x00D81088, + HwPfPciePcsPllSettingPcieG2 = 0x00D8108C, + HwPfPciePcsPllSettingPcieG3 = 0x00D81090, + HwPfPciePcsControl = 0x00D81094, + HwPfPciePcsEqControl = 0x00D81098, + HwPfPciePcsEqTimer = 0x00D8109C, + HwPfPciePcsEqErrStatus = 0x00D810A0, + HwPfPciePcsEqErrCount = 0x00D810A4, + HwPfPciePcsStatus = 0x00D810A8, + HwPfPciePcsMiscRegister = 0x00D810AC, + HwPfPciePcsObsControl = 0x00D810B0, + HwPfPciePcsPrbsCount0 = 0x00D81200, + HwPfPciePcsBistControl0 = 0x00D81204, + HwPfPciePcsBistStaticWord00 = 0x00D81208, + HwPfPciePcsBistStaticWord10 = 0x00D8120C, + HwPfPciePcsBistStaticWord20 = 0x00D81210, + HwPfPciePcsBistStaticWord30 = 0x00D81214, + HwPfPciePcsPrbsCount1 = 0x00D81220, + HwPfPciePcsBistControl1 = 0x00D81224, + HwPfPciePcsBistStaticWord01 = 0x00D81228, + HwPfPciePcsBistStaticWord11 = 0x00D8122C, + HwPfPciePcsBistStaticWord21 = 0x00D81230, + HwPfPciePcsBistStaticWord31 = 0x00D81234, + HwPfPciePcsPrbsCount2 = 0x00D81240, + HwPfPciePcsBistControl2 = 0x00D81244, + HwPfPciePcsBistStaticWord02 = 0x00D81248, + HwPfPciePcsBistStaticWord12 = 0x00D8124C, + HwPfPciePcsBistStaticWord22 = 0x00D81250, + HwPfPciePcsBistStaticWord32 = 0x00D81254, + HwPfPciePcsPrbsCount3 = 0x00D81260, + HwPfPciePcsBistControl3 = 0x00D81264, + HwPfPciePcsBistStaticWord03 = 0x00D81268, + HwPfPciePcsBistStaticWord13 = 0x00D8126C, + HwPfPciePcsBistStaticWord23 = 0x00D81270, + HwPfPciePcsBistStaticWord33 = 0x00D81274, + HwPfPcieGpexLtssmStateCntrl = 0x00D90400, + HwPfPcieGpexLtssmStateStatus = 0x00D90404, + HwPfPcieGpexSkipFreqTimer = 0x00D90408, + HwPfPcieGpexLaneSelect = 0x00D9040C, + HwPfPcieGpexLaneDeskew = 0x00D90410, + HwPfPcieGpexRxErrorStatus = 0x00D90414, + HwPfPcieGpexLaneNumControl = 0x00D90418, + HwPfPcieGpexNFstControl = 0x00D9041C, + HwPfPcieGpexLinkStatus = 0x00D90420, + HwPfPcieGpexAckReplayTimeout = 0x00D90438, + HwPfPcieGpexSeqNumberStatus = 0x00D9043C, + HwPfPcieGpexCoreClkRatio = 0x00D90440, + HwPfPcieGpexDllTholdControl = 0x00D90448, + HwPfPcieGpexPmTimer = 0x00D90450, + HwPfPcieGpexPmeTimeout = 0x00D90454, + HwPfPcieGpexAspmL1Timer = 0x00D90458, + HwPfPcieGpexAspmReqTimer = 0x00D9045C, + HwPfPcieGpexAspmL1Dis = 0x00D90460, + HwPfPcieGpexAdvisoryErrorControl = 0x00D90468, + HwPfPcieGpexId = 0x00D90470, + HwPfPcieGpexClasscode = 0x00D90474, + HwPfPcieGpexSubsystemId = 0x00D90478, + HwPfPcieGpexDeviceCapabilities = 0x00D9047C, + HwPfPcieGpexLinkCapabilities = 0x00D90480, + HwPfPcieGpexFunctionNumber = 0x00D90484, + HwPfPcieGpexPmCapabilities = 0x00D90488, + HwPfPcieGpexFunctionSelect = 0x00D9048C, + HwPfPcieGpexErrorCounter = 0x00D904AC, + HwPfPcieGpexConfigReady = 0x00D904B0, + HwPfPcieGpexFcUpdateTimeout = 0x00D904B8, + HwPfPcieGpexFcUpdateTimer = 0x00D904BC, + HwPfPcieGpexVcBufferLoad = 0x00D904C8, + HwPfPcieGpexVcBufferSizeThold = 0x00D904CC, + HwPfPcieGpexVcBufferSelect = 0x00D904D0, + HwPfPcieGpexBarEnable = 0x00D904D4, + HwPfPcieGpexBarDwordLower = 0x00D904D8, + HwPfPcieGpexBarDwordUpper = 0x00D904DC, + HwPfPcieGpexBarSelect = 0x00D904E0, + HwPfPcieGpexCreditCounterSelect = 0x00D904E4, + HwPfPcieGpexCreditCounterStatus = 0x00D904E8, + HwPfPcieGpexTlpHeaderSelect = 0x00D904EC, + HwPfPcieGpexTlpHeaderDword0 = 0x00D904F0, + HwPfPcieGpexTlpHeaderDword1 = 0x00D904F4, + HwPfPcieGpexTlpHeaderDword2 = 0x00D904F8, + HwPfPcieGpexTlpHeaderDword3 = 0x00D904FC, + HwPfPcieGpexRelaxOrderControl = 0x00D90500, + HwPfPcieGpexBarPrefetch = 0x00D90504, + HwPfPcieGpexFcCheckControl = 0x00D90508, + HwPfPcieGpexFcUpdateTimerTraffic = 0x00D90518, + HwPfPcieGpexPhyControl0 = 0x00D9053C, + HwPfPcieGpexPhyControl1 = 0x00D90544, + HwPfPcieGpexPhyControl2 = 0x00D9054C, + HwPfPcieGpexUserControl0 = 0x00D9055C, + HwPfPcieGpexUncorrErrorStatus = 0x00D905F0, + HwPfPcieGpexRxCplError = 0x00D90620, + HwPfPcieGpexRxCplErrorDword0 = 0x00D90624, + HwPfPcieGpexRxCplErrorDword1 = 0x00D90628, + HwPfPcieGpexRxCplErrorDword2 = 0x00D9062C, + HwPfPcieGpexPabSwResetEn = 0x00D90630, + HwPfPcieGpexGen3Control0 = 0x00D90634, + HwPfPcieGpexGen3Control1 = 0x00D90638, + HwPfPcieGpexGen3Control2 = 0x00D9063C, + HwPfPcieGpexGen2ControlCsr = 0x00D90640, + HwPfPcieGpexTotalVfInitialVf0 = 0x00D90644, + HwPfPcieGpexTotalVfInitialVf1 = 0x00D90648, + HwPfPcieGpexSriovLinkDevId0 = 0x00D90684, + HwPfPcieGpexSriovLinkDevId1 = 0x00D90688, + HwPfPcieGpexSriovPageSize0 = 0x00D906C4, + HwPfPcieGpexSriovPageSize1 = 0x00D906C8, + HwPfPcieGpexIdVersion = 0x00D906FC, + HwPfPcieGpexSriovVfOffsetStride0 = 0x00D90704, + HwPfPcieGpexSriovVfOffsetStride1 = 0x00D90708, + HwPfPcieGpexGen3DeskewControl = 0x00D907B4, + HwPfPcieGpexGen3EqControl = 0x00D907B8, + HwPfPcieGpexBridgeVersion = 0x00D90800, + HwPfPcieGpexBridgeCapability = 0x00D90804, + HwPfPcieGpexBridgeControl = 0x00D90808, + HwPfPcieGpexBridgeStatus = 0x00D9080C, + HwPfPcieGpexEngineActivityStatus = 0x00D9081C, + HwPfPcieGpexEngineResetControl = 0x00D90820, + HwPfPcieGpexAxiPioControl = 0x00D90840, + HwPfPcieGpexAxiPioStatus = 0x00D90844, + HwPfPcieGpexAmbaSlaveCmdStatus = 0x00D90848, + HwPfPcieGpexPexPioControl = 0x00D908C0, + HwPfPcieGpexPexPioStatus = 0x00D908C4, + HwPfPcieGpexAmbaMasterStatus = 0x00D908C8, + HwPfPcieGpexCsrSlaveCmdStatus = 0x00D90920, + HwPfPcieGpexMailboxAxiControl = 0x00D90A50, + HwPfPcieGpexMailboxAxiData = 0x00D90A54, + HwPfPcieGpexMailboxPexControl = 0x00D90A90, + HwPfPcieGpexMailboxPexData = 0x00D90A94, + HwPfPcieGpexPexInterruptEnable = 0x00D90AD0, + HwPfPcieGpexPexInterruptStatus = 0x00D90AD4, + HwPfPcieGpexPexInterruptAxiPioVector = 0x00D90AD8, + HwPfPcieGpexPexInterruptPexPioVector = 0x00D90AE0, + HwPfPcieGpexPexInterruptMiscVector = 0x00D90AF8, + HwPfPcieGpexAmbaInterruptPioEnable = 0x00D90B00, + HwPfPcieGpexAmbaInterruptMiscEnable = 0x00D90B0C, + HwPfPcieGpexAmbaInterruptPioStatus = 0x00D90B10, + HwPfPcieGpexAmbaInterruptMiscStatus = 0x00D90B1C, + HwPfPcieGpexPexPmControl = 0x00D90B80, + HwPfPcieGpexSlotMisc = 0x00D90B88, + HwPfPcieGpexAxiAddrMappingControl = 0x00D90BA0, + HwPfPcieGpexAxiAddrMappingWindowAxiBase = 0x00D90BA4, + HwPfPcieGpexAxiAddrMappingWindowPexBaseLow = 0x00D90BA8, + HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh = 0x00D90BAC, + HwPfPcieGpexPexBarAddrFunc0Bar0 = 0x00D91BA0, + HwPfPcieGpexPexBarAddrFunc0Bar1 = 0x00D91BA4, + HwPfPcieGpexAxiAddrMappingPcieHdrParam = 0x00D95BA0, + HwPfPcieGpexExtAxiAddrMappingAxiBase = 0x00D980A0, + HwPfPcieGpexPexExtBarAddrFunc0Bar0 = 0x00D984A0, + HwPfPcieGpexPexExtBarAddrFunc0Bar1 = 0x00D984A4, + HwPfPcieGpexAmbaInterruptFlrEnable = 0x00D9B960, + HwPfPcieGpexAmbaInterruptFlrStatus = 0x00D9B9A0, + HwPfPcieGpexExtAxiAddrMappingSize = 0x00D9BAF0, + HwPfPcieGpexPexPioAwcacheControl = 0x00D9C300, + HwPfPcieGpexPexPioArcacheControl = 0x00D9C304, + HwPfPcieGpexPabObSizeControlVc0 = 0x00D9C310 +}; + +/* TIP PF Interrupt numbers */ +enum { + ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0, + ACC100_PF_INT_DOORBELL_VF_2_PF = 1, + ACC100_PF_INT_DMA_DL_DESC_IRQ = 2, + ACC100_PF_INT_DMA_UL_DESC_IRQ = 3, + ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4, + ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5, + ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6, + ACC100_PF_INT_ILLEGAL_FORMAT = 7, + ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8, + ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9, + ACC100_PF_INT_ARAM_ACCESS_ERR = 10, + ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11, + ACC100_PF_INT_PARITY_ERR = 12, + ACC100_PF_INT_QMGR_ERR = 13, + ACC100_PF_INT_INT_REQ_OVERFLOW = 14, + ACC100_PF_INT_APB_TIMEOUT = 15, +}; + +#endif /* ACC100_PF_ENUM_H */ diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h new file mode 100644 index 0000000..b512af3 --- /dev/null +++ b/drivers/baseband/acc100/acc100_vf_enum.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2017 Intel Corporation + */ + +#ifndef ACC100_VF_ENUM_H +#define ACC100_VF_ENUM_H + +/* + * ACC100 Register mapping on VF BAR0 + * This is automatically generated from RDL, format may change with new RDL + */ +enum { + HWVfQmgrIngressAq = 0x00000000, + HWVfHiVfToPfDbellVf = 0x00000800, + HWVfHiPfToVfDbellVf = 0x00000808, + HWVfHiInfoRingBaseLoVf = 0x00000810, + HWVfHiInfoRingBaseHiVf = 0x00000814, + HWVfHiInfoRingPointerVf = 0x00000818, + HWVfHiInfoRingIntWrEnVf = 0x00000820, + HWVfHiInfoRingPf2VfWrEnVf = 0x00000824, + HWVfHiMsixVectorMapperVf = 0x00000860, + HWVfDmaFec5GulDescBaseLoRegVf = 0x00000920, + HWVfDmaFec5GulDescBaseHiRegVf = 0x00000924, + HWVfDmaFec5GulRespPtrLoRegVf = 0x00000928, + HWVfDmaFec5GulRespPtrHiRegVf = 0x0000092C, + HWVfDmaFec5GdlDescBaseLoRegVf = 0x00000940, + HWVfDmaFec5GdlDescBaseHiRegVf = 0x00000944, + HWVfDmaFec5GdlRespPtrLoRegVf = 0x00000948, + HWVfDmaFec5GdlRespPtrHiRegVf = 0x0000094C, + HWVfDmaFec4GulDescBaseLoRegVf = 0x00000960, + HWVfDmaFec4GulDescBaseHiRegVf = 0x00000964, + HWVfDmaFec4GulRespPtrLoRegVf = 0x00000968, + HWVfDmaFec4GulRespPtrHiRegVf = 0x0000096C, + HWVfDmaFec4GdlDescBaseLoRegVf = 0x00000980, + HWVfDmaFec4GdlDescBaseHiRegVf = 0x00000984, + HWVfDmaFec4GdlRespPtrLoRegVf = 0x00000988, + HWVfDmaFec4GdlRespPtrHiRegVf = 0x0000098C, + HWVfDmaDdrBaseRangeRoVf = 0x000009A0, + HWVfQmgrAqResetVf = 0x00000E00, + HWVfQmgrRingSizeVf = 0x00000E04, + HWVfQmgrGrpDepthLog20Vf = 0x00000E08, + HWVfQmgrGrpDepthLog21Vf = 0x00000E0C, + HWVfQmgrGrpFunction0Vf = 0x00000E10, + HWVfQmgrGrpFunction1Vf = 0x00000E14, + HWVfPmACntrlRegVf = 0x00000F40, + HWVfPmACountVf = 0x00000F48, + HWVfPmAKCntLoVf = 0x00000F50, + HWVfPmAKCntHiVf = 0x00000F54, + HWVfPmADeltaCntLoVf = 0x00000F60, + HWVfPmADeltaCntHiVf = 0x00000F64, + HWVfPmBCntrlRegVf = 0x00000F80, + HWVfPmBCountVf = 0x00000F88, + HWVfPmBKCntLoVf = 0x00000F90, + HWVfPmBKCntHiVf = 0x00000F94, + HWVfPmBDeltaCntLoVf = 0x00000FA0, + HWVfPmBDeltaCntHiVf = 0x00000FA4 +}; + +/* TIP VF Interrupt numbers */ +enum { + ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0, + ACC100_VF_INT_DOORBELL_VF_2_PF = 1, + ACC100_VF_INT_DMA_DL_DESC_IRQ = 2, + ACC100_VF_INT_DMA_UL_DESC_IRQ = 3, + ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4, + ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5, + ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6, + ACC100_VF_INT_ILLEGAL_FORMAT = 7, + ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8, + ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9, +}; + +#endif /* ACC100_VF_ENUM_H */ diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h index 6f46df0..cd77570 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.h +++ b/drivers/baseband/acc100/rte_acc100_pmd.h @@ -5,6 +5,9 @@ #ifndef _RTE_ACC100_PMD_H_ #define _RTE_ACC100_PMD_H_ +#include "acc100_pf_enum.h" +#include "acc100_vf_enum.h" + /* Helper macro for logging */ #define rte_bbdev_log(level, fmt, ...) \ rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \ @@ -27,6 +30,493 @@ #define RTE_ACC100_PF_DEVICE_ID (0x0d5c) #define RTE_ACC100_VF_DEVICE_ID (0x0d5d) +/* Define as 1 to use only a single FEC engine */ +#ifndef RTE_ACC100_SINGLE_FEC +#define RTE_ACC100_SINGLE_FEC 0 +#endif + +/* Values used in filling in descriptors */ +#define ACC100_DMA_DESC_TYPE 2 +#define ACC100_DMA_CODE_BLK_MODE 0 +#define ACC100_DMA_BLKID_FCW 1 +#define ACC100_DMA_BLKID_IN 2 +#define ACC100_DMA_BLKID_OUT_ENC 1 +#define ACC100_DMA_BLKID_OUT_HARD 1 +#define ACC100_DMA_BLKID_OUT_SOFT 2 +#define ACC100_DMA_BLKID_OUT_HARQ 3 +#define ACC100_DMA_BLKID_IN_HARQ 3 + +/* Values used in filling in decode FCWs */ +#define ACC100_FCW_TD_VER 1 +#define ACC100_FCW_TD_EXT_COLD_REG_EN 1 +#define ACC100_FCW_TD_AUTOMAP 0x0f +#define ACC100_FCW_TD_RVIDX_0 2 +#define ACC100_FCW_TD_RVIDX_1 26 +#define ACC100_FCW_TD_RVIDX_2 50 +#define ACC100_FCW_TD_RVIDX_3 74 + +/* Values used in writing to the registers */ +#define ACC100_REG_IRQ_EN_ALL 0x1FF83FF /* Enable all interrupts */ + +/* ACC100 Specific Dimensioning */ +#define ACC100_SIZE_64MBYTE (64*1024*1024) +/* Number of elements in an Info Ring */ +#define ACC100_INFO_RING_NUM_ENTRIES 1024 +/* Number of elements in HARQ layout memory */ +#define ACC100_HARQ_LAYOUT (64*1024*1024) +/* Assume offset for HARQ in memory */ +#define ACC100_HARQ_OFFSET (32*1024) +/* Mask used to calculate an index in an Info Ring array (not a byte offset) */ +#define ACC100_INFO_RING_MASK (ACC100_INFO_RING_NUM_ENTRIES-1) +/* Number of Virtual Functions ACC100 supports */ +#define ACC100_NUM_VFS 16 +#define ACC100_NUM_QGRPS 8 +#define ACC100_NUM_QGRPS_PER_WORD 8 +#define ACC100_NUM_AQS 16 +#define MAX_ENQ_BATCH_SIZE 255 +/* All ACC100 Registers alignment are 32bits = 4B */ +#define BYTES_IN_WORD 4 +#define MAX_E_MBUF 64000 + +#define GRP_ID_SHIFT 10 /* Queue Index Hierarchy */ +#define VF_ID_SHIFT 4 /* Queue Index Hierarchy */ +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */ +#define TMPL_PRI_0 0x03020100 +#define TMPL_PRI_1 0x07060504 +#define TMPL_PRI_2 0x0b0a0908 +#define TMPL_PRI_3 0x0f0e0d0c +#define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */ +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4) + +#define ACC100_NUM_TMPL 32 +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */ +/* Mapping of signals for the available engines */ +#define SIG_UL_5G 0 +#define SIG_UL_5G_LAST 7 +#define SIG_DL_5G 13 +#define SIG_DL_5G_LAST 15 +#define SIG_UL_4G 16 +#define SIG_UL_4G_LAST 21 +#define SIG_DL_4G 27 +#define SIG_DL_4G_LAST 31 + +/* max number of iterations to allocate memory block for all rings */ +#define SW_RING_MEM_ALLOC_ATTEMPTS 5 +#define MAX_QUEUE_DEPTH 1024 +#define ACC100_DMA_MAX_NUM_POINTERS 14 +#define ACC100_DMA_DESC_PADDING 8 +#define ACC100_FCW_PADDING 12 +#define ACC100_DESC_FCW_OFFSET 192 +#define ACC100_DESC_SIZE 256 +#define ACC100_DESC_OFFSET (ACC100_DESC_SIZE / 64) +#define ACC100_FCW_TE_BLEN 32 +#define ACC100_FCW_TD_BLEN 24 +#define ACC100_FCW_LE_BLEN 32 +#define ACC100_FCW_LD_BLEN 36 + +#define ACC100_FCW_VER 2 +#define MUX_5GDL_DESC 6 +#define CMP_ENC_SIZE 20 +#define CMP_DEC_SIZE 24 +#define ENC_OFFSET (32) +#define DEC_OFFSET (80) +#define ACC100_EXT_MEM +#define ACC100_HARQ_OFFSET_THRESHOLD 1024 + +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */ +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */ +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */ +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */ +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */ +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */ +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */ +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */ +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */ + +/* ACC100 Configuration */ +#define ACC100_DDR_ECC_ENABLE +#define ACC100_CFG_DMA_ERROR 0x3D7 +#define ACC100_CFG_AXI_CACHE 0x11 +#define ACC100_CFG_QMGR_HI_P 0x0F0F +#define ACC100_CFG_PCI_AXI 0xC003 +#define ACC100_CFG_PCI_BRIDGE 0x40006033 +#define ACC100_ENGINE_OFFSET 0x1000 +#define ACC100_RESET_HI 0x20100 +#define ACC100_RESET_LO 0x20000 +#define ACC100_RESET_HARD 0x1FF +#define ACC100_ENGINES_MAX 9 +#define LONG_WAIT 1000 + +/* ACC100 DMA Descriptor triplet */ +struct acc100_dma_triplet { + uint64_t address; + uint32_t blen:20, + res0:4, + last:1, + dma_ext:1, + res1:2, + blkid:4; +} __rte_packed; + + + +/* ACC100 DMA Response Descriptor */ +union acc100_dma_rsp_desc { + uint32_t val; + struct { + uint32_t crc_status:1, + synd_ok:1, + dma_err:1, + neg_stop:1, + fcw_err:1, + output_err:1, + input_err:1, + timestampEn:1, + iterCountFrac:8, + iter_cnt:8, + rsrvd3:6, + sdone:1, + fdone:1; + uint32_t add_info_0; + uint32_t add_info_1; + }; +}; + + +/* ACC100 Queue Manager Enqueue PCI Register */ +union acc100_enqueue_reg_fmt { + uint32_t val; + struct { + uint32_t num_elem:8, + addr_offset:3, + rsrvd:1, + req_elem_addr:20; + }; +}; + +/* FEC 4G Uplink Frame Control Word */ +struct __rte_packed acc100_fcw_td { + uint8_t fcw_ver:4, + num_maps:4; /* Unused */ + uint8_t filler:6, /* Unused */ + rsrvd0:1, + bypass_sb_deint:1; + uint16_t k_pos; + uint16_t k_neg; /* Unused */ + uint8_t c_neg; /* Unused */ + uint8_t c; /* Unused */ + uint32_t ea; /* Unused */ + uint32_t eb; /* Unused */ + uint8_t cab; /* Unused */ + uint8_t k0_start_col; /* Unused */ + uint8_t rsrvd1; + uint8_t code_block_mode:1, /* Unused */ + turbo_crc_type:1, + rsrvd2:3, + bypass_teq:1, /* Unused */ + soft_output_en:1, /* Unused */ + ext_td_cold_reg_en:1; + union { /* External Cold register */ + uint32_t ext_td_cold_reg; + struct { + uint32_t min_iter:4, /* Unused */ + max_iter:4, + ext_scale:5, /* Unused */ + rsrvd3:3, + early_stop_en:1, /* Unused */ + sw_soft_out_dis:1, /* Unused */ + sw_et_cont:1, /* Unused */ + sw_soft_out_saturation:1, /* Unused */ + half_iter_on:1, /* Unused */ + raw_decoder_input_on:1, /* Unused */ + rsrvd4:10; + }; + }; +}; + +/* FEC 5GNR Uplink Frame Control Word */ +struct __rte_packed acc100_fcw_ld { + uint32_t FCWversion:4, + qm:4, + nfiller:11, + BG:1, + Zc:9, + res0:1, + synd_precoder:1, + synd_post:1; + uint32_t ncb:16, + k0:16; + uint32_t rm_e:24, + hcin_en:1, + hcout_en:1, + crc_select:1, + bypass_dec:1, + bypass_intlv:1, + so_en:1, + so_bypass_rm:1, + so_bypass_intlv:1; + uint32_t hcin_offset:16, + hcin_size0:16; + uint32_t hcin_size1:16, + hcin_decomp_mode:3, + llr_pack_mode:1, + hcout_comp_mode:3, + res2:1, + dec_convllr:4, + hcout_convllr:4; + uint32_t itmax:7, + itstop:1, + so_it:7, + res3:1, + hcout_offset:16; + uint32_t hcout_size0:16, + hcout_size1:16; + uint32_t gain_i:8, + gain_h:8, + negstop_th:16; + uint32_t negstop_it:7, + negstop_en:1, + res4:24; +}; + +/* FEC 4G Downlink Frame Control Word */ +struct __rte_packed acc100_fcw_te { + uint16_t k_neg; + uint16_t k_pos; + uint8_t c_neg; + uint8_t c; + uint8_t filler; + uint8_t cab; + uint32_t ea:17, + rsrvd0:15; + uint32_t eb:17, + rsrvd1:15; + uint16_t ncb_neg; + uint16_t ncb_pos; + uint8_t rv_idx0:2, + rsrvd2:2, + rv_idx1:2, + rsrvd3:2; + uint8_t bypass_rv_idx0:1, + bypass_rv_idx1:1, + bypass_rm:1, + rsrvd4:5; + uint8_t rsrvd5:1, + rsrvd6:3, + code_block_crc:1, + rsrvd7:3; + uint8_t code_block_mode:1, + rsrvd8:7; + uint64_t rsrvd9; +}; + +/* FEC 5GNR Downlink Frame Control Word */ +struct __rte_packed acc100_fcw_le { + uint32_t FCWversion:4, + qm:4, + nfiller:11, + BG:1, + Zc:9, + res0:3; + uint32_t ncb:16, + k0:16; + uint32_t rm_e:24, + res1:2, + crc_select:1, + res2:1, + bypass_intlv:1, + res3:3; + uint32_t res4_a:12, + mcb_count:3, + res4_b:17; + uint32_t res5; + uint32_t res6; + uint32_t res7; + uint32_t res8; +}; + +/* ACC100 DMA Request Descriptor */ +struct __rte_packed acc100_dma_req_desc { + union { + struct{ + uint32_t type:4, + rsrvd0:26, + sdone:1, + fdone:1; + uint32_t rsrvd1; + uint32_t rsrvd2; + uint32_t pass_param:8, + sdone_enable:1, + irq_enable:1, + timeStampEn:1, + res0:5, + numCBs:4, + res1:4, + m2dlen:4, + d2mlen:4; + }; + struct{ + uint32_t word0; + uint32_t word1; + uint32_t word2; + uint32_t word3; + }; + }; + struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS]; + + /* Virtual addresses used to retrieve SW context info */ + union { + void *op_addr; + uint64_t pad1; /* pad to 64 bits */ + }; + /* + * Stores additional information needed for driver processing: + * - last_desc_in_batch - flag used to mark last descriptor (CB) + * in batch + * - cbs_in_tb - stores information about total number of Code Blocks + * in currently processed Transport Block + */ + union { + struct { + union { + struct acc100_fcw_ld fcw_ld; + struct acc100_fcw_td fcw_td; + struct acc100_fcw_le fcw_le; + struct acc100_fcw_te fcw_te; + uint32_t pad2[ACC100_FCW_PADDING]; + }; + uint32_t last_desc_in_batch :8, + cbs_in_tb:8, + pad4 : 16; + }; + uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */ + }; +}; + +/* ACC100 DMA Descriptor */ +union acc100_dma_desc { + struct acc100_dma_req_desc req; + union acc100_dma_rsp_desc rsp; +}; + + +/* Union describing Info Ring entry */ +union acc100_harq_layout_data { + uint32_t val; + struct { + uint16_t offset; + uint16_t size0; + }; +} __rte_packed; + + +/* Union describing Info Ring entry */ +union acc100_info_ring_data { + uint32_t val; + struct { + union { + uint16_t detailed_info; + struct { + uint16_t aq_id: 4; + uint16_t qg_id: 4; + uint16_t vf_id: 6; + uint16_t reserved: 2; + }; + }; + uint16_t int_nb: 7; + uint16_t msi_0: 1; + uint16_t vf2pf: 6; + uint16_t loop: 1; + uint16_t valid: 1; + }; +} __rte_packed; + +struct acc100_registry_addr { + unsigned int dma_ring_dl5g_hi; + unsigned int dma_ring_dl5g_lo; + unsigned int dma_ring_ul5g_hi; + unsigned int dma_ring_ul5g_lo; + unsigned int dma_ring_dl4g_hi; + unsigned int dma_ring_dl4g_lo; + unsigned int dma_ring_ul4g_hi; + unsigned int dma_ring_ul4g_lo; + unsigned int ring_size; + unsigned int info_ring_hi; + unsigned int info_ring_lo; + unsigned int info_ring_en; + unsigned int info_ring_ptr; + unsigned int tail_ptrs_dl5g_hi; + unsigned int tail_ptrs_dl5g_lo; + unsigned int tail_ptrs_ul5g_hi; + unsigned int tail_ptrs_ul5g_lo; + unsigned int tail_ptrs_dl4g_hi; + unsigned int tail_ptrs_dl4g_lo; + unsigned int tail_ptrs_ul4g_hi; + unsigned int tail_ptrs_ul4g_lo; + unsigned int depth_log0_offset; + unsigned int depth_log1_offset; + unsigned int qman_group_func; + unsigned int ddr_range; +}; + +/* Structure holding registry addresses for PF */ +static const struct acc100_registry_addr pf_reg_addr = { + .dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf, + .dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf, + .dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf, + .dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf, + .dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf, + .dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf, + .dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf, + .dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf, + .ring_size = HWPfQmgrRingSizeVf, + .info_ring_hi = HWPfHiInfoRingBaseHiRegPf, + .info_ring_lo = HWPfHiInfoRingBaseLoRegPf, + .info_ring_en = HWPfHiInfoRingIntWrEnRegPf, + .info_ring_ptr = HWPfHiInfoRingPointerRegPf, + .tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf, + .tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf, + .tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf, + .tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf, + .tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf, + .tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf, + .tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf, + .tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf, + .depth_log0_offset = HWPfQmgrGrpDepthLog20Vf, + .depth_log1_offset = HWPfQmgrGrpDepthLog21Vf, + .qman_group_func = HWPfQmgrGrpFunction0, + .ddr_range = HWPfDmaVfDdrBaseRw, +}; + +/* Structure holding registry addresses for VF */ +static const struct acc100_registry_addr vf_reg_addr = { + .dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf, + .dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf, + .dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf, + .dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf, + .dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf, + .dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf, + .dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf, + .dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf, + .ring_size = HWVfQmgrRingSizeVf, + .info_ring_hi = HWVfHiInfoRingBaseHiVf, + .info_ring_lo = HWVfHiInfoRingBaseLoVf, + .info_ring_en = HWVfHiInfoRingIntWrEnVf, + .info_ring_ptr = HWVfHiInfoRingPointerVf, + .tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf, + .tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf, + .tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf, + .tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf, + .tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf, + .tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf, + .tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf, + .tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf, + .depth_log0_offset = HWVfQmgrGrpDepthLog20Vf, + .depth_log1_offset = HWVfQmgrGrpDepthLog21Vf, + .qman_group_func = HWVfQmgrGrpFunction0Vf, + .ddr_range = HWVfDmaDdrBaseRangeRoVf, +}; + /* Private data structure for each ACC100 device */ struct acc100_device { void *mmio_base; /**< Base address of MMIO registers (BAR0) */ From patchwork Fri Sep 4 17:53:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76576 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C6700A04C5; Fri, 4 Sep 2020 19:58:29 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A13D61C0D2; Fri, 4 Sep 2020 19:58:13 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id A07D01C0B4 for ; Fri, 4 Sep 2020 19:58:09 +0200 (CEST) IronPort-SDR: ENmEMggeXNxUdbIgyXSGZORSucoVC7dKv1Vsf8xVXI6vLyrJ91W81NpWA8+mb+g2Ni1X0ZUtxg SFo+zfa6spAQ== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614834" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614834" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: AbEi3BNRZQgKJ1NCyK1ns3Nr5S9X17JdOBTUGbcxoZFmHn1LO+jp627JgjJ0z/XussT8hfnrxq JKEkNnDvPiJA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765383" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:53:59 -0700 Message-Id: <1599242047-58232-4-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 03/11] baseband/acc100: add info get function X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add in the "info_get" function to the driver, to allow us to query the device. No processing capability are available yet. Linking bbdev-test to support the PMD with null capability. Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- app/test-bbdev/Makefile | 3 + app/test-bbdev/meson.build | 3 + drivers/baseband/acc100/rte_acc100_cfg.h | 96 +++++++++++++ drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++ drivers/baseband/acc100/rte_acc100_pmd.h | 3 + 5 files changed, 330 insertions(+) create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h diff --git a/app/test-bbdev/Makefile b/app/test-bbdev/Makefile index dc29557..dbc3437 100644 --- a/app/test-bbdev/Makefile +++ b/app/test-bbdev/Makefile @@ -26,5 +26,8 @@ endif ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC),y) LDLIBS += -lrte_pmd_bbdev_fpga_5gnr_fec endif +ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100),y) +LDLIBS += -lrte_pmd_bbdev_acc100 +endif include $(RTE_SDK)/mk/rte.app.mk diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build index 18ab6a8..fbd8ae3 100644 --- a/app/test-bbdev/meson.build +++ b/app/test-bbdev/meson.build @@ -12,3 +12,6 @@ endif if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC') deps += ['pmd_bbdev_fpga_5gnr_fec'] endif +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100') + deps += ['pmd_bbdev_acc100'] +endif \ No newline at end of file diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h new file mode 100644 index 0000000..73bbe36 --- /dev/null +++ b/drivers/baseband/acc100/rte_acc100_cfg.h @@ -0,0 +1,96 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_ACC100_CFG_H_ +#define _RTE_ACC100_CFG_H_ + +/** + * @file rte_acc100_cfg.h + * + * Functions for configuring ACC100 HW, exposed directly to applications. + * Configuration related to encoding/decoding is done through the + * librte_bbdev library. + * + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + */ + +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif +/**< Number of Virtual Functions ACC100 supports */ +#define RTE_ACC100_NUM_VFS 16 + +/** + * Definition of Queue Topology for ACC100 Configuration + * Some level of details is abstracted out to expose a clean interface + * given that comprehensive flexibility is not required + */ +struct rte_q_topology_t { + /** Number of QGroups in incremental order of priority */ + uint16_t num_qgroups; + /** + * All QGroups have the same number of AQs here. + * Note : Could be made a 16-array if more flexibility is really + * required + */ + uint16_t num_aqs_per_groups; + /** + * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N + * Note : Could be made a 16-array if more flexibility is really + * required + */ + uint16_t aq_depth_log2; + /** + * Index of the first Queue Group Index - assuming contiguity + * Initialized as -1 + */ + int8_t first_qgroup_index; +}; + +/** + * Definition of Arbitration related parameters for ACC100 Configuration + */ +struct rte_arbitration_t { + /** Default Weight for VF Fairness Arbitration */ + uint16_t round_robin_weight; + uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */ + uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */ +}; + +/** + * Structure to pass ACC100 configuration. + * Note: all VF Bundles will have the same configuration. + */ +struct acc100_conf { + bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */ + /** 1 if input '1' bit is represented by a positive LLR value, 0 if '1' + * bit is represented by a negative value. + */ + bool input_pos_llr_1_bit; + /** 1 if output '1' bit is represented by a positive value, 0 if '1' + * bit is represented by a negative value. + */ + bool output_pos_llr_1_bit; + uint16_t num_vf_bundles; /**< Number of VF bundles to setup */ + /** Queue topology for each operation type */ + struct rte_q_topology_t q_ul_4g; + struct rte_q_topology_t q_dl_4g; + struct rte_q_topology_t q_ul_5g; + struct rte_q_topology_t q_dl_5g; + /** Arbitration configuration for each operation type */ + struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS]; + struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS]; + struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS]; + struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS]; +}; + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_ACC100_CFG_H_ */ diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index 1b4cd13..7807a30 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -26,6 +26,184 @@ RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif +/* Read a register of a ACC100 device */ +static inline uint32_t +acc100_reg_read(struct acc100_device *d, uint32_t offset) +{ + + void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset); + uint32_t ret = *((volatile uint32_t *)(reg_addr)); + return rte_le_to_cpu_32(ret); +} + +/* Calculate the offset of the enqueue register */ +static inline uint32_t +queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id) +{ + if (pf_device) + return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) + + HWPfQmgrIngressAq); + else + return ((qgrp_id << 7) + (aq_id << 3) + + HWVfQmgrIngressAq); +} + +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC}; + +/* Return the queue topology for a Queue Group Index */ +static inline void +qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum, + struct acc100_conf *acc100_conf) +{ + struct rte_q_topology_t *p_qtop; + p_qtop = NULL; + switch (acc_enum) { + case UL_4G: + p_qtop = &(acc100_conf->q_ul_4g); + break; + case UL_5G: + p_qtop = &(acc100_conf->q_ul_5g); + break; + case DL_4G: + p_qtop = &(acc100_conf->q_dl_4g); + break; + case DL_5G: + p_qtop = &(acc100_conf->q_dl_5g); + break; + default: + /* NOTREACHED */ + rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc"); + break; + } + *qtop = p_qtop; +} + +static void +initQTop(struct acc100_conf *acc100_conf) +{ + acc100_conf->q_ul_4g.num_aqs_per_groups = 0; + acc100_conf->q_ul_4g.num_qgroups = 0; + acc100_conf->q_ul_4g.first_qgroup_index = -1; + acc100_conf->q_ul_5g.num_aqs_per_groups = 0; + acc100_conf->q_ul_5g.num_qgroups = 0; + acc100_conf->q_ul_5g.first_qgroup_index = -1; + acc100_conf->q_dl_4g.num_aqs_per_groups = 0; + acc100_conf->q_dl_4g.num_qgroups = 0; + acc100_conf->q_dl_4g.first_qgroup_index = -1; + acc100_conf->q_dl_5g.num_aqs_per_groups = 0; + acc100_conf->q_dl_5g.num_qgroups = 0; + acc100_conf->q_dl_5g.first_qgroup_index = -1; +} + +static inline void +updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf, + struct acc100_device *d) { + uint32_t reg; + struct rte_q_topology_t *q_top = NULL; + qtopFromAcc(&q_top, acc, acc100_conf); + if (unlikely(q_top == NULL)) + return; + uint16_t aq; + q_top->num_qgroups++; + if (q_top->first_qgroup_index == -1) { + q_top->first_qgroup_index = qg; + /* Can be optimized to assume all are enabled by default */ + reg = acc100_reg_read(d, queue_offset(d->pf_device, + 0, qg, ACC100_NUM_AQS - 1)); + if (reg & QUEUE_ENABLE) { + q_top->num_aqs_per_groups = ACC100_NUM_AQS; + return; + } + q_top->num_aqs_per_groups = 0; + for (aq = 0; aq < ACC100_NUM_AQS; aq++) { + reg = acc100_reg_read(d, queue_offset(d->pf_device, + 0, qg, aq)); + if (reg & QUEUE_ENABLE) + q_top->num_aqs_per_groups++; + } + } +} + +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */ +static inline void +fetch_acc100_config(struct rte_bbdev *dev) +{ + struct acc100_device *d = dev->data->dev_private; + struct acc100_conf *acc100_conf = &d->acc100_conf; + const struct acc100_registry_addr *reg_addr; + uint8_t acc, qg; + uint32_t reg, reg_aq, reg_len0, reg_len1; + uint32_t reg_mode; + + /* No need to retrieve the configuration is already done */ + if (d->configured) + return; + + /* Choose correct registry addresses for the device type */ + if (d->pf_device) + reg_addr = &pf_reg_addr; + else + reg_addr = &vf_reg_addr; + + d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10; + + /* Single VF Bundle by VF */ + acc100_conf->num_vf_bundles = 1; + initQTop(acc100_conf); + + struct rte_q_topology_t *q_top = NULL; + int qman_func_id[5] = {0, 2, 1, 3, 4}; + reg = acc100_reg_read(d, reg_addr->qman_group_func); + for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) { + reg_aq = acc100_reg_read(d, + queue_offset(d->pf_device, 0, qg, 0)); + if (reg_aq & QUEUE_ENABLE) { + acc = qman_func_id[(reg >> (qg * 4)) & 0x7]; + updateQtop(acc, qg, acc100_conf, d); + } + } + + /* Check the depth of the AQs*/ + reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset); + reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset); + for (acc = 0; acc < NUM_ACC; acc++) { + qtopFromAcc(&q_top, acc, acc100_conf); + if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD) + q_top->aq_depth_log2 = (reg_len0 >> + (q_top->first_qgroup_index * 4)) + & 0xF; + else + q_top->aq_depth_log2 = (reg_len1 >> + ((q_top->first_qgroup_index - + ACC100_NUM_QGRPS_PER_WORD) * 4)) + & 0xF; + } + + /* Read PF mode */ + if (d->pf_device) { + reg_mode = acc100_reg_read(d, HWPfHiPfMode); + acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0; + } + + rte_bbdev_log_debug( + "%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n", + (d->pf_device) ? "PF" : "VF", + (acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG", + (acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG", + acc100_conf->q_ul_4g.num_qgroups, + acc100_conf->q_dl_4g.num_qgroups, + acc100_conf->q_ul_5g.num_qgroups, + acc100_conf->q_dl_5g.num_qgroups, + acc100_conf->q_ul_4g.num_aqs_per_groups, + acc100_conf->q_dl_4g.num_aqs_per_groups, + acc100_conf->q_ul_5g.num_aqs_per_groups, + acc100_conf->q_dl_5g.num_aqs_per_groups, + acc100_conf->q_ul_4g.aq_depth_log2, + acc100_conf->q_dl_4g.aq_depth_log2, + acc100_conf->q_ul_5g.aq_depth_log2, + acc100_conf->q_dl_5g.aq_depth_log2); +} + /* Free 64MB memory used for software rings */ static int acc100_dev_close(struct rte_bbdev *dev __rte_unused) @@ -33,8 +211,55 @@ return 0; } +/* Get ACC100 device info */ +static void +acc100_dev_info_get(struct rte_bbdev *dev, + struct rte_bbdev_driver_info *dev_info) +{ + struct acc100_device *d = dev->data->dev_private; + + static const struct rte_bbdev_op_cap bbdev_capabilities[] = { + RTE_BBDEV_END_OF_CAPABILITIES_LIST() + }; + + static struct rte_bbdev_queue_conf default_queue_conf; + default_queue_conf.socket = dev->data->socket_id; + default_queue_conf.queue_size = MAX_QUEUE_DEPTH; + + dev_info->driver_name = dev->device->driver->name; + + /* Read and save the populated config from ACC100 registers */ + fetch_acc100_config(dev); + + /* This isn't ideal because it reports the maximum number of queues but + * does not provide info on how many can be uplink/downlink or different + * priorities + */ + dev_info->max_num_queues = + d->acc100_conf.q_dl_5g.num_aqs_per_groups * + d->acc100_conf.q_dl_5g.num_qgroups + + d->acc100_conf.q_ul_5g.num_aqs_per_groups * + d->acc100_conf.q_ul_5g.num_qgroups + + d->acc100_conf.q_dl_4g.num_aqs_per_groups * + d->acc100_conf.q_dl_4g.num_qgroups + + d->acc100_conf.q_ul_4g.num_aqs_per_groups * + d->acc100_conf.q_ul_4g.num_qgroups; + dev_info->queue_size_lim = MAX_QUEUE_DEPTH; + dev_info->hardware_accelerated = true; + dev_info->max_dl_queue_priority = + d->acc100_conf.q_dl_4g.num_qgroups - 1; + dev_info->max_ul_queue_priority = + d->acc100_conf.q_ul_4g.num_qgroups - 1; + dev_info->default_queue_conf = default_queue_conf; + dev_info->cpu_flag_reqs = NULL; + dev_info->min_alignment = 64; + dev_info->capabilities = bbdev_capabilities; + dev_info->harq_buffer_size = d->ddr_size; +} + static const struct rte_bbdev_ops acc100_bbdev_ops = { .close = acc100_dev_close, + .info_get = acc100_dev_info_get, }; /* ACC100 PCI PF address map */ diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h index cd77570..662e2c8 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.h +++ b/drivers/baseband/acc100/rte_acc100_pmd.h @@ -7,6 +7,7 @@ #include "acc100_pf_enum.h" #include "acc100_vf_enum.h" +#include "rte_acc100_cfg.h" /* Helper macro for logging */ #define rte_bbdev_log(level, fmt, ...) \ @@ -520,6 +521,8 @@ struct acc100_registry_addr { /* Private data structure for each ACC100 device */ struct acc100_device { void *mmio_base; /**< Base address of MMIO registers (BAR0) */ + uint32_t ddr_size; /* Size in kB */ + struct acc100_conf acc100_conf; /* ACC100 Initial configuration */ bool pf_device; /**< True if this is a PF ACC100 device */ bool configured; /**< True if this ACC100 device is configured */ }; From patchwork Fri Sep 4 17:54:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76575 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 39471A04C5; Fri, 4 Sep 2020 19:58:20 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E5DA41C0CF; Fri, 4 Sep 2020 19:58:11 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 2A377CF3 for ; Fri, 4 Sep 2020 19:58:08 +0200 (CEST) IronPort-SDR: 4PHn46GRR3jqC7He0kj1svS0RbW3AOjFDojqOtFxQjl/+nDbPAFPFX30M0aLvdeCpIdfO7qOik MfxfLyYz5MTQ== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614835" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614835" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: O0UqkR8ythzfrMyX3BOmbFYvHoWFbZ43OuJ1EPOEbShkDehTs+Kn42WL2OB7ZBaO5We7Fizuzd +6jOhHDgQYXA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765387" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:00 -0700 Message-Id: <1599242047-58232-5-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Adding function to create and configure queues for the device. Still no capability. Signed-off-by: Nicolas Chautru Reviewed-by: Rosen Xu Acked-by: Liu Tianjiao --- drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++- drivers/baseband/acc100/rte_acc100_pmd.h | 45 ++++ 2 files changed, 464 insertions(+), 1 deletion(-) diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index 7807a30..7a21c57 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -26,6 +26,22 @@ RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif +/* Write to MMIO register address */ +static inline void +mmio_write(void *addr, uint32_t value) +{ + *((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); +} + +/* Write a register of a ACC100 device */ +static inline void +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload) +{ + void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset); + mmio_write(reg_addr, payload); + usleep(1000); +} + /* Read a register of a ACC100 device */ static inline uint32_t acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6 +52,22 @@ return rte_le_to_cpu_32(ret); } +/* Basic Implementation of Log2 for exact 2^N */ +static inline uint32_t +log2_basic(uint32_t value) +{ + return (value == 0) ? 0 : __builtin_ctz(value); +} + +/* Calculate memory alignment offset assuming alignment is 2^N */ +static inline uint32_t +calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment) +{ + rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem); + return (uint32_t)(alignment - + (unaligned_phy_mem & (alignment-1))); +} + /* Calculate the offset of the enqueue register */ static inline uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id) @@ -204,10 +236,393 @@ acc100_conf->q_dl_5g.aq_depth_log2); } +static void +free_base_addresses(void **base_addrs, int size) +{ + int i; + for (i = 0; i < size; i++) + rte_free(base_addrs[i]); +} + +static inline uint32_t +get_desc_len(void) +{ + return sizeof(union acc100_dma_desc); +} + +/* Allocate the 2 * 64MB block for the sw rings */ +static int +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d, + int socket) +{ + uint32_t sw_ring_size = ACC100_SIZE_64MBYTE; + d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name, + 2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket); + if (d->sw_rings_base == NULL) { + rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + return -ENOMEM; + } + memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE); + uint32_t next_64mb_align_offset = calc_mem_alignment_offset( + d->sw_rings_base, ACC100_SIZE_64MBYTE); + d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset); + d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) + + next_64mb_align_offset; + d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len(); + d->sw_ring_max_depth = d->sw_ring_size / get_desc_len(); + + return 0; +} + +/* Attempt to allocate minimised memory space for sw rings */ +static void +alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d, + uint16_t num_queues, int socket) +{ + rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy; + uint32_t next_64mb_align_offset; + rte_iova_t sw_ring_phys_end_addr; + void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS]; + void *sw_rings_base; + int i = 0; + uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len(); + uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues; + + /* Find an aligned block of memory to store sw rings */ + while (i < SW_RING_MEM_ALLOC_ATTEMPTS) { + /* + * sw_ring allocated memory is guaranteed to be aligned to + * q_sw_ring_size at the condition that the requested size is + * less than the page size + */ + sw_rings_base = rte_zmalloc_socket( + dev->device->driver->name, + dev_sw_ring_size, q_sw_ring_size, socket); + + if (sw_rings_base == NULL) { + rte_bbdev_log(ERR, + "Failed to allocate memory for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + break; + } + + sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base); + next_64mb_align_offset = calc_mem_alignment_offset( + sw_rings_base, ACC100_SIZE_64MBYTE); + next_64mb_align_addr_phy = sw_rings_base_phy + + next_64mb_align_offset; + sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size; + + /* Check if the end of the sw ring memory block is before the + * start of next 64MB aligned mem address + */ + if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) { + d->sw_rings_phys = sw_rings_base_phy; + d->sw_rings = sw_rings_base; + d->sw_rings_base = sw_rings_base; + d->sw_ring_size = q_sw_ring_size; + d->sw_ring_max_depth = MAX_QUEUE_DEPTH; + break; + } + /* Store the address of the unaligned mem block */ + base_addrs[i] = sw_rings_base; + i++; + } + + /* Free all unaligned blocks of mem allocated in the loop */ + free_base_addresses(base_addrs, i); +} + + +/* Allocate 64MB memory used for all software rings */ +static int +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id) +{ + uint32_t phys_low, phys_high, payload; + struct acc100_device *d = dev->data->dev_private; + const struct acc100_registry_addr *reg_addr; + + if (d->pf_device && !d->acc100_conf.pf_mode_en) { + rte_bbdev_log(NOTICE, + "%s has PF mode disabled. This PF can't be used.", + dev->data->name); + return -ENODEV; + } + + alloc_sw_rings_min_mem(dev, d, num_queues, socket_id); + + /* If minimal memory space approach failed, then allocate + * the 2 * 64MB block for the sw rings + */ + if (d->sw_rings == NULL) + alloc_2x64mb_sw_rings_mem(dev, d, socket_id); + + /* Configure ACC100 with the base address for DMA descriptor rings + * Same descriptor rings used for UL and DL DMA Engines + * Note : Assuming only VF0 bundle is used for PF mode + */ + phys_high = (uint32_t)(d->sw_rings_phys >> 32); + phys_low = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1)); + + /* Choose correct registry addresses for the device type */ + if (d->pf_device) + reg_addr = &pf_reg_addr; + else + reg_addr = &vf_reg_addr; + + /* Read the populated cfg from ACC100 registers */ + fetch_acc100_config(dev); + + /* Mark as configured properly */ + d->configured = true; + + /* Release AXI from PF */ + if (d->pf_device) + acc100_reg_write(d, HWPfDmaAxiControl, 1); + + acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high); + acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low); + acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high); + acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low); + acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high); + acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low); + acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high); + acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low); + + /* + * Configure Ring Size to the max queue ring size + * (used for wrapping purpose) + */ + payload = log2_basic(d->sw_ring_size / 64); + acc100_reg_write(d, reg_addr->ring_size, payload); + + /* Configure tail pointer for use when SDONE enabled */ + d->tail_ptrs = rte_zmalloc_socket( + dev->device->driver->name, + ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t), + RTE_CACHE_LINE_SIZE, socket_id); + if (d->tail_ptrs == NULL) { + rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + rte_free(d->sw_rings); + return -ENOMEM; + } + d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs); + + phys_high = (uint32_t)(d->tail_ptr_phys >> 32); + phys_low = (uint32_t)(d->tail_ptr_phys); + acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high); + acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low); + acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high); + acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low); + acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high); + acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low); + acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high); + acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low); + + d->harq_layout = rte_zmalloc_socket("HARQ Layout", + ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout), + RTE_CACHE_LINE_SIZE, dev->data->socket_id); + + rte_bbdev_log_debug( + "ACC100 (%s) configured sw_rings = %p, sw_rings_phys = %#" + PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys); + + return 0; +} + /* Free 64MB memory used for software rings */ static int -acc100_dev_close(struct rte_bbdev *dev __rte_unused) +acc100_dev_close(struct rte_bbdev *dev) { + struct acc100_device *d = dev->data->dev_private; + if (d->sw_rings_base != NULL) { + rte_free(d->tail_ptrs); + rte_free(d->sw_rings_base); + d->sw_rings_base = NULL; + } + usleep(1000); + return 0; +} + + +/** + * Report a ACC100 queue index which is free + * Return 0 to 16k for a valid queue_idx or -1 when no queue is available + * Note : Only supporting VF0 Bundle for PF mode + */ +static int +acc100_find_free_queue_idx(struct rte_bbdev *dev, + const struct rte_bbdev_queue_conf *conf) +{ + struct acc100_device *d = dev->data->dev_private; + int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G}; + int acc = op_2_acc[conf->op_type]; + struct rte_q_topology_t *qtop = NULL; + qtopFromAcc(&qtop, acc, &(d->acc100_conf)); + if (qtop == NULL) + return -1; + /* Identify matching QGroup Index which are sorted in priority order */ + uint16_t group_idx = qtop->first_qgroup_index; + group_idx += conf->priority; + if (group_idx >= ACC100_NUM_QGRPS || + conf->priority >= qtop->num_qgroups) { + rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u", + dev->data->name, conf->priority); + return -1; + } + /* Find a free AQ_idx */ + uint16_t aq_idx; + for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) { + if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) { + /* Mark the Queue as assigned */ + d->q_assigned_bit_map[group_idx] |= (1 << aq_idx); + /* Report the AQ Index */ + return (group_idx << GRP_ID_SHIFT) + aq_idx; + } + } + rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u", + dev->data->name, conf->priority); + return -1; +} + +/* Setup ACC100 queue */ +static int +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, + const struct rte_bbdev_queue_conf *conf) +{ + struct acc100_device *d = dev->data->dev_private; + struct acc100_queue *q; + int16_t q_idx; + + /* Allocate the queue data structure. */ + q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q), + RTE_CACHE_LINE_SIZE, conf->socket); + if (q == NULL) { + rte_bbdev_log(ERR, "Failed to allocate queue memory"); + return -ENOMEM; + } + + q->d = d; + q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id)); + q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id); + + /* Prepare the Ring with default descriptor format */ + union acc100_dma_desc *desc = NULL; + unsigned int desc_idx, b_idx; + int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ? + ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ? + ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN)); + + for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) { + desc = q->ring_addr + desc_idx; + desc->req.word0 = ACC100_DMA_DESC_TYPE; + desc->req.word1 = 0; /**< Timestamp */ + desc->req.word2 = 0; + desc->req.word3 = 0; + uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET; + desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset; + desc->req.data_ptrs[0].blen = fcw_len; + desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW; + desc->req.data_ptrs[0].last = 0; + desc->req.data_ptrs[0].dma_ext = 0; + for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1; + b_idx++) { + desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN; + desc->req.data_ptrs[b_idx].last = 1; + desc->req.data_ptrs[b_idx].dma_ext = 0; + b_idx++; + desc->req.data_ptrs[b_idx].blkid = + ACC100_DMA_BLKID_OUT_ENC; + desc->req.data_ptrs[b_idx].last = 1; + desc->req.data_ptrs[b_idx].dma_ext = 0; + } + /* Preset some fields of LDPC FCW */ + desc->req.fcw_ld.FCWversion = ACC100_FCW_VER; + desc->req.fcw_ld.gain_i = 1; + desc->req.fcw_ld.gain_h = 1; + } + + q->lb_in = rte_zmalloc_socket(dev->device->driver->name, + RTE_CACHE_LINE_SIZE, + RTE_CACHE_LINE_SIZE, conf->socket); + if (q->lb_in == NULL) { + rte_bbdev_log(ERR, "Failed to allocate lb_in memory"); + return -ENOMEM; + } + q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in); + q->lb_out = rte_zmalloc_socket(dev->device->driver->name, + RTE_CACHE_LINE_SIZE, + RTE_CACHE_LINE_SIZE, conf->socket); + if (q->lb_out == NULL) { + rte_bbdev_log(ERR, "Failed to allocate lb_out memory"); + return -ENOMEM; + } + q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out); + + /* + * Software queue ring wraps synchronously with the HW when it reaches + * the boundary of the maximum allocated queue size, no matter what the + * sw queue size is. This wrapping is guarded by setting the wrap_mask + * to represent the maximum queue size as allocated at the time when + * the device has been setup (in configure()). + * + * The queue depth is set to the queue size value (conf->queue_size). + * This limits the occupancy of the queue at any point of time, so that + * the queue does not get swamped with enqueue requests. + */ + q->sw_ring_depth = conf->queue_size; + q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1; + + q->op_type = conf->op_type; + + q_idx = acc100_find_free_queue_idx(dev, conf); + if (q_idx == -1) { + rte_free(q); + return -1; + } + + q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF; + q->vf_id = (q_idx >> VF_ID_SHIFT) & 0x3F; + q->aq_id = q_idx & 0xF; + q->aq_depth = (conf->op_type == RTE_BBDEV_OP_TURBO_DEC) ? + (1 << d->acc100_conf.q_ul_4g.aq_depth_log2) : + (1 << d->acc100_conf.q_dl_4g.aq_depth_log2); + + q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base, + queue_offset(d->pf_device, + q->vf_id, q->qgrp_id, q->aq_id)); + + rte_bbdev_log_debug( + "Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p", + dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id, + q->aq_id, q->aq_depth, q->mmio_reg_enqueue); + + dev->data->queues[queue_id].queue_private = q; + return 0; +} + +/* Release ACC100 queue */ +static int +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) +{ + struct acc100_device *d = dev->data->dev_private; + struct acc100_queue *q = dev->data->queues[q_id].queue_private; + + if (q != NULL) { + /* Mark the Queue as un-assigned */ + d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF - + (1 << q->aq_id)); + rte_free(q->lb_in); + rte_free(q->lb_out); + rte_free(q); + dev->data->queues[q_id].queue_private = NULL; + } + return 0; } @@ -258,8 +673,11 @@ } static const struct rte_bbdev_ops acc100_bbdev_ops = { + .setup_queues = acc100_setup_queues, .close = acc100_dev_close, .info_get = acc100_dev_info_get, + .queue_setup = acc100_queue_setup, + .queue_release = acc100_queue_release, }; /* ACC100 PCI PF address map */ diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h index 662e2c8..0e2b79c 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.h +++ b/drivers/baseband/acc100/rte_acc100_pmd.h @@ -518,11 +518,56 @@ struct acc100_registry_addr { .ddr_range = HWVfDmaDdrBaseRangeRoVf, }; +/* Structure associated with each queue. */ +struct __rte_cache_aligned acc100_queue { + union acc100_dma_desc *ring_addr; /* Virtual address of sw ring */ + rte_iova_t ring_addr_phys; /* Physical address of software ring */ + uint32_t sw_ring_head; /* software ring head */ + uint32_t sw_ring_tail; /* software ring tail */ + /* software ring size (descriptors, not bytes) */ + uint32_t sw_ring_depth; + /* mask used to wrap enqueued descriptors on the sw ring */ + uint32_t sw_ring_wrap_mask; + /* MMIO register used to enqueue descriptors */ + void *mmio_reg_enqueue; + uint8_t vf_id; /* VF ID (max = 63) */ + uint8_t qgrp_id; /* Queue Group ID */ + uint16_t aq_id; /* Atomic Queue ID */ + uint16_t aq_depth; /* Depth of atomic queue */ + uint32_t aq_enqueued; /* Count how many "batches" have been enqueued */ + uint32_t aq_dequeued; /* Count how many "batches" have been dequeued */ + uint32_t irq_enable; /* Enable ops dequeue interrupts if set to 1 */ + struct rte_mempool *fcw_mempool; /* FCW mempool */ + enum rte_bbdev_op_type op_type; /* Type of this Queue: TE or TD */ + /* Internal Buffers for loopback input */ + uint8_t *lb_in; + uint8_t *lb_out; + rte_iova_t lb_in_addr_phys; + rte_iova_t lb_out_addr_phys; + struct acc100_device *d; +}; + /* Private data structure for each ACC100 device */ struct acc100_device { void *mmio_base; /**< Base address of MMIO registers (BAR0) */ + void *sw_rings_base; /* Base addr of un-aligned memory for sw rings */ + void *sw_rings; /* 64MBs of 64MB aligned memory for sw rings */ + rte_iova_t sw_rings_phys; /* Physical address of sw_rings */ + /* Virtual address of the info memory routed to the this function under + * operation, whether it is PF or VF. + */ + union acc100_harq_layout_data *harq_layout; + uint32_t sw_ring_size; uint32_t ddr_size; /* Size in kB */ + uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */ + rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */ + /* Max number of entries available for each queue in device, depending + * on how many queues are enabled with configure() + */ + uint32_t sw_ring_max_depth; struct acc100_conf acc100_conf; /* ACC100 Initial configuration */ + /* Bitmap capturing which Queues have already been assigned */ + uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS]; bool pf_device; /**< True if this is a PF ACC100 device */ bool configured; /**< True if this ACC100 device is configured */ }; From patchwork Fri Sep 4 17:54:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76579 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C8967A04C5; Fri, 4 Sep 2020 19:59:04 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E3AC31C11D; Fri, 4 Sep 2020 19:58:17 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 30C5F1C0CA for ; Fri, 4 Sep 2020 19:58:11 +0200 (CEST) IronPort-SDR: Q2GttvexUwZWfj1Gz48SJH23vfAqseS6bn9wu6r2FfXPjRRe+ITOasmXaRMyrPrEu+JD/+tu55 /Xv3Yop1Bc8w== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614836" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614836" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: 8IT9AFQLS7jsFU4ncwWRW2farYAjQ7qPInmdG3SjS6OxfAFjb960yW00+DUPRb+5WcEwoBjnPr ey9fb3m0kEJg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765390" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:01 -0700 Message-Id: <1599242047-58232-6-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 05/11] baseband/acc100: add LDPC processing functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Adding LDPC decode and encode processing operations Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++- drivers/baseband/acc100/rte_acc100_pmd.h | 3 + 2 files changed, 1626 insertions(+), 2 deletions(-) diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index 7a21c57..7f64695 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -15,6 +15,9 @@ #include #include #include +#ifdef RTE_BBDEV_OFFLOAD_COST +#include +#endif #include #include @@ -449,7 +452,6 @@ return 0; } - /** * Report a ACC100 queue index which is free * Return 0 to 16k for a valid queue_idx or -1 when no queue is available @@ -634,6 +636,46 @@ struct acc100_device *d = dev->data->dev_private; static const struct rte_bbdev_op_cap bbdev_capabilities[] = { + { + .type = RTE_BBDEV_OP_LDPC_ENC, + .cap.ldpc_enc = { + .capability_flags = + RTE_BBDEV_LDPC_RATE_MATCH | + RTE_BBDEV_LDPC_CRC_24B_ATTACH | + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS, + .num_buffers_src = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + .num_buffers_dst = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + } + }, + { + .type = RTE_BBDEV_OP_LDPC_DEC, + .cap.ldpc_dec = { + .capability_flags = + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK | + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP | + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE | + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE | +#ifdef ACC100_EXT_MEM + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE | + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE | +#endif + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE | + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS | + RTE_BBDEV_LDPC_DECODE_BYPASS | + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER | + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION | + RTE_BBDEV_LDPC_LLR_COMPRESSION, + .llr_size = 8, + .llr_decimals = 1, + .num_buffers_src = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + .num_buffers_hard_out = + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, + .num_buffers_soft_out = 0, + } + }, RTE_BBDEV_END_OF_CAPABILITIES_LIST() }; @@ -669,9 +711,14 @@ dev_info->cpu_flag_reqs = NULL; dev_info->min_alignment = 64; dev_info->capabilities = bbdev_capabilities; +#ifdef ACC100_EXT_MEM dev_info->harq_buffer_size = d->ddr_size; +#else + dev_info->harq_buffer_size = 0; +#endif } + static const struct rte_bbdev_ops acc100_bbdev_ops = { .setup_queues = acc100_setup_queues, .close = acc100_dev_close, @@ -696,6 +743,1577 @@ {.device_id = 0}, }; +/* Read flag value 0/1 from bitmap */ +static inline bool +check_bit(uint32_t bitmap, uint32_t bitmask) +{ + return bitmap & bitmask; +} + +static inline char * +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len) +{ + if (unlikely(len > rte_pktmbuf_tailroom(m))) + return NULL; + + char *tail = (char *)m->buf_addr + m->data_off + m->data_len; + m->data_len = (uint16_t)(m->data_len + len); + m_head->pkt_len = (m_head->pkt_len + len); + return tail; +} + +/* Compute value of k0. + * Based on 3GPP 38.212 Table 5.4.2.1-2 + * Starting position of different redundancy versions, k0 + */ +static inline uint16_t +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index) +{ + if (rv_index == 0) + return 0; + uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c; + if (n_cb == n) { + if (rv_index == 1) + return (bg == 1 ? K0_1_1 : K0_1_2) * z_c; + else if (rv_index == 2) + return (bg == 1 ? K0_2_1 : K0_2_2) * z_c; + else + return (bg == 1 ? K0_3_1 : K0_3_2) * z_c; + } + /* LBRM case - includes a division by N */ + if (rv_index == 1) + return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb) + / n) * z_c; + else if (rv_index == 2) + return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb) + / n) * z_c; + else + return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb) + / n) * z_c; +} + +/* Fill in a frame control word for LDPC encoding. */ +static inline void +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op, + struct acc100_fcw_le *fcw, int num_cb) +{ + fcw->qm = op->ldpc_enc.q_m; + fcw->nfiller = op->ldpc_enc.n_filler; + fcw->BG = (op->ldpc_enc.basegraph - 1); + fcw->Zc = op->ldpc_enc.z_c; + fcw->ncb = op->ldpc_enc.n_cb; + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph, + op->ldpc_enc.rv_index); + fcw->rm_e = op->ldpc_enc.cb_params.e; + fcw->crc_select = check_bit(op->ldpc_enc.op_flags, + RTE_BBDEV_LDPC_CRC_24B_ATTACH); + fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags, + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS); + fcw->mcb_count = num_cb; +} + +/* Fill in a frame control word for LDPC decoding. */ +static inline void +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw, + union acc100_harq_layout_data *harq_layout) +{ + uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset; + uint16_t harq_index; + uint32_t l; + bool harq_prun = false; + + fcw->qm = op->ldpc_dec.q_m; + fcw->nfiller = op->ldpc_dec.n_filler; + fcw->BG = (op->ldpc_dec.basegraph - 1); + fcw->Zc = op->ldpc_dec.z_c; + fcw->ncb = op->ldpc_dec.n_cb; + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph, + op->ldpc_dec.rv_index); + if (op->ldpc_dec.code_block_mode == 1) + fcw->rm_e = op->ldpc_dec.cb_params.e; + else + fcw->rm_e = (op->ldpc_dec.tb_params.r < + op->ldpc_dec.tb_params.cab) ? + op->ldpc_dec.tb_params.ea : + op->ldpc_dec.tb_params.eb; + + fcw->hcin_en = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE); + fcw->hcout_en = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE); + fcw->crc_select = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK); + fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_DECODE_BYPASS); + fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS); + if (op->ldpc_dec.q_m == 1) { + fcw->bypass_intlv = 1; + fcw->qm = 2; + } + fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION); + fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION); + fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_LLR_COMPRESSION); + harq_index = op->ldpc_dec.harq_combined_output.offset / + ACC100_HARQ_OFFSET; +#ifdef ACC100_EXT_MEM + /* Limit cases when HARQ pruning is valid */ + harq_prun = ((op->ldpc_dec.harq_combined_output.offset % + ACC100_HARQ_OFFSET) == 0) && + (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX + * ACC100_HARQ_OFFSET); +#endif + if (fcw->hcin_en > 0) { + harq_in_length = op->ldpc_dec.harq_combined_input.length; + if (fcw->hcin_decomp_mode > 0) + harq_in_length = harq_in_length * 8 / 6; + harq_in_length = RTE_ALIGN(harq_in_length, 64); + if ((harq_layout[harq_index].offset > 0) & harq_prun) { + rte_bbdev_log_debug("HARQ IN offset unexpected for now\n"); + fcw->hcin_size0 = harq_layout[harq_index].size0; + fcw->hcin_offset = harq_layout[harq_index].offset; + fcw->hcin_size1 = harq_in_length - + harq_layout[harq_index].offset; + } else { + fcw->hcin_size0 = harq_in_length; + fcw->hcin_offset = 0; + fcw->hcin_size1 = 0; + } + } else { + fcw->hcin_size0 = 0; + fcw->hcin_offset = 0; + fcw->hcin_size1 = 0; + } + + fcw->itmax = op->ldpc_dec.iter_max; + fcw->itstop = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE); + fcw->synd_precoder = fcw->itstop; + /* + * These are all implicitly set + * fcw->synd_post = 0; + * fcw->so_en = 0; + * fcw->so_bypass_rm = 0; + * fcw->so_bypass_intlv = 0; + * fcw->dec_convllr = 0; + * fcw->hcout_convllr = 0; + * fcw->hcout_size1 = 0; + * fcw->so_it = 0; + * fcw->hcout_offset = 0; + * fcw->negstop_th = 0; + * fcw->negstop_it = 0; + * fcw->negstop_en = 0; + * fcw->gain_i = 1; + * fcw->gain_h = 1; + */ + if (fcw->hcout_en > 0) { + parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8) + * op->ldpc_dec.z_c - op->ldpc_dec.n_filler; + k0_p = (fcw->k0 > parity_offset) ? + fcw->k0 - op->ldpc_dec.n_filler : fcw->k0; + ncb_p = fcw->ncb - op->ldpc_dec.n_filler; + l = k0_p + fcw->rm_e; + harq_out_length = (uint16_t) fcw->hcin_size0; + harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p); + harq_out_length = (harq_out_length + 0x3F) & 0xFFC0; + if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) && + harq_prun) { + fcw->hcout_size0 = (uint16_t) fcw->hcin_size0; + fcw->hcout_offset = k0_p & 0xFFC0; + fcw->hcout_size1 = harq_out_length - fcw->hcout_offset; + } else { + fcw->hcout_size0 = harq_out_length; + fcw->hcout_size1 = 0; + fcw->hcout_offset = 0; + } + harq_layout[harq_index].offset = fcw->hcout_offset; + harq_layout[harq_index].size0 = fcw->hcout_size0; + } else { + fcw->hcout_size0 = 0; + fcw->hcout_size1 = 0; + fcw->hcout_offset = 0; + } +} + +/** + * Fills descriptor with data pointers of one block type. + * + * @param desc + * Pointer to DMA descriptor. + * @param input + * Pointer to pointer to input data which will be encoded. It can be changed + * and points to next segment in scatter-gather case. + * @param offset + * Input offset in rte_mbuf structure. It is used for calculating the point + * where data is starting. + * @param cb_len + * Length of currently processed Code Block + * @param seg_total_left + * It indicates how many bytes still left in segment (mbuf) for further + * processing. + * @param op_flags + * Store information about device capabilities + * @param next_triplet + * Index for ACC100 DMA Descriptor triplet + * + * @return + * Returns index of next triplet on success, other value if lengths of + * pkt and processed cb do not match. + * + */ +static inline int +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc, + struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len, + uint32_t *seg_total_left, int next_triplet) +{ + uint32_t part_len; + struct rte_mbuf *m = *input; + + part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len; + cb_len -= part_len; + *seg_total_left -= part_len; + + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(m, *offset); + desc->data_ptrs[next_triplet].blen = part_len; + desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN; + desc->data_ptrs[next_triplet].last = 0; + desc->data_ptrs[next_triplet].dma_ext = 0; + *offset += part_len; + next_triplet++; + + while (cb_len > 0) { + if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS && + m->next != NULL) { + + m = m->next; + *seg_total_left = rte_pktmbuf_data_len(m); + part_len = (*seg_total_left < cb_len) ? + *seg_total_left : + cb_len; + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_mtophys(m); + desc->data_ptrs[next_triplet].blen = part_len; + desc->data_ptrs[next_triplet].blkid = + ACC100_DMA_BLKID_IN; + desc->data_ptrs[next_triplet].last = 0; + desc->data_ptrs[next_triplet].dma_ext = 0; + cb_len -= part_len; + *seg_total_left -= part_len; + /* Initializing offset for next segment (mbuf) */ + *offset = part_len; + next_triplet++; + } else { + rte_bbdev_log(ERR, + "Some data still left for processing: " + "data_left: %u, next_triplet: %u, next_mbuf: %p", + cb_len, next_triplet, m->next); + return -EINVAL; + } + } + /* Storing new mbuf as it could be changed in scatter-gather case*/ + *input = m; + + return next_triplet; +} + +/* Fills descriptor with data pointers of one block type. + * Returns index of next triplet on success, other value if lengths of + * output data and processed mbuf do not match. + */ +static inline int +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc, + struct rte_mbuf *output, uint32_t out_offset, + uint32_t output_len, int next_triplet, int blk_id) +{ + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(output, out_offset); + desc->data_ptrs[next_triplet].blen = output_len; + desc->data_ptrs[next_triplet].blkid = blk_id; + desc->data_ptrs[next_triplet].last = 0; + desc->data_ptrs[next_triplet].dma_ext = 0; + next_triplet++; + + return next_triplet; +} + +static inline int +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op, + struct acc100_dma_req_desc *desc, struct rte_mbuf **input, + struct rte_mbuf *output, uint32_t *in_offset, + uint32_t *out_offset, uint32_t *out_length, + uint32_t *mbuf_total_left, uint32_t *seg_total_left) +{ + int next_triplet = 1; /* FCW already done */ + uint16_t K, in_length_in_bits, in_length_in_bytes; + struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc; + + desc->word0 = ACC100_DMA_DESC_TYPE; + desc->word1 = 0; /**< Timestamp could be disabled */ + desc->word2 = 0; + desc->word3 = 0; + desc->numCBs = 1; + + K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c; + in_length_in_bits = K - enc->n_filler; + if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) || + (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH)) + in_length_in_bits -= 24; + in_length_in_bytes = in_length_in_bits >> 3; + + if (unlikely((*mbuf_total_left == 0) || + (*mbuf_total_left < in_length_in_bytes))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, in_length_in_bytes); + return -1; + } + + next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, + in_length_in_bytes, + seg_total_left, next_triplet); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= in_length_in_bytes; + + /* Set output length */ + /* Integer round up division by 8 */ + *out_length = (enc->cb_params.e + 7) >> 3; + + next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset, + *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + op->ldpc_enc.output.length += *out_length; + *out_offset += *out_length; + desc->data_ptrs[next_triplet - 1].last = 1; + desc->data_ptrs[next_triplet - 1].dma_ext = 0; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline int +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op, + struct acc100_dma_req_desc *desc, + struct rte_mbuf **input, struct rte_mbuf *h_output, + uint32_t *in_offset, uint32_t *h_out_offset, + uint32_t *h_out_length, uint32_t *mbuf_total_left, + uint32_t *seg_total_left, + struct acc100_fcw_ld *fcw) +{ + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec; + int next_triplet = 1; /* FCW already done */ + uint32_t input_length; + uint16_t output_length, crc24_overlap = 0; + uint16_t sys_cols, K, h_p_size, h_np_size; + bool h_comp = check_bit(dec->op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION); + + desc->word0 = ACC100_DMA_DESC_TYPE; + desc->word1 = 0; /**< Timestamp could be disabled */ + desc->word2 = 0; + desc->word3 = 0; + desc->numCBs = 1; + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP)) + crc24_overlap = 24; + + /* Compute some LDPC BG lengths */ + input_length = dec->cb_params.e; + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_LLR_COMPRESSION)) + input_length = (input_length * 3 + 3) / 4; + sys_cols = (dec->basegraph == 1) ? 22 : 10; + K = sys_cols * dec->z_c; + output_length = K - dec->n_filler - crc24_overlap; + + if (unlikely((*mbuf_total_left == 0) || + (*mbuf_total_left < input_length))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, input_length); + return -1; + } + + next_triplet = acc100_dma_fill_blk_type_in(desc, input, + in_offset, input_length, + seg_total_left, next_triplet); + + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) { + h_p_size = fcw->hcin_size0 + fcw->hcin_size1; + if (h_comp) + h_p_size = (h_p_size * 3 + 3) / 4; + desc->data_ptrs[next_triplet].address = + dec->harq_combined_input.offset; + desc->data_ptrs[next_triplet].blen = h_p_size; + desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ; + desc->data_ptrs[next_triplet].dma_ext = 1; +#ifndef ACC100_EXT_MEM + acc100_dma_fill_blk_type_out( + desc, + op->ldpc_dec.harq_combined_input.data, + op->ldpc_dec.harq_combined_input.offset, + h_p_size, + next_triplet, + ACC100_DMA_BLKID_IN_HARQ); +#endif + next_triplet++; + } + + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= input_length; + + next_triplet = acc100_dma_fill_blk_type_out(desc, h_output, + *h_out_offset, output_length >> 3, next_triplet, + ACC100_DMA_BLKID_OUT_HARD); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) { + /* Pruned size of the HARQ */ + h_p_size = fcw->hcout_size0 + fcw->hcout_size1; + /* Non-Pruned size of the HARQ */ + h_np_size = fcw->hcout_offset > 0 ? + fcw->hcout_offset + fcw->hcout_size1 : + h_p_size; + if (h_comp) { + h_np_size = (h_np_size * 3 + 3) / 4; + h_p_size = (h_p_size * 3 + 3) / 4; + } + dec->harq_combined_output.length = h_np_size; + desc->data_ptrs[next_triplet].address = + dec->harq_combined_output.offset; + desc->data_ptrs[next_triplet].blen = h_p_size; + desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ; + desc->data_ptrs[next_triplet].dma_ext = 1; +#ifndef ACC100_EXT_MEM + acc100_dma_fill_blk_type_out( + desc, + dec->harq_combined_output.data, + dec->harq_combined_output.offset, + h_p_size, + next_triplet, + ACC100_DMA_BLKID_OUT_HARQ); +#endif + next_triplet++; + } + + *h_out_length = output_length >> 3; + dec->hard_output.length += *h_out_length; + *h_out_offset += *h_out_length; + desc->data_ptrs[next_triplet - 1].last = 1; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline void +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op, + struct acc100_dma_req_desc *desc, + struct rte_mbuf *input, struct rte_mbuf *h_output, + uint32_t *in_offset, uint32_t *h_out_offset, + uint32_t *h_out_length, + union acc100_harq_layout_data *harq_layout) +{ + int next_triplet = 1; /* FCW already done */ + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(input, *in_offset); + next_triplet++; + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) { + struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input; + desc->data_ptrs[next_triplet].address = hi.offset; +#ifndef ACC100_EXT_MEM + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(hi.data, hi.offset); +#endif + next_triplet++; + } + + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(h_output, *h_out_offset); + *h_out_length = desc->data_ptrs[next_triplet].blen; + next_triplet++; + + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) { + desc->data_ptrs[next_triplet].address = + op->ldpc_dec.harq_combined_output.offset; + /* Adjust based on previous operation */ + struct rte_bbdev_dec_op *prev_op = desc->op_addr; + op->ldpc_dec.harq_combined_output.length = + prev_op->ldpc_dec.harq_combined_output.length; + int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset / + ACC100_HARQ_OFFSET; + int16_t prev_hq_idx = + prev_op->ldpc_dec.harq_combined_output.offset + / ACC100_HARQ_OFFSET; + harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val; +#ifndef ACC100_EXT_MEM + struct rte_bbdev_op_data ho = + op->ldpc_dec.harq_combined_output; + desc->data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(ho.data, ho.offset); +#endif + next_triplet++; + } + + op->ldpc_dec.hard_output.length += *h_out_length; + desc->op_addr = op; +} + + +/* Enqueue a number of operations to HW and update software rings */ +static inline void +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n, + struct rte_bbdev_stats *queue_stats) +{ + union acc100_enqueue_reg_fmt enq_req; +#ifdef RTE_BBDEV_OFFLOAD_COST + uint64_t start_time = 0; + queue_stats->acc_offload_cycles = 0; + RTE_SET_USED(queue_stats); +#else + RTE_SET_USED(queue_stats); +#endif + + enq_req.val = 0; + /* Setting offset, 100b for 256 DMA Desc */ + enq_req.addr_offset = ACC100_DESC_OFFSET; + + /* Split ops into batches */ + do { + union acc100_dma_desc *desc; + uint16_t enq_batch_size; + uint64_t offset; + rte_iova_t req_elem_addr; + + enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE); + + /* Set flag on last descriptor in a batch */ + desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) & + q->sw_ring_wrap_mask); + desc->req.last_desc_in_batch = 1; + + /* Calculate the 1st descriptor's address */ + offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) * + sizeof(union acc100_dma_desc)); + req_elem_addr = q->ring_addr_phys + offset; + + /* Fill enqueue struct */ + enq_req.num_elem = enq_batch_size; + /* low 6 bits are not needed */ + enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "Req sdone", desc, sizeof(*desc)); +#endif + rte_bbdev_log_debug( + "Enqueue %u reqs (phys %#"PRIx64") to reg %p", + enq_batch_size, + req_elem_addr, + (void *)q->mmio_reg_enqueue); + + rte_wmb(); + +#ifdef RTE_BBDEV_OFFLOAD_COST + /* Start time measurement for enqueue function offload. */ + start_time = rte_rdtsc_precise(); +#endif + rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue"); + mmio_write(q->mmio_reg_enqueue, enq_req.val); + +#ifdef RTE_BBDEV_OFFLOAD_COST + queue_stats->acc_offload_cycles += + rte_rdtsc_precise() - start_time; +#endif + + q->aq_enqueued++; + q->sw_ring_head += enq_batch_size; + n -= enq_batch_size; + + } while (n); + + +} + +/* Enqueue one encode operations for ACC100 device in CB mode */ +static inline int +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops, + uint16_t total_enqueued_cbs, int16_t num) +{ + union acc100_dma_desc *desc = NULL; + uint32_t out_length; + struct rte_mbuf *output_head, *output; + int i, next_triplet; + uint16_t in_length_in_bytes; + struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num); + + /** This could be done at polling */ + desc->req.word0 = ACC100_DMA_DESC_TYPE; + desc->req.word1 = 0; /**< Timestamp could be disabled */ + desc->req.word2 = 0; + desc->req.word3 = 0; + desc->req.numCBs = num; + + in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len; + out_length = (enc->cb_params.e + 7) >> 3; + desc->req.m2dlen = 1 + num; + desc->req.d2mlen = num; + next_triplet = 1; + + for (i = 0; i < num; i++) { + desc->req.data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0); + desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes; + next_triplet++; + desc->req.data_ptrs[next_triplet].address = + rte_pktmbuf_iova_offset( + ops[i]->ldpc_enc.output.data, 0); + desc->req.data_ptrs[next_triplet].blen = out_length; + next_triplet++; + ops[i]->ldpc_enc.output.length = out_length; + output_head = output = ops[i]->ldpc_enc.output.data; + mbuf_append(output_head, output, out_length); + output->data_len = out_length; + } + + desc->req.op_addr = ops[0]; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_le, + sizeof(desc->req.fcw_le) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + /* One CB (one op) was successfully prepared to enqueue */ + return num; +} + +/* Enqueue one encode operations for ACC100 device in CB mode */ +static inline int +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op, + uint16_t total_enqueued_cbs) +{ + union acc100_dma_desc *desc = NULL; + int ret; + uint32_t in_offset, out_offset, out_length, mbuf_total_left, + seg_total_left; + struct rte_mbuf *input, *output_head, *output; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc100_fcw_le_fill(op, &desc->req.fcw_le, 1); + + input = op->ldpc_enc.input.data; + output_head = output = op->ldpc_enc.output.data; + in_offset = op->ldpc_enc.input.offset; + out_offset = op->ldpc_enc.output.offset; + out_length = 0; + mbuf_total_left = op->ldpc_enc.input.length; + seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data) + - in_offset; + + ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output, + &in_offset, &out_offset, &out_length, &mbuf_total_left, + &seg_total_left); + + if (unlikely(ret < 0)) + return ret; + + mbuf_append(output_head, output, out_length); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_le, + sizeof(desc->req.fcw_le) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); + + /* Check if any data left after processing one CB */ + if (mbuf_total_left != 0) { + rte_bbdev_log(ERR, + "Some date still left after processing one CB: mbuf_total_left = %u", + mbuf_total_left); + return -EINVAL; + } +#endif + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + +/** Enqueue one decode operations for ACC100 device in CB mode */ +static inline int +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs, bool same_op) +{ + int ret; + + union acc100_dma_desc *desc; + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + struct rte_mbuf *input, *h_output_head, *h_output; + uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0; + input = op->ldpc_dec.input.data; + h_output_head = h_output = op->ldpc_dec.hard_output.data; + in_offset = op->ldpc_dec.input.offset; + h_out_offset = op->ldpc_dec.hard_output.offset; + mbuf_total_left = op->ldpc_dec.input.length; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(input == NULL)) { + rte_bbdev_log(ERR, "Invalid mbuf pointer"); + return -EFAULT; + } +#endif + union acc100_harq_layout_data *harq_layout = q->d->harq_layout; + + if (same_op) { + union acc100_dma_desc *prev_desc; + desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1) + & q->sw_ring_wrap_mask); + prev_desc = q->ring_addr + desc_idx; + uint8_t *prev_ptr = (uint8_t *) prev_desc; + uint8_t *new_ptr = (uint8_t *) desc; + /* Copy first 4 words and BDESCs */ + rte_memcpy(new_ptr, prev_ptr, 16); + rte_memcpy(new_ptr + 36, prev_ptr + 36, 40); + desc->req.op_addr = prev_desc->req.op_addr; + /* Copy FCW */ + rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET, + prev_ptr + ACC100_DESC_FCW_OFFSET, + ACC100_FCW_LD_BLEN); + acc100_dma_desc_ld_update(op, &desc->req, input, h_output, + &in_offset, &h_out_offset, + &h_out_length, harq_layout); + } else { + struct acc100_fcw_ld *fcw; + uint32_t seg_total_left; + fcw = &desc->req.fcw_ld; + acc100_fcw_ld_fill(op, fcw, harq_layout); + + /* Special handling when overusing mbuf */ + if (fcw->rm_e < MAX_E_MBUF) + seg_total_left = rte_pktmbuf_data_len(input) + - in_offset; + else + seg_total_left = fcw->rm_e; + + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output, + &in_offset, &h_out_offset, + &h_out_length, &mbuf_total_left, + &seg_total_left, fcw); + if (unlikely(ret < 0)) + return ret; + } + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); +#ifndef ACC100_EXT_MEM + if (op->ldpc_dec.harq_combined_output.length > 0) { + /* Push the HARQ output into host memory */ + struct rte_mbuf *hq_output_head, *hq_output; + hq_output_head = op->ldpc_dec.harq_combined_output.data; + hq_output = op->ldpc_dec.harq_combined_output.data; + mbuf_append(hq_output_head, hq_output, + op->ldpc_dec.harq_combined_output.length); + } +#endif + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_ld, + sizeof(desc->req.fcw_ld) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + + +/* Enqueue one decode operations for ACC100 device in TB mode */ +static inline int +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) +{ + union acc100_dma_desc *desc = NULL; + int ret; + uint8_t r, c; + uint32_t in_offset, h_out_offset, + h_out_length, mbuf_total_left, seg_total_left; + struct rte_mbuf *input, *h_output_head, *h_output; + uint16_t current_enqueued_cbs = 0; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET; + union acc100_harq_layout_data *harq_layout = q->d->harq_layout; + acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout); + + input = op->ldpc_dec.input.data; + h_output_head = h_output = op->ldpc_dec.hard_output.data; + in_offset = op->ldpc_dec.input.offset; + h_out_offset = op->ldpc_dec.hard_output.offset; + h_out_length = 0; + mbuf_total_left = op->ldpc_dec.input.length; + c = op->ldpc_dec.tb_params.c; + r = op->ldpc_dec.tb_params.r; + + while (mbuf_total_left > 0 && r < c) { + + seg_total_left = rte_pktmbuf_data_len(input) - in_offset; + + /* Set up DMA descriptor */ + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset; + desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN; + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, + h_output, &in_offset, &h_out_offset, + &h_out_length, + &mbuf_total_left, &seg_total_left, + &desc->req.fcw_ld); + + if (unlikely(ret < 0)) + return ret; + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); + + /* Set total number of CBs in TB */ + desc->req.cbs_in_tb = cbs_in_tb; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_td, + sizeof(desc->req.fcw_td) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + if (seg_total_left == 0) { + /* Go to the next mbuf */ + input = input->next; + in_offset = 0; + h_output = h_output->next; + h_out_offset = 0; + } + total_enqueued_cbs++; + current_enqueued_cbs++; + r++; + } + + if (unlikely(desc == NULL)) + return current_enqueued_cbs; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Check if any CBs left for processing */ + if (mbuf_total_left != 0) { + rte_bbdev_log(ERR, + "Some date still left for processing: mbuf_total_left = %u", + mbuf_total_left); + return -EINVAL; + } +#endif + /* Set SDone on last CB descriptor for TB mode */ + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + return current_enqueued_cbs; +} + + +/* Calculates number of CBs in processed encoder TB based on 'r' and input + * length. + */ +static inline uint8_t +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc) +{ + uint8_t c, c_neg, r, crc24_bits = 0; + uint16_t k, k_neg, k_pos; + uint8_t cbs_in_tb = 0; + int32_t length; + + length = turbo_enc->input.length; + r = turbo_enc->tb_params.r; + c = turbo_enc->tb_params.c; + c_neg = turbo_enc->tb_params.c_neg; + k_neg = turbo_enc->tb_params.k_neg; + k_pos = turbo_enc->tb_params.k_pos; + crc24_bits = 0; + if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH)) + crc24_bits = 24; + while (length > 0 && r < c) { + k = (r < c_neg) ? k_neg : k_pos; + length -= (k - crc24_bits) >> 3; + r++; + cbs_in_tb++; + } + + return cbs_in_tb; +} + +/* Calculates number of CBs in processed decoder TB based on 'r' and input + * length. + */ +static inline uint16_t +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec) +{ + uint8_t c, c_neg, r = 0; + uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0; + int32_t length; + + length = turbo_dec->input.length; + r = turbo_dec->tb_params.r; + c = turbo_dec->tb_params.c; + c_neg = turbo_dec->tb_params.c_neg; + k_neg = turbo_dec->tb_params.k_neg; + k_pos = turbo_dec->tb_params.k_pos; + while (length > 0 && r < c) { + k = (r < c_neg) ? k_neg : k_pos; + kw = RTE_ALIGN_CEIL(k + 4, 32) * 3; + length -= kw; + r++; + cbs_in_tb++; + } + + return cbs_in_tb; +} + +/* Calculates number of CBs in processed decoder TB based on 'r' and input + * length. + */ +static inline uint16_t +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec) +{ + uint16_t r, cbs_in_tb = 0; + int32_t length = ldpc_dec->input.length; + r = ldpc_dec->tb_params.r; + while (length > 0 && r < ldpc_dec->tb_params.c) { + length -= (r < ldpc_dec->tb_params.cab) ? + ldpc_dec->tb_params.ea : + ldpc_dec->tb_params.eb; + r++; + cbs_in_tb++; + } + return cbs_in_tb; +} + +/* Check we can mux encode operations with common FCW */ +static inline bool +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) { + uint16_t i; + if (num == 1) + return false; + for (i = 1; i < num; ++i) { + /* Only mux compatible code blocks */ + if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET, + (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET, + CMP_ENC_SIZE) != 0) + return false; + } + return true; +} + +/** Enqueue encode operations for ACC100 device in CB mode. */ +static inline uint16_t +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head; + uint16_t i = 0; + union acc100_dma_desc *desc; + int ret, desc_idx = 0; + int16_t enq, left = num; + + while (left > 0) { + if (unlikely(avail - 1 < 0)) + break; + avail--; + enq = RTE_MIN(left, MUX_5GDL_DESC); + if (check_mux(&ops[i], enq)) { + ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i], + desc_idx, enq); + if (ret < 0) + break; + i += enq; + } else { + ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx); + if (ret < 0) + break; + i++; + } + desc_idx++; + left = num - i; + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1) + & q->sw_ring_wrap_mask); + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + +/* Enqueue encode operations for ACC100 device. */ +static uint16_t +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + if (unlikely(num == 0)) + return 0; + return acc100_enqueue_ldpc_enc_cb(q_data, ops, num); +} + +/* Check we can mux encode operations with common FCW */ +static inline bool +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) { + /* Only mux compatible code blocks */ + if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET, + (uint8_t *)(&ops[1]->ldpc_dec) + + DEC_OFFSET, CMP_DEC_SIZE) != 0) { + return false; + } else + return true; +} + + +/* Enqueue decode operations for ACC100 device in TB mode */ +static uint16_t +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head; + uint16_t i, enqueued_cbs = 0; + uint8_t cbs_in_tb; + int ret; + + for (i = 0; i < num; ++i) { + cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec); + /* Check if there are available space for further processing */ + if (unlikely(avail - cbs_in_tb < 0)) + break; + avail -= cbs_in_tb; + + ret = enqueue_ldpc_dec_one_op_tb(q, ops[i], + enqueued_cbs, cbs_in_tb); + if (ret < 0) + break; + enqueued_cbs += ret; + } + + acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + return i; +} + +/* Enqueue decode operations for ACC100 device in CB mode */ +static uint16_t +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head; + uint16_t i; + union acc100_dma_desc *desc; + int ret; + bool same_op = false; + for (i = 0; i < num; ++i) { + /* Check if there are available space for further processing */ + if (unlikely(avail - 1 < 0)) + break; + avail -= 1; + + if (i > 0) + same_op = cmp_ldpc_dec_op(&ops[i-1]); + rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n", + i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index, + ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count, + ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c, + ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m, + ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e, + same_op); + ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op); + if (ret < 0) + break; + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + i - 1) + & q->sw_ring_wrap_mask); + + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc100_dma_enqueue(q, i, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + return i; +} + +/* Enqueue decode operations for ACC100 device. */ +static uint16_t +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t aq_avail = q->aq_depth + + (q->aq_dequeued - q->aq_enqueued) / 128; + + if (unlikely((aq_avail == 0) || (num == 0))) + return 0; + + if (ops[0]->ldpc_dec.code_block_mode == 0) + return acc100_enqueue_ldpc_dec_tb(q_data, ops, num); + else + return acc100_enqueue_ldpc_dec_cb(q_data, ops, num); +} + + +/* Dequeue one encode operations from ACC100 device in CB mode */ +static inline int +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op, + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc100_dma_desc *desc, atom_desc; + union acc100_dma_rsp_desc rsp; + struct rte_bbdev_enc_op *op; + int i; + + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC100_FDONE)) + return -1; + + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val); + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC100_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; /*Reserved bits */ + desc->rsp.add_info_1 = 0; /*Reserved bits */ + + /* Flag that the muxing cause loss of opaque data */ + op->opaque_data = (void *)-1; + for (i = 0 ; i < desc->req.numCBs; i++) + ref_op[i] = op; + + /* One CB (op) was successfully dequeued */ + return desc->req.numCBs; +} + +/* Dequeue one encode operations from ACC100 device in TB mode */ +static inline int +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op, + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc100_dma_desc *desc, *last_desc, atom_desc; + union acc100_dma_rsp_desc rsp; + struct rte_bbdev_enc_op *op; + uint8_t i = 0; + uint16_t current_dequeued_cbs = 0, cbs_in_tb; + + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC100_FDONE)) + return -1; + + /* Get number of CBs in dequeued TB */ + cbs_in_tb = desc->req.cbs_in_tb; + /* Get last CB */ + last_desc = q->ring_addr + ((q->sw_ring_tail + + total_dequeued_cbs + cbs_in_tb - 1) + & q->sw_ring_wrap_mask); + /* Check if last CB in TB is ready to dequeue (and thus + * the whole TB) - checking sdone bit. If not return. + */ + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc, + __ATOMIC_RELAXED); + if (!(atom_desc.rsp.val & ACC100_SDONE)) + return -1; + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + + while (i < cbs_in_tb) { + desc = q->ring_addr + ((q->sw_ring_tail + + total_dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x", desc, + rsp.val); + + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC100_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + total_dequeued_cbs++; + current_dequeued_cbs++; + i++; + } + + *ref_op = op; + + return current_dequeued_cbs; +} + +/* Dequeue one decode operation from ACC100 device in CB mode */ +static inline int +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data, + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op, + uint16_t dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc100_dma_desc *desc, atom_desc; + union acc100_dma_rsp_desc rsp; + struct rte_bbdev_dec_op *op; + + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC100_FDONE)) + return -1; + + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val); + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + if (op->status != 0) + q_data->queue_stats.dequeue_err_count++; + + /* CRC invalid if error exists */ + if (!op->status) + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR; + op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2; + /* Check if this is the last desc in batch (Atomic Queue) */ + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC100_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + *ref_op = op; + + /* One CB (op) was successfully dequeued */ + return 1; +} + +/* Dequeue one decode operations from ACC100 device in CB mode */ +static inline int +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data, + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op, + uint16_t dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc100_dma_desc *desc, atom_desc; + union acc100_dma_rsp_desc rsp; + struct rte_bbdev_dec_op *op; + + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC100_FDONE)) + return -1; + + rsp.val = atom_desc.rsp.val; + + /* Dequeue */ + op = desc->req.op_addr; + + /* Clearing status, it will be set based on response */ + op->status = 0; + op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR; + op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR; + op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR; + if (op->status != 0) + q_data->queue_stats.dequeue_err_count++; + + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR; + if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok) + op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR; + op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt; + + /* Check if this is the last desc in batch (Atomic Queue) */ + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + + desc->rsp.val = ACC100_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + + *ref_op = op; + + /* One CB (op) was successfully dequeued */ + return 1; +} + +/* Dequeue one decode operations from ACC100 device in TB mode. */ +static inline int +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op, + uint16_t dequeued_cbs, uint32_t *aq_dequeued) +{ + union acc100_dma_desc *desc, *last_desc, atom_desc; + union acc100_dma_rsp_desc rsp; + struct rte_bbdev_dec_op *op; + uint8_t cbs_in_tb = 1, cb_idx = 0; + + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + + /* Check fdone bit */ + if (!(atom_desc.rsp.val & ACC100_FDONE)) + return -1; + + /* Dequeue */ + op = desc->req.op_addr; + + /* Get number of CBs in dequeued TB */ + cbs_in_tb = desc->req.cbs_in_tb; + /* Get last CB */ + last_desc = q->ring_addr + ((q->sw_ring_tail + + dequeued_cbs + cbs_in_tb - 1) + & q->sw_ring_wrap_mask); + /* Check if last CB in TB is ready to dequeue (and thus + * the whole TB) - checking sdone bit. If not return. + */ + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc, + __ATOMIC_RELAXED); + if (!(atom_desc.rsp.val & ACC100_SDONE)) + return -1; + + /* Clearing status, it will be set based on response */ + op->status = 0; + + /* Read remaining CBs if exists */ + while (cb_idx < cbs_in_tb) { + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask); + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, + __ATOMIC_RELAXED); + rsp.val = atom_desc.rsp.val; + rte_bbdev_log_debug("Resp. desc %p: %x", desc, + rsp.val); + + op->status |= ((rsp.input_err) + ? (1 << RTE_BBDEV_DATA_ERROR) : 0); + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); + + /* CRC invalid if error exists */ + if (!op->status) + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR; + op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt, + op->turbo_dec.iter_count); + + /* Check if this is the last desc in batch (Atomic Queue) */ + if (desc->req.last_desc_in_batch) { + (*aq_dequeued)++; + desc->req.last_desc_in_batch = 0; + } + desc->rsp.val = ACC100_DMA_DESC_TYPE; + desc->rsp.add_info_0 = 0; + desc->rsp.add_info_1 = 0; + dequeued_cbs++; + cb_idx++; + } + + *ref_op = op; + + return cb_idx; +} + +/* Dequeue LDPC encode operations from ACC100 device. */ +static uint16_t +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + uint32_t avail = q->sw_ring_head - q->sw_ring_tail; + uint32_t aq_dequeued = 0; + uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0; + int ret; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == 0 && q == NULL)) + return 0; +#endif + + dequeue_num = (avail < num) ? avail : num; + + for (i = 0; i < dequeue_num; i++) { + ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs], + dequeued_descs, &aq_dequeued); + if (ret < 0) + break; + dequeued_cbs += ret; + dequeued_descs++; + if (dequeued_cbs >= num) + break; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_descs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += dequeued_cbs; + + return dequeued_cbs; +} + +/* Dequeue decode operations from ACC100 device. */ +static uint16_t +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + uint16_t dequeue_num; + uint32_t avail = q->sw_ring_head - q->sw_ring_tail; + uint32_t aq_dequeued = 0; + uint16_t i; + uint16_t dequeued_cbs = 0; + struct rte_bbdev_dec_op *op; + int ret; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == 0 && q == NULL)) + return 0; +#endif + + dequeue_num = (avail < num) ? avail : num; + + for (i = 0; i < dequeue_num; ++i) { + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask))->req.op_addr; + if (op->ldpc_dec.code_block_mode == 0) + ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs, + &aq_dequeued); + else + ret = dequeue_ldpc_dec_one_op_cb( + q_data, q, &ops[i], dequeued_cbs, + &aq_dequeued); + + if (ret < 0) + break; + dequeued_cbs += ret; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_cbs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += i; + + return i; +} + /* Initialization Function */ static void acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) @@ -703,6 +2321,10 @@ struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device); dev->dev_ops = &acc100_bbdev_ops; + dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc; + dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec; + dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc; + dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec; ((struct acc100_device *) dev->data->dev_private)->pf_device = !strcmp(drv->driver.name, @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev) RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map); RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver); RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map); - diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h index 0e2b79c..78686c1 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.h +++ b/drivers/baseband/acc100/rte_acc100_pmd.h @@ -88,6 +88,8 @@ #define TMPL_PRI_3 0x0f0e0d0c #define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */ #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4) +#define ACC100_FDONE 0x80000000 +#define ACC100_SDONE 0x40000000 #define ACC100_NUM_TMPL 32 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */ @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc { union acc100_dma_desc { struct acc100_dma_req_desc req; union acc100_dma_rsp_desc rsp; + uint64_t atom_hdr; }; From patchwork Fri Sep 4 17:54:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76577 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 99C4DA04C5; Fri, 4 Sep 2020 19:58:42 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5465B1C10D; Fri, 4 Sep 2020 19:58:15 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 8F60C1C0B4 for ; Fri, 4 Sep 2020 19:58:10 +0200 (CEST) IronPort-SDR: iFjpvF1oq0WA3eUj6op585EiL64tiY7SL30oWSpdr4LzXUm71E//0z8YTI8JtLtnnmdK73i3/E 49jYekWU0qCA== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614838" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614838" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: 3vY28Up2osXyrRRQuh43pjwJLegPm6QvqHexErzVihPZDzkHFwJ/62BMFzvh+Fcvz0ezR4yyH1 CzxNrSRhd+Og== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765393" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:02 -0700 Message-Id: <1599242047-58232-7-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 06/11] baseband/acc100: add HARQ loopback support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Additional support for HARQ memory loopback Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++ 1 file changed, 158 insertions(+) diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index 7f64695..5b011a1 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -658,6 +658,7 @@ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE | RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE | #ifdef ACC100_EXT_MEM + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK | RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE | RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE | #endif @@ -1480,12 +1481,169 @@ return 1; } +static inline int +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs) { + struct acc100_fcw_ld *fcw; + union acc100_dma_desc *desc; + int next_triplet = 1; + struct rte_mbuf *hq_output_head, *hq_output; + uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length; + if (harq_in_length == 0) { + rte_bbdev_log(ERR, "Loopback of invalid null size\n"); + return -EINVAL; + } + + int h_comp = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION + ) ? 1 : 0; + if (h_comp == 1) + harq_in_length = harq_in_length * 8 / 6; + harq_in_length = RTE_ALIGN(harq_in_length, 64); + uint16_t harq_dma_length_in = (h_comp == 0) ? + harq_in_length : + harq_in_length * 6 / 8; + uint16_t harq_dma_length_out = harq_dma_length_in; + bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE); + union acc100_harq_layout_data *harq_layout = q->d->harq_layout; + uint16_t harq_index = (ddr_mem_in ? + op->ldpc_dec.harq_combined_input.offset : + op->ldpc_dec.harq_combined_output.offset) + / ACC100_HARQ_OFFSET; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + fcw = &desc->req.fcw_ld; + /* Set the FCW from loopback into DDR */ + memset(fcw, 0, sizeof(struct acc100_fcw_ld)); + fcw->FCWversion = ACC100_FCW_VER; + fcw->qm = 2; + fcw->Zc = 384; + if (harq_in_length < 16 * N_ZC_1) + fcw->Zc = 16; + fcw->ncb = fcw->Zc * N_ZC_1; + fcw->rm_e = 2; + fcw->hcin_en = 1; + fcw->hcout_en = 1; + + rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n", + ddr_mem_in, harq_index, + harq_layout[harq_index].offset, harq_in_length, + harq_dma_length_in); + + if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) { + fcw->hcin_size0 = harq_layout[harq_index].size0; + fcw->hcin_offset = harq_layout[harq_index].offset; + fcw->hcin_size1 = harq_in_length - fcw->hcin_offset; + harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1); + if (h_comp == 1) + harq_dma_length_in = harq_dma_length_in * 6 / 8; + } else { + fcw->hcin_size0 = harq_in_length; + } + harq_layout[harq_index].val = 0; + rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n", + fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1); + fcw->hcout_size0 = harq_in_length; + fcw->hcin_decomp_mode = h_comp; + fcw->hcout_comp_mode = h_comp; + fcw->gain_i = 1; + fcw->gain_h = 1; + + /* Set the prefix of descriptor. This could be done at polling */ + desc->req.word0 = ACC100_DMA_DESC_TYPE; + desc->req.word1 = 0; /**< Timestamp could be disabled */ + desc->req.word2 = 0; + desc->req.word3 = 0; + desc->req.numCBs = 1; + + /* Null LLR input for Decoder */ + desc->req.data_ptrs[next_triplet].address = + q->lb_in_addr_phys; + desc->req.data_ptrs[next_triplet].blen = 2; + desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN; + desc->req.data_ptrs[next_triplet].last = 0; + desc->req.data_ptrs[next_triplet].dma_ext = 0; + next_triplet++; + + /* HARQ Combine input from either Memory interface */ + if (!ddr_mem_in) { + next_triplet = acc100_dma_fill_blk_type_out(&desc->req, + op->ldpc_dec.harq_combined_input.data, + op->ldpc_dec.harq_combined_input.offset, + harq_dma_length_in, + next_triplet, + ACC100_DMA_BLKID_IN_HARQ); + } else { + desc->req.data_ptrs[next_triplet].address = + op->ldpc_dec.harq_combined_input.offset; + desc->req.data_ptrs[next_triplet].blen = + harq_dma_length_in; + desc->req.data_ptrs[next_triplet].blkid = + ACC100_DMA_BLKID_IN_HARQ; + desc->req.data_ptrs[next_triplet].dma_ext = 1; + next_triplet++; + } + desc->req.data_ptrs[next_triplet - 1].last = 1; + desc->req.m2dlen = next_triplet; + + /* Dropped decoder hard output */ + desc->req.data_ptrs[next_triplet].address = + q->lb_out_addr_phys; + desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD; + desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD; + desc->req.data_ptrs[next_triplet].last = 0; + desc->req.data_ptrs[next_triplet].dma_ext = 0; + next_triplet++; + + /* HARQ Combine output to either Memory interface */ + if (check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE + )) { + desc->req.data_ptrs[next_triplet].address = + op->ldpc_dec.harq_combined_output.offset; + desc->req.data_ptrs[next_triplet].blen = + harq_dma_length_out; + desc->req.data_ptrs[next_triplet].blkid = + ACC100_DMA_BLKID_OUT_HARQ; + desc->req.data_ptrs[next_triplet].dma_ext = 1; + next_triplet++; + } else { + hq_output_head = op->ldpc_dec.harq_combined_output.data; + hq_output = op->ldpc_dec.harq_combined_output.data; + next_triplet = acc100_dma_fill_blk_type_out( + &desc->req, + op->ldpc_dec.harq_combined_output.data, + op->ldpc_dec.harq_combined_output.offset, + harq_dma_length_out, + next_triplet, + ACC100_DMA_BLKID_OUT_HARQ); + /* HARQ output */ + mbuf_append(hq_output_head, hq_output, harq_dma_length_out); + op->ldpc_dec.harq_combined_output.length = + harq_dma_length_out; + } + desc->req.data_ptrs[next_triplet - 1].last = 1; + desc->req.d2mlen = next_triplet - desc->req.m2dlen; + desc->req.op_addr = op; + + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + /** Enqueue one decode operations for ACC100 device in CB mode */ static inline int enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, uint16_t total_enqueued_cbs, bool same_op) { int ret; + if (unlikely(check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) { + ret = harq_loopback(q, op, total_enqueued_cbs); + return ret; + } union acc100_dma_desc *desc; uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) From patchwork Fri Sep 4 17:54:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76580 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 95944A04C5; Fri, 4 Sep 2020 19:59:17 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1AF8F1C120; Fri, 4 Sep 2020 19:58:19 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 7EC9A1C0CC for ; Fri, 4 Sep 2020 19:58:11 +0200 (CEST) IronPort-SDR: j7V5jdbnDucI9bXl3UZ7KkUUeD3xVXSevWeuUXLaVQGs/EaAgROBN1X701ojvLr604FdmOUaEp Zcnxl0HANGWw== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614840" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614840" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: dVw4UkkBrRQWZpieLwjDB0LzNXWIKJC8aJvqwfKNIwZ3cCHUjC1goya6hNYmsokv+1SAzBBFhz 85EkAxTxAejw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765395" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:03 -0700 Message-Id: <1599242047-58232-8-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 07/11] baseband/acc100: add support for 4G processing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Adding capability for 4G encode and decoder processing Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++-- 1 file changed, 943 insertions(+), 67 deletions(-) diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index 5b011a1..bd07def 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -339,7 +339,6 @@ free_base_addresses(base_addrs, i); } - /* Allocate 64MB memory used for all software rings */ static int acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id) @@ -637,6 +636,41 @@ static const struct rte_bbdev_op_cap bbdev_capabilities[] = { { + .type = RTE_BBDEV_OP_TURBO_DEC, + .cap.turbo_dec = { + .capability_flags = + RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE | + RTE_BBDEV_TURBO_CRC_TYPE_24B | + RTE_BBDEV_TURBO_HALF_ITERATION_EVEN | + RTE_BBDEV_TURBO_EARLY_TERMINATION | + RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN | + RTE_BBDEV_TURBO_MAP_DEC | + RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP | + RTE_BBDEV_TURBO_DEC_SCATTER_GATHER, + .max_llr_modulus = INT8_MAX, + .num_buffers_src = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + .num_buffers_hard_out = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + .num_buffers_soft_out = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + } + }, + { + .type = RTE_BBDEV_OP_TURBO_ENC, + .cap.turbo_enc = { + .capability_flags = + RTE_BBDEV_TURBO_CRC_24B_ATTACH | + RTE_BBDEV_TURBO_RV_INDEX_BYPASS | + RTE_BBDEV_TURBO_RATE_MATCH | + RTE_BBDEV_TURBO_ENC_SCATTER_GATHER, + .num_buffers_src = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + .num_buffers_dst = + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, + } + }, + { .type = RTE_BBDEV_OP_LDPC_ENC, .cap.ldpc_enc = { .capability_flags = @@ -719,7 +753,6 @@ #endif } - static const struct rte_bbdev_ops acc100_bbdev_ops = { .setup_queues = acc100_setup_queues, .close = acc100_dev_close, @@ -763,6 +796,58 @@ return tail; } +/* Fill in a frame control word for turbo encoding. */ +static inline void +acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw) +{ + fcw->code_block_mode = op->turbo_enc.code_block_mode; + if (fcw->code_block_mode == 0) { /* For TB mode */ + fcw->k_neg = op->turbo_enc.tb_params.k_neg; + fcw->k_pos = op->turbo_enc.tb_params.k_pos; + fcw->c_neg = op->turbo_enc.tb_params.c_neg; + fcw->c = op->turbo_enc.tb_params.c; + fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg; + fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos; + + if (check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_RATE_MATCH)) { + fcw->bypass_rm = 0; + fcw->cab = op->turbo_enc.tb_params.cab; + fcw->ea = op->turbo_enc.tb_params.ea; + fcw->eb = op->turbo_enc.tb_params.eb; + } else { + /* E is set to the encoding output size when RM is + * bypassed. + */ + fcw->bypass_rm = 1; + fcw->cab = fcw->c_neg; + fcw->ea = 3 * fcw->k_neg + 12; + fcw->eb = 3 * fcw->k_pos + 12; + } + } else { /* For CB mode */ + fcw->k_pos = op->turbo_enc.cb_params.k; + fcw->ncb_pos = op->turbo_enc.cb_params.ncb; + + if (check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_RATE_MATCH)) { + fcw->bypass_rm = 0; + fcw->eb = op->turbo_enc.cb_params.e; + } else { + /* E is set to the encoding output size when RM is + * bypassed. + */ + fcw->bypass_rm = 1; + fcw->eb = 3 * fcw->k_pos + 12; + } + } + + fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_RV_INDEX_BYPASS); + fcw->code_block_crc = check_bit(op->turbo_enc.op_flags, + RTE_BBDEV_TURBO_CRC_24B_ATTACH); + fcw->rv_idx1 = op->turbo_enc.rv_index; +} + /* Compute value of k0. * Based on 3GPP 38.212 Table 5.4.2.1-2 * Starting position of different redundancy versions, k0 @@ -813,6 +898,25 @@ fcw->mcb_count = num_cb; } +/* Fill in a frame control word for turbo decoding. */ +static inline void +acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw) +{ + /* Note : Early termination is always enabled for 4GUL */ + fcw->fcw_ver = 1; + if (op->turbo_dec.code_block_mode == 0) + fcw->k_pos = op->turbo_dec.tb_params.k_pos; + else + fcw->k_pos = op->turbo_dec.cb_params.k; + fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_CRC_TYPE_24B); + fcw->bypass_sb_deint = 0; + fcw->raw_decoder_input_on = 0; + fcw->max_iter = op->turbo_dec.iter_max; + fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_HALF_ITERATION_EVEN); +} + /* Fill in a frame control word for LDPC decoding. */ static inline void acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw, @@ -1042,6 +1146,87 @@ } static inline int +acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op, + struct acc100_dma_req_desc *desc, struct rte_mbuf **input, + struct rte_mbuf *output, uint32_t *in_offset, + uint32_t *out_offset, uint32_t *out_length, + uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r) +{ + int next_triplet = 1; /* FCW already done */ + uint32_t e, ea, eb, length; + uint16_t k, k_neg, k_pos; + uint8_t cab, c_neg; + + desc->word0 = ACC100_DMA_DESC_TYPE; + desc->word1 = 0; /**< Timestamp could be disabled */ + desc->word2 = 0; + desc->word3 = 0; + desc->numCBs = 1; + + if (op->turbo_enc.code_block_mode == 0) { + ea = op->turbo_enc.tb_params.ea; + eb = op->turbo_enc.tb_params.eb; + cab = op->turbo_enc.tb_params.cab; + k_neg = op->turbo_enc.tb_params.k_neg; + k_pos = op->turbo_enc.tb_params.k_pos; + c_neg = op->turbo_enc.tb_params.c_neg; + e = (r < cab) ? ea : eb; + k = (r < c_neg) ? k_neg : k_pos; + } else { + e = op->turbo_enc.cb_params.e; + k = op->turbo_enc.cb_params.k; + } + + if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH)) + length = (k - 24) >> 3; + else + length = k >> 3; + + if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, length); + return -1; + } + + next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, + length, seg_total_left, next_triplet); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= length; + + /* Set output length */ + if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH)) + /* Integer round up division by 8 */ + *out_length = (e + 7) >> 3; + else + *out_length = (k >> 3) * 3 + 2; + + next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset, + *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + op->turbo_enc.output.length += *out_length; + *out_offset += *out_length; + desc->data_ptrs[next_triplet - 1].last = 1; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline int acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op, struct acc100_dma_req_desc *desc, struct rte_mbuf **input, struct rte_mbuf *output, uint32_t *in_offset, @@ -1110,6 +1295,117 @@ } static inline int +acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op, + struct acc100_dma_req_desc *desc, struct rte_mbuf **input, + struct rte_mbuf *h_output, struct rte_mbuf *s_output, + uint32_t *in_offset, uint32_t *h_out_offset, + uint32_t *s_out_offset, uint32_t *h_out_length, + uint32_t *s_out_length, uint32_t *mbuf_total_left, + uint32_t *seg_total_left, uint8_t r) +{ + int next_triplet = 1; /* FCW already done */ + uint16_t k; + uint16_t crc24_overlap = 0; + uint32_t e, kw; + + desc->word0 = ACC100_DMA_DESC_TYPE; + desc->word1 = 0; /**< Timestamp could be disabled */ + desc->word2 = 0; + desc->word3 = 0; + desc->numCBs = 1; + + if (op->turbo_dec.code_block_mode == 0) { + k = (r < op->turbo_dec.tb_params.c_neg) + ? op->turbo_dec.tb_params.k_neg + : op->turbo_dec.tb_params.k_pos; + e = (r < op->turbo_dec.tb_params.cab) + ? op->turbo_dec.tb_params.ea + : op->turbo_dec.tb_params.eb; + } else { + k = op->turbo_dec.cb_params.k; + e = op->turbo_dec.cb_params.e; + } + + if ((op->turbo_dec.code_block_mode == 0) + && !check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP)) + crc24_overlap = 24; + + /* Calculates circular buffer size. + * According to 3gpp 36.212 section 5.1.4.2 + * Kw = 3 * Kpi, + * where: + * Kpi = nCol * nRow + * where nCol is 32 and nRow can be calculated from: + * D =< nCol * nRow + * where D is the size of each output from turbo encoder block (k + 4). + */ + kw = RTE_ALIGN_CEIL(k + 4, 32) * 3; + + if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + *mbuf_total_left, kw); + return -1; + } + + next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw, + seg_total_left, next_triplet); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + desc->data_ptrs[next_triplet - 1].last = 1; + desc->m2dlen = next_triplet; + *mbuf_total_left -= kw; + + next_triplet = acc100_dma_fill_blk_type_out( + desc, h_output, *h_out_offset, + k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + + *h_out_length = ((k - crc24_overlap) >> 3); + op->turbo_dec.hard_output.length += *h_out_length; + *h_out_offset += *h_out_length; + + /* Soft output */ + if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) { + if (check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_EQUALIZER)) + *s_out_length = e; + else + *s_out_length = (k * 3) + 12; + + next_triplet = acc100_dma_fill_blk_type_out(desc, s_output, + *s_out_offset, *s_out_length, next_triplet, + ACC100_DMA_BLKID_OUT_SOFT); + if (unlikely(next_triplet < 0)) { + rte_bbdev_log(ERR, + "Mismatch between data to process and mbuf data length in bbdev_op: %p", + op); + return -1; + } + + op->turbo_dec.soft_output.length += *s_out_length; + *s_out_offset += *s_out_length; + } + + desc->data_ptrs[next_triplet - 1].last = 1; + desc->d2mlen = next_triplet - desc->m2dlen; + + desc->op_addr = op; + + return 0; +} + +static inline int acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_dma_req_desc *desc, struct rte_mbuf **input, struct rte_mbuf *h_output, @@ -1374,6 +1670,57 @@ /* Enqueue one encode operations for ACC100 device in CB mode */ static inline int +enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op, + uint16_t total_enqueued_cbs) +{ + union acc100_dma_desc *desc = NULL; + int ret; + uint32_t in_offset, out_offset, out_length, mbuf_total_left, + seg_total_left; + struct rte_mbuf *input, *output_head, *output; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc100_fcw_te_fill(op, &desc->req.fcw_te); + + input = op->turbo_enc.input.data; + output_head = output = op->turbo_enc.output.data; + in_offset = op->turbo_enc.input.offset; + out_offset = op->turbo_enc.output.offset; + out_length = 0; + mbuf_total_left = op->turbo_enc.input.length; + seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data) + - in_offset; + + ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output, + &in_offset, &out_offset, &out_length, &mbuf_total_left, + &seg_total_left, 0); + + if (unlikely(ret < 0)) + return ret; + + mbuf_append(output_head, output, out_length); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_te, + sizeof(desc->req.fcw_te) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); + + /* Check if any data left after processing one CB */ + if (mbuf_total_left != 0) { + rte_bbdev_log(ERR, + "Some date still left after processing one CB: mbuf_total_left = %u", + mbuf_total_left); + return -EINVAL; + } +#endif + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + +/* Enqueue one encode operations for ACC100 device in CB mode */ +static inline int enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops, uint16_t total_enqueued_cbs, int16_t num) { @@ -1481,78 +1828,235 @@ return 1; } -static inline int -harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op, - uint16_t total_enqueued_cbs) { - struct acc100_fcw_ld *fcw; - union acc100_dma_desc *desc; - int next_triplet = 1; - struct rte_mbuf *hq_output_head, *hq_output; - uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length; - if (harq_in_length == 0) { - rte_bbdev_log(ERR, "Loopback of invalid null size\n"); - return -EINVAL; - } - int h_comp = check_bit(op->ldpc_dec.op_flags, - RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION - ) ? 1 : 0; - if (h_comp == 1) - harq_in_length = harq_in_length * 8 / 6; - harq_in_length = RTE_ALIGN(harq_in_length, 64); - uint16_t harq_dma_length_in = (h_comp == 0) ? - harq_in_length : - harq_in_length * 6 / 8; - uint16_t harq_dma_length_out = harq_dma_length_in; - bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags, - RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE); - union acc100_harq_layout_data *harq_layout = q->d->harq_layout; - uint16_t harq_index = (ddr_mem_in ? - op->ldpc_dec.harq_combined_input.offset : - op->ldpc_dec.harq_combined_output.offset) - / ACC100_HARQ_OFFSET; +/* Enqueue one encode operations for ACC100 device in TB mode. */ +static inline int +enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op, + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) +{ + union acc100_dma_desc *desc = NULL; + int ret; + uint8_t r, c; + uint32_t in_offset, out_offset, out_length, mbuf_total_left, + seg_total_left; + struct rte_mbuf *input, *output_head, *output; + uint16_t current_enqueued_cbs = 0; uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; - fcw = &desc->req.fcw_ld; - /* Set the FCW from loopback into DDR */ - memset(fcw, 0, sizeof(struct acc100_fcw_ld)); - fcw->FCWversion = ACC100_FCW_VER; - fcw->qm = 2; - fcw->Zc = 384; - if (harq_in_length < 16 * N_ZC_1) - fcw->Zc = 16; - fcw->ncb = fcw->Zc * N_ZC_1; - fcw->rm_e = 2; - fcw->hcin_en = 1; - fcw->hcout_en = 1; + uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET; + acc100_fcw_te_fill(op, &desc->req.fcw_te); - rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n", - ddr_mem_in, harq_index, - harq_layout[harq_index].offset, harq_in_length, - harq_dma_length_in); + input = op->turbo_enc.input.data; + output_head = output = op->turbo_enc.output.data; + in_offset = op->turbo_enc.input.offset; + out_offset = op->turbo_enc.output.offset; + out_length = 0; + mbuf_total_left = op->turbo_enc.input.length; - if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) { - fcw->hcin_size0 = harq_layout[harq_index].size0; - fcw->hcin_offset = harq_layout[harq_index].offset; - fcw->hcin_size1 = harq_in_length - fcw->hcin_offset; - harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1); - if (h_comp == 1) - harq_dma_length_in = harq_dma_length_in * 6 / 8; - } else { - fcw->hcin_size0 = harq_in_length; - } - harq_layout[harq_index].val = 0; - rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n", - fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1); - fcw->hcout_size0 = harq_in_length; - fcw->hcin_decomp_mode = h_comp; - fcw->hcout_comp_mode = h_comp; - fcw->gain_i = 1; - fcw->gain_h = 1; + c = op->turbo_enc.tb_params.c; + r = op->turbo_enc.tb_params.r; - /* Set the prefix of descriptor. This could be done at polling */ + while (mbuf_total_left > 0 && r < c) { + seg_total_left = rte_pktmbuf_data_len(input) - in_offset; + /* Set up DMA descriptor */ + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset; + desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN; + + ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output, + &in_offset, &out_offset, &out_length, + &mbuf_total_left, &seg_total_left, r); + if (unlikely(ret < 0)) + return ret; + mbuf_append(output_head, output, out_length); + + /* Set total number of CBs in TB */ + desc->req.cbs_in_tb = cbs_in_tb; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_te, + sizeof(desc->req.fcw_te) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + if (seg_total_left == 0) { + /* Go to the next mbuf */ + input = input->next; + in_offset = 0; + output = output->next; + out_offset = 0; + } + + total_enqueued_cbs++; + current_enqueued_cbs++; + r++; + } + + if (unlikely(desc == NULL)) + return current_enqueued_cbs; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Check if any CBs left for processing */ + if (mbuf_total_left != 0) { + rte_bbdev_log(ERR, + "Some date still left for processing: mbuf_total_left = %u", + mbuf_total_left); + return -EINVAL; + } +#endif + + /* Set SDone on last CB descriptor for TB mode. */ + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + return current_enqueued_cbs; +} + +/** Enqueue one decode operations for ACC100 device in CB mode */ +static inline int +enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs) +{ + union acc100_dma_desc *desc = NULL; + int ret; + uint32_t in_offset, h_out_offset, s_out_offset, s_out_length, + h_out_length, mbuf_total_left, seg_total_left; + struct rte_mbuf *input, *h_output_head, *h_output, + *s_output_head, *s_output; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + acc100_fcw_td_fill(op, &desc->req.fcw_td); + + input = op->turbo_dec.input.data; + h_output_head = h_output = op->turbo_dec.hard_output.data; + s_output_head = s_output = op->turbo_dec.soft_output.data; + in_offset = op->turbo_dec.input.offset; + h_out_offset = op->turbo_dec.hard_output.offset; + s_out_offset = op->turbo_dec.soft_output.offset; + h_out_length = s_out_length = 0; + mbuf_total_left = op->turbo_dec.input.length; + seg_total_left = rte_pktmbuf_data_len(input) - in_offset; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(input == NULL)) { + rte_bbdev_log(ERR, "Invalid mbuf pointer"); + return -EFAULT; + } +#endif + + /* Set up DMA descriptor */ + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + + ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output, + s_output, &in_offset, &h_out_offset, &s_out_offset, + &h_out_length, &s_out_length, &mbuf_total_left, + &seg_total_left, 0); + + if (unlikely(ret < 0)) + return ret; + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); + + /* Soft output */ + if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) + mbuf_append(s_output_head, s_output, s_out_length); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_td, + sizeof(desc->req.fcw_td) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); + + /* Check if any CBs left for processing */ + if (mbuf_total_left != 0) { + rte_bbdev_log(ERR, + "Some date still left after processing one CB: mbuf_total_left = %u", + mbuf_total_left); + return -EINVAL; + } +#endif + + /* One CB (one op) was successfully prepared to enqueue */ + return 1; +} + +static inline int +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs) { + struct acc100_fcw_ld *fcw; + union acc100_dma_desc *desc; + int next_triplet = 1; + struct rte_mbuf *hq_output_head, *hq_output; + uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length; + if (harq_in_length == 0) { + rte_bbdev_log(ERR, "Loopback of invalid null size\n"); + return -EINVAL; + } + + int h_comp = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION + ) ? 1 : 0; + if (h_comp == 1) + harq_in_length = harq_in_length * 8 / 6; + harq_in_length = RTE_ALIGN(harq_in_length, 64); + uint16_t harq_dma_length_in = (h_comp == 0) ? + harq_in_length : + harq_in_length * 6 / 8; + uint16_t harq_dma_length_out = harq_dma_length_in; + bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE); + union acc100_harq_layout_data *harq_layout = q->d->harq_layout; + uint16_t harq_index = (ddr_mem_in ? + op->ldpc_dec.harq_combined_input.offset : + op->ldpc_dec.harq_combined_output.offset) + / ACC100_HARQ_OFFSET; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + fcw = &desc->req.fcw_ld; + /* Set the FCW from loopback into DDR */ + memset(fcw, 0, sizeof(struct acc100_fcw_ld)); + fcw->FCWversion = ACC100_FCW_VER; + fcw->qm = 2; + fcw->Zc = 384; + if (harq_in_length < 16 * N_ZC_1) + fcw->Zc = 16; + fcw->ncb = fcw->Zc * N_ZC_1; + fcw->rm_e = 2; + fcw->hcin_en = 1; + fcw->hcout_en = 1; + + rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n", + ddr_mem_in, harq_index, + harq_layout[harq_index].offset, harq_in_length, + harq_dma_length_in); + + if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) { + fcw->hcin_size0 = harq_layout[harq_index].size0; + fcw->hcin_offset = harq_layout[harq_index].offset; + fcw->hcin_size1 = harq_in_length - fcw->hcin_offset; + harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1); + if (h_comp == 1) + harq_dma_length_in = harq_dma_length_in * 6 / 8; + } else { + fcw->hcin_size0 = harq_in_length; + } + harq_layout[harq_index].val = 0; + rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n", + fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1); + fcw->hcout_size0 = harq_in_length; + fcw->hcin_decomp_mode = h_comp; + fcw->hcout_comp_mode = h_comp; + fcw->gain_i = 1; + fcw->gain_h = 1; + + /* Set the prefix of descriptor. This could be done at polling */ desc->req.word0 = ACC100_DMA_DESC_TYPE; desc->req.word1 = 0; /**< Timestamp could be disabled */ desc->req.word2 = 0; @@ -1816,6 +2320,107 @@ return current_enqueued_cbs; } +/* Enqueue one decode operations for ACC100 device in TB mode */ +static inline int +enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) +{ + union acc100_dma_desc *desc = NULL; + int ret; + uint8_t r, c; + uint32_t in_offset, h_out_offset, s_out_offset, s_out_length, + h_out_length, mbuf_total_left, seg_total_left; + struct rte_mbuf *input, *h_output_head, *h_output, + *s_output_head, *s_output; + uint16_t current_enqueued_cbs = 0; + + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc = q->ring_addr + desc_idx; + uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET; + acc100_fcw_td_fill(op, &desc->req.fcw_td); + + input = op->turbo_dec.input.data; + h_output_head = h_output = op->turbo_dec.hard_output.data; + s_output_head = s_output = op->turbo_dec.soft_output.data; + in_offset = op->turbo_dec.input.offset; + h_out_offset = op->turbo_dec.hard_output.offset; + s_out_offset = op->turbo_dec.soft_output.offset; + h_out_length = s_out_length = 0; + mbuf_total_left = op->turbo_dec.input.length; + c = op->turbo_dec.tb_params.c; + r = op->turbo_dec.tb_params.r; + + while (mbuf_total_left > 0 && r < c) { + + seg_total_left = rte_pktmbuf_data_len(input) - in_offset; + + /* Set up DMA descriptor */ + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs) + & q->sw_ring_wrap_mask); + desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset; + desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN; + ret = acc100_dma_desc_td_fill(op, &desc->req, &input, + h_output, s_output, &in_offset, &h_out_offset, + &s_out_offset, &h_out_length, &s_out_length, + &mbuf_total_left, &seg_total_left, r); + + if (unlikely(ret < 0)) + return ret; + + /* Hard output */ + mbuf_append(h_output_head, h_output, h_out_length); + + /* Soft output */ + if (check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_SOFT_OUTPUT)) + mbuf_append(s_output_head, s_output, s_out_length); + + /* Set total number of CBs in TB */ + desc->req.cbs_in_tb = cbs_in_tb; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + rte_memdump(stderr, "FCW", &desc->req.fcw_td, + sizeof(desc->req.fcw_td) - 8); + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc)); +#endif + + if (seg_total_left == 0) { + /* Go to the next mbuf */ + input = input->next; + in_offset = 0; + h_output = h_output->next; + h_out_offset = 0; + + if (check_bit(op->turbo_dec.op_flags, + RTE_BBDEV_TURBO_SOFT_OUTPUT)) { + s_output = s_output->next; + s_out_offset = 0; + } + } + + total_enqueued_cbs++; + current_enqueued_cbs++; + r++; + } + + if (unlikely(desc == NULL)) + return current_enqueued_cbs; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Check if any CBs left for processing */ + if (mbuf_total_left != 0) { + rte_bbdev_log(ERR, + "Some date still left for processing: mbuf_total_left = %u", + mbuf_total_left); + return -EINVAL; + } +#endif + /* Set SDone on last CB descriptor for TB mode */ + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + return current_enqueued_cbs; +} /* Calculates number of CBs in processed encoder TB based on 'r' and input * length. @@ -1893,6 +2498,45 @@ return cbs_in_tb; } +/* Enqueue encode operations for ACC100 device in CB mode. */ +static uint16_t +acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head; + uint16_t i; + union acc100_dma_desc *desc; + int ret; + + for (i = 0; i < num; ++i) { + /* Check if there are available space for further processing */ + if (unlikely(avail - 1 < 0)) + break; + avail -= 1; + + ret = enqueue_enc_one_op_cb(q, ops[i], i); + if (ret < 0) + break; + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + i - 1) + & q->sw_ring_wrap_mask); + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc100_dma_enqueue(q, i, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + return i; +} + /* Check we can mux encode operations with common FCW */ static inline bool check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) { @@ -1960,6 +2604,52 @@ return i; } +/* Enqueue encode operations for ACC100 device in TB mode. */ +static uint16_t +acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head; + uint16_t i, enqueued_cbs = 0; + uint8_t cbs_in_tb; + int ret; + + for (i = 0; i < num; ++i) { + cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc); + /* Check if there are available space for further processing */ + if (unlikely(avail - cbs_in_tb < 0)) + break; + avail -= cbs_in_tb; + + ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb); + if (ret < 0) + break; + enqueued_cbs += ret; + } + + acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + +/* Enqueue encode operations for ACC100 device. */ +static uint16_t +acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + if (unlikely(num == 0)) + return 0; + if (ops[0]->turbo_enc.code_block_mode == 0) + return acc100_enqueue_enc_tb(q_data, ops, num); + else + return acc100_enqueue_enc_cb(q_data, ops, num); +} + /* Enqueue encode operations for ACC100 device. */ static uint16_t acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, @@ -1967,7 +2657,51 @@ { if (unlikely(num == 0)) return 0; - return acc100_enqueue_ldpc_enc_cb(q_data, ops, num); + if (ops[0]->ldpc_enc.code_block_mode == 0) + return acc100_enqueue_enc_tb(q_data, ops, num); + else + return acc100_enqueue_ldpc_enc_cb(q_data, ops, num); +} + + +/* Enqueue decode operations for ACC100 device in CB mode */ +static uint16_t +acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head; + uint16_t i; + union acc100_dma_desc *desc; + int ret; + + for (i = 0; i < num; ++i) { + /* Check if there are available space for further processing */ + if (unlikely(avail - 1 < 0)) + break; + avail -= 1; + + ret = enqueue_dec_one_op_cb(q, ops[i], i); + if (ret < 0) + break; + } + + if (unlikely(i == 0)) + return 0; /* Nothing to enqueue */ + + /* Set SDone in last CB in enqueued ops for CB mode*/ + desc = q->ring_addr + ((q->sw_ring_head + i - 1) + & q->sw_ring_wrap_mask); + desc->req.sdone_enable = 1; + desc->req.irq_enable = q->irq_enable; + + acc100_dma_enqueue(q, i, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; } /* Check we can mux encode operations with common FCW */ @@ -2065,6 +2799,53 @@ return i; } + +/* Enqueue decode operations for ACC100 device in TB mode */ +static uint16_t +acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head; + uint16_t i, enqueued_cbs = 0; + uint8_t cbs_in_tb; + int ret; + + for (i = 0; i < num; ++i) { + cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec); + /* Check if there are available space for further processing */ + if (unlikely(avail - cbs_in_tb < 0)) + break; + avail -= cbs_in_tb; + + ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb); + if (ret < 0) + break; + enqueued_cbs += ret; + } + + acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats); + + /* Update stats */ + q_data->queue_stats.enqueued_count += i; + q_data->queue_stats.enqueue_err_count += num - i; + + return i; +} + +/* Enqueue decode operations for ACC100 device. */ +static uint16_t +acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + if (unlikely(num == 0)) + return 0; + if (ops[0]->turbo_dec.code_block_mode == 0) + return acc100_enqueue_dec_tb(q_data, ops, num); + else + return acc100_enqueue_dec_cb(q_data, ops, num); +} + /* Enqueue decode operations for ACC100 device. */ static uint16_t acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@ -2388,6 +3169,51 @@ return cb_idx; } +/* Dequeue encode operations from ACC100 device. */ +static uint16_t +acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + uint16_t dequeue_num; + uint32_t avail = q->sw_ring_head - q->sw_ring_tail; + uint32_t aq_dequeued = 0; + uint16_t i; + uint16_t dequeued_cbs = 0; + struct rte_bbdev_enc_op *op; + int ret; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == 0 && q == NULL)) + return 0; +#endif + + dequeue_num = (avail < num) ? avail : num; + + for (i = 0; i < dequeue_num; ++i) { + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask))->req.op_addr; + if (op->turbo_enc.code_block_mode == 0) + ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs, + &aq_dequeued); + else + ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs, + &aq_dequeued); + + if (ret < 0) + break; + dequeued_cbs += ret; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_cbs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += i; + + return i; +} + /* Dequeue LDPC encode operations from ACC100 device. */ static uint16_t acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data, @@ -2426,6 +3252,52 @@ return dequeued_cbs; } + +/* Dequeue decode operations from ACC100 device. */ +static uint16_t +acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_dec_op **ops, uint16_t num) +{ + struct acc100_queue *q = q_data->queue_private; + uint16_t dequeue_num; + uint32_t avail = q->sw_ring_head - q->sw_ring_tail; + uint32_t aq_dequeued = 0; + uint16_t i; + uint16_t dequeued_cbs = 0; + struct rte_bbdev_dec_op *op; + int ret; + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + if (unlikely(ops == 0 && q == NULL)) + return 0; +#endif + + dequeue_num = (avail < num) ? avail : num; + + for (i = 0; i < dequeue_num; ++i) { + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs) + & q->sw_ring_wrap_mask))->req.op_addr; + if (op->turbo_dec.code_block_mode == 0) + ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs, + &aq_dequeued); + else + ret = dequeue_dec_one_op_cb(q_data, q, &ops[i], + dequeued_cbs, &aq_dequeued); + + if (ret < 0) + break; + dequeued_cbs += ret; + } + + q->aq_dequeued += aq_dequeued; + q->sw_ring_tail += dequeued_cbs; + + /* Update enqueue stats */ + q_data->queue_stats.dequeued_count += i; + + return i; +} + /* Dequeue decode operations from ACC100 device. */ static uint16_t acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@ -2479,6 +3351,10 @@ struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device); dev->dev_ops = &acc100_bbdev_ops; + dev->enqueue_enc_ops = acc100_enqueue_enc; + dev->enqueue_dec_ops = acc100_enqueue_dec; + dev->dequeue_enc_ops = acc100_dequeue_enc; + dev->dequeue_dec_ops = acc100_dequeue_dec; dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc; dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec; dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc; From patchwork Fri Sep 4 17:54:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76584 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AE2B5A04C5; Fri, 4 Sep 2020 19:59:57 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 66EF01C139; Fri, 4 Sep 2020 19:58:23 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 9C1ED1C0CC for ; Fri, 4 Sep 2020 19:58:12 +0200 (CEST) IronPort-SDR: 06MeYSlgo8yMMP+vNe2zjzcEw7XC0IeAcaCK6rAntt83aSlZF3GugNr1Je29crs6yPuBJnMi4L UfRwCnh7gG8w== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614842" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614842" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: TjMf9rsksYNPJj8R3caP40Uq1jnhXR8vxxT5wmgOjF55qX18eN3jLma8WdhlHe4sMafSQhk+Xi W+pFqvPYTYGA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765400" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:04 -0700 Message-Id: <1599242047-58232-9-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 08/11] baseband/acc100: add interrupt support to PMD X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Adding capability and functions to support MSI interrupts, call backs and inforing. Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++- drivers/baseband/acc100/rte_acc100_pmd.h | 15 ++ 2 files changed, 300 insertions(+), 3 deletions(-) diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index bd07def..54b5917 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -339,6 +339,213 @@ free_base_addresses(base_addrs, i); } +/* + * Find queue_id of a device queue based on details from the Info Ring. + * If a queue isn't found UINT16_MAX is returned. + */ +static inline uint16_t +get_queue_id_from_ring_info(struct rte_bbdev_data *data, + const union acc100_info_ring_data ring_data) +{ + uint16_t queue_id; + + for (queue_id = 0; queue_id < data->num_queues; ++queue_id) { + struct acc100_queue *acc100_q = + data->queues[queue_id].queue_private; + if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id && + acc100_q->qgrp_id == ring_data.qg_id && + acc100_q->vf_id == ring_data.vf_id) + return queue_id; + } + + return UINT16_MAX; +} + +/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */ +static inline void +acc100_check_ir(struct acc100_device *acc100_dev) +{ + volatile union acc100_info_ring_data *ring_data; + uint16_t info_ring_head = acc100_dev->info_ring_head; + if (acc100_dev->info_ring == NULL) + return; + + ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head & + ACC100_INFO_RING_MASK); + + while (ring_data->valid) { + if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || ( + ring_data->int_nb > + ACC100_PF_INT_DMA_DL5G_DESC_IRQ)) + rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x", + ring_data->int_nb, ring_data->detailed_info); + /* Initialize Info Ring entry and move forward */ + ring_data->val = 0; + info_ring_head++; + ring_data = acc100_dev->info_ring + + (info_ring_head & ACC100_INFO_RING_MASK); + } +} + +/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */ +static inline void +acc100_pf_interrupt_handler(struct rte_bbdev *dev) +{ + struct acc100_device *acc100_dev = dev->data->dev_private; + volatile union acc100_info_ring_data *ring_data; + struct acc100_deq_intr_details deq_intr_det; + + ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head & + ACC100_INFO_RING_MASK); + + while (ring_data->valid) { + + rte_bbdev_log_debug( + "ACC100 PF Interrupt received, Info Ring data: 0x%x", + ring_data->val); + + switch (ring_data->int_nb) { + case ACC100_PF_INT_DMA_DL_DESC_IRQ: + case ACC100_PF_INT_DMA_UL_DESC_IRQ: + case ACC100_PF_INT_DMA_UL5G_DESC_IRQ: + case ACC100_PF_INT_DMA_DL5G_DESC_IRQ: + deq_intr_det.queue_id = get_queue_id_from_ring_info( + dev->data, *ring_data); + if (deq_intr_det.queue_id == UINT16_MAX) { + rte_bbdev_log(ERR, + "Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u", + ring_data->aq_id, + ring_data->qg_id, + ring_data->vf_id); + return; + } + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det); + break; + default: + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_ERROR, NULL); + break; + } + + /* Initialize Info Ring entry and move forward */ + ring_data->val = 0; + ++acc100_dev->info_ring_head; + ring_data = acc100_dev->info_ring + + (acc100_dev->info_ring_head & + ACC100_INFO_RING_MASK); + } +} + +/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */ +static inline void +acc100_vf_interrupt_handler(struct rte_bbdev *dev) +{ + struct acc100_device *acc100_dev = dev->data->dev_private; + volatile union acc100_info_ring_data *ring_data; + struct acc100_deq_intr_details deq_intr_det; + + ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head & + ACC100_INFO_RING_MASK); + + while (ring_data->valid) { + + rte_bbdev_log_debug( + "ACC100 VF Interrupt received, Info Ring data: 0x%x", + ring_data->val); + + switch (ring_data->int_nb) { + case ACC100_VF_INT_DMA_DL_DESC_IRQ: + case ACC100_VF_INT_DMA_UL_DESC_IRQ: + case ACC100_VF_INT_DMA_UL5G_DESC_IRQ: + case ACC100_VF_INT_DMA_DL5G_DESC_IRQ: + /* VFs are not aware of their vf_id - it's set to 0 in + * queue structures. + */ + ring_data->vf_id = 0; + deq_intr_det.queue_id = get_queue_id_from_ring_info( + dev->data, *ring_data); + if (deq_intr_det.queue_id == UINT16_MAX) { + rte_bbdev_log(ERR, + "Couldn't find queue: aq_id: %u, qg_id: %u", + ring_data->aq_id, + ring_data->qg_id); + return; + } + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det); + break; + default: + rte_bbdev_pmd_callback_process(dev, + RTE_BBDEV_EVENT_ERROR, NULL); + break; + } + + /* Initialize Info Ring entry and move forward */ + ring_data->valid = 0; + ++acc100_dev->info_ring_head; + ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head + & ACC100_INFO_RING_MASK); + } +} + +/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */ +static void +acc100_dev_interrupt_handler(void *cb_arg) +{ + struct rte_bbdev *dev = cb_arg; + struct acc100_device *acc100_dev = dev->data->dev_private; + + /* Read info ring */ + if (acc100_dev->pf_device) + acc100_pf_interrupt_handler(dev); + else + acc100_vf_interrupt_handler(dev); +} + +/* Allocate and setup inforing */ +static int +allocate_inforing(struct rte_bbdev *dev) +{ + struct acc100_device *d = dev->data->dev_private; + const struct acc100_registry_addr *reg_addr; + rte_iova_t info_ring_phys; + uint32_t phys_low, phys_high; + + if (d->info_ring != NULL) + return 0; /* Already configured */ + + /* Choose correct registry addresses for the device type */ + if (d->pf_device) + reg_addr = &pf_reg_addr; + else + reg_addr = &vf_reg_addr; + /* Allocate InfoRing */ + d->info_ring = rte_zmalloc_socket("Info Ring", + ACC100_INFO_RING_NUM_ENTRIES * + sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE, + dev->data->socket_id); + if (d->info_ring == NULL) { + rte_bbdev_log(ERR, + "Failed to allocate Info Ring for %s:%u", + dev->device->driver->name, + dev->data->dev_id); + return -ENOMEM; + } + info_ring_phys = rte_malloc_virt2iova(d->info_ring); + + /* Setup Info Ring */ + phys_high = (uint32_t)(info_ring_phys >> 32); + phys_low = (uint32_t)(info_ring_phys); + acc100_reg_write(d, reg_addr->info_ring_hi, phys_high); + acc100_reg_write(d, reg_addr->info_ring_lo, phys_low); + acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL); + d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) & + 0xFFF) / sizeof(union acc100_info_ring_data); + return 0; +} + + /* Allocate 64MB memory used for all software rings */ static int acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id) @@ -426,6 +633,7 @@ acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high); acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low); + allocate_inforing(dev); d->harq_layout = rte_zmalloc_socket("HARQ Layout", ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout), RTE_CACHE_LINE_SIZE, dev->data->socket_id); @@ -437,13 +645,53 @@ return 0; } +static int +acc100_intr_enable(struct rte_bbdev *dev) +{ + int ret; + struct acc100_device *d = dev->data->dev_private; + + /* Only MSI are currently supported */ + if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI || + dev->intr_handle->type == RTE_INTR_HANDLE_UIO) { + + allocate_inforing(dev); + + ret = rte_intr_enable(dev->intr_handle); + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't enable interrupts for device: %s", + dev->data->name); + rte_free(d->info_ring); + return ret; + } + ret = rte_intr_callback_register(dev->intr_handle, + acc100_dev_interrupt_handler, dev); + if (ret < 0) { + rte_bbdev_log(ERR, + "Couldn't register interrupt callback for device: %s", + dev->data->name); + rte_free(d->info_ring); + return ret; + } + + return 0; + } + + rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts", + dev->data->name); + return -ENOTSUP; +} + /* Free 64MB memory used for software rings */ static int acc100_dev_close(struct rte_bbdev *dev) { struct acc100_device *d = dev->data->dev_private; + acc100_check_ir(d); if (d->sw_rings_base != NULL) { rte_free(d->tail_ptrs); + rte_free(d->info_ring); rte_free(d->sw_rings_base); d->sw_rings_base = NULL; } @@ -643,6 +891,7 @@ RTE_BBDEV_TURBO_CRC_TYPE_24B | RTE_BBDEV_TURBO_HALF_ITERATION_EVEN | RTE_BBDEV_TURBO_EARLY_TERMINATION | + RTE_BBDEV_TURBO_DEC_INTERRUPTS | RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN | RTE_BBDEV_TURBO_MAP_DEC | RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP | @@ -663,6 +912,7 @@ RTE_BBDEV_TURBO_CRC_24B_ATTACH | RTE_BBDEV_TURBO_RV_INDEX_BYPASS | RTE_BBDEV_TURBO_RATE_MATCH | + RTE_BBDEV_TURBO_ENC_INTERRUPTS | RTE_BBDEV_TURBO_ENC_SCATTER_GATHER, .num_buffers_src = RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, @@ -676,7 +926,8 @@ .capability_flags = RTE_BBDEV_LDPC_RATE_MATCH | RTE_BBDEV_LDPC_CRC_24B_ATTACH | - RTE_BBDEV_LDPC_INTERLEAVER_BYPASS, + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS | + RTE_BBDEV_LDPC_ENC_INTERRUPTS, .num_buffers_src = RTE_BBDEV_LDPC_MAX_CODE_BLOCKS, .num_buffers_dst = @@ -701,7 +952,8 @@ RTE_BBDEV_LDPC_DECODE_BYPASS | RTE_BBDEV_LDPC_DEC_SCATTER_GATHER | RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION | - RTE_BBDEV_LDPC_LLR_COMPRESSION, + RTE_BBDEV_LDPC_LLR_COMPRESSION | + RTE_BBDEV_LDPC_DEC_INTERRUPTS, .llr_size = 8, .llr_decimals = 1, .num_buffers_src = @@ -751,14 +1003,39 @@ #else dev_info->harq_buffer_size = 0; #endif + acc100_check_ir(d); +} + +static int +acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id) +{ + struct acc100_queue *q = dev->data->queues[queue_id].queue_private; + + if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI && + dev->intr_handle->type != RTE_INTR_HANDLE_UIO) + return -ENOTSUP; + + q->irq_enable = 1; + return 0; +} + +static int +acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id) +{ + struct acc100_queue *q = dev->data->queues[queue_id].queue_private; + q->irq_enable = 0; + return 0; } static const struct rte_bbdev_ops acc100_bbdev_ops = { .setup_queues = acc100_setup_queues, + .intr_enable = acc100_intr_enable, .close = acc100_dev_close, .info_get = acc100_dev_info_get, .queue_setup = acc100_queue_setup, .queue_release = acc100_queue_release, + .queue_intr_enable = acc100_queue_intr_enable, + .queue_intr_disable = acc100_queue_intr_disable }; /* ACC100 PCI PF address map */ @@ -3018,8 +3295,10 @@ ? (1 << RTE_BBDEV_DATA_ERROR) : 0); op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0); - if (op->status != 0) + if (op->status != 0) { q_data->queue_stats.dequeue_err_count++; + acc100_check_ir(q->d); + } /* CRC invalid if error exists */ if (!op->status) @@ -3076,6 +3355,9 @@ op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR; op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt; + if (op->status & (1 << RTE_BBDEV_DRV_ERROR)) + acc100_check_ir(q->d); + /* Check if this is the last desc in batch (Atomic Queue) */ if (desc->req.last_desc_in_batch) { (*aq_dequeued)++; diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h index 78686c1..8980fa5 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.h +++ b/drivers/baseband/acc100/rte_acc100_pmd.h @@ -559,7 +559,14 @@ struct acc100_device { /* Virtual address of the info memory routed to the this function under * operation, whether it is PF or VF. */ + union acc100_info_ring_data *info_ring; + union acc100_harq_layout_data *harq_layout; + /* Virtual Info Ring head */ + uint16_t info_ring_head; + /* Number of bytes available for each queue in device, depending on + * how many queues are enabled with configure() + */ uint32_t sw_ring_size; uint32_t ddr_size; /* Size in kB */ uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */ @@ -575,4 +582,12 @@ struct acc100_device { bool configured; /**< True if this ACC100 device is configured */ }; +/** + * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to + * the callback function. + */ +struct acc100_deq_intr_details { + uint16_t queue_id; +}; + #endif /* _RTE_ACC100_PMD_H_ */ From patchwork Fri Sep 4 17:54:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76581 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B3182A04C5; Fri, 4 Sep 2020 19:59:28 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2FDA01C125; Fri, 4 Sep 2020 19:58:20 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id AC33B1C0BF for ; Fri, 4 Sep 2020 19:58:11 +0200 (CEST) IronPort-SDR: xaymRzR7UiKNuJxUa7KIzlC//ED0kQ/zwfRyJAZNEUtl7alACFBw/ItNrmIOnDgsNZcw2Lz+Ak 2nHhHpyMqOpQ== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614843" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614843" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: 4sqnN5HG1PWE4jm7p43DvxGYpQe81tkRNK8ULHuWrjdiTqQQhViuxXKcKzkgSgy4eLaPFaVIP0 pjQW+7p8OGyQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765403" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:05 -0700 Message-Id: <1599242047-58232-10-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 09/11] baseband/acc100: add debug function to validate input X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Debug functions to validate the input API from user Only enabled in DEBUG mode at build time Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++ 1 file changed, 424 insertions(+) diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index 54b5917..e64d5e2 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -1945,6 +1945,231 @@ } +#ifdef RTE_LIBRTE_BBDEV_DEBUG +/* Validates turbo encoder parameters */ +static inline int +validate_enc_op(struct rte_bbdev_enc_op *op) +{ + struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc; + struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL; + struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL; + uint16_t kw, kw_neg, kw_pos; + + if (op->mempool == NULL) { + rte_bbdev_log(ERR, "Invalid mempool pointer"); + return -1; + } + if (turbo_enc->input.data == NULL) { + rte_bbdev_log(ERR, "Invalid input pointer"); + return -1; + } + if (turbo_enc->output.data == NULL) { + rte_bbdev_log(ERR, "Invalid output pointer"); + return -1; + } + if (turbo_enc->rv_index > 3) { + rte_bbdev_log(ERR, + "rv_index (%u) is out of range 0 <= value <= 3", + turbo_enc->rv_index); + return -1; + } + if (turbo_enc->code_block_mode != 0 && + turbo_enc->code_block_mode != 1) { + rte_bbdev_log(ERR, + "code_block_mode (%u) is out of range 0 <= value <= 1", + turbo_enc->code_block_mode); + return -1; + } + + if (turbo_enc->code_block_mode == 0) { + tb = &turbo_enc->tb_params; + if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE) + && tb->c_neg > 0) { + rte_bbdev_log(ERR, + "k_neg (%u) is out of range %u <= value <= %u", + tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) { + rte_bbdev_log(ERR, + "k_pos (%u) is out of range %u <= value <= %u", + tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1)) + rte_bbdev_log(ERR, + "c_neg (%u) is out of range 0 <= value <= %u", + tb->c_neg, + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1); + if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) { + rte_bbdev_log(ERR, + "c (%u) is out of range 1 <= value <= %u", + tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS); + return -1; + } + if (tb->cab > tb->c) { + rte_bbdev_log(ERR, + "cab (%u) is greater than c (%u)", + tb->cab, tb->c); + return -1; + } + if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2)) + && tb->r < tb->cab) { + rte_bbdev_log(ERR, + "ea (%u) is less than %u or it is not even", + tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2)) + && tb->c > tb->cab) { + rte_bbdev_log(ERR, + "eb (%u) is less than %u or it is not even", + tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + + kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4, + RTE_BBDEV_TURBO_C_SUBBLOCK); + if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) { + rte_bbdev_log(ERR, + "ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg", + tb->ncb_neg, tb->k_neg, kw_neg); + return -1; + } + + kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4, + RTE_BBDEV_TURBO_C_SUBBLOCK); + if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) { + rte_bbdev_log(ERR, + "ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos", + tb->ncb_pos, tb->k_pos, kw_pos); + return -1; + } + if (tb->r > (tb->c - 1)) { + rte_bbdev_log(ERR, + "r (%u) is greater than c - 1 (%u)", + tb->r, tb->c - 1); + return -1; + } + } else { + cb = &turbo_enc->cb_params; + if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE + || cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) { + rte_bbdev_log(ERR, + "k (%u) is out of range %u <= value <= %u", + cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + + if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) { + rte_bbdev_log(ERR, + "e (%u) is less than %u or it is not even", + cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + + kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3; + if (cb->ncb < cb->k || cb->ncb > kw) { + rte_bbdev_log(ERR, + "ncb (%u) is out of range (%u) k <= value <= (%u) kw", + cb->ncb, cb->k, kw); + return -1; + } + } + + return 0; +} +/* Validates LDPC encoder parameters */ +static inline int +validate_ldpc_enc_op(struct rte_bbdev_enc_op *op) +{ + struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc; + + if (op->mempool == NULL) { + rte_bbdev_log(ERR, "Invalid mempool pointer"); + return -1; + } + if (ldpc_enc->input.data == NULL) { + rte_bbdev_log(ERR, "Invalid input pointer"); + return -1; + } + if (ldpc_enc->output.data == NULL) { + rte_bbdev_log(ERR, "Invalid output pointer"); + return -1; + } + if (ldpc_enc->input.length > + RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) { + rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d", + ldpc_enc->input.length, + RTE_BBDEV_LDPC_MAX_CB_SIZE); + return -1; + } + if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) { + rte_bbdev_log(ERR, + "BG (%u) is out of range 1 <= value <= 2", + ldpc_enc->basegraph); + return -1; + } + if (ldpc_enc->rv_index > 3) { + rte_bbdev_log(ERR, + "rv_index (%u) is out of range 0 <= value <= 3", + ldpc_enc->rv_index); + return -1; + } + if (ldpc_enc->code_block_mode > 1) { + rte_bbdev_log(ERR, + "code_block_mode (%u) is out of range 0 <= value <= 1", + ldpc_enc->code_block_mode); + return -1; + } + + return 0; +} + +/* Validates LDPC decoder parameters */ +static inline int +validate_ldpc_dec_op(struct rte_bbdev_dec_op *op) +{ + struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec; + + if (op->mempool == NULL) { + rte_bbdev_log(ERR, "Invalid mempool pointer"); + return -1; + } + if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) { + rte_bbdev_log(ERR, + "BG (%u) is out of range 1 <= value <= 2", + ldpc_dec->basegraph); + return -1; + } + if (ldpc_dec->iter_max == 0) { + rte_bbdev_log(ERR, + "iter_max (%u) is equal to 0", + ldpc_dec->iter_max); + return -1; + } + if (ldpc_dec->rv_index > 3) { + rte_bbdev_log(ERR, + "rv_index (%u) is out of range 0 <= value <= 3", + ldpc_dec->rv_index); + return -1; + } + if (ldpc_dec->code_block_mode > 1) { + rte_bbdev_log(ERR, + "code_block_mode (%u) is out of range 0 <= value <= 1", + ldpc_dec->code_block_mode); + return -1; + } + + return 0; +} +#endif + /* Enqueue one encode operations for ACC100 device in CB mode */ static inline int enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op, @@ -1956,6 +2181,14 @@ seg_total_left; struct rte_mbuf *input, *output_head, *output; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_enc_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo encoder validation failed"); + return -EINVAL; + } +#endif + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; @@ -2008,6 +2241,14 @@ uint16_t in_length_in_bytes; struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_ldpc_enc_op(ops[0]) == -1) { + rte_bbdev_log(ERR, "LDPC encoder validation failed"); + return -EINVAL; + } +#endif + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; @@ -2065,6 +2306,14 @@ seg_total_left; struct rte_mbuf *input, *output_head, *output; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_ldpc_enc_op(op) == -1) { + rte_bbdev_log(ERR, "LDPC encoder validation failed"); + return -EINVAL; + } +#endif + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; @@ -2119,6 +2368,14 @@ struct rte_mbuf *input, *output_head, *output; uint16_t current_enqueued_cbs = 0; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_enc_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo encoder validation failed"); + return -EINVAL; + } +#endif + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; @@ -2191,6 +2448,142 @@ return current_enqueued_cbs; } +#ifdef RTE_LIBRTE_BBDEV_DEBUG +/* Validates turbo decoder parameters */ +static inline int +validate_dec_op(struct rte_bbdev_dec_op *op) +{ + struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec; + struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL; + struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL; + + if (op->mempool == NULL) { + rte_bbdev_log(ERR, "Invalid mempool pointer"); + return -1; + } + if (turbo_dec->input.data == NULL) { + rte_bbdev_log(ERR, "Invalid input pointer"); + return -1; + } + if (turbo_dec->hard_output.data == NULL) { + rte_bbdev_log(ERR, "Invalid hard_output pointer"); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) && + turbo_dec->soft_output.data == NULL) { + rte_bbdev_log(ERR, "Invalid soft_output pointer"); + return -1; + } + if (turbo_dec->rv_index > 3) { + rte_bbdev_log(ERR, + "rv_index (%u) is out of range 0 <= value <= 3", + turbo_dec->rv_index); + return -1; + } + if (turbo_dec->iter_min < 1) { + rte_bbdev_log(ERR, + "iter_min (%u) is less than 1", + turbo_dec->iter_min); + return -1; + } + if (turbo_dec->iter_max <= 2) { + rte_bbdev_log(ERR, + "iter_max (%u) is less than or equal to 2", + turbo_dec->iter_max); + return -1; + } + if (turbo_dec->iter_min > turbo_dec->iter_max) { + rte_bbdev_log(ERR, + "iter_min (%u) is greater than iter_max (%u)", + turbo_dec->iter_min, turbo_dec->iter_max); + return -1; + } + if (turbo_dec->code_block_mode != 0 && + turbo_dec->code_block_mode != 1) { + rte_bbdev_log(ERR, + "code_block_mode (%u) is out of range 0 <= value <= 1", + turbo_dec->code_block_mode); + return -1; + } + + if (turbo_dec->code_block_mode == 0) { + tb = &turbo_dec->tb_params; + if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE) + && tb->c_neg > 0) { + rte_bbdev_log(ERR, + "k_neg (%u) is out of range %u <= value <= %u", + tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE + || tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) + && tb->c > tb->c_neg) { + rte_bbdev_log(ERR, + "k_pos (%u) is out of range %u <= value <= %u", + tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1)) + rte_bbdev_log(ERR, + "c_neg (%u) is out of range 0 <= value <= %u", + tb->c_neg, + RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1); + if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) { + rte_bbdev_log(ERR, + "c (%u) is out of range 1 <= value <= %u", + tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS); + return -1; + } + if (tb->cab > tb->c) { + rte_bbdev_log(ERR, + "cab (%u) is greater than c (%u)", + tb->cab, tb->c); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) && + (tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE + || (tb->ea % 2)) + && tb->cab > 0) { + rte_bbdev_log(ERR, + "ea (%u) is less than %u or it is not even", + tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) && + (tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE + || (tb->eb % 2)) + && tb->c > tb->cab) { + rte_bbdev_log(ERR, + "eb (%u) is less than %u or it is not even", + tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE); + } + } else { + cb = &turbo_dec->cb_params; + if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE + || cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) { + rte_bbdev_log(ERR, + "k (%u) is out of range %u <= value <= %u", + cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE, + RTE_BBDEV_TURBO_MAX_CB_SIZE); + return -1; + } + if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) && + (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || + (cb->e % 2))) { + rte_bbdev_log(ERR, + "e (%u) is less than %u or it is not even", + cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE); + return -1; + } + } + + return 0; +} +#endif + /** Enqueue one decode operations for ACC100 device in CB mode */ static inline int enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, @@ -2203,6 +2596,14 @@ struct rte_mbuf *input, *h_output_head, *h_output, *s_output_head, *s_output; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_dec_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo decoder validation failed"); + return -EINVAL; + } +#endif + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; @@ -2426,6 +2827,13 @@ return ret; } +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_ldpc_dec_op(op) == -1) { + rte_bbdev_log(ERR, "LDPC decoder validation failed"); + return -EINVAL; + } +#endif union acc100_dma_desc *desc; uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); @@ -2521,6 +2929,14 @@ struct rte_mbuf *input, *h_output_head, *h_output; uint16_t current_enqueued_cbs = 0; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_ldpc_dec_op(op) == -1) { + rte_bbdev_log(ERR, "LDPC decoder validation failed"); + return -EINVAL; + } +#endif + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; @@ -2611,6 +3027,14 @@ *s_output_head, *s_output; uint16_t current_enqueued_cbs = 0; +#ifdef RTE_LIBRTE_BBDEV_DEBUG + /* Validate op structure */ + if (validate_dec_op(op) == -1) { + rte_bbdev_log(ERR, "Turbo decoder validation failed"); + return -EINVAL; + } +#endif + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask); desc = q->ring_addr + desc_idx; From patchwork Fri Sep 4 17:54:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76583 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9B5D6A04C5; Fri, 4 Sep 2020 19:59:47 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6AA401C133; Fri, 4 Sep 2020 19:58:22 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 454171C0CA for ; Fri, 4 Sep 2020 19:58:12 +0200 (CEST) IronPort-SDR: lppY0J+f4OjMAg1VvBOTFVjV9DxlddWdzp+phubMF7jRdQpWZ3t30h0VcgyVNWJ8uNnlk/yMZN wGiJMiyHLYzg== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614844" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614844" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: NJUYXruF3bHOD0hjlDPvBK+h32fTu1wdaBPq2NDDKYxE/RnmP9q0mZ3xWQH29whiwCc94B2Kdx hPNyvzlLP+QA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765405" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:06 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:06 -0700 Message-Id: <1599242047-58232-11-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 10/11] baseband/acc100: add configure function X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add configure function to configure the PF from within the bbdev-test itself without external application configuration the device. Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- app/test-bbdev/test_bbdev_perf.c | 72 +++ drivers/baseband/acc100/Makefile | 3 + drivers/baseband/acc100/meson.build | 2 + drivers/baseband/acc100/rte_acc100_cfg.h | 17 + drivers/baseband/acc100/rte_acc100_pmd.c | 505 +++++++++++++++++++++ .../acc100/rte_pmd_bbdev_acc100_version.map | 7 + 6 files changed, 606 insertions(+) diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c index 45c0d62..32f23ff 100644 --- a/app/test-bbdev/test_bbdev_perf.c +++ b/app/test-bbdev/test_bbdev_perf.c @@ -52,6 +52,18 @@ #define FLR_5G_TIMEOUT 610 #endif +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100 +#include +#define ACC100PF_DRIVER_NAME ("intel_acc100_pf") +#define ACC100VF_DRIVER_NAME ("intel_acc100_vf") +#define ACC100_QMGR_NUM_AQS 16 +#define ACC100_QMGR_NUM_QGS 2 +#define ACC100_QMGR_AQ_DEPTH 5 +#define ACC100_QMGR_INVALID_IDX -1 +#define ACC100_QMGR_RR 1 +#define ACC100_QOS_GBR 0 +#endif + #define OPS_CACHE_SIZE 256U #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */ @@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad, info->dev_name); } #endif +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100 + if ((get_init_device() == true) && + (!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) { + struct acc100_conf conf; + unsigned int i; + + printf("Configure ACC100 FEC Driver %s with default values\n", + info->drv.driver_name); + + /* clear default configuration before initialization */ + memset(&conf, 0, sizeof(struct acc100_conf)); + + /* Always set in PF mode for built-in configuration */ + conf.pf_mode_en = true; + for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) { + conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR; + conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR; + conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR; + conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR; + conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR; + } + + conf.input_pos_llr_1_bit = true; + conf.output_pos_llr_1_bit = true; + conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */ + + conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS; + conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX; + conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS; + conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH; + conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS; + conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX; + conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS; + conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH; + conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS; + conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX; + conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS; + conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH; + conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS; + conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX; + conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS; + conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH; + + /* setup PF with configuration information */ + ret = acc100_configure(info->dev_name, &conf); + TEST_ASSERT_SUCCESS(ret, + "Failed to configure ACC100 PF for bbdev %s", + info->dev_name); + /* Let's refresh this now this is configured */ + } + rte_bbdev_info_get(dev_id, info); +#endif + nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues); nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES); diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile index c79e487..37e73af 100644 --- a/drivers/baseband/acc100/Makefile +++ b/drivers/baseband/acc100/Makefile @@ -22,4 +22,7 @@ LIBABIVER := 1 # library source files SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c +# export include files +SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include += rte_acc100_cfg.h + include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build index 8afafc2..7ac44dc 100644 --- a/drivers/baseband/acc100/meson.build +++ b/drivers/baseband/acc100/meson.build @@ -4,3 +4,5 @@ deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci'] sources = files('rte_acc100_pmd.c') + +install_headers('rte_acc100_cfg.h') diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h index 73bbe36..7f523bc 100644 --- a/drivers/baseband/acc100/rte_acc100_cfg.h +++ b/drivers/baseband/acc100/rte_acc100_cfg.h @@ -89,6 +89,23 @@ struct acc100_conf { struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS]; }; +/** + * Configure a ACC100 device + * + * @param dev_name + * The name of the device. This is the short form of PCI BDF, e.g. 00:01.0. + * It can also be retrieved for a bbdev device from the dev_name field in the + * rte_bbdev_info structure returned by rte_bbdev_info_get(). + * @param conf + * Configuration to apply to ACC100 HW. + * + * @return + * Zero on success, negative value on failure. + */ +__rte_experimental +int +acc100_configure(const char *dev_name, struct acc100_conf *conf); + #ifdef __cplusplus } #endif diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c index e64d5e2..f039ed3 100644 --- a/drivers/baseband/acc100/rte_acc100_pmd.c +++ b/drivers/baseband/acc100/rte_acc100_pmd.c @@ -85,6 +85,26 @@ enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC}; +/* Return the accelerator enum for a Queue Group Index */ +static inline int +accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf) +{ + int accQg[ACC100_NUM_QGRPS]; + int NumQGroupsPerFn[NUM_ACC]; + int acc, qgIdx, qgIndex = 0; + for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++) + accQg[qgIdx] = 0; + NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups; + NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups; + NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups; + NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups; + for (acc = UL_4G; acc < NUM_ACC; acc++) + for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++) + accQg[qgIndex++] = acc; + acc = accQg[qg_idx]; + return acc; +} + /* Return the queue topology for a Queue Group Index */ static inline void qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum, @@ -113,6 +133,30 @@ *qtop = p_qtop; } +/* Return the AQ depth for a Queue Group Index */ +static inline int +aqDepth(int qg_idx, struct acc100_conf *acc100_conf) +{ + struct rte_q_topology_t *q_top = NULL; + int acc_enum = accFromQgid(qg_idx, acc100_conf); + qtopFromAcc(&q_top, acc_enum, acc100_conf); + if (unlikely(q_top == NULL)) + return 0; + return q_top->aq_depth_log2; +} + +/* Return the AQ depth for a Queue Group Index */ +static inline int +aqNum(int qg_idx, struct acc100_conf *acc100_conf) +{ + struct rte_q_topology_t *q_top = NULL; + int acc_enum = accFromQgid(qg_idx, acc100_conf); + qtopFromAcc(&q_top, acc_enum, acc100_conf); + if (unlikely(q_top == NULL)) + return 0; + return q_top->num_aqs_per_groups; +} + static void initQTop(struct acc100_conf *acc100_conf) { @@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev) RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map); RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver); RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map); + +/* + * Implementation to fix the power on status of some 5GUL engines + * This requires DMA permission if ported outside DPDK + */ +static void +poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d, + struct acc100_conf *conf) +{ + int i, template_idx, qg_idx; + uint32_t address, status, payload; + printf("Need to clear power-on 5GUL status in internal memory\n"); + /* Reset LDPC Cores */ + for (i = 0; i < ACC100_ENGINES_MAX; i++) + acc100_reg_write(d, HWPfFecUl5gCntrlReg + + ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI); + usleep(LONG_WAIT); + for (i = 0; i < ACC100_ENGINES_MAX; i++) + acc100_reg_write(d, HWPfFecUl5gCntrlReg + + ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO); + usleep(LONG_WAIT); + /* Prepare dummy workload */ + alloc_2x64mb_sw_rings_mem(bbdev, d, 0); + /* Set base addresses */ + uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32); + uint32_t phys_low = (uint32_t)(d->sw_rings_phys & + ~(ACC100_SIZE_64MBYTE-1)); + acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high); + acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low); + + /* Descriptor for a dummy 5GUL code block processing*/ + union acc100_dma_desc *desc = NULL; + desc = d->sw_rings; + desc->req.data_ptrs[0].address = d->sw_rings_phys + + ACC100_DESC_FCW_OFFSET; + desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN; + desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW; + desc->req.data_ptrs[0].last = 0; + desc->req.data_ptrs[0].dma_ext = 0; + desc->req.data_ptrs[1].address = d->sw_rings_phys + 512; + desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN; + desc->req.data_ptrs[1].last = 1; + desc->req.data_ptrs[1].dma_ext = 0; + desc->req.data_ptrs[1].blen = 44; + desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024; + desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC; + desc->req.data_ptrs[2].last = 1; + desc->req.data_ptrs[2].dma_ext = 0; + desc->req.data_ptrs[2].blen = 5; + /* Dummy FCW */ + desc->req.fcw_ld.FCWversion = ACC100_FCW_VER; + desc->req.fcw_ld.qm = 1; + desc->req.fcw_ld.nfiller = 30; + desc->req.fcw_ld.BG = 2 - 1; + desc->req.fcw_ld.Zc = 7; + desc->req.fcw_ld.ncb = 350; + desc->req.fcw_ld.rm_e = 4; + desc->req.fcw_ld.itmax = 10; + desc->req.fcw_ld.gain_i = 1; + desc->req.fcw_ld.gain_h = 1; + + int engines_to_restart[SIG_UL_5G_LAST + 1] = {0}; + int num_failed_engine = 0; + /* Detect engines in undefined state */ + for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST; + template_idx++) { + /* Check engine power-on status */ + address = HwPfFecUl5gIbDebugReg + + ACC100_ENGINE_OFFSET * template_idx; + status = (acc100_reg_read(d, address) >> 4) & 0xF; + if (status == 0) { + engines_to_restart[num_failed_engine] = template_idx; + num_failed_engine++; + } + } + + int numQqsAcc = conf->q_ul_5g.num_qgroups; + int numQgs = conf->q_ul_5g.num_qgroups; + payload = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + payload |= (1 << qg_idx); + /* Force each engine which is in unspecified state */ + for (i = 0; i < num_failed_engine; i++) { + int failed_engine = engines_to_restart[i]; + printf("Force engine %d\n", failed_engine); + for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + BYTES_IN_WORD * template_idx; + if (template_idx == failed_engine) + acc100_reg_write(d, address, payload); + else + acc100_reg_write(d, address, 0); + } + /* Reset descriptor header */ + desc->req.word0 = ACC100_DMA_DESC_TYPE; + desc->req.word1 = 0; + desc->req.word2 = 0; + desc->req.word3 = 0; + desc->req.numCBs = 1; + desc->req.m2dlen = 2; + desc->req.d2mlen = 1; + /* Enqueue the code block for processing */ + union acc100_enqueue_reg_fmt enq_req; + enq_req.val = 0; + enq_req.addr_offset = ACC100_DESC_OFFSET; + enq_req.num_elem = 1; + enq_req.req_elem_addr = 0; + rte_wmb(); + acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val); + usleep(LONG_WAIT * 100); + if (desc->req.word0 != 2) + printf("DMA Response %#"PRIx32"\n", desc->req.word0); + } + + /* Reset LDPC Cores */ + for (i = 0; i < ACC100_ENGINES_MAX; i++) + acc100_reg_write(d, HWPfFecUl5gCntrlReg + + ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI); + usleep(LONG_WAIT); + for (i = 0; i < ACC100_ENGINES_MAX; i++) + acc100_reg_write(d, HWPfFecUl5gCntrlReg + + ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO); + usleep(LONG_WAIT); + acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD); + usleep(LONG_WAIT); + int numEngines = 0; + /* Check engine power-on status again */ + for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST; + template_idx++) { + address = HwPfFecUl5gIbDebugReg + + ACC100_ENGINE_OFFSET * template_idx; + status = (acc100_reg_read(d, address) >> 4) & 0xF; + address = HWPfQmgrGrpTmplateReg4Indx + + BYTES_IN_WORD * template_idx; + if (status == 1) { + acc100_reg_write(d, address, payload); + numEngines++; + } else + acc100_reg_write(d, address, 0); + } + printf("Number of 5GUL engines %d\n", numEngines); + + if (d->sw_rings_base != NULL) + rte_free(d->sw_rings_base); + usleep(LONG_WAIT); +} + +/* Initial configuration of a ACC100 device prior to running configure() */ +int +acc100_configure(const char *dev_name, struct acc100_conf *conf) +{ + rte_bbdev_log(INFO, "acc100_configure"); + uint32_t payload, address, status; + int qg_idx, template_idx, vf_idx, acc, i; + struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name); + + /* Compile time checks */ + RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256); + RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256); + RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24); + RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32); + + if (bbdev == NULL) { + rte_bbdev_log(ERR, + "Invalid dev_name (%s), or device is not yet initialised", + dev_name); + return -ENODEV; + } + struct acc100_device *d = bbdev->data->dev_private; + + /* Store configuration */ + rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf)); + + /* PCIe Bridge configuration */ + acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE); + for (i = 1; i < 17; i++) + acc100_reg_write(d, + HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh + + i * 16, 0); + + /* PCIe Link Trainiing and Status State Machine */ + acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000); + + /* Prevent blocking AXI read on BRESP for AXI Write */ + address = HwPfPcieGpexAxiPioControl; + payload = ACC100_CFG_PCI_AXI; + acc100_reg_write(d, address, payload); + + /* 5GDL PLL phase shift */ + acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1); + + /* Explicitly releasing AXI as this may be stopped after PF FLR/BME */ + address = HWPfDmaAxiControl; + payload = 1; + acc100_reg_write(d, address, payload); + + /* DDR Configuration */ + address = HWPfDdrBcTim6; + payload = acc100_reg_read(d, address); + payload &= 0xFFFFFFFB; /* Bit 2 */ +#ifdef ACC100_DDR_ECC_ENABLE + payload |= 0x4; +#endif + acc100_reg_write(d, address, payload); + address = HWPfDdrPhyDqsCountNum; +#ifdef ACC100_DDR_ECC_ENABLE + payload = 9; +#else + payload = 8; +#endif + acc100_reg_write(d, address, payload); + + /* Set default descriptor signature */ + address = HWPfDmaDescriptorSignatuture; + payload = 0; + acc100_reg_write(d, address, payload); + + /* Enable the Error Detection in DMA */ + payload = ACC100_CFG_DMA_ERROR; + address = HWPfDmaErrorDetectionEn; + acc100_reg_write(d, address, payload); + + /* AXI Cache configuration */ + payload = ACC100_CFG_AXI_CACHE; + address = HWPfDmaAxcacheReg; + acc100_reg_write(d, address, payload); + + /* Default DMA Configuration (Qmgr Enabled) */ + address = HWPfDmaConfig0Reg; + payload = 0; + acc100_reg_write(d, address, payload); + address = HWPfDmaQmanen; + payload = 0; + acc100_reg_write(d, address, payload); + + /* Default RLIM/ALEN configuration */ + address = HWPfDmaConfig1Reg; + payload = (1 << 31) + (23 << 8) + (1 << 6) + 7; + acc100_reg_write(d, address, payload); + + /* Configure DMA Qmanager addresses */ + address = HWPfDmaQmgrAddrReg; + payload = HWPfQmgrEgressQueuesTemplate; + acc100_reg_write(d, address, payload); + + /* ===== Qmgr Configuration ===== */ + /* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */ + int totalQgs = conf->q_ul_4g.num_qgroups + + conf->q_ul_5g.num_qgroups + + conf->q_dl_4g.num_qgroups + + conf->q_dl_5g.num_qgroups; + for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) { + address = HWPfQmgrDepthLog2Grp + + BYTES_IN_WORD * qg_idx; + payload = aqDepth(qg_idx, conf); + acc100_reg_write(d, address, payload); + address = HWPfQmgrTholdGrp + + BYTES_IN_WORD * qg_idx; + payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1)); + acc100_reg_write(d, address, payload); + } + + /* Template Priority in incremental order */ + for (template_idx = 0; template_idx < ACC100_NUM_TMPL; + template_idx++) { + address = HWPfQmgrGrpTmplateReg0Indx + + BYTES_IN_WORD * (template_idx % 8); + payload = TMPL_PRI_0; + acc100_reg_write(d, address, payload); + address = HWPfQmgrGrpTmplateReg1Indx + + BYTES_IN_WORD * (template_idx % 8); + payload = TMPL_PRI_1; + acc100_reg_write(d, address, payload); + address = HWPfQmgrGrpTmplateReg2indx + + BYTES_IN_WORD * (template_idx % 8); + payload = TMPL_PRI_2; + acc100_reg_write(d, address, payload); + address = HWPfQmgrGrpTmplateReg3Indx + + BYTES_IN_WORD * (template_idx % 8); + payload = TMPL_PRI_3; + acc100_reg_write(d, address, payload); + } + + address = HWPfQmgrGrpPriority; + payload = ACC100_CFG_QMGR_HI_P; + acc100_reg_write(d, address, payload); + + /* Template Configuration */ + for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) { + payload = 0; + address = HWPfQmgrGrpTmplateReg4Indx + + BYTES_IN_WORD * template_idx; + acc100_reg_write(d, address, payload); + } + /* 4GUL */ + int numQgs = conf->q_ul_4g.num_qgroups; + int numQqsAcc = 0; + payload = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + payload |= (1 << qg_idx); + for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + BYTES_IN_WORD*template_idx; + acc100_reg_write(d, address, payload); + } + /* 5GUL */ + numQqsAcc += numQgs; + numQgs = conf->q_ul_5g.num_qgroups; + payload = 0; + int numEngines = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + payload |= (1 << qg_idx); + for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST; + template_idx++) { + /* Check engine power-on status */ + address = HwPfFecUl5gIbDebugReg + + ACC100_ENGINE_OFFSET * template_idx; + status = (acc100_reg_read(d, address) >> 4) & 0xF; + address = HWPfQmgrGrpTmplateReg4Indx + + BYTES_IN_WORD * template_idx; + if (status == 1) { + acc100_reg_write(d, address, payload); + numEngines++; + } else + acc100_reg_write(d, address, 0); + #if RTE_ACC100_SINGLE_FEC == 1 + payload = 0; + #endif + } + printf("Number of 5GUL engines %d\n", numEngines); + /* 4GDL */ + numQqsAcc += numQgs; + numQgs = conf->q_dl_4g.num_qgroups; + payload = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + payload |= (1 << qg_idx); + for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + BYTES_IN_WORD*template_idx; + acc100_reg_write(d, address, payload); + #if RTE_ACC100_SINGLE_FEC == 1 + payload = 0; + #endif + } + /* 5GDL */ + numQqsAcc += numQgs; + numQgs = conf->q_dl_5g.num_qgroups; + payload = 0; + for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++) + payload |= (1 << qg_idx); + for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST; + template_idx++) { + address = HWPfQmgrGrpTmplateReg4Indx + + BYTES_IN_WORD*template_idx; + acc100_reg_write(d, address, payload); + #if RTE_ACC100_SINGLE_FEC == 1 + payload = 0; + #endif + } + + /* Queue Group Function mapping */ + int qman_func_id[5] = {0, 2, 1, 3, 4}; + address = HWPfQmgrGrpFunction0; + payload = 0; + for (qg_idx = 0; qg_idx < 8; qg_idx++) { + acc = accFromQgid(qg_idx, conf); + payload |= qman_func_id[acc]<<(qg_idx * 4); + } + acc100_reg_write(d, address, payload); + + /* Configuration of the Arbitration QGroup depth to 1 */ + for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) { + address = HWPfQmgrArbQDepthGrp + + BYTES_IN_WORD * qg_idx; + payload = 0; + acc100_reg_write(d, address, payload); + } + + /* Enabling AQueues through the Queue hierarchy*/ + for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) { + for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) { + payload = 0; + if (vf_idx < conf->num_vf_bundles && + qg_idx < totalQgs) + payload = (1 << aqNum(qg_idx, conf)) - 1; + address = HWPfQmgrAqEnableVf + + vf_idx * BYTES_IN_WORD; + payload += (qg_idx << 16); + acc100_reg_write(d, address, payload); + } + } + + /* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */ + uint32_t aram_address = 0; + for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) { + for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) { + address = HWPfQmgrVfBaseAddr + vf_idx + * BYTES_IN_WORD + qg_idx + * BYTES_IN_WORD * 64; + payload = aram_address; + acc100_reg_write(d, address, payload); + /* Offset ARAM Address for next memory bank + * - increment of 4B + */ + aram_address += aqNum(qg_idx, conf) * + (1 << aqDepth(qg_idx, conf)); + } + } + + if (aram_address > WORDS_IN_ARAM_SIZE) { + rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n", + aram_address, WORDS_IN_ARAM_SIZE); + return -EINVAL; + } + + /* ==== HI Configuration ==== */ + + /* Prevent Block on Transmit Error */ + address = HWPfHiBlockTransmitOnErrorEn; + payload = 0; + acc100_reg_write(d, address, payload); + /* Prevents to drop MSI */ + address = HWPfHiMsiDropEnableReg; + payload = 0; + acc100_reg_write(d, address, payload); + /* Set the PF Mode register */ + address = HWPfHiPfMode; + payload = (conf->pf_mode_en) ? 2 : 0; + acc100_reg_write(d, address, payload); + /* Enable Error Detection in HW */ + address = HWPfDmaErrorDetectionEn; + payload = 0x3D7; + acc100_reg_write(d, address, payload); + + /* QoS overflow init */ + payload = 1; + address = HWPfQosmonAEvalOverflow0; + acc100_reg_write(d, address, payload); + address = HWPfQosmonBEvalOverflow0; + acc100_reg_write(d, address, payload); + + /* HARQ DDR Configuration */ + unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */ + for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) { + address = HWPfDmaVfDdrBaseRw + vf_idx + * 0x10; + payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) + + (ddrSizeInMb - 1); + acc100_reg_write(d, address, payload); + } + usleep(LONG_WAIT); + + if (numEngines < (SIG_UL_5G_LAST + 1)) + poweron_cleanup(bbdev, d, conf); + + rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name); + return 0; +} diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map index 4a76d1d..91c234d 100644 --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map @@ -1,3 +1,10 @@ DPDK_21 { local: *; }; + +EXPERIMENTAL { + global: + + acc100_configure; + +}; From patchwork Fri Sep 4 17:54:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Chautru X-Patchwork-Id: 76582 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DA056A04C5; Fri, 4 Sep 2020 19:59:38 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3AE9D1C12B; Fri, 4 Sep 2020 19:58:21 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 9D4071C0D2 for ; Fri, 4 Sep 2020 19:58:12 +0200 (CEST) IronPort-SDR: j8cyOgHR4XiECQSQTci07nxR1rfFukH2pO5ybBUAaxQMwvnPSUOxyoIjzgna9+nS9fCzSpDTd8 ECxY7aZyqA7Q== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="242614845" X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="242614845" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 10:58:06 -0700 IronPort-SDR: w4sACq3ZFjy5Zml6lEmTkbA3RDwvyGhLjOMcd7NS85xzOeAz5c5IVEmWsuFFE262ZVMXCfm79/ ij4Nb8uetWsg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,390,1592895600"; d="scan'208";a="326765408" Received: from skx-5gnr-sc12-4.sc.intel.com ([172.25.69.210]) by fmsmga004.fm.intel.com with ESMTP; 04 Sep 2020 10:58:07 -0700 From: Nicolas Chautru To: dev@dpdk.org, akhil.goyal@nxp.com Cc: bruce.richardson@intel.com, rosen.xu@intel.com, dave.burley@accelercomm.com, aidan.goddard@accelercomm.com, ferruh.yigit@intel.com, tianjiao.liu@intel.com, Nicolas Chautru Date: Fri, 4 Sep 2020 10:54:07 -0700 Message-Id: <1599242047-58232-12-git-send-email-nicolas.chautru@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> References: <1597796731-57841-12-git-send-email-nicolas.chautru@intel.com> <1599242047-58232-1-git-send-email-nicolas.chautru@intel.com> Subject: [dpdk-dev] [PATCH v4 11/11] doc: update bbdev feature table X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Correcting overview matrix to use acc100 name Signed-off-by: Nicolas Chautru Acked-by: Liu Tianjiao --- doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++ doc/guides/bbdevs/features/mbc.ini | 14 -------------- 2 files changed, 14 insertions(+), 14 deletions(-) create mode 100644 doc/guides/bbdevs/features/acc100.ini delete mode 100644 doc/guides/bbdevs/features/mbc.ini diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini new file mode 100644 index 0000000..642cd48 --- /dev/null +++ b/doc/guides/bbdevs/features/acc100.ini @@ -0,0 +1,14 @@ +; +; Supported features of the 'acc100' bbdev driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Turbo Decoder (4G) = Y +Turbo Encoder (4G) = Y +LDPC Decoder (5G) = Y +LDPC Encoder (5G) = Y +LLR/HARQ Compression = Y +External DDR Access = Y +HW Accelerated = Y +BBDEV API = Y diff --git a/doc/guides/bbdevs/features/mbc.ini b/doc/guides/bbdevs/features/mbc.ini deleted file mode 100644 index 78a7b95..0000000 --- a/doc/guides/bbdevs/features/mbc.ini +++ /dev/null @@ -1,14 +0,0 @@ -; -; Supported features of the 'mbc' bbdev driver. -; -; Refer to default.ini for the full list of available PMD features. -; -[Features] -Turbo Decoder (4G) = Y -Turbo Encoder (4G) = Y -LDPC Decoder (5G) = Y -LDPC Encoder (5G) = Y -LLR/HARQ Compression = Y -External DDR Access = Y -HW Accelerated = Y -BBDEV API = Y