From patchwork Fri Jul 1 09:02:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113604 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 89D04A00C2; Fri, 1 Jul 2022 11:03:08 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2B782427EC; Fri, 1 Jul 2022 11:03:05 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id D5FB940E03 for ; Fri, 1 Jul 2022 11:03:03 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 0CA7D20D5905; Fri, 1 Jul 2022 02:03:03 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 0CA7D20D5905 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666183; bh=C8QHjW2Yj8q/1w6ProQEDsB7N4ntkL611rW11dbbbWc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=gZn6Tn94aenUs/iY8n1eY/ZTEV0HkyGjxet26c5LoHgdFTmWj5S9l7TR4Tah8pa/C JTl2paIYIKMhISa407qderZq8xA8oRZzxCghGWIIwjxbLKLmtaUDmA/ouJSoehNoUn 6CPPOhbldOTpApoFSC2OL6o9w4oZgc7YS7RQKZWg= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 01/17] net/mana: add basic driver, build environment and doc Date: Fri, 1 Jul 2022 02:02:31 -0700 Message-Id: <1656666167-26035-2-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li MANA is a PCI device. It uses IB verbs to access hardware through the kernel RDMA layer. This patch introduces build environment and basic device probe functions. Signed-off-by: Long Li --- MAINTAINERS | 6 + doc/guides/nics/features/mana.ini | 10 + doc/guides/nics/index.rst | 1 + doc/guides/nics/mana.rst | 54 +++ drivers/net/mana/mana.c | 729 ++++++++++++++++++++++++++++++ drivers/net/mana/mana.h | 214 +++++++++ drivers/net/mana/meson.build | 34 ++ drivers/net/mana/mp.c | 257 +++++++++++ drivers/net/mana/version.map | 3 + drivers/net/meson.build | 1 + 10 files changed, 1309 insertions(+) create mode 100644 doc/guides/nics/features/mana.ini create mode 100644 doc/guides/nics/mana.rst create mode 100644 drivers/net/mana/mana.c create mode 100644 drivers/net/mana/mana.h create mode 100644 drivers/net/mana/meson.build create mode 100644 drivers/net/mana/mp.c create mode 100644 drivers/net/mana/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 18d9edaf88..b8bda48a33 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -837,6 +837,12 @@ F: buildtools/options-ibverbs-static.sh F: doc/guides/nics/mlx5.rst F: doc/guides/nics/features/mlx5.ini +Microsoft mana +M: Long Li +F: drivers/net/mana +F: doc/guides/nics/mana.rst +F: doc/guides/nics/features/mana.ini + Microsoft vdev_netvsc - EXPERIMENTAL M: Matan Azrad F: drivers/net/vdev_netvsc/ diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini new file mode 100644 index 0000000000..9d8676089b --- /dev/null +++ b/doc/guides/nics/features/mana.ini @@ -0,0 +1,10 @@ +; +; Supported features of the 'cnxk' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Linux = Y +Multiprocess aware = Y +Usage doc = Y +x86-64 = Y diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index 1c94caccea..2725d1d9f0 100644 --- a/doc/guides/nics/index.rst +++ b/doc/guides/nics/index.rst @@ -41,6 +41,7 @@ Network Interface Controller Drivers intel_vf kni liquidio + mana memif mlx4 mlx5 diff --git a/doc/guides/nics/mana.rst b/doc/guides/nics/mana.rst new file mode 100644 index 0000000000..a871db35a7 --- /dev/null +++ b/doc/guides/nics/mana.rst @@ -0,0 +1,54 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright 2022 Microsoft Corporation + +MANA poll mode driver library +============================= + +The MANA poll mode driver library (**librte_net_mana**) implements support +for Microsoft Azure Network Adatper VF in SR-IOV context. + +Features +-------- + +Features of the MANA Ethdev PMD are: + +Prerequisites +------------- + +This driver relies on external libraries and kernel drivers for resources +allocations and initialization. The following dependencies are not part of +DPDK and must be installed separately: + +- **libibverbs** (provided by rdma-core package) + + User space verbs framework used by librte_net_mana. This library provides + a generic interface between the kernel and low-level user space drivers + such as libmana. + + It allows slow and privileged operations (context initialization, hardware + resources allocations) to be managed by the kernel and fast operations to + never leave user space. + +- **libmana** (provided by rdma-core package) + + Low-level user space driver library for Microsoft Azure Network Adatper + devices, it is automatically loaded by libibverbs. + +- **Kernel modules** + + They provide the kernel-side verbs API and low level device drivers that + manage actual hardware initialization and resources sharing with user + space processes. + + Unlike most other PMDs, these modules must remain loaded and bound to + their devices: + + - mana: Ethernet device driver that provides kernel network interfaces. + - mana_ib: InifiniBand device driver. + - ib_uverbs: user space driver for verbs (entry point for libibverbs). + +Driver compilation and testing +------------------------------ + +Refer to the document :ref:`compiling and testing a PMD for a NIC ` +for details. diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c new file mode 100644 index 0000000000..893e2b1e23 --- /dev/null +++ b/drivers/net/mana/mana.c @@ -0,0 +1,729 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2022 Microsoft Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include + +#include "mana.h" + +/* Shared memory between primary/secondary processes, per driver */ +struct mana_shared_data *mana_shared_data; +const struct rte_memzone *mana_shared_mz; +static const char *MZ_MANA_SHARED_DATA = "mana_shared_data"; + +struct mana_shared_data mana_local_data; + +/* Spinlock for mana_shared_data */ +static rte_spinlock_t mana_shared_data_lock = RTE_SPINLOCK_INITIALIZER; + +/* Allocate a buffer on the stack and fill it with a printf format string. */ +#define MKSTR(name, ...) \ + int mkstr_size_##name = snprintf(NULL, 0, "" __VA_ARGS__); \ + char name[mkstr_size_##name + 1]; \ + \ + memset(name, 0, mkstr_size_##name + 1); \ + snprintf(name, sizeof(name), "" __VA_ARGS__) + +int mana_logtype_driver; +int mana_logtype_init; + +const struct eth_dev_ops mana_dev_ops = { +}; + +const struct eth_dev_ops mana_dev_sec_ops = { +}; + +uint16_t +mana_rx_burst_removed(void *dpdk_rxq __rte_unused, + struct rte_mbuf **pkts __rte_unused, + uint16_t pkts_n __rte_unused) +{ + rte_mb(); + return 0; +} + +uint16_t +mana_tx_burst_removed(void *dpdk_rxq __rte_unused, + struct rte_mbuf **pkts __rte_unused, + uint16_t pkts_n __rte_unused) +{ + rte_mb(); + return 0; +} + +static const char *mana_init_args[] = { + "mac", + NULL, +}; + +/* Support of parsing up to 8 mac address from EAL command line */ +#define MAX_NUM_ADDRESS 8 +struct mana_conf { + struct rte_ether_addr mac_array[MAX_NUM_ADDRESS]; + unsigned int index; +}; + +static int mana_arg_parse_callback(const char *key, const char *val, + void *private) +{ + struct mana_conf *conf = (struct mana_conf *)private; + int ret; + + DRV_LOG(INFO, "key=%s value=%s index=%d", key, val, conf->index); + + if (conf->index >= MAX_NUM_ADDRESS) { + DRV_LOG(ERR, "Exceeding max MAC address"); + return 1; + } + + ret = rte_ether_unformat_addr(val, &conf->mac_array[conf->index]); + if (ret) { + DRV_LOG(ERR, "Invalid MAC address %s", val); + return ret; + } + + conf->index++; + + return 0; +} + +static int mana_parse_args(struct rte_devargs *devargs, struct mana_conf *conf) +{ + struct rte_kvargs *kvlist; + unsigned int arg_count; + int ret = 0; + + kvlist = rte_kvargs_parse(devargs->args, mana_init_args); + if (!kvlist) { + DRV_LOG(ERR, "failed to parse kvargs args=%s", devargs->args); + return -EINVAL; + } + + arg_count = rte_kvargs_count(kvlist, mana_init_args[0]); + if (arg_count > MAX_NUM_ADDRESS) { + ret = -EINVAL; + goto free_kvlist; + } + ret = rte_kvargs_process(kvlist, mana_init_args[0], + mana_arg_parse_callback, conf); + if (ret) { + DRV_LOG(ERR, "error parsing args"); + goto free_kvlist; + } + +free_kvlist: + rte_kvargs_free(kvlist); + return ret; +} + +static int get_port_mac(struct ibv_device *device, unsigned int port, + struct rte_ether_addr *addr) +{ + FILE *file; + int ret = 0; + DIR *dir; + struct dirent *dent; + unsigned int dev_port; + char mac[20]; + + MKSTR(path, "%s/device/net", device->ibdev_path); + + dir = opendir(path); + if (!dir) + return -ENOENT; + + while ((dent = readdir(dir))) { + char *name = dent->d_name; + + MKSTR(filepath, "%s/%s/dev_port", path, name); + + /* Ignore . and .. */ + if ((name[0] == '.') && + ((name[1] == '\0') || + ((name[1] == '.') && (name[2] == '\0')))) + continue; + + file = fopen(filepath, "rb"); + if (!file) + continue; + + ret = fscanf(file, "%u", &dev_port); + fclose(file); + + if (ret != 1) + continue; + + /* Ethernet ports start at 0, IB port start at 1 */ + if (dev_port == port - 1) { + MKSTR(filepath, "%s/%s/address", path, name); + + file = fopen(filepath, "rb"); + if (!file) + continue; + + ret = fscanf(file, "%s", mac); + fclose(file); + + if (ret < 0) + break; + + ret = rte_ether_unformat_addr(mac, addr); + if (ret) + DRV_LOG(ERR, "unrecognized mac addr %s", mac); + break; + } + } + + closedir(dir); + return ret; +} + +static int mana_ibv_device_to_pci_addr(const struct ibv_device *device, + struct rte_pci_addr *pci_addr) +{ + FILE *file; + char line[32]; + + MKSTR(path, "%s/device/uevent", device->ibdev_path); + + file = fopen(path, "rb"); + if (!file) + return -errno; + + while (fgets(line, sizeof(line), file) == line) { + size_t len = strlen(line); + int ret; + + /* Truncate long lines. */ + if (len == (sizeof(line) - 1)) + while (line[(len - 1)] != '\n') { + ret = fgetc(file); + if (ret == EOF) + break; + line[(len - 1)] = ret; + } + /* Extract information. */ + if (sscanf(line, + "PCI_SLOT_NAME=" + "%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 "\n", + &pci_addr->domain, + &pci_addr->bus, + &pci_addr->devid, + &pci_addr->function) == 4) { + break; + } + } + fclose(file); + return 0; +} + +static int mana_proc_priv_init(struct rte_eth_dev *dev) +{ + struct mana_process_priv *priv; + + priv = rte_zmalloc_socket("mana_proc_priv", + sizeof(struct mana_process_priv), + RTE_CACHE_LINE_SIZE, + dev->device->numa_node); + if (!priv) + return -ENOMEM; + + dev->process_private = priv; + return 0; +} + +static int mana_map_doorbell_secondary(struct rte_eth_dev *eth_dev, int fd) +{ + struct mana_process_priv *priv = eth_dev->process_private; + + void *addr; + + addr = mmap(NULL, rte_mem_page_size(), PROT_WRITE, MAP_SHARED, fd, 0); + if (addr == MAP_FAILED) { + DRV_LOG(ERR, "Failed to map secondary doorbell port %u", + eth_dev->data->port_id); + return -ENOMEM; + } + + DRV_LOG(INFO, "Secondary doorbell mapped to %p", addr); + + priv->db_page = addr; + + return 0; +} + +/* Initialize shared data for the driver (all devices) */ +static int mana_init_shared_data(void) +{ + int ret = 0; + const struct rte_memzone *secondary_mz; + + rte_spinlock_lock(&mana_shared_data_lock); + + /* Skip if shared data is already initialized */ + if (mana_shared_data) + goto exit; + + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { + mana_shared_mz = rte_memzone_reserve(MZ_MANA_SHARED_DATA, + sizeof(*mana_shared_data), + SOCKET_ID_ANY, 0); + if (!mana_shared_mz) { + DRV_LOG(ERR, "Cannot allocate mana shared data"); + ret = -rte_errno; + goto exit; + } + + mana_shared_data = mana_shared_mz->addr; + memset(mana_shared_data, 0, sizeof(*mana_shared_data)); + rte_spinlock_init(&mana_shared_data->lock); + } else { + secondary_mz = rte_memzone_lookup(MZ_MANA_SHARED_DATA); + if (!secondary_mz) { + DRV_LOG(ERR, "Cannot attach mana shared data"); + ret = -rte_errno; + goto exit; + } + + mana_shared_data = secondary_mz->addr; + memset(&mana_local_data, 0, sizeof(mana_local_data)); + } + +exit: + rte_spinlock_unlock(&mana_shared_data_lock); + + return ret; +} + +static int mana_init_once(void) +{ + int ret; + + ret = mana_init_shared_data(); + if (ret) + return ret; + + rte_spinlock_lock(&mana_shared_data->lock); + + switch (rte_eal_process_type()) { + case RTE_PROC_PRIMARY: + if (mana_shared_data->init_done) + break; + + ret = mana_mp_init_primary(); + if (ret) + break; + DRV_LOG(ERR, "MP INIT PRIMARY"); + + mana_shared_data->init_done = 1; + break; + + case RTE_PROC_SECONDARY: + + if (mana_local_data.init_done) + break; + + ret = mana_mp_init_secondary(); + if (ret) + break; + + DRV_LOG(ERR, "MP INIT SECONDARY"); + + mana_local_data.init_done = 1; + break; + + default: + /* Impossible, internal error */ + ret = -EPROTO; + break; + } + + rte_spinlock_unlock(&mana_shared_data->lock); + + return ret; +} + +static int mana_pci_probe_mac(struct rte_pci_driver *pci_drv __rte_unused, + struct rte_pci_device *pci_dev, + struct rte_ether_addr *mac_addr) +{ + struct ibv_device **ibv_list; + int ibv_idx; + struct ibv_context *ctx; + struct ibv_device_attr_ex dev_attr; + int num_devices; + int ret = 0; + unsigned int port; + struct mana_priv *priv = NULL; + struct rte_eth_dev *eth_dev = NULL; + bool found_port; + + ibv_list = ibv_get_device_list(&num_devices); + for (ibv_idx = 0; ibv_idx < num_devices; ibv_idx++) { + struct ibv_device *ibdev = ibv_list[ibv_idx]; + struct rte_pci_addr pci_addr; + + DRV_LOG(INFO, "Probe device name %s dev_name %s ibdev_path %s", + ibdev->name, ibdev->dev_name, ibdev->ibdev_path); + + if (mana_ibv_device_to_pci_addr(ibdev, &pci_addr)) + continue; + + /* Ignore if this IB device is not this PCI device */ + if (pci_dev->addr.domain != pci_addr.domain || + pci_dev->addr.bus != pci_addr.bus || + pci_dev->addr.devid != pci_addr.devid || + pci_dev->addr.function != pci_addr.function) + continue; + + ctx = ibv_open_device(ibdev); + if (!ctx) { + DRV_LOG(ERR, "Failed to open IB device %s", + ibdev->name); + continue; + } + + ret = ibv_query_device_ex(ctx, NULL, &dev_attr); + DRV_LOG(INFO, "dev_attr.orig_attr.phys_port_cnt %u", + dev_attr.orig_attr.phys_port_cnt); + found_port = false; + + for (port = 1; port <= dev_attr.orig_attr.phys_port_cnt; + port++) { + struct ibv_parent_domain_init_attr attr = {}; + struct rte_ether_addr addr; + char address[64]; + char name[RTE_ETH_NAME_MAX_LEN]; + + ret = get_port_mac(ibdev, port, &addr); + if (ret) + continue; + + if (mac_addr && !rte_is_same_ether_addr(&addr, mac_addr)) + continue; + + rte_ether_format_addr(address, sizeof(address), &addr); + DRV_LOG(INFO, "device located port %u address %s", + port, address); + found_port = true; + + priv = rte_zmalloc_socket(NULL, sizeof(*priv), + RTE_CACHE_LINE_SIZE, + SOCKET_ID_ANY); + if (!priv) { + ret = -ENOMEM; + goto failed; + } + + snprintf(name, sizeof(name), "%s_port%d", + pci_dev->device.name, port); + + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { + int fd; + + eth_dev = rte_eth_dev_attach_secondary(name); + if (!eth_dev) { + DRV_LOG(ERR, "Can't attach to dev %s", + name); + ret = -ENOMEM; + goto failed; + } + + eth_dev->device = &pci_dev->device; + eth_dev->dev_ops = &mana_dev_sec_ops; + ret = mana_proc_priv_init(eth_dev); + if (ret) + goto failed; + priv->process_priv = eth_dev->process_private; + + /* Get the IB FD from the primary process */ + fd = mana_mp_req_verbs_cmd_fd(eth_dev); + if (fd < 0) { + DRV_LOG(ERR, "Failed to get FD %d", fd); + ret = -ENODEV; + goto failed; + } + + ret = mana_map_doorbell_secondary(eth_dev, fd); + if (ret) { + DRV_LOG(ERR, "Failed secondary map %d", + fd); + goto failed; + } + + /* fd is no not used after mapping doorbell */ + close(fd); + + rte_spinlock_lock(&mana_shared_data->lock); + mana_shared_data->secondary_cnt++; + mana_local_data.secondary_cnt++; + rte_spinlock_unlock(&mana_shared_data->lock); + + rte_eth_copy_pci_info(eth_dev, pci_dev); + rte_eth_dev_probing_finish(eth_dev); + + /* Impossible to have more than one port + * matching a MAC address + */ + continue; + } + + eth_dev = rte_eth_dev_allocate(name); + if (!eth_dev) { + ret = -ENOMEM; + goto failed; + } + + eth_dev->data->mac_addrs = + rte_calloc("mana_mac", 1, + sizeof(struct rte_ether_addr), 0); + if (!eth_dev->data->mac_addrs) { + ret = -ENOMEM; + goto failed; + } + + rte_ether_addr_copy(&addr, eth_dev->data->mac_addrs); + + priv->ib_pd = ibv_alloc_pd(ctx); + if (!priv->ib_pd) { + DRV_LOG(ERR, "ibv_alloc_pd failed port %d", port); + ret = -ENOMEM; + goto failed; + } + + /* Create a parent domain with the port number */ + attr.pd = priv->ib_pd; + attr.comp_mask = IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT; + attr.pd_context = (void *)(uint64_t)port; + priv->ib_parent_pd = ibv_alloc_parent_domain(ctx, &attr); + if (!priv->ib_parent_pd) { + DRV_LOG(ERR, + "ibv_alloc_parent_domain failed port %d", + port); + ret = -ENOMEM; + goto failed; + } + + priv->ib_ctx = ctx; + priv->port_id = eth_dev->data->port_id; + priv->dev_port = port; + eth_dev->data->dev_private = priv; + priv->dev_data = eth_dev->data; + + priv->max_rx_queues = dev_attr.orig_attr.max_qp; + priv->max_tx_queues = dev_attr.orig_attr.max_qp; + + priv->max_rx_desc = + RTE_MIN(dev_attr.orig_attr.max_qp_wr, + dev_attr.orig_attr.max_cqe); + priv->max_tx_desc = + RTE_MIN(dev_attr.orig_attr.max_qp_wr, + dev_attr.orig_attr.max_cqe); + + priv->max_send_sge = dev_attr.orig_attr.max_sge; + priv->max_recv_sge = dev_attr.orig_attr.max_sge; + + priv->max_mr = dev_attr.orig_attr.max_mr; + priv->max_mr_size = dev_attr.orig_attr.max_mr_size; + + DRV_LOG(INFO, "dev %s max queues %d desc %d sge %d\n", + name, priv->max_rx_queues, priv->max_rx_desc, + priv->max_send_sge); + + rte_spinlock_lock(&mana_shared_data->lock); + mana_shared_data->primary_cnt++; + rte_spinlock_unlock(&mana_shared_data->lock); + + eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV; + + eth_dev->device = &pci_dev->device; + eth_dev->data->dev_flags |= + RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS; + + DRV_LOG(INFO, "device %s at port %u", + name, eth_dev->data->port_id); + + eth_dev->rx_pkt_burst = mana_rx_burst_removed; + eth_dev->tx_pkt_burst = mana_tx_burst_removed; + eth_dev->dev_ops = &mana_dev_ops; + + rte_eth_copy_pci_info(eth_dev, pci_dev); + rte_eth_dev_probing_finish(eth_dev); + } + + /* Secondary process doesn't need an ibv_ctx. It maps the + * doorbell pages using the IB cmd_fd passed from the primary + * process and send messages to primary process for memory + * registartions. + */ + if (!found_port || rte_eal_process_type() == RTE_PROC_SECONDARY) + ibv_close_device(ctx); + } + + ibv_free_device_list(ibv_list); + return 0; + +failed: + /* Free the resource for the port failed */ + if (priv) { + if (priv->ib_parent_pd) + ibv_dealloc_pd(priv->ib_parent_pd); + + if (priv->ib_pd) + ibv_dealloc_pd(priv->ib_pd); + } + + if (eth_dev) + rte_eth_dev_release_port(eth_dev); + + rte_free(priv); + + ibv_close_device(ctx); + ibv_free_device_list(ibv_list); + + return ret; +} + +static int mana_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, + struct rte_pci_device *pci_dev) +{ + struct rte_devargs *args = pci_dev->device.devargs; + struct mana_conf conf = {}; + unsigned int i; + int ret; + + if (args && args->args) { + ret = mana_parse_args(args, &conf); + if (ret) { + DRV_LOG(ERR, "failed to parse parameters args = %s", + args->args); + return ret; + } + } + + ret = mana_init_once(); + if (ret) { + DRV_LOG(ERR, "Failed to init PMD global data %d", ret); + return ret; + } + + /* If there are no driver parameters, probe on all ports */ + if (!conf.index) + return mana_pci_probe_mac(pci_drv, pci_dev, NULL); + + for (i = 0; i < conf.index; i++) { + ret = mana_pci_probe_mac(pci_drv, pci_dev, &conf.mac_array[i]); + if (ret) + return ret; + } + + return 0; +} + +static int mana_dev_uninit(struct rte_eth_dev *dev) +{ + RTE_SET_USED(dev); + return 0; +} + +static int mana_pci_remove(struct rte_pci_device *pci_dev) +{ + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { + rte_spinlock_lock(&mana_shared_data_lock); + + rte_spinlock_lock(&mana_shared_data->lock); + + RTE_VERIFY(mana_shared_data->primary_cnt > 0); + mana_shared_data->primary_cnt--; + if (!mana_shared_data->primary_cnt) { + DRV_LOG(DEBUG, "mp uninit primary"); + mana_mp_uninit_primary(); + } + + rte_spinlock_unlock(&mana_shared_data->lock); + + /* Also free the shared memory if this is the last */ + if (!mana_shared_data->primary_cnt) { + DRV_LOG(DEBUG, "free shared memezone data"); + rte_memzone_free(mana_shared_mz); + } + + rte_spinlock_unlock(&mana_shared_data_lock); + } else { + rte_spinlock_lock(&mana_shared_data_lock); + + rte_spinlock_lock(&mana_shared_data->lock); + RTE_VERIFY(mana_shared_data->secondary_cnt > 0); + mana_shared_data->secondary_cnt--; + rte_spinlock_unlock(&mana_shared_data->lock); + + RTE_VERIFY(mana_local_data.secondary_cnt > 0); + mana_local_data.secondary_cnt--; + if (!mana_local_data.secondary_cnt) { + DRV_LOG(DEBUG, "mp uninit secondary"); + mana_mp_uninit_secondary(); + } + + rte_spinlock_unlock(&mana_shared_data_lock); + } + + return rte_eth_dev_pci_generic_remove(pci_dev, mana_dev_uninit); +} + +static const struct rte_pci_id mana_pci_id_map[] = { + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MICROSOFT, + PCI_DEVICE_ID_MICROSOFT_MANA) + }, +}; + +static struct rte_pci_driver mana_pci_driver = { + .driver = { + .name = "mana_pci", + }, + .id_table = mana_pci_id_map, + .probe = mana_pci_probe, + .remove = mana_pci_remove, + .drv_flags = RTE_PCI_DRV_INTR_RMV, +}; + +RTE_INIT(rte_mana_pmd_init) +{ + rte_pci_register(&mana_pci_driver); +} + +RTE_PMD_EXPORT_NAME(net_mana, __COUNTER__); +RTE_PMD_REGISTER_PCI_TABLE(net_mana, mana_pci_id_map); +RTE_PMD_REGISTER_KMOD_DEP(net_mana, "* ib_uverbs & mana_ib"); +RTE_LOG_REGISTER_SUFFIX(mana_logtype_init, init, NOTICE); +RTE_LOG_REGISTER_SUFFIX(mana_logtype_driver, driver, NOTICE); diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h new file mode 100644 index 0000000000..dbef5420ff --- /dev/null +++ b/drivers/net/mana/mana.h @@ -0,0 +1,214 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2022 Microsoft Corporation + */ + +#ifndef __MANA_H__ +#define __MANA_H__ + +enum { + PCI_VENDOR_ID_MICROSOFT = 0x1414, +}; + +enum { + PCI_DEVICE_ID_MICROSOFT_MANA = 0x00ba, +}; + +/* Shared data between primary/secondary processes */ +struct mana_shared_data { + rte_spinlock_t lock; + int init_done; + unsigned int primary_cnt; + unsigned int secondary_cnt; +}; + +#define MIN_RX_BUF_SIZE 1024 +#define MAX_FRAME_SIZE RTE_ETHER_MAX_LEN +#define BNIC_MAX_MAC_ADDR 1 + +#define BNIC_DEV_RX_OFFLOAD_SUPPORT ( \ + DEV_RX_OFFLOAD_CHECKSUM | \ + DEV_RX_OFFLOAD_RSS_HASH) + +#define BNIC_DEV_TX_OFFLOAD_SUPPORT ( \ + RTE_ETH_TX_OFFLOAD_MULTI_SEGS | \ + RTE_ETH_TX_OFFLOAD_IPV4_CKSUM | \ + RTE_ETH_TX_OFFLOAD_TCP_CKSUM | \ + RTE_ETH_TX_OFFLOAD_UDP_CKSUM | \ + RTE_ETH_TX_OFFLOAD_TCP_TSO) + +#define INDIRECTION_TABLE_NUM_ELEMENTS 64 +#define TOEPLITZ_HASH_KEY_SIZE_IN_BYTES 40 +#define BNIC_ETH_RSS_SUPPORT ( \ + ETH_RSS_IPV4 | \ + ETH_RSS_NONFRAG_IPV4_TCP | \ + ETH_RSS_NONFRAG_IPV4_UDP | \ + ETH_RSS_IPV6 | \ + ETH_RSS_NONFRAG_IPV6_TCP | \ + ETH_RSS_NONFRAG_IPV6_UDP) + +#define MIN_BUFFERS_PER_QUEUE 64 +#define MAX_RECEIVE_BUFFERS_PER_QUEUE 256 +#define MAX_SEND_BUFFERS_PER_QUEUE 256 + +struct mana_process_priv { + void *db_page; +}; + +struct mana_priv { + struct rte_eth_dev_data *dev_data; + struct mana_process_priv *process_priv; + int num_queues; + + /* DPDK port */ + int port_id; + + /* IB device port */ + int dev_port; + + struct ibv_context *ib_ctx; + struct ibv_pd *ib_pd; + struct ibv_pd *ib_parent_pd; + struct ibv_rwq_ind_table *ind_table; + uint8_t ind_table_key[40]; + struct ibv_qp *rwq_qp; + void *db_page; + int max_rx_queues; + int max_tx_queues; + int max_rx_desc; + int max_tx_desc; + int max_send_sge; + int max_recv_sge; + int max_mr; + uint64_t max_mr_size; + rte_rwlock_t mr_list_lock; +}; + +struct mana_txq_desc { + struct rte_mbuf *pkt; + uint32_t wqe_size_in_bu; +}; + +struct mana_rxq_desc { + struct rte_mbuf *pkt; + uint32_t wqe_size_in_bu; +}; + +struct mana_gdma_queue { + void *buffer; + uint32_t count; /* in entries */ + uint32_t size; /* in bytes */ + uint32_t id; + uint32_t head; + uint32_t tail; +}; + +struct mana_stats { + uint64_t packets; + uint64_t bytes; + uint64_t errors; + uint64_t nombuf; +}; + +#define MANA_MR_BTREE_PER_QUEUE_N 64 +struct mana_txq { + struct mana_priv *priv; + uint32_t num_desc; + struct ibv_cq *cq; + struct ibv_qp *qp; + + struct mana_gdma_queue gdma_sq; + struct mana_gdma_queue gdma_cq; + + uint32_t tx_vp_offset; + + /* For storing pending requests */ + struct mana_txq_desc *desc_ring; + + /* desc_ring_head is where we put pending requests to ring, + * completion pull off desc_ring_tail + */ + uint32_t desc_ring_head, desc_ring_tail; + + struct mana_stats stats; + unsigned int socket; +}; + +struct mana_rxq { + struct mana_priv *priv; + uint32_t num_desc; + struct rte_mempool *mp; + struct ibv_cq *cq; + struct ibv_wq *wq; + + /* For storing pending requests */ + struct mana_rxq_desc *desc_ring; + + /* desc_ring_head is where we put pending requests to ring, + * completion pull off desc_ring_tail + */ + uint32_t desc_ring_head, desc_ring_tail; + + struct mana_gdma_queue gdma_rq; + struct mana_gdma_queue gdma_cq; + + struct mana_stats stats; + + unsigned int socket; +}; + +extern int mana_logtype_driver; +extern int mana_logtype_init; + +#define DRV_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, mana_logtype_driver, "%s(): " fmt "\n", \ + __func__, ## args) + +#define PMD_INIT_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, mana_logtype_init, "%s(): " fmt "\n",\ + __func__, ## args) + +#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>") + +const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev); + +uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, + uint16_t pkts_n); + +uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, + uint16_t pkts_n); + +void *mana_alloc_verbs_buf(size_t size, void *data); +void mana_free_verbs_buf(void *ptr, void *data); + +/** Request timeout for IPC. */ +#define MANA_MP_REQ_TIMEOUT_SEC 5 + +/* Request types for IPC. */ +enum mana_mp_req_type { + MANA_MP_REQ_VERBS_CMD_FD = 1, + MANA_MP_REQ_CREATE_MR, + MANA_MP_REQ_START_RXTX, + MANA_MP_REQ_STOP_RXTX, +}; + +/* Pameters for IPC. */ +struct mana_mp_param { + enum mana_mp_req_type type; + int port_id; + int result; + + /* MANA_MP_REQ_CREATE_MR */ + uintptr_t addr; + uint32_t len; +}; + +#define MANA_MP_NAME "net_mana_mp" +int mana_mp_init_primary(void); +int mana_mp_init_secondary(void); +void mana_mp_uninit_primary(void); +void mana_mp_uninit_secondary(void); +int mana_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev); + +void mana_mp_req_on_rxtx(struct rte_eth_dev *dev, enum mana_mp_req_type type); + +#endif diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build new file mode 100644 index 0000000000..7ab34c253c --- /dev/null +++ b/drivers/net/mana/meson.build @@ -0,0 +1,34 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2022 Microsoft Corporation + +if is_windows + build = false + reason = 'not supported on Windows' + subdir_done() +endif + +deps += ['pci', 'bus_pci', 'net', 'eal', 'kvargs'] + +sources += files( + 'mana.c', + 'mp.c', +) + +lib = cc.find_library('ibverbs', required:false) +if lib.found() + ext_deps += lib +else + build = false + reason = 'missing dependency ibverbs' + subdir_done() +endif + + +lib = cc.find_library('mana', required:false) +if lib.found() + ext_deps += lib +else + build = false + reason = 'missing dependency mana' + subdir_done() +endif diff --git a/drivers/net/mana/mp.c b/drivers/net/mana/mp.c new file mode 100644 index 0000000000..b2f5f7ab49 --- /dev/null +++ b/drivers/net/mana/mp.c @@ -0,0 +1,257 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2022 Microsoft Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "mana.h" + +extern struct mana_shared_data *mana_shared_data; + +static void mp_init_msg(struct rte_mp_msg *msg, enum mana_mp_req_type type, + int port_id) +{ + struct mana_mp_param *param; + + strlcpy(msg->name, MANA_MP_NAME, sizeof(msg->name)); + msg->len_param = sizeof(*param); + + param = (struct mana_mp_param *)msg->param; + param->type = type; + param->port_id = port_id; +} + +static int mana_mp_primary_handle(const struct rte_mp_msg *mp_msg, + const void *peer) +{ + struct rte_eth_dev *dev; + const struct mana_mp_param *param = + (const struct mana_mp_param *)mp_msg->param; + struct rte_mp_msg mp_res = { 0 }; + struct mana_mp_param *res = (struct mana_mp_param *)mp_res.param; + int ret; + struct mana_priv *priv; + + if (!rte_eth_dev_is_valid_port(param->port_id)) { + DRV_LOG(ERR, "MP handle port ID %u invalid", param->port_id); + return -ENODEV; + } + + dev = &rte_eth_devices[param->port_id]; + priv = dev->data->dev_private; + + mp_init_msg(&mp_res, param->type, param->port_id); + + switch (param->type) { + case MANA_MP_REQ_VERBS_CMD_FD: + mp_res.num_fds = 1; + mp_res.fds[0] = priv->ib_ctx->cmd_fd; + res->result = 0; + ret = rte_mp_reply(&mp_res, peer); + break; + + default: + DRV_LOG(ERR, "Port %u unknown primary MP type %u", + param->port_id, param->type); + ret = -EINVAL; + } + + return ret; +} + +static int mana_mp_secondary_handle(const struct rte_mp_msg *mp_msg, + const void *peer) +{ + struct rte_mp_msg mp_res = { 0 }; + struct mana_mp_param *res = (struct mana_mp_param *)mp_res.param; + const struct mana_mp_param *param = + (const struct mana_mp_param *)mp_msg->param; + struct rte_eth_dev *dev; + int ret; + + if (!rte_eth_dev_is_valid_port(param->port_id)) { + DRV_LOG(ERR, "MP handle port ID %u invalid", param->port_id); + return -ENODEV; + } + + dev = &rte_eth_devices[param->port_id]; + + mp_init_msg(&mp_res, param->type, param->port_id); + + switch (param->type) { + case MANA_MP_REQ_START_RXTX: + DRV_LOG(INFO, "Port %u starting datapath", dev->data->port_id); + + rte_mb(); + + res->result = 0; + ret = rte_mp_reply(&mp_res, peer); + break; + + case MANA_MP_REQ_STOP_RXTX: + DRV_LOG(INFO, "Port %u stopping datapath", dev->data->port_id); + + dev->tx_pkt_burst = mana_tx_burst_removed; + dev->rx_pkt_burst = mana_rx_burst_removed; + + rte_mb(); + + res->result = 0; + ret = rte_mp_reply(&mp_res, peer); + break; + + default: + DRV_LOG(ERR, "Port %u unknown secondary MP type %u", + param->port_id, param->type); + ret = -EINVAL; + } + + return ret; +} + +int mana_mp_init_primary(void) +{ + int ret; + + ret = rte_mp_action_register(MANA_MP_NAME, mana_mp_primary_handle); + if (ret && rte_errno != ENOTSUP) { + DRV_LOG(ERR, "Failed to register primary handler %d %d", + ret, rte_errno); + return -1; + } + + return 0; +} + +void mana_mp_uninit_primary(void) +{ + rte_mp_action_unregister(MANA_MP_NAME); +} + +int mana_mp_init_secondary(void) +{ + return rte_mp_action_register(MANA_MP_NAME, mana_mp_secondary_handle); +} + +void mana_mp_uninit_secondary(void) +{ + rte_mp_action_unregister(MANA_MP_NAME); +} + +int mana_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev) +{ + struct rte_mp_msg mp_req = { 0 }; + struct rte_mp_msg *mp_res; + struct rte_mp_reply mp_rep; + struct mana_mp_param *res; + struct timespec ts = {.tv_sec = MANA_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0}; + int ret; + + mp_init_msg(&mp_req, MANA_MP_REQ_VERBS_CMD_FD, dev->data->port_id); + + ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts); + if (ret) { + DRV_LOG(ERR, "port %u request to primary process failed", + dev->data->port_id); + return ret; + } + + if (mp_rep.nb_received != 1) { + DRV_LOG(ERR, "primary replied %u messages", mp_rep.nb_received); + ret = -EPROTO; + goto exit; + } + + mp_res = &mp_rep.msgs[0]; + res = (struct mana_mp_param *)mp_res->param; + if (res->result) { + DRV_LOG(ERR, "failed to get CMD FD, port %u", + dev->data->port_id); + ret = res->result; + goto exit; + } + + if (mp_res->num_fds != 1) { + DRV_LOG(ERR, "got FDs %d unexpected", mp_res->num_fds); + ret = -EPROTO; + goto exit; + } + + ret = mp_res->fds[0]; + DRV_LOG(ERR, "port %u command FD from primary is %d", + dev->data->port_id, ret); +exit: + free(mp_rep.msgs); + return ret; +} + +void mana_mp_req_on_rxtx(struct rte_eth_dev *dev, enum mana_mp_req_type type) +{ + struct rte_mp_msg mp_req = { 0 }; + struct rte_mp_msg *mp_res; + struct rte_mp_reply mp_rep; + struct mana_mp_param *res; + struct timespec ts = {.tv_sec = MANA_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0}; + int i, ret; + + if (type != MANA_MP_REQ_START_RXTX && type != MANA_MP_REQ_STOP_RXTX) { + DRV_LOG(ERR, "port %u unknown request (req_type %d)", + dev->data->port_id, type); + return; + } + + if (!mana_shared_data->secondary_cnt) + return; + + mp_init_msg(&mp_req, type, dev->data->port_id); + + ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts); + if (ret) { + if (rte_errno != ENOTSUP) + DRV_LOG(ERR, "port %u failed to request Rx/Tx (%d)", + dev->data->port_id, type); + goto exit; + } + if (mp_rep.nb_sent != mp_rep.nb_received) { + DRV_LOG(ERR, "port %u not all secondaries responded (%d)", + dev->data->port_id, type); + goto exit; + } + for (i = 0; i < mp_rep.nb_received; i++) { + mp_res = &mp_rep.msgs[i]; + res = (struct mana_mp_param *)mp_res->param; + if (res->result) { + DRV_LOG(ERR, "port %u request failed on secondary %d", + dev->data->port_id, i); + goto exit; + } + } +exit: + free(mp_rep.msgs); +} diff --git a/drivers/net/mana/version.map b/drivers/net/mana/version.map new file mode 100644 index 0000000000..c2e0723b4c --- /dev/null +++ b/drivers/net/mana/version.map @@ -0,0 +1,3 @@ +DPDK_22 { + local: *; +}; diff --git a/drivers/net/meson.build b/drivers/net/meson.build index 2355d1cde8..0b111a6ebb 100644 --- a/drivers/net/meson.build +++ b/drivers/net/meson.build @@ -34,6 +34,7 @@ drivers = [ 'ixgbe', 'kni', 'liquidio', + 'mana', 'memif', 'mlx4', 'mlx5', From patchwork Fri Jul 1 09:02:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113605 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0EA1DA00C2; Fri, 1 Jul 2022 11:03:18 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 930AF4284D; Fri, 1 Jul 2022 11:03:06 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 5671042670 for ; Fri, 1 Jul 2022 11:03:04 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id ABB3D20D5906; Fri, 1 Jul 2022 02:03:03 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com ABB3D20D5906 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666183; bh=qv1V20I7DwO8wxob5/bIKItLq/37U9CyUtMkoulLG3s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=czFooDAvWeq2xRlintr+NY81PjBJw+LS0Sm56tFSxj1FmTu/6fj2Yoq2n18DYMtQh 3879l+0SvItyWpICegP/7yrsLoR2scFa0Qr8zP0zY472UqaQ16Pq0eDdlxhg0guX8r jBDafL+jA5JK6meiWoXs9a+z9E22JP6Jvy3LIfJw= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 02/17] net/mana: add device configuration and stop Date: Fri, 1 Jul 2022 02:02:32 -0700 Message-Id: <1656666167-26035-3-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li MANA defines its memory allocation functions to override IB layer default functions to allocate device queues. This patch adds the code for device configuration and stop. Signed-off-by: Long Li --- drivers/net/mana/mana.c | 87 ++++++++++++++++++++++++++++++++++++++++- drivers/net/mana/mana.h | 3 -- 2 files changed, 85 insertions(+), 5 deletions(-) diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 893e2b1e23..882a38d7df 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -57,7 +57,91 @@ static rte_spinlock_t mana_shared_data_lock = RTE_SPINLOCK_INITIALIZER; int mana_logtype_driver; int mana_logtype_init; +static void *mana_alloc_verbs_buf(size_t size, void *data) +{ + void *ret; + size_t alignment = rte_mem_page_size(); + int socket = (int)(uintptr_t)data; + + DRV_LOG(DEBUG, "size=%lu socket=%d", size, socket); + + if (alignment == (size_t)-1) { + DRV_LOG(ERR, "Failed to get mem page size"); + rte_errno = ENOMEM; + return NULL; + } + + ret = rte_zmalloc_socket("mana_verb_buf", size, alignment, socket); + if (!ret && size) + rte_errno = ENOMEM; + return ret; +} + +static void mana_free_verbs_buf(void *ptr, void *data __rte_unused) +{ + rte_free(ptr); +} + +static int mana_dev_configure(struct rte_eth_dev *dev) +{ + struct mana_priv *priv = dev->data->dev_private; + struct rte_eth_conf *dev_conf = &dev->data->dev_conf; + const struct rte_eth_rxmode *rxmode = &dev_conf->rxmode; + const struct rte_eth_txmode *txmode = &dev_conf->txmode; + + if (dev_conf->rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG) + dev_conf->rxmode.offloads |= DEV_RX_OFFLOAD_RSS_HASH; + + if (txmode->offloads & ~BNIC_DEV_TX_OFFLOAD_SUPPORT) { + DRV_LOG(ERR, "Unsupported TX offload: %lx", txmode->offloads); + return -EINVAL; + } + + if (rxmode->offloads & ~BNIC_DEV_RX_OFFLOAD_SUPPORT) { + DRV_LOG(ERR, "Unsupported RX offload: %lx", rxmode->offloads); + return -EINVAL; + } + + if (dev->data->nb_rx_queues != dev->data->nb_tx_queues) { + DRV_LOG(ERR, "Only support equal number of rx/tx queues"); + return -EINVAL; + } + + if (!rte_is_power_of_2(dev->data->nb_rx_queues)) { + DRV_LOG(ERR, "number of TX/RX queues must be power of 2"); + return -EINVAL; + } + + priv->num_queues = dev->data->nb_rx_queues; + + manadv_set_context_attr(priv->ib_ctx, MANADV_CTX_ATTR_BUF_ALLOCATORS, + (void *)((uintptr_t)&(struct manadv_ctx_allocators){ + .alloc = &mana_alloc_verbs_buf, + .free = &mana_free_verbs_buf, + .data = 0, + })); + + return 0; +} + +static int +mana_dev_close(struct rte_eth_dev *dev) +{ + struct mana_priv *priv = dev->data->dev_private; + int ret; + + ret = ibv_close_device(priv->ib_ctx); + if (ret) { + ret = errno; + return ret; + } + + return 0; +} + const struct eth_dev_ops mana_dev_ops = { + .dev_configure = mana_dev_configure, + .dev_close = mana_dev_close, }; const struct eth_dev_ops mana_dev_sec_ops = { @@ -652,8 +736,7 @@ static int mana_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, static int mana_dev_uninit(struct rte_eth_dev *dev) { - RTE_SET_USED(dev); - return 0; + return mana_dev_close(dev); } static int mana_pci_remove(struct rte_pci_device *pci_dev) diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index dbef5420ff..9609bee4de 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -177,9 +177,6 @@ uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); -void *mana_alloc_verbs_buf(size_t size, void *data); -void mana_free_verbs_buf(void *ptr, void *data); - /** Request timeout for IPC. */ #define MANA_MP_REQ_TIMEOUT_SEC 5 From patchwork Fri Jul 1 09:02:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113606 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BDC75A00C2; Fri, 1 Jul 2022 11:03:25 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9DD1642B6D; Fri, 1 Jul 2022 11:03:07 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id E1F2940E03 for ; Fri, 1 Jul 2022 11:03:04 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 4A8C220D590B; Fri, 1 Jul 2022 02:03:04 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 4A8C220D590B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666184; bh=siRoLBE3g5BfHQCjYYMpNeE8pcRb7wE93//jvCc98fo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=JhEN9Fl99xZ5KNQOMQgnaDWhjLTv+hp/uy1ph5jMv6hbbmAe5QFRjK6y/zy0Z+6hU nLzy+zhdIoYWWR9sSnhEj40+n2PyUlsQwC5LpyHqeF8fmI1YF5lSP2sWRK1upY1ga+ xCcilVHUJznU0N38TFPzYo5vo44ItsS0SyD8y4X4= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 03/17] net/mana: add function to report support ptypes Date: Fri, 1 Jul 2022 02:02:33 -0700 Message-Id: <1656666167-26035-4-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li Report supported protocol types. Signed-off-by: Long Li --- drivers/net/mana/mana.c | 16 ++++++++++++++++ drivers/net/mana/mana.h | 2 -- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 882a38d7df..77796ce40d 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -139,9 +139,25 @@ mana_dev_close(struct rte_eth_dev *dev) return 0; } +static const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev __rte_unused) +{ + static const uint32_t ptypes[] = { + RTE_PTYPE_L2_ETHER, + RTE_PTYPE_L3_IPV4_EXT_UNKNOWN, + RTE_PTYPE_L3_IPV6_EXT_UNKNOWN, + RTE_PTYPE_L4_FRAG, + RTE_PTYPE_L4_TCP, + RTE_PTYPE_L4_UDP, + RTE_PTYPE_UNKNOWN + }; + + return ptypes; +} + const struct eth_dev_ops mana_dev_ops = { .dev_configure = mana_dev_configure, .dev_close = mana_dev_close, + .dev_supported_ptypes_get = mana_supported_ptypes, }; const struct eth_dev_ops mana_dev_sec_ops = { diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index 9609bee4de..b0571a0516 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -169,8 +169,6 @@ extern int mana_logtype_init; #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>") -const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev); - uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); From patchwork Fri Jul 1 09:02:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113607 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8066DA00C2; Fri, 1 Jul 2022 11:03:32 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A349D42B70; Fri, 1 Jul 2022 11:03:08 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 72AB140E03 for ; Fri, 1 Jul 2022 11:03:05 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id CD2B320D5912; Fri, 1 Jul 2022 02:03:04 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com CD2B320D5912 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666184; bh=nSLvb+0DjWYrWW/L0i2WBgmdVNwNPaAHimpcowlC5PY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=s9GY1lhyyQ7Uk1jPm416fGsh1xFZ2jR5mqXCxmiwBcBgBJAsBQH6VN5+biKYeGdXy LXQq/zl/IARTK1RTcKHaRwS7SZL3CdGBHmXXtDOU5qWhYSJ0WT/5FhBTEtnxmvBy22 JGK7PuMbFaaS6Ltf/k/6P74FQFKdoEHtMtohqW48= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 04/17] net/mana: add link update Date: Fri, 1 Jul 2022 02:02:34 -0700 Message-Id: <1656666167-26035-5-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li The carrier state is managed by the Azure host. MANA runs as a VF and always reports UP. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 1 + drivers/net/mana/mana.c | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index 9d8676089b..b7e7cc510b 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -4,6 +4,7 @@ ; Refer to default.ini for the full list of available PMD features. ; [Features] +Link status = P Linux = Y Multiprocess aware = Y Usage doc = Y diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 77796ce40d..7b495e1aa1 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -154,10 +154,27 @@ static const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev __rte_unuse return ptypes; } +static int mana_dev_link_update(struct rte_eth_dev *dev, + int wait_to_complete __rte_unused) +{ + struct rte_eth_link link; + + /* MANA has no concept of carrier state, always reporting UP */ + link = (struct rte_eth_link) { + .link_duplex = RTE_ETH_LINK_FULL_DUPLEX, + .link_autoneg = RTE_ETH_LINK_SPEED_FIXED, + .link_speed = RTE_ETH_SPEED_NUM_200G, + .link_status = RTE_ETH_LINK_UP, + }; + + return rte_eth_linkstatus_set(dev, &link); +} + const struct eth_dev_ops mana_dev_ops = { .dev_configure = mana_dev_configure, .dev_close = mana_dev_close, .dev_supported_ptypes_get = mana_supported_ptypes, + .link_update = mana_dev_link_update, }; const struct eth_dev_ops mana_dev_sec_ops = { From patchwork Fri Jul 1 09:02:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113608 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 408BEA00C2; Fri, 1 Jul 2022 11:03:39 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9C5D642B76; Fri, 1 Jul 2022 11:03:09 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 9E47D427F3 for ; Fri, 1 Jul 2022 11:03:05 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 53B4120D590C; Fri, 1 Jul 2022 02:03:05 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 53B4120D590C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666185; bh=sOOFGULXqn01aaHOZL1D2d6Ji8akcy6e++3W4eJfRM8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=hLBTBE3rtsM0dyQbWq1DkCFcTbqKGBqpxs+dsGIP7RS2MUleCRSZQdcbRoqedhyKZ NCRZQeaKf/dv0ypIr8YnmcAWEeLzfIlRYEuKYnRUzKnRJidmMcQinTSIgc23A74Akv Mr+6DRBfiY9jgFuu+hM6b6NOnl2u0+9pfTf7xeZ8= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 05/17] net/mana: add function for device removal interrupts Date: Fri, 1 Jul 2022 02:02:35 -0700 Message-Id: <1656666167-26035-6-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li MANA supports PCI hot plug events. Add this interrupt to DPDK core so its parent PMD can detect device removal during Azure servicing or live migration. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 1 + drivers/net/mana/mana.c | 97 +++++++++++++++++++++++++++++++ drivers/net/mana/mana.h | 1 + 3 files changed, 99 insertions(+) diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index b7e7cc510b..47e20754eb 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -7,5 +7,6 @@ Link status = P Linux = Y Multiprocess aware = Y +Removal event = Y Usage doc = Y x86-64 = Y diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 7b495e1aa1..f03908b6e4 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -124,12 +124,18 @@ static int mana_dev_configure(struct rte_eth_dev *dev) return 0; } +static int mana_intr_uninstall(struct mana_priv *priv); + static int mana_dev_close(struct rte_eth_dev *dev) { struct mana_priv *priv = dev->data->dev_private; int ret; + ret = mana_intr_uninstall(priv); + if (ret) + return ret; + ret = ibv_close_device(priv->ib_ctx); if (ret) { ret = errno; @@ -364,6 +370,90 @@ static int mana_ibv_device_to_pci_addr(const struct ibv_device *device, return 0; } +static void mana_intr_handler(void *arg) +{ + struct mana_priv *priv = arg; + struct ibv_context *ctx = priv->ib_ctx; + struct ibv_async_event event; + + /* Read and ack all messages from IB device */ + while (true) { + if (ibv_get_async_event(ctx, &event)) + break; + + if (event.event_type == IBV_EVENT_DEVICE_FATAL) { + struct rte_eth_dev *dev; + + dev = &rte_eth_devices[priv->port_id]; + if (dev->data->dev_conf.intr_conf.rmv) + rte_eth_dev_callback_process(dev, + RTE_ETH_EVENT_INTR_RMV, NULL); + } + + ibv_ack_async_event(&event); + } +} + +static int mana_intr_uninstall(struct mana_priv *priv) +{ + int ret; + + ret = rte_intr_callback_unregister(priv->intr_handle, + mana_intr_handler, priv); + if (ret <= 0) { + DRV_LOG(ERR, "Failed to unregister intr callback ret %d", ret); + return ret; + } + + rte_intr_instance_free(priv->intr_handle); + + return 0; +} + +static int mana_intr_install(struct mana_priv *priv) +{ + int ret, flags; + struct ibv_context *ctx = priv->ib_ctx; + + priv->intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_SHARED); + if (!priv->intr_handle) { + DRV_LOG(ERR, "Failed to allocate intr_handle"); + rte_errno = ENOMEM; + return -ENOMEM; + } + + rte_intr_fd_set(priv->intr_handle, -1); + + flags = fcntl(ctx->async_fd, F_GETFL); + ret = fcntl(ctx->async_fd, F_SETFL, flags | O_NONBLOCK); + if (ret) { + DRV_LOG(ERR, "Failed to change async_fd to NONBLOCK"); + goto free_intr; + } + + rte_intr_fd_set(priv->intr_handle, ctx->async_fd); + rte_intr_type_set(priv->intr_handle, RTE_INTR_HANDLE_EXT); + + ret = rte_intr_callback_register(priv->intr_handle, + mana_intr_handler, priv); + if (ret) { + DRV_LOG(ERR, "Failed to register intr callback"); + rte_intr_fd_set(priv->intr_handle, -1); + goto restore_fd; + } + + return 0; + +restore_fd: + fcntl(ctx->async_fd, F_SETFL, flags); + +free_intr: + rte_intr_instance_free(priv->intr_handle); + priv->intr_handle = NULL; + + return ret; +} + static int mana_proc_priv_init(struct rte_eth_dev *dev) { struct mana_process_priv *priv; @@ -677,6 +767,13 @@ static int mana_pci_probe_mac(struct rte_pci_driver *pci_drv __rte_unused, name, priv->max_rx_queues, priv->max_rx_desc, priv->max_send_sge); + /* Create async interrupt handler */ + ret = mana_intr_install(priv); + if (ret) { + DRV_LOG(ERR, "Failed to install intr handler"); + goto failed; + } + rte_spinlock_lock(&mana_shared_data->lock); mana_shared_data->primary_cnt++; rte_spinlock_unlock(&mana_shared_data->lock); diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index b0571a0516..71c82e4bd2 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -72,6 +72,7 @@ struct mana_priv { uint8_t ind_table_key[40]; struct ibv_qp *rwq_qp; void *db_page; + struct rte_intr_handle *intr_handle; int max_rx_queues; int max_tx_queues; int max_rx_desc; From patchwork Fri Jul 1 09:02:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113609 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A7725A00C2; Fri, 1 Jul 2022 11:03:45 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AB0D542B7A; Fri, 1 Jul 2022 11:03:10 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 234EE40E03 for ; Fri, 1 Jul 2022 11:03:06 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id CCB8720D5914; Fri, 1 Jul 2022 02:03:05 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com CCB8720D5914 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666185; bh=3pp7hrAz8Su3/Ry/DJw0icrKW2wGDgCsJ7oxls7Gnw4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=eetpkgkrpwYyWBXVTlrNDsLWNYeErdPQv0IFoS5uUOoK36+jxs/tZe1rCKwpR4NlM 7wdGgoEf4UcAlKeo6hsKBMaj1Xcyki8q0+zsq/BWoaHNjnOUn52gEDQ+2S1CDk29KP 8vl7eoFtIgxszZzxPZzyc4PgYn3HS1/NilrljeQU= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 06/17] net/mana: add device info Date: Fri, 1 Jul 2022 02:02:36 -0700 Message-Id: <1656666167-26035-7-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li Add the function to get device info. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 1 + drivers/net/mana/mana.c | 82 +++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+) diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index 47e20754eb..5183c6d3d0 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -8,5 +8,6 @@ Link status = P Linux = Y Multiprocess aware = Y Removal event = Y +Speed capabilities = P Usage doc = Y x86-64 = Y diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index f03908b6e4..1513d5904b 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -145,6 +145,86 @@ mana_dev_close(struct rte_eth_dev *dev) return 0; } +static int mana_dev_info_get(struct rte_eth_dev *dev, + struct rte_eth_dev_info *dev_info) +{ + struct mana_priv *priv = dev->data->dev_private; + + dev_info->max_mtu = RTE_ETHER_MTU; + + /* RX params */ + dev_info->min_rx_bufsize = MIN_RX_BUF_SIZE; + dev_info->max_rx_pktlen = MAX_FRAME_SIZE; + + dev_info->max_rx_queues = priv->max_rx_queues; + dev_info->max_tx_queues = priv->max_tx_queues; + + dev_info->max_mac_addrs = BNIC_MAX_MAC_ADDR; + dev_info->max_hash_mac_addrs = 0; + + dev_info->max_vfs = 1; + + /* Offload params */ + dev_info->rx_offload_capa = BNIC_DEV_RX_OFFLOAD_SUPPORT; + + dev_info->tx_offload_capa = BNIC_DEV_TX_OFFLOAD_SUPPORT; + + /* RSS */ + dev_info->reta_size = INDIRECTION_TABLE_NUM_ELEMENTS; + dev_info->hash_key_size = TOEPLITZ_HASH_KEY_SIZE_IN_BYTES; + dev_info->flow_type_rss_offloads = BNIC_ETH_RSS_SUPPORT; + + /* Thresholds */ + dev_info->default_rxconf = (struct rte_eth_rxconf){ + .rx_thresh = { + .pthresh = 8, + .hthresh = 8, + .wthresh = 0, + }, + .rx_free_thresh = 32, + /* If no descriptors available, pkts are dropped by default */ + .rx_drop_en = 1, + }; + + dev_info->default_txconf = (struct rte_eth_txconf){ + .tx_thresh = { + .pthresh = 32, + .hthresh = 0, + .wthresh = 0, + }, + .tx_rs_thresh = 32, + .tx_free_thresh = 32, + }; + + /* Buffer limits */ + dev_info->rx_desc_lim.nb_min = MIN_BUFFERS_PER_QUEUE; + dev_info->rx_desc_lim.nb_max = priv->max_rx_desc; + dev_info->rx_desc_lim.nb_align = MIN_BUFFERS_PER_QUEUE; + dev_info->rx_desc_lim.nb_seg_max = priv->max_recv_sge; + dev_info->rx_desc_lim.nb_mtu_seg_max = priv->max_recv_sge; + + dev_info->tx_desc_lim.nb_min = MIN_BUFFERS_PER_QUEUE; + dev_info->tx_desc_lim.nb_max = priv->max_tx_desc; + dev_info->tx_desc_lim.nb_align = MIN_BUFFERS_PER_QUEUE; + dev_info->tx_desc_lim.nb_seg_max = priv->max_send_sge; + dev_info->rx_desc_lim.nb_mtu_seg_max = priv->max_recv_sge; + + /* Speed */ + dev_info->speed_capa = ETH_LINK_SPEED_100G; + + /* RX params */ + dev_info->default_rxportconf.burst_size = 1; + dev_info->default_rxportconf.ring_size = MAX_RECEIVE_BUFFERS_PER_QUEUE; + dev_info->default_rxportconf.nb_queues = 1; + + /* TX params */ + dev_info->default_txportconf.burst_size = 1; + dev_info->default_txportconf.ring_size = MAX_SEND_BUFFERS_PER_QUEUE; + dev_info->default_txportconf.nb_queues = 1; + + return 0; +} + static const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev __rte_unused) { static const uint32_t ptypes[] = { @@ -179,11 +259,13 @@ static int mana_dev_link_update(struct rte_eth_dev *dev, const struct eth_dev_ops mana_dev_ops = { .dev_configure = mana_dev_configure, .dev_close = mana_dev_close, + .dev_infos_get = mana_dev_info_get, .dev_supported_ptypes_get = mana_supported_ptypes, .link_update = mana_dev_link_update, }; const struct eth_dev_ops mana_dev_sec_ops = { + .dev_infos_get = mana_dev_info_get, }; uint16_t From patchwork Fri Jul 1 09:02:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113610 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D64C1A00C2; Fri, 1 Jul 2022 11:03:51 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AE6E342B7E; Fri, 1 Jul 2022 11:03:11 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 9C29040E03 for ; Fri, 1 Jul 2022 11:03:06 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 4F9BA20D591C; Fri, 1 Jul 2022 02:03:06 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 4F9BA20D591C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666186; bh=7tCQjosm4uPKsQI/WQdrJsowRPBjqczZ9NQfBRyaxoM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=XKbl6u9Y/6ytmKzq1el3yV5scdmOPy5yK0Ji/ZX1gW6qgCii6f9xbHpAFD3Ja4+X0 4cdO3fQ2KX2krGA0NSFmb2SQgz8y5fZIZUCJN9EBMmcs+lwYdl/Ccl9hkiLIgi69fw 3FaJPDUy3tkbzQnef+lpyAhMjj6pe4f3VzflwH1E= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 07/17] net/mana: add function to configure RSS Date: Fri, 1 Jul 2022 02:02:37 -0700 Message-Id: <1656666167-26035-8-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li Currently this PMD supports RSS configuration when the device is stopped. Configuring RSS in running state will be supported in the future. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 1 + drivers/net/mana/mana.c | 61 +++++++++++++++++++++++++++++++ drivers/net/mana/mana.h | 1 + 3 files changed, 63 insertions(+) diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index 5183c6d3d0..9ba4767a06 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -8,6 +8,7 @@ Link status = P Linux = Y Multiprocess aware = Y Removal event = Y +RSS hash = Y Speed capabilities = P Usage doc = Y x86-64 = Y diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 1513d5904b..46b1d5502d 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -240,6 +240,65 @@ static const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev __rte_unuse return ptypes; } +static int mana_rss_hash_update(struct rte_eth_dev *dev, + struct rte_eth_rss_conf *rss_conf) +{ + struct mana_priv *priv = dev->data->dev_private; + + /* Currently can only update RSS hash when device is stopped */ + if (dev->data->dev_started) { + DRV_LOG(ERR, "Can't update RSS after device has started"); + return -ENODEV; + } + + if (rss_conf->rss_hf & ~BNIC_ETH_RSS_SUPPORT) { + DRV_LOG(ERR, "Port %u invalid RSS HF 0x%lx", + dev->data->port_id, rss_conf->rss_hf); + return -EINVAL; + } + + if (rss_conf->rss_key && rss_conf->rss_key_len) { + if (rss_conf->rss_key_len != TOEPLITZ_HASH_KEY_SIZE_IN_BYTES) { + DRV_LOG(ERR, "Port %u key len must be %u long", + dev->data->port_id, + TOEPLITZ_HASH_KEY_SIZE_IN_BYTES); + return -EINVAL; + } + + priv->rss_conf.rss_key_len = rss_conf->rss_key_len; + priv->rss_conf.rss_key = + rte_zmalloc("mana_rss", rss_conf->rss_key_len, + RTE_CACHE_LINE_SIZE); + if (!priv->rss_conf.rss_key) + return -ENOMEM; + memcpy(priv->rss_conf.rss_key, rss_conf->rss_key, + rss_conf->rss_key_len); + } + priv->rss_conf.rss_hf = rss_conf->rss_hf; + + return 0; +} + +static int mana_rss_hash_conf_get(struct rte_eth_dev *dev, + struct rte_eth_rss_conf *rss_conf) +{ + struct mana_priv *priv = dev->data->dev_private; + + if (!rss_conf) + return -EINVAL; + + if (rss_conf->rss_key && + rss_conf->rss_key_len >= priv->rss_conf.rss_key_len) { + memcpy(rss_conf->rss_key, priv->rss_conf.rss_key, + priv->rss_conf.rss_key_len); + } + + rss_conf->rss_key_len = priv->rss_conf.rss_key_len; + rss_conf->rss_hf = priv->rss_conf.rss_hf; + + return 0; +} + static int mana_dev_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused) { @@ -261,6 +320,8 @@ const struct eth_dev_ops mana_dev_ops = { .dev_close = mana_dev_close, .dev_infos_get = mana_dev_info_get, .dev_supported_ptypes_get = mana_supported_ptypes, + .rss_hash_update = mana_rss_hash_update, + .rss_hash_conf_get = mana_rss_hash_conf_get, .link_update = mana_dev_link_update, }; diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index 71c82e4bd2..1efb2330ee 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -72,6 +72,7 @@ struct mana_priv { uint8_t ind_table_key[40]; struct ibv_qp *rwq_qp; void *db_page; + struct rte_eth_rss_conf rss_conf; struct rte_intr_handle *intr_handle; int max_rx_queues; int max_tx_queues; From patchwork Fri Jul 1 09:02:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113611 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F2E4A00C2; Fri, 1 Jul 2022 11:03:58 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AADB842B83; Fri, 1 Jul 2022 11:03:12 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 2283542B6D for ; Fri, 1 Jul 2022 11:03:07 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id D1ED720D5915; Fri, 1 Jul 2022 02:03:06 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com D1ED720D5915 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666186; bh=JUkJeMzHXnL6mrtRUzV1XIcdwzNRr6z06X6s5glvoL0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=eXViHTQzVLa55y2za0b7lB+WT9RodKIw+pC+evCV11ioVmk7hTHKZliwhPvYpOiWl EGyN5qFV8vS7ILaYS+A1qbJIOiWzKYT5VQ+3+nQJtsTDa7CK1ZnDikj8jdgiIFwwbX YxOWNxq8ASkbgqp3KI3mVaSSsMN2Pkw6cAUm21DY= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 08/17] net/mana: add function to configure RX queues Date: Fri, 1 Jul 2022 02:02:38 -0700 Message-Id: <1656666167-26035-9-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li RX hardware queue is allocated when starting the queue. This function is for queue configuration pre starting. Signed-off-by: Long Li --- drivers/net/mana/mana.c | 68 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 46b1d5502d..951fc418b6 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -225,6 +225,16 @@ static int mana_dev_info_get(struct rte_eth_dev *dev, return 0; } +static void mana_dev_rx_queue_info(struct rte_eth_dev *dev, uint16_t queue_id, + struct rte_eth_rxq_info *qinfo) +{ + struct mana_rxq *rxq = dev->data->rx_queues[queue_id]; + + qinfo->mp = rxq->mp; + qinfo->nb_desc = rxq->num_desc; + qinfo->conf.offloads = dev->data->dev_conf.rxmode.offloads; +} + static const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev __rte_unused) { static const uint32_t ptypes[] = { @@ -299,6 +309,61 @@ static int mana_rss_hash_conf_get(struct rte_eth_dev *dev, return 0; } +static int mana_dev_rx_queue_setup(struct rte_eth_dev *dev, + uint16_t queue_idx, uint16_t nb_desc, + unsigned int socket_id, + const struct rte_eth_rxconf *rx_conf __rte_unused, + struct rte_mempool *mp) +{ + struct mana_priv *priv = dev->data->dev_private; + struct mana_rxq *rxq; + int ret; + + rxq = rte_zmalloc_socket("mana_rxq", sizeof(*rxq), 0, socket_id); + if (!rxq) { + DRV_LOG(ERR, "failed to allocate rxq"); + return -ENOMEM; + } + + DRV_LOG(DEBUG, "idx %u nb_desc %u socket %u", + queue_idx, nb_desc, socket_id); + + rxq->socket = socket_id; + + rxq->desc_ring = rte_zmalloc_socket("mana_rx_mbuf_ring", + sizeof(struct mana_rxq_desc) * + nb_desc, + RTE_CACHE_LINE_SIZE, socket_id); + + if (!rxq->desc_ring) { + DRV_LOG(ERR, "failed to allocate rxq desc_ring"); + ret = -ENOMEM; + goto fail; + } + + rxq->num_desc = nb_desc; + + rxq->priv = priv; + rxq->num_desc = nb_desc; + rxq->mp = mp; + dev->data->rx_queues[queue_idx] = rxq; + + return 0; + +fail: + rte_free(rxq->desc_ring); + rte_free(rxq); + return ret; +} + +static void mana_dev_rx_queue_release(struct rte_eth_dev *dev, uint16_t qid) +{ + struct mana_rxq *rxq = dev->data->rx_queues[qid]; + + rte_free(rxq->desc_ring); + rte_free(rxq); +} + static int mana_dev_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused) { @@ -319,9 +384,12 @@ const struct eth_dev_ops mana_dev_ops = { .dev_configure = mana_dev_configure, .dev_close = mana_dev_close, .dev_infos_get = mana_dev_info_get, + .rxq_info_get = mana_dev_rx_queue_info, .dev_supported_ptypes_get = mana_supported_ptypes, .rss_hash_update = mana_rss_hash_update, .rss_hash_conf_get = mana_rss_hash_conf_get, + .rx_queue_setup = mana_dev_rx_queue_setup, + .rx_queue_release = mana_dev_rx_queue_release, .link_update = mana_dev_link_update, }; From patchwork Fri Jul 1 09:02:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113612 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 39C90A00C2; Fri, 1 Jul 2022 11:04:04 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A80C142B87; Fri, 1 Jul 2022 11:03:13 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 9456F42670 for ; Fri, 1 Jul 2022 11:03:07 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 4F6EF20D591D; Fri, 1 Jul 2022 02:03:07 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 4F6EF20D591D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666187; bh=Q5A9vfJ4RQe/BLRln+1enS2JMyIoL7pgEqPcDTpJS+o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=IfDLTHIla61TuGGVT7jj1iQczE5VoVBOEGwONZhxz7FAayclVrU0aLLRV5tS5oKCv qM9NeyiExFpAASkqOYF65khTYXvIIwdb9YxTFt/ls7mFAbiwhtS1N0WEXieHaLGZ9I gZjuQ2x5vtITi21HVlDf/VVfRuSixMeSG8vN2HPc= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 09/17] net/mana: add function to configure TX queues Date: Fri, 1 Jul 2022 02:02:39 -0700 Message-Id: <1656666167-26035-10-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li TX hardware queue is allocated when starting the queue, this is for pre configuration. Signed-off-by: Long Li --- drivers/net/mana/mana.c | 65 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 951fc418b6..6b1c3ee035 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -225,6 +225,15 @@ static int mana_dev_info_get(struct rte_eth_dev *dev, return 0; } +static void mana_dev_tx_queue_info(struct rte_eth_dev *dev, uint16_t queue_id, + struct rte_eth_txq_info *qinfo) +{ + struct mana_txq *txq = dev->data->tx_queues[queue_id]; + + qinfo->conf.offloads = dev->data->dev_conf.txmode.offloads; + qinfo->nb_desc = txq->num_desc; +} + static void mana_dev_rx_queue_info(struct rte_eth_dev *dev, uint16_t queue_id, struct rte_eth_rxq_info *qinfo) { @@ -309,6 +318,59 @@ static int mana_rss_hash_conf_get(struct rte_eth_dev *dev, return 0; } +static int mana_dev_tx_queue_setup(struct rte_eth_dev *dev, + uint16_t queue_idx, uint16_t nb_desc, + unsigned int socket_id, + const struct rte_eth_txconf *tx_conf __rte_unused) + +{ + struct mana_priv *priv = dev->data->dev_private; + struct mana_txq *txq; + int ret; + + txq = rte_zmalloc_socket("mana_txq", sizeof(*txq), 0, socket_id); + if (!txq) { + DRV_LOG(ERR, "failed to allocate txq"); + return -ENOMEM; + } + + txq->socket = socket_id; + + txq->desc_ring = rte_malloc_socket("mana_tx_desc_ring", + sizeof(struct mana_txq_desc) * + nb_desc, + RTE_CACHE_LINE_SIZE, socket_id); + if (!txq->desc_ring) { + DRV_LOG(ERR, "failed to allocate txq desc_ring"); + ret = -ENOMEM; + goto fail; + } + + DRV_LOG(DEBUG, "idx %u nb_desc %u socket %u txq->desc_ring %p", + queue_idx, nb_desc, socket_id, txq->desc_ring); + + txq->desc_ring_head = 0; + txq->desc_ring_tail = 0; + txq->priv = priv; + txq->num_desc = nb_desc; + dev->data->tx_queues[queue_idx] = txq; + + return 0; + +fail: + rte_free(txq->desc_ring); + rte_free(txq); + return ret; +} + +static void mana_dev_tx_queue_release(struct rte_eth_dev *dev, uint16_t qid) +{ + struct mana_txq *txq = dev->data->tx_queues[qid]; + + rte_free(txq->desc_ring); + rte_free(txq); +} + static int mana_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx, uint16_t nb_desc, unsigned int socket_id, @@ -384,10 +446,13 @@ const struct eth_dev_ops mana_dev_ops = { .dev_configure = mana_dev_configure, .dev_close = mana_dev_close, .dev_infos_get = mana_dev_info_get, + .txq_info_get = mana_dev_tx_queue_info, .rxq_info_get = mana_dev_rx_queue_info, .dev_supported_ptypes_get = mana_supported_ptypes, .rss_hash_update = mana_rss_hash_update, .rss_hash_conf_get = mana_rss_hash_conf_get, + .tx_queue_setup = mana_dev_tx_queue_setup, + .tx_queue_release = mana_dev_tx_queue_release, .rx_queue_setup = mana_dev_rx_queue_setup, .rx_queue_release = mana_dev_rx_queue_release, .link_update = mana_dev_link_update, From patchwork Fri Jul 1 09:02:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113613 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 29CD4A00C2; Fri, 1 Jul 2022 11:04:10 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AAA9B42B8B; Fri, 1 Jul 2022 11:03:14 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 1B8F242B70 for ; Fri, 1 Jul 2022 11:03:08 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id CAF6420D5F19; Fri, 1 Jul 2022 02:03:07 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com CAF6420D5F19 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666187; bh=OakLvaiZ0cpGTchFe8JoVLtbG1SZ9wqLH/FypIzB5zo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=ZczvL/N3i+do+GP4IRro/DmMv6LCQ4XJ4h/aM72YsdlbZ1yv8zbORImYxyPYCK4iY 5sS0w9qItfbF6tqzR9lBXr+ZHDmO3FCUNDAPAuD4ecckAd1u6qhkWwkDtGG7kCQlnB wAxhG+H77C0+/lVCSaXJQauq5TzRtLcfq9q/bPkw= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 10/17] net/mana: implement memory registration Date: Fri, 1 Jul 2022 02:02:40 -0700 Message-Id: <1656666167-26035-11-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li MANA hardware has iommu built-in, that provides hardware safe access to user memory through memory registration. Since memory registration is an expensive operation, this patch implements a two level memory registartion cache mechanisum for each queue and for each port. Signed-off-by: Long Li --- drivers/net/mana/mana.c | 20 +++ drivers/net/mana/mana.h | 38 ++++ drivers/net/mana/meson.build | 1 + drivers/net/mana/mp.c | 85 +++++++++ drivers/net/mana/mr.c | 339 +++++++++++++++++++++++++++++++++++ 5 files changed, 483 insertions(+) create mode 100644 drivers/net/mana/mr.c diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 6b1c3ee035..6c8983cd6a 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -132,6 +132,8 @@ mana_dev_close(struct rte_eth_dev *dev) struct mana_priv *priv = dev->data->dev_private; int ret; + remove_all_mr(priv); + ret = mana_intr_uninstall(priv); if (ret) return ret; @@ -346,6 +348,13 @@ static int mana_dev_tx_queue_setup(struct rte_eth_dev *dev, goto fail; } + ret = mana_mr_btree_init(&txq->mr_btree, + MANA_MR_BTREE_PER_QUEUE_N, 0); + if (ret) { + DRV_LOG(ERR, "Failed to init TXQ MR btree"); + goto fail; + } + DRV_LOG(DEBUG, "idx %u nb_desc %u socket %u txq->desc_ring %p", queue_idx, nb_desc, socket_id, txq->desc_ring); @@ -367,6 +376,8 @@ static void mana_dev_tx_queue_release(struct rte_eth_dev *dev, uint16_t qid) { struct mana_txq *txq = dev->data->tx_queues[qid]; + mana_mr_btree_free(&txq->mr_btree); + rte_free(txq->desc_ring); rte_free(txq); } @@ -403,6 +414,13 @@ static int mana_dev_rx_queue_setup(struct rte_eth_dev *dev, goto fail; } + ret = mana_mr_btree_init(&rxq->mr_btree, + MANA_MR_BTREE_PER_QUEUE_N, socket_id); + if (ret) { + DRV_LOG(ERR, "Failed to init RXQ MR btree"); + goto fail; + } + rxq->num_desc = nb_desc; rxq->priv = priv; @@ -422,6 +440,8 @@ static void mana_dev_rx_queue_release(struct rte_eth_dev *dev, uint16_t qid) { struct mana_rxq *rxq = dev->data->rx_queues[qid]; + mana_mr_btree_free(&rxq->mr_btree); + rte_free(rxq->desc_ring); rte_free(rxq); } diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index 1efb2330ee..b1ef9ce60b 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -50,6 +50,22 @@ struct mana_shared_data { #define MAX_RECEIVE_BUFFERS_PER_QUEUE 256 #define MAX_SEND_BUFFERS_PER_QUEUE 256 +struct mana_mr_cache { + uint32_t lkey; + uintptr_t addr; + size_t len; + void *verb_obj; +}; + +#define MANA_MR_BTREE_CACHE_N 512 +struct mana_mr_btree { + uint16_t len; /* Used entries */ + uint16_t size; /* Total entries */ + int overflow; + int socket; + struct mana_mr_cache *table; +}; + struct mana_process_priv { void *db_page; }; @@ -82,6 +98,7 @@ struct mana_priv { int max_recv_sge; int max_mr; uint64_t max_mr_size; + struct mana_mr_btree mr_btree; rte_rwlock_t mr_list_lock; }; @@ -132,6 +149,7 @@ struct mana_txq { uint32_t desc_ring_head, desc_ring_tail; struct mana_stats stats; + struct mana_mr_btree mr_btree; unsigned int socket; }; @@ -154,6 +172,7 @@ struct mana_rxq { struct mana_gdma_queue gdma_cq; struct mana_stats stats; + struct mana_mr_btree mr_btree; unsigned int socket; }; @@ -177,6 +196,24 @@ uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); +struct mana_mr_cache *find_pmd_mr(struct mana_mr_btree *local_tree, + struct mana_priv *priv, + struct rte_mbuf *mbuf); +int new_pmd_mr(struct mana_mr_btree *local_tree, struct mana_priv *priv, + struct rte_mempool *pool); +void remove_all_mr(struct mana_priv *priv); +void del_pmd_mr(struct mana_mr_cache *mr); + +void mana_mempool_chunk_cb(struct rte_mempool *mp, void *opaque, + struct rte_mempool_memhdr *memhdr, unsigned int idx); + +struct mana_mr_cache *mana_mr_btree_lookup(struct mana_mr_btree *bt, + uint16_t *idx, + uintptr_t addr, size_t len); +int mana_mr_btree_insert(struct mana_mr_btree *bt, struct mana_mr_cache *entry); +int mana_mr_btree_init(struct mana_mr_btree *bt, int n, int socket); +void mana_mr_btree_free(struct mana_mr_btree *bt); + /** Request timeout for IPC. */ #define MANA_MP_REQ_TIMEOUT_SEC 5 @@ -205,6 +242,7 @@ int mana_mp_init_secondary(void); void mana_mp_uninit_primary(void); void mana_mp_uninit_secondary(void); int mana_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev); +int mana_mp_req_mr_create(struct mana_priv *priv, uintptr_t addr, uint32_t len); void mana_mp_req_on_rxtx(struct rte_eth_dev *dev, enum mana_mp_req_type type); diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build index 7ab34c253c..fc0dbaabb3 100644 --- a/drivers/net/mana/meson.build +++ b/drivers/net/mana/meson.build @@ -11,6 +11,7 @@ deps += ['pci', 'bus_pci', 'net', 'eal', 'kvargs'] sources += files( 'mana.c', + 'mr.c', 'mp.c', ) diff --git a/drivers/net/mana/mp.c b/drivers/net/mana/mp.c index b2f5f7ab49..9cb3c09d32 100644 --- a/drivers/net/mana/mp.c +++ b/drivers/net/mana/mp.c @@ -34,6 +34,52 @@ extern struct mana_shared_data *mana_shared_data; +static int mana_mp_mr_create(struct mana_priv *priv, uintptr_t addr, + uint32_t len) +{ + struct ibv_mr *ibv_mr; + int ret; + struct mana_mr_cache *mr; + + ibv_mr = ibv_reg_mr(priv->ib_pd, (void *)addr, len, + IBV_ACCESS_LOCAL_WRITE); + + if (!ibv_mr) + return -errno; + + DRV_LOG(DEBUG, "MR (2nd) lkey %u addr %px len 0x%lx", + ibv_mr->lkey, ibv_mr->addr, ibv_mr->length); + + mr = rte_calloc("MANA MR", 1, sizeof(*mr), 0); + if (!mr) { + DRV_LOG(ERR, "(2nd) Failed to allocate MR"); + ret = -ENOMEM; + goto fail_alloc; + } + mr->lkey = ibv_mr->lkey; + mr->addr = (uintptr_t)ibv_mr->addr; + mr->len = ibv_mr->length; + mr->verb_obj = ibv_mr; + + rte_rwlock_write_lock(&priv->mr_list_lock); + ret = mana_mr_btree_insert(&priv->mr_btree, mr); + rte_rwlock_write_unlock(&priv->mr_list_lock); + if (ret) { + DRV_LOG(ERR, "(2nd) Failed to add to global MR btree"); + goto fail_btree; + } + + return 0; + +fail_btree: + rte_free(mr); + +fail_alloc: + ibv_dereg_mr(ibv_mr); + + return ret; +} + static void mp_init_msg(struct rte_mp_msg *msg, enum mana_mp_req_type type, int port_id) { @@ -69,6 +115,12 @@ static int mana_mp_primary_handle(const struct rte_mp_msg *mp_msg, mp_init_msg(&mp_res, param->type, param->port_id); switch (param->type) { + case MANA_MP_REQ_CREATE_MR: + ret = mana_mp_mr_create(priv, param->addr, param->len); + res->result = ret; + ret = rte_mp_reply(&mp_res, peer); + break; + case MANA_MP_REQ_VERBS_CMD_FD: mp_res.num_fds = 1; mp_res.fds[0] = priv->ib_ctx->cmd_fd; @@ -211,6 +263,39 @@ int mana_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev) return ret; } +int mana_mp_req_mr_create(struct mana_priv *priv, uintptr_t addr, uint32_t len) +{ + struct rte_mp_msg mp_req = { 0 }; + struct rte_mp_msg *mp_res; + struct rte_mp_reply mp_rep; + struct mana_mp_param *req = (struct mana_mp_param *)mp_req.param; + struct mana_mp_param *res; + struct timespec ts = {.tv_sec = MANA_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0}; + int ret; + + mp_init_msg(&mp_req, MANA_MP_REQ_CREATE_MR, priv->port_id); + req->addr = addr; + req->len = len; + + ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts); + if (ret) { + DRV_LOG(ERR, "Port %u request to primary failed", + req->port_id); + return ret; + } + + if (mp_rep.nb_received != 1) + return -EPROTO; + + mp_res = &mp_rep.msgs[0]; + res = (struct mana_mp_param *)mp_res->param; + ret = res->result; + + free(mp_rep.msgs); + + return ret; +} + void mana_mp_req_on_rxtx(struct rte_eth_dev *dev, enum mana_mp_req_type type) { struct rte_mp_msg mp_req = { 0 }; diff --git a/drivers/net/mana/mr.c b/drivers/net/mana/mr.c new file mode 100644 index 0000000000..926b3a6ebc --- /dev/null +++ b/drivers/net/mana/mr.c @@ -0,0 +1,339 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2022 Microsoft Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "mana.h" + +struct mana_range { + uintptr_t start; + uintptr_t end; + uint32_t len; +}; + +void mana_mempool_chunk_cb(struct rte_mempool *mp __rte_unused, void *opaque, + struct rte_mempool_memhdr *memhdr, unsigned int idx) +{ + struct mana_range *ranges = opaque; + struct mana_range *range = &ranges[idx]; + uint64_t page_size = rte_mem_page_size(); + + range->start = RTE_ALIGN_FLOOR((uintptr_t)memhdr->addr, page_size); + range->end = RTE_ALIGN_CEIL((uintptr_t)memhdr->addr + memhdr->len, + page_size); + range->len = range->end - range->start; +} + +int new_pmd_mr(struct mana_mr_btree *local_tree, struct mana_priv *priv, + struct rte_mempool *pool) +{ + struct ibv_mr *ibv_mr; + struct mana_range ranges[pool->nb_mem_chunks]; + uint32_t i; + struct mana_mr_cache *mr; + int ret; + + rte_mempool_mem_iter(pool, mana_mempool_chunk_cb, ranges); + + for (i = 0; i < pool->nb_mem_chunks; i++) { + if (ranges[i].len > priv->max_mr_size) { + DRV_LOG(ERR, "memory chunk size %u exceeding max MR\n", + ranges[i].len); + return -ENOMEM; + } + + DRV_LOG(DEBUG, "registering memory chunk start 0x%lx len 0x%x", + ranges[i].start, ranges[i].len); + + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { + /* Send a message to the primary to do MR */ + ret = mana_mp_req_mr_create(priv, ranges[i].start, + ranges[i].len); + if (ret) { + DRV_LOG(ERR, "MR failed start 0x%lx len 0x%x", + ranges[i].start, ranges[i].len); + return ret; + } + continue; + } + + ibv_mr = ibv_reg_mr(priv->ib_pd, (void *)ranges[i].start, + ranges[i].len, IBV_ACCESS_LOCAL_WRITE); + if (ibv_mr) { + DRV_LOG(DEBUG, "MR lkey %u addr %px len 0x%lx", + ibv_mr->lkey, ibv_mr->addr, ibv_mr->length); + + mr = rte_calloc("MANA MR", 1, sizeof(*mr), 0); + mr->lkey = ibv_mr->lkey; + mr->addr = (uintptr_t)ibv_mr->addr; + mr->len = ibv_mr->length; + mr->verb_obj = ibv_mr; + + rte_rwlock_write_lock(&priv->mr_list_lock); + ret = mana_mr_btree_insert(&priv->mr_btree, mr); + rte_rwlock_write_unlock(&priv->mr_list_lock); + if (ret) { + ibv_dereg_mr(ibv_mr); + DRV_LOG(ERR, "Failed to add to global MR btree"); + return ret; + } + + ret = mana_mr_btree_insert(local_tree, mr); + if (ret) { + /* Don't need to clean up MR as it's already + * in the global tree + */ + DRV_LOG(ERR, "Failed to add to local MR btree"); + return ret; + } + } else { + DRV_LOG(ERR, "MR failed start 0x%lx len 0x%x", + ranges[i].start, ranges[i].len); + return -errno; + } + } + return 0; +} + +void del_pmd_mr(struct mana_mr_cache *mr) +{ + int ret; + struct ibv_mr *ibv_mr = (struct ibv_mr *)mr->verb_obj; + + ret = ibv_dereg_mr(ibv_mr); + if (ret) + DRV_LOG(ERR, "dereg MR failed ret %d", ret); +} + +struct mana_mr_cache *find_pmd_mr(struct mana_mr_btree *local_mr_btree, + struct mana_priv *priv, struct rte_mbuf *mbuf) +{ + struct rte_mempool *pool = mbuf->pool; + int ret, second_try = 0; + struct mana_mr_cache *mr; + uint16_t idx; + + DRV_LOG(DEBUG, "finding mr for mbuf addr %p len %d", + mbuf->buf_addr, mbuf->buf_len); + +try_again: + /* First try to find the MR in local queue tree */ + mr = mana_mr_btree_lookup(local_mr_btree, &idx, + (uintptr_t)mbuf->buf_addr, mbuf->buf_len); + if (mr) { + DRV_LOG(DEBUG, "Local mr lkey %u addr %lx len %lu", + mr->lkey, mr->addr, mr->len); + return mr; + } + + /* If not found, try to find the MR in global tree */ + rte_rwlock_read_lock(&priv->mr_list_lock); + mr = mana_mr_btree_lookup(&priv->mr_btree, &idx, + (uintptr_t)mbuf->buf_addr, + mbuf->buf_len); + rte_rwlock_read_unlock(&priv->mr_list_lock); + + /* If found in the global tree, add it to the local tree */ + if (mr) { + ret = mana_mr_btree_insert(local_mr_btree, mr); + if (ret) { + DRV_LOG(DEBUG, "Failed to add MR to local tree."); + return NULL; + } + + DRV_LOG(DEBUG, "Added local mr lkey %u addr %lx len %lu", + mr->lkey, mr->addr, mr->len); + return mr; + } + + if (second_try) { + DRV_LOG(ERR, "Internal error second try failed"); + return NULL; + } + + ret = new_pmd_mr(local_mr_btree, priv, pool); + if (ret) { + DRV_LOG(ERR, "Failed to allocate MR ret %d addr %p len %d", + ret, mbuf->buf_addr, mbuf->buf_len); + return NULL; + } + + second_try = 1; + goto try_again; +} + +void remove_all_mr(struct mana_priv *priv) +{ + struct mana_mr_btree *bt = &priv->mr_btree; + struct mana_mr_cache *mr; + struct ibv_mr *ibv_mr; + uint16_t i; + + rte_rwlock_write_lock(&priv->mr_list_lock); + /* Start with index 1 as the 1st entry is always NULL */ + for (i = 1; i < bt->len; i++) { + mr = &bt->table[i]; + ibv_mr = mr->verb_obj; + ibv_dereg_mr(ibv_mr); + } + bt->len = 1; + rte_rwlock_write_unlock(&priv->mr_list_lock); +} + +static int mana_mr_btree_expand(struct mana_mr_btree *bt, int n) +{ + void *mem; + + mem = rte_realloc_socket(bt->table, n * sizeof(struct mana_mr_cache), + 0, bt->socket); + if (!mem) { + DRV_LOG(ERR, "Failed to expand btree size %d", n); + return -1; + } + + DRV_LOG(ERR, "Expanded btree to size %d", n); + bt->table = mem; + bt->size = n; + + return 0; +} + +struct mana_mr_cache *mana_mr_btree_lookup(struct mana_mr_btree *bt, + uint16_t *idx, + uintptr_t addr, size_t len) +{ + struct mana_mr_cache *table; + uint16_t n; + uint16_t base = 0; + int ret; + + n = bt->len; + + /* Try to double the cache if it's full */ + if (n == bt->size) { + ret = mana_mr_btree_expand(bt, bt->size << 1); + if (ret) + return NULL; + } + + table = bt->table; + + /* Do binary search on addr */ + do { + uint16_t delta = n >> 1; + + if (addr < table[base + delta].addr) { + n = delta; + } else { + base += delta; + n -= delta; + } + } while (n > 1); + + *idx = base; + + if (addr + len <= table[base].addr + table[base].len) + return &table[base]; + + DRV_LOG(DEBUG, "addr %lx len %lu idx %u sum %lx not found", + addr, len, *idx, addr + len); + + return NULL; +} + +int mana_mr_btree_init(struct mana_mr_btree *bt, int n, int socket) +{ + memset(bt, 0, sizeof(*bt)); + bt->table = rte_calloc_socket("MANA B-tree table", + n, + sizeof(struct mana_mr_cache), + 0, socket); + if (!bt->table) { + DRV_LOG(ERR, "Failed to allocate B-tree n %d socket %d", + n, socket); + return -ENOMEM; + } + + bt->socket = socket; + bt->size = n; + + /* First entry must be NULL for binary search to work */ + bt->table[0] = (struct mana_mr_cache) { + .lkey = UINT32_MAX, + }; + bt->len = 1; + + DRV_LOG(ERR, "B-tree initialized table %p size %d len %d", + bt->table, n, bt->len); + + return 0; +} + +void mana_mr_btree_free(struct mana_mr_btree *bt) +{ + rte_free(bt->table); + memset(bt, 0, sizeof(*bt)); +} + +int mana_mr_btree_insert(struct mana_mr_btree *bt, struct mana_mr_cache *entry) +{ + struct mana_mr_cache *table; + uint16_t idx = 0; + uint16_t shift; + + if (mana_mr_btree_lookup(bt, &idx, entry->addr, entry->len)) { + DRV_LOG(DEBUG, "Addr %lx len %lu exists in btree", + entry->addr, entry->len); + return 0; + } + + if (bt->len >= bt->size) { + bt->overflow = 1; + return -1; + } + + table = bt->table; + + idx++; + shift = (bt->len - idx) * sizeof(struct mana_mr_cache); + if (shift) { + DRV_LOG(DEBUG, "Moving %u bytes from idx %u to %u", + shift, idx, idx + 1); + memmove(&table[idx + 1], &table[idx], shift); + } + + table[idx] = *entry; + bt->len++; + + DRV_LOG(DEBUG, "Inserted MR b-tree table %p idx %d addr %lx len %lu", + table, idx, entry->addr, entry->len); + + return 0; +} From patchwork Fri Jul 1 09:02:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113614 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id ADB3AA00C2; Fri, 1 Jul 2022 11:04:16 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C7EBF42B90; Fri, 1 Jul 2022 11:03:15 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 9848E42B71 for ; Fri, 1 Jul 2022 11:03:08 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 53B9220D4D79; Fri, 1 Jul 2022 02:03:08 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 53B9220D4D79 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666188; bh=K9hkJL9flEPQO/8wunzVGKO5rERYB99+dz1kF1XyTys=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=COv6mptCe/+u9MY5EhHVnHM5lV9Ck3bvFHG2NP7QcgJVWqsXgnmv780lMWJC6xbE+ 5LkW4aWD8BHsk4K56CBkstLk0mDwCC3ZKIu9aTitq4WKIMUiWfzWGx2UtIYlKUgIqZ 1lCQmgy2/gDsIqyE4+lx8IUEyuSzHV5znJnKlS4Y= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 11/17] net/mana: implement the hardware layer operations Date: Fri, 1 Jul 2022 02:02:41 -0700 Message-Id: <1656666167-26035-12-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li The hardware layer of MANA understands the device queue and doorbell formats. Those functions are implemented for use by packet RX/TX code. Signed-off-by: Long Li --- drivers/net/mana/gdma.c | 309 +++++++++++++++++++++++++++++++++++ drivers/net/mana/mana.h | 183 +++++++++++++++++++++ drivers/net/mana/meson.build | 1 + 3 files changed, 493 insertions(+) create mode 100644 drivers/net/mana/gdma.c diff --git a/drivers/net/mana/gdma.c b/drivers/net/mana/gdma.c new file mode 100644 index 0000000000..c86ee69bdd --- /dev/null +++ b/drivers/net/mana/gdma.c @@ -0,0 +1,309 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2022 Microsoft Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "mana.h" + +uint8_t *gdma_get_wqe_pointer(struct mana_gdma_queue *queue) +{ + uint32_t offset_in_bytes = + (queue->head * GDMA_WQE_ALIGNMENT_UNIT_SIZE) & + (queue->size - 1); + + DRV_LOG(DEBUG, "txq sq_head %u sq_size %u offset_in_bytes %u", + queue->head, queue->size, offset_in_bytes); + + if (offset_in_bytes + GDMA_WQE_ALIGNMENT_UNIT_SIZE > queue->size) + DRV_LOG(ERR, "fatal error: offset_in_bytes %u too big", + offset_in_bytes); + + return ((uint8_t *)queue->buffer) + offset_in_bytes; +} + +static uint32_t +write_dma_client_oob(uint8_t *work_queue_buffer_pointer, + const struct gdma_work_request *work_request, + uint32_t client_oob_size) +{ + uint8_t *p = work_queue_buffer_pointer; + + struct gdma_wqe_dma_oob *header = (struct gdma_wqe_dma_oob *)p; + + memset(header, 0, sizeof(struct gdma_wqe_dma_oob)); + header->num_sgl_entries = work_request->num_sgl_elements; + header->inline_client_oob_size_in_dwords = + client_oob_size / sizeof(uint32_t); + header->client_data_unit = work_request->client_data_unit; + + DRV_LOG(DEBUG, "queue buf %p sgl %u oob_h %u du %u oob_buf %p oob_b %u", + work_queue_buffer_pointer, header->num_sgl_entries, + header->inline_client_oob_size_in_dwords, + header->client_data_unit, work_request->inline_oob_data, + work_request->inline_oob_size_in_bytes); + + p += sizeof(struct gdma_wqe_dma_oob); + if (work_request->inline_oob_data && + work_request->inline_oob_size_in_bytes > 0) { + memcpy(p, work_request->inline_oob_data, + work_request->inline_oob_size_in_bytes); + if (client_oob_size > work_request->inline_oob_size_in_bytes) + memset(p + work_request->inline_oob_size_in_bytes, 0, + client_oob_size - + work_request->inline_oob_size_in_bytes); + } + + return sizeof(struct gdma_wqe_dma_oob) + client_oob_size; +} + +static uint32_t +write_scatter_gather_list(uint8_t *work_queue_head_pointer, + uint8_t *work_queue_end_pointer, + uint8_t *work_queue_cur_pointer, + struct gdma_work_request *work_request) +{ + struct gdma_sgl_element *sge_list; + struct gdma_sgl_element dummy_sgl[1]; + uint8_t *address; + uint32_t size; + uint32_t num_sge; + uint32_t size_to_queue_end; + uint32_t sge_list_size; + + DRV_LOG(DEBUG, "work_queue_cur_pointer %p work_request->flags %x", + work_queue_cur_pointer, work_request->flags); + + num_sge = work_request->num_sgl_elements; + sge_list = work_request->sgl; + size_to_queue_end = (uint32_t)(work_queue_end_pointer - + work_queue_cur_pointer); + + if (num_sge == 0) { + /* Per spec, the case of an empty SGL should be handled as + * follows to avoid corrupted WQE errors: + * Write one dummy SGL entry + * Set the address to 1, leave the rest as 0 + */ + dummy_sgl[num_sge].address = 1; + dummy_sgl[num_sge].size = 0; + dummy_sgl[num_sge].memory_key = 0; + num_sge++; + sge_list = dummy_sgl; + } + + sge_list_size = 0; + { + address = (uint8_t *)sge_list; + size = sizeof(struct gdma_sgl_element) * num_sge; + if (size_to_queue_end < size) { + memcpy(work_queue_cur_pointer, address, + size_to_queue_end); + work_queue_cur_pointer = work_queue_head_pointer; + address += size_to_queue_end; + size -= size_to_queue_end; + } + + memcpy(work_queue_cur_pointer, address, size); + sge_list_size = size; + } + + DRV_LOG(DEBUG, "sge %u address 0x%lx size %u key %u sge_list_size %u", + num_sge, sge_list->address, sge_list->size, + sge_list->memory_key, sge_list_size); + + return sge_list_size; +} + +int gdma_post_work_request(struct mana_gdma_queue *queue, + struct gdma_work_request *work_req, + struct gdma_posted_wqe_info *wqe_info) +{ + uint32_t client_oob_size = + work_req->inline_oob_size_in_bytes > + INLINE_OOB_SMALL_SIZE_IN_BYTES ? + INLINE_OOB_LARGE_SIZE_IN_BYTES : + INLINE_OOB_SMALL_SIZE_IN_BYTES; + + uint32_t sgl_data_size = sizeof(struct gdma_sgl_element) * + RTE_MAX((uint32_t)1, work_req->num_sgl_elements); + uint32_t wqe_size = + RTE_ALIGN(sizeof(struct gdma_wqe_dma_oob) + + client_oob_size + sgl_data_size, + GDMA_WQE_ALIGNMENT_UNIT_SIZE); + uint8_t *wq_buffer_pointer; + uint32_t queue_free_units = queue->count - (queue->head - queue->tail); + + if (wqe_size / GDMA_WQE_ALIGNMENT_UNIT_SIZE > queue_free_units) { + DRV_LOG(DEBUG, "WQE size %u queue count %u head %u tail %u", + wqe_size, queue->count, queue->head, queue->tail); + return -EBUSY; + } + + DRV_LOG(DEBUG, "client_oob_size %u sgl_data_size %u wqe_size %u", + client_oob_size, sgl_data_size, wqe_size); + + if (wqe_info) { + wqe_info->wqe_index = + ((queue->head * GDMA_WQE_ALIGNMENT_UNIT_SIZE) & + (queue->size - 1)) / GDMA_WQE_ALIGNMENT_UNIT_SIZE; + wqe_info->unmasked_queue_offset = queue->head; + wqe_info->wqe_size_in_bu = + wqe_size / GDMA_WQE_ALIGNMENT_UNIT_SIZE; + } + + wq_buffer_pointer = gdma_get_wqe_pointer(queue); + wq_buffer_pointer += write_dma_client_oob(wq_buffer_pointer, work_req, + client_oob_size); + if (wq_buffer_pointer >= ((uint8_t *)queue->buffer) + queue->size) + wq_buffer_pointer -= queue->size; + + write_scatter_gather_list((uint8_t *)queue->buffer, + (uint8_t *)queue->buffer + queue->size, + wq_buffer_pointer, work_req); + + queue->head += wqe_size / GDMA_WQE_ALIGNMENT_UNIT_SIZE; + + return 0; +} + +union gdma_doorbell_entry { + uint64_t as_uint64; + + struct { + uint64_t id : 24; + uint64_t reserved : 8; + uint64_t tail_ptr : 31; + uint64_t arm : 1; + } cq; + + struct { + uint64_t id : 24; + uint64_t wqe_cnt : 8; + uint64_t tail_ptr : 32; + } rq; + + struct { + uint64_t id : 24; + uint64_t reserved : 8; + uint64_t tail_ptr : 32; + } sq; + + struct { + uint64_t id : 16; + uint64_t reserved : 16; + uint64_t tail_ptr : 31; + uint64_t arm : 1; + } eq; +}; /* HW DATA */ + +#define DOORBELL_OFFSET_SQ 0x0 +#define DOORBELL_OFFSET_RQ 0x400 +#define DOORBELL_OFFSET_CQ 0x800 +#define DOORBELL_OFFSET_EQ 0xFF8 + +int mana_ring_doorbell(void *db_page, enum gdma_queue_types queue_type, + uint32_t queue_id, uint32_t tail) +{ + uint8_t *addr = db_page; + union gdma_doorbell_entry e = {}; + + switch (queue_type) { + case gdma_queue_send: + e.sq.id = queue_id; + e.sq.tail_ptr = tail; + addr += DOORBELL_OFFSET_SQ; + break; + + case gdma_queue_receive: + e.rq.id = queue_id; + e.rq.tail_ptr = tail; + e.rq.wqe_cnt = 1; + addr += DOORBELL_OFFSET_RQ; + break; + + case gdma_queue_completion: + e.cq.id = queue_id; + e.cq.tail_ptr = tail; + e.cq.arm = 1; + addr += DOORBELL_OFFSET_CQ; + break; + + default: + DRV_LOG(ERR, "Unsupported queue type %d", queue_type); + return -1; + } + + rte_wmb(); + DRV_LOG(DEBUG, "db_page %p addr %p queue_id %u type %u tail %u", + db_page, addr, queue_id, queue_type, tail); + + rte_write64(e.as_uint64, addr); + return 0; +} + +int gdma_poll_completion_queue(struct mana_gdma_queue *cq, + struct gdma_comp *comp) +{ + struct gdma_hardware_completion_entry *cqe; + uint32_t head = cq->head % cq->count; + uint32_t new_owner_bits, old_owner_bits; + uint32_t cqe_owner_bits; + struct gdma_hardware_completion_entry *buffer = cq->buffer; + + cqe = &buffer[head]; + new_owner_bits = (cq->head / cq->count) & COMPLETION_QUEUE_OWNER_MASK; + old_owner_bits = (cq->head / cq->count - 1) & + COMPLETION_QUEUE_OWNER_MASK; + cqe_owner_bits = cqe->owner_bits; + + DRV_LOG(DEBUG, "comp cqe bits 0x%x owner bits 0x%x", + cqe_owner_bits, old_owner_bits); + + if (cqe_owner_bits == old_owner_bits) + return 0; /* No new entry */ + + if (cqe_owner_bits != new_owner_bits) { + DRV_LOG(ERR, "CQ overflowed, ID %u cqe 0x%x new 0x%x", + cq->id, cqe_owner_bits, new_owner_bits); + return -1; + } + + comp->work_queue_number = cqe->wq_num; + comp->send_work_queue = cqe->is_sq; + + memcpy(comp->completion_data, cqe->dma_client_data, GDMA_COMP_DATA_SIZE); + + cq->head++; + + DRV_LOG(DEBUG, "comp new 0x%x old 0x%x cqe 0x%x wq %u sq %u head %u", + new_owner_bits, old_owner_bits, cqe_owner_bits, + comp->work_queue_number, comp->send_work_queue, cq->head); + return 1; +} diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index b1ef9ce60b..1847902054 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -50,6 +50,178 @@ struct mana_shared_data { #define MAX_RECEIVE_BUFFERS_PER_QUEUE 256 #define MAX_SEND_BUFFERS_PER_QUEUE 256 +#define GDMA_WQE_ALIGNMENT_UNIT_SIZE 32 + +#define COMP_ENTRY_SIZE 64 +#define MAX_TX_WQE_SIZE 512 +#define MAX_RX_WQE_SIZE 256 + +/* Values from the GDMA specification document, WQE format description */ +#define INLINE_OOB_SMALL_SIZE_IN_BYTES 8 +#define INLINE_OOB_LARGE_SIZE_IN_BYTES 24 + +#define NOT_USING_CLIENT_DATA_UNIT 0 + +enum gdma_queue_types { + gdma_queue_type_invalid = 0, + gdma_queue_send, + gdma_queue_receive, + gdma_queue_completion, + gdma_queue_event, + gdma_queue_type_max = 16, + /*Room for expansion */ + + /* This enum can be expanded to add more queue types but + * it's expected to be done in a contiguous manner. + * Failing that will result in unexpected behavior. + */ +}; + +#define WORK_QUEUE_NUMBER_BASE_BITS 10 + +struct gdma_header { + /* size of the entire gdma structure, including the entire length of + * the struct that is formed by extending other gdma struct. i.e. + * GDMA_BASE_SPEC extends gdma_header, GDMA_EVENT_QUEUE_SPEC extends + * GDMA_BASE_SPEC, StructSize for GDMA_EVENT_QUEUE_SPEC will be size of + * GDMA_EVENT_QUEUE_SPEC which includes size of GDMA_BASE_SPEC and size + * of gdma_header. + * Above example is for illustration purpose and is not in code + */ + size_t struct_size; +}; + +/* The following macros are from GDMA SPEC 3.6, "Table 2: CQE data structure" + * and "Table 4: Event Queue Entry (EQE) data format" + */ +#define GDMA_COMP_DATA_SIZE 0x3C /* Must be a multiple of 4 */ +#define GDMA_COMP_DATA_SIZE_IN_UINT32 (GDMA_COMP_DATA_SIZE / 4) + +#define COMPLETION_QUEUE_ENTRY_WORK_QUEUE_INDEX 0 +#define COMPLETION_QUEUE_ENTRY_WORK_QUEUE_SIZE 24 +#define COMPLETION_QUEUE_ENTRY_SEND_WORK_QUEUE_INDEX 24 +#define COMPLETION_QUEUE_ENTRY_SEND_WORK_QUEUE_SIZE 1 +#define COMPLETION_QUEUE_ENTRY_OWNER_BITS_INDEX 29 +#define COMPLETION_QUEUE_ENTRY_OWNER_BITS_SIZE 3 + +#define COMPLETION_QUEUE_OWNER_MASK \ + ((1 << (COMPLETION_QUEUE_ENTRY_OWNER_BITS_SIZE)) - 1) + +struct gdma_comp { + struct gdma_header gdma_header; + + /* Filled by GDMA core */ + uint32_t completion_data[GDMA_COMP_DATA_SIZE_IN_UINT32]; + + /* Filled by GDMA core */ + uint32_t work_queue_number; + + /* Filled by GDMA core */ + bool send_work_queue; +}; + +struct gdma_hardware_completion_entry { + char dma_client_data[GDMA_COMP_DATA_SIZE]; + union { + uint32_t work_queue_owner_bits; + struct { + uint32_t wq_num : 24; + uint32_t is_sq : 1; + uint32_t reserved : 4; + uint32_t owner_bits : 3; + }; + }; +}; /* HW DATA */ + +struct gdma_posted_wqe_info { + struct gdma_header gdma_header; + + /* size of the written wqe in basic units (32B), filled by GDMA core. + * Use this value to progress the work queue after the wqe is processed + * by hardware. + */ + uint32_t wqe_size_in_bu; + + /* At the time of writing the wqe to the work queue, the offset in the + * work queue buffer where by the wqe will be written. Each unit + * represents 32B of buffer space. + */ + uint32_t wqe_index; + + /* Unmasked offset in the queue to which the WQE was written. + * In 32 byte units. + */ + uint32_t unmasked_queue_offset; +}; + +struct gdma_sgl_element { + uint64_t address; + uint32_t memory_key; + uint32_t size; +}; + +#define MAX_SGL_ENTRIES_FOR_TRANSMIT 30 + +struct one_sgl { + struct gdma_sgl_element gdma_sgl[MAX_SGL_ENTRIES_FOR_TRANSMIT]; +}; + +struct gdma_work_request { + struct gdma_header gdma_header; + struct gdma_sgl_element *sgl; + uint32_t num_sgl_elements; + uint32_t inline_oob_size_in_bytes; + void *inline_oob_data; + uint32_t flags; /* From _gdma_work_request_FLAGS */ + uint32_t client_data_unit; /* For LSO, this is the MTU of the data */ +}; + +enum mana_cqe_type { + CQE_INVALID = 0, +}; + +struct mana_cqe_header { + uint32_t cqe_type : 6; + uint32_t client_type : 2; + uint32_t vendor_err : 24; +}; /* HW DATA */ + +/* NDIS HASH Types */ +#define BIT(nr) (1 << (nr)) +#define NDIS_HASH_IPV4 BIT(0) +#define NDIS_HASH_TCP_IPV4 BIT(1) +#define NDIS_HASH_UDP_IPV4 BIT(2) +#define NDIS_HASH_IPV6 BIT(3) +#define NDIS_HASH_TCP_IPV6 BIT(4) +#define NDIS_HASH_UDP_IPV6 BIT(5) +#define NDIS_HASH_IPV6_EX BIT(6) +#define NDIS_HASH_TCP_IPV6_EX BIT(7) +#define NDIS_HASH_UDP_IPV6_EX BIT(8) + +#define MANA_HASH_L3 (NDIS_HASH_IPV4 | NDIS_HASH_IPV6 | NDIS_HASH_IPV6_EX) +#define MANA_HASH_L4 \ + (NDIS_HASH_TCP_IPV4 | NDIS_HASH_UDP_IPV4 | NDIS_HASH_TCP_IPV6 | \ + NDIS_HASH_UDP_IPV6 | NDIS_HASH_TCP_IPV6_EX | NDIS_HASH_UDP_IPV6_EX) + +struct gdma_wqe_dma_oob { + uint32_t reserved:24; + uint32_t last_Vbytes:8; + union { + uint32_t flags; + struct { + uint32_t num_sgl_entries:8; + uint32_t inline_client_oob_size_in_dwords:3; + uint32_t client_oob_in_sgl:1; + uint32_t consume_credit:1; + uint32_t fence:1; + uint32_t reserved1:2; + uint32_t client_data_unit:14; + uint32_t check_sn:1; + uint32_t sgl_direct:1; + }; + }; +}; + struct mana_mr_cache { uint32_t lkey; uintptr_t addr; @@ -190,12 +362,23 @@ extern int mana_logtype_init; #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>") +int mana_ring_doorbell(void *db_page, enum gdma_queue_types queue_type, + uint32_t queue_id, uint32_t tail); + +int gdma_post_work_request(struct mana_gdma_queue *queue, + struct gdma_work_request *work_req, + struct gdma_posted_wqe_info *wqe_info); +uint8_t *gdma_get_wqe_pointer(struct mana_gdma_queue *queue); + uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); +int gdma_poll_completion_queue(struct mana_gdma_queue *cq, + struct gdma_comp *comp); + struct mana_mr_cache *find_pmd_mr(struct mana_mr_btree *local_tree, struct mana_priv *priv, struct rte_mbuf *mbuf); diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build index fc0dbaabb3..4a80189428 100644 --- a/drivers/net/mana/meson.build +++ b/drivers/net/mana/meson.build @@ -12,6 +12,7 @@ deps += ['pci', 'bus_pci', 'net', 'eal', 'kvargs'] sources += files( 'mana.c', 'mr.c', + 'gdma.c', 'mp.c', ) From patchwork Fri Jul 1 09:02:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113615 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5914FA00C2; Fri, 1 Jul 2022 11:04:23 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D6BB742B93; Fri, 1 Jul 2022 11:03:16 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 18708427F3 for ; Fri, 1 Jul 2022 11:03:09 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id C7F1720D5905; Fri, 1 Jul 2022 02:03:08 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com C7F1720D5905 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666188; bh=pSH4F+QJf8aONcyvbezpXKCPFpBcY7JbiRGAxuy36po=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=NT/XRLs9n+k1wpYpEHI6t03WVNSb9SBMzJ5a83IL+Go0dQBgMBk1dvIqijNgKRxT5 sTVp8SZDH9N9dvg5XdEkGi6dRzhc6uVGUIPLJU3E541oq1JxoECxP0hz1bR4vyWIPa tzb9kcGKZaIbZOr/C/FJFvossampH6b6btplvSXQ= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 12/17] net/mana: add function to start/stop TX queues Date: Fri, 1 Jul 2022 02:02:42 -0700 Message-Id: <1656666167-26035-13-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li MANA allocate device queues through the IB layer when starting TX queues. When device is stopped all the queues are unmapped and freed. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 1 + drivers/net/mana/mana.h | 4 + drivers/net/mana/meson.build | 1 + drivers/net/mana/tx.c | 180 ++++++++++++++++++++++++++++++ 4 files changed, 186 insertions(+) create mode 100644 drivers/net/mana/tx.c diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index 9ba4767a06..7546c99ea3 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -7,6 +7,7 @@ Link status = P Linux = Y Multiprocess aware = Y +Queue start/stop = Y Removal event = Y RSS hash = Y Speed capabilities = P diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index 1847902054..fef646a9a7 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -379,6 +379,10 @@ uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, int gdma_poll_completion_queue(struct mana_gdma_queue *cq, struct gdma_comp *comp); +int start_tx_queues(struct rte_eth_dev *dev); + +int stop_tx_queues(struct rte_eth_dev *dev); + struct mana_mr_cache *find_pmd_mr(struct mana_mr_btree *local_tree, struct mana_priv *priv, struct rte_mbuf *mbuf); diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build index 4a80189428..34bb9c6b2f 100644 --- a/drivers/net/mana/meson.build +++ b/drivers/net/mana/meson.build @@ -11,6 +11,7 @@ deps += ['pci', 'bus_pci', 'net', 'eal', 'kvargs'] sources += files( 'mana.c', + 'tx.c', 'mr.c', 'gdma.c', 'mp.c', diff --git a/drivers/net/mana/tx.c b/drivers/net/mana/tx.c new file mode 100644 index 0000000000..dde911e548 --- /dev/null +++ b/drivers/net/mana/tx.c @@ -0,0 +1,180 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2022 Microsoft Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "mana.h" + +int stop_tx_queues(struct rte_eth_dev *dev) +{ + struct mana_priv *priv = dev->data->dev_private; + int i; + + for (i = 0; i < priv->num_queues; i++) { + struct mana_txq *txq = dev->data->tx_queues[i]; + + if (txq->qp) { + ibv_destroy_qp(txq->qp); + txq->qp = NULL; + } + + if (txq->cq) { + ibv_destroy_cq(txq->cq); + txq->cq = NULL; + } + + /* Drain and free posted WQEs */ + while (txq->desc_ring_tail != txq->desc_ring_head) { + struct mana_txq_desc *desc = + &txq->desc_ring[txq->desc_ring_tail]; + + rte_pktmbuf_free(desc->pkt); + + txq->desc_ring_tail = + (txq->desc_ring_tail + 1) % txq->num_desc; + } + txq->desc_ring_head = 0; + txq->desc_ring_tail = 0; + + memset(&txq->gdma_sq, 0, sizeof(txq->gdma_sq)); + memset(&txq->gdma_cq, 0, sizeof(txq->gdma_cq)); + } + + return 0; +} + +int start_tx_queues(struct rte_eth_dev *dev) +{ + struct mana_priv *priv = dev->data->dev_private; + int ret, i; + + /* start TX queues */ + for (i = 0; i < priv->num_queues; i++) { + struct mana_txq *txq; + struct ibv_qp_init_attr qp_attr = { 0 }; + struct manadv_obj obj = {}; + struct manadv_qp dv_qp; + struct manadv_cq dv_cq; + + txq = dev->data->tx_queues[i]; + + manadv_set_context_attr(priv->ib_ctx, + MANADV_CTX_ATTR_BUF_ALLOCATORS, + (void *)((uintptr_t)&(struct manadv_ctx_allocators){ + .alloc = &mana_alloc_verbs_buf, + .free = &mana_free_verbs_buf, + .data = (void *)(uintptr_t)txq->socket, + })); + + txq->cq = ibv_create_cq(priv->ib_ctx, txq->num_desc, + NULL, NULL, 0); + if (!txq->cq) { + DRV_LOG(ERR, "failed to create cq queue index %d", i); + ret = -errno; + goto fail; + } + + qp_attr.send_cq = txq->cq; + qp_attr.recv_cq = txq->cq; + qp_attr.cap.max_send_wr = txq->num_desc; + qp_attr.cap.max_send_sge = priv->max_send_sge; + + /* Skip setting qp_attr.cap.max_inline_data */ + + qp_attr.qp_type = IBV_QPT_RAW_PACKET; + qp_attr.sq_sig_all = 0; + + txq->qp = ibv_create_qp(priv->ib_parent_pd, &qp_attr); + if (!txq->qp) { + DRV_LOG(ERR, "Failed to create qp queue index %d", i); + ret = -errno; + goto fail; + } + + /* Get the addresses of CQ, QP and DB */ + obj.qp.in = txq->qp; + obj.qp.out = &dv_qp; + obj.cq.in = txq->cq; + obj.cq.out = &dv_cq; + ret = manadv_init_obj(&obj, MANADV_OBJ_QP | MANADV_OBJ_CQ); + if (ret) { + DRV_LOG(ERR, "Failed to get manadv objects"); + goto fail; + } + + txq->gdma_sq.buffer = obj.qp.out->sq_buf; + txq->gdma_sq.count = obj.qp.out->sq_count; + txq->gdma_sq.size = obj.qp.out->sq_size; + txq->gdma_sq.id = obj.qp.out->sq_id; + + txq->tx_vp_offset = obj.qp.out->tx_vp_offset; + priv->db_page = obj.qp.out->db_page; + DRV_LOG(INFO, "txq sq id %u vp_offset %u db_page %px " + " buf %px count %u size %u", + txq->gdma_sq.id, txq->tx_vp_offset, + priv->db_page, + txq->gdma_sq.buffer, txq->gdma_sq.count, + txq->gdma_sq.size); + + txq->gdma_cq.buffer = obj.cq.out->buf; + txq->gdma_cq.count = obj.cq.out->count; + txq->gdma_cq.size = txq->gdma_cq.count * COMP_ENTRY_SIZE; + txq->gdma_cq.id = obj.cq.out->cq_id; + + /* CQ head starts with count (not 0) */ + txq->gdma_cq.head = txq->gdma_cq.count; + + DRV_LOG(INFO, "txq cq id %u buf %px count %u size %u head %u", + txq->gdma_cq.id, txq->gdma_cq.buffer, + txq->gdma_cq.count, txq->gdma_cq.size, + txq->gdma_cq.head); + } + + return 0; + +fail: + stop_tx_queues(dev); + return ret; +} + +static inline uint16_t get_vsq_frame_num(uint32_t vsq) +{ + union { + uint32_t gdma_txq_id; + struct { + uint32_t reserved1 : 10; + uint32_t vsq_frame : 14; + uint32_t reserved2 : 8; + }; + } v; + + v.gdma_txq_id = vsq; + return v.vsq_frame; +} From patchwork Fri Jul 1 09:02:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113616 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 538C2A00C2; Fri, 1 Jul 2022 11:04:34 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8F12A42BAC; Fri, 1 Jul 2022 11:03:18 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 93ACF42B75 for ; Fri, 1 Jul 2022 11:03:09 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 4EBC620D4D79; Fri, 1 Jul 2022 02:03:09 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 4EBC620D4D79 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666189; bh=83sYIDdBeMnZnOUZwA2igKK7PWw7eLl8XHgySVR8BOU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=H40+sFSSz4Wna9NVyVMdzbH5gAzG21Ok02DUu21G/i46qk0hzLndJhgw79koc8S7b 4vNH4KAafUin1DwqffgU4vR73j09SgL/1+47qLjorJmSX6b4rBMu748JPDlzG1JM4P 35HYfACI2qY5bAGqMjIN1rAHh2IpsZtcU+cduLtI= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 13/17] net/mana: add function to start/stop RX queues Date: Fri, 1 Jul 2022 02:02:43 -0700 Message-Id: <1656666167-26035-14-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li MANA allocates device queues through the IB layer when starting RX queues. When device is stopped all the queues are unmapped and freed. Signed-off-by: Long Li --- drivers/net/mana/mana.h | 5 + drivers/net/mana/meson.build | 1 + drivers/net/mana/rx.c | 369 +++++++++++++++++++++++++++++++++++ 3 files changed, 375 insertions(+) create mode 100644 drivers/net/mana/rx.c diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index fef646a9a7..5052ec9061 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -364,6 +364,7 @@ extern int mana_logtype_init; int mana_ring_doorbell(void *db_page, enum gdma_queue_types queue_type, uint32_t queue_id, uint32_t tail); +int rq_ring_doorbell(struct mana_rxq *rxq); int gdma_post_work_request(struct mana_gdma_queue *queue, struct gdma_work_request *work_req, @@ -379,10 +380,14 @@ uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, int gdma_poll_completion_queue(struct mana_gdma_queue *cq, struct gdma_comp *comp); +int start_rx_queues(struct rte_eth_dev *dev); int start_tx_queues(struct rte_eth_dev *dev); +int stop_rx_queues(struct rte_eth_dev *dev); int stop_tx_queues(struct rte_eth_dev *dev); +int alloc_and_post_rx_wqe(struct mana_rxq *rxq); + struct mana_mr_cache *find_pmd_mr(struct mana_mr_btree *local_tree, struct mana_priv *priv, struct rte_mbuf *mbuf); diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build index 34bb9c6b2f..8233c04eee 100644 --- a/drivers/net/mana/meson.build +++ b/drivers/net/mana/meson.build @@ -11,6 +11,7 @@ deps += ['pci', 'bus_pci', 'net', 'eal', 'kvargs'] sources += files( 'mana.c', + 'rx.c', 'tx.c', 'mr.c', 'gdma.c', diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c new file mode 100644 index 0000000000..bcc9f308f3 --- /dev/null +++ b/drivers/net/mana/rx.c @@ -0,0 +1,369 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2022 Microsoft Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "mana.h" + +static uint8_t mana_rss_hash_key_default[TOEPLITZ_HASH_KEY_SIZE_IN_BYTES] = { + 0x2c, 0xc6, 0x81, 0xd1, + 0x5b, 0xdb, 0xf4, 0xf7, + 0xfc, 0xa2, 0x83, 0x19, + 0xdb, 0x1a, 0x3e, 0x94, + 0x6b, 0x9e, 0x38, 0xd9, + 0x2c, 0x9c, 0x03, 0xd1, + 0xad, 0x99, 0x44, 0xa7, + 0xd9, 0x56, 0x3d, 0x59, + 0x06, 0x3c, 0x25, 0xf3, + 0xfc, 0x1f, 0xdc, 0x2a, +}; + +int rq_ring_doorbell(struct mana_rxq *rxq) +{ + struct mana_priv *priv = rxq->priv; + int ret; + void *db_page = priv->db_page; + + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { + struct rte_eth_dev *dev = + &rte_eth_devices[priv->dev_data->port_id]; + struct mana_process_priv *process_priv = dev->process_private; + + db_page = process_priv->db_page; + } + + ret = mana_ring_doorbell(db_page, gdma_queue_receive, + rxq->gdma_rq.id, + rxq->gdma_rq.head * + GDMA_WQE_ALIGNMENT_UNIT_SIZE); + + if (ret) + DRV_LOG(ERR, "failed to ring RX doorbell ret %d", ret); + + return ret; +} + +int alloc_and_post_rx_wqe(struct mana_rxq *rxq) +{ + struct rte_mbuf *mbuf = NULL; + struct gdma_sgl_element sgl[1]; + struct gdma_work_request request = {0}; + struct gdma_posted_wqe_info wqe_info = {0}; + struct mana_priv *priv = rxq->priv; + int ret; + struct mana_mr_cache *mr; + + mbuf = rte_pktmbuf_alloc(rxq->mp); + if (!mbuf) { + rxq->stats.nombuf++; + return -ENOMEM; + } + + mr = find_pmd_mr(&rxq->mr_btree, priv, mbuf); + if (!mr) { + DRV_LOG(ERR, "failed to register RX MR"); + rte_pktmbuf_free(mbuf); + return -ENOMEM; + } + + request.gdma_header.struct_size = sizeof(request); + wqe_info.gdma_header.struct_size = sizeof(wqe_info); + + sgl[0].address = rte_cpu_to_le_64(rte_pktmbuf_mtod(mbuf, uint64_t)); + sgl[0].memory_key = mr->lkey; + sgl[0].size = + rte_pktmbuf_data_room_size(rxq->mp) - + RTE_PKTMBUF_HEADROOM; + + request.sgl = sgl; + request.num_sgl_elements = 1; + request.inline_oob_data = NULL; + request.inline_oob_size_in_bytes = 0; + request.flags = 0; + request.client_data_unit = NOT_USING_CLIENT_DATA_UNIT; + + ret = gdma_post_work_request(&rxq->gdma_rq, &request, &wqe_info); + if (!ret) { + struct mana_rxq_desc *desc = + &rxq->desc_ring[rxq->desc_ring_head]; + + /* update queue for tracking pending packets */ + desc->pkt = mbuf; + desc->wqe_size_in_bu = wqe_info.wqe_size_in_bu; + rxq->desc_ring_head = (rxq->desc_ring_head + 1) % rxq->num_desc; + } else { + DRV_LOG(ERR, "failed to post recv ret %d", ret); + return ret; + } + + return 0; +} + +static int alloc_and_post_rx_wqes(struct mana_rxq *rxq) +{ + int ret; + + for (uint32_t i = 0; i < rxq->num_desc; i++) { + ret = alloc_and_post_rx_wqe(rxq); + if (ret) { + DRV_LOG(ERR, "failed to post RX ret = %d", ret); + return ret; + } + } + + rq_ring_doorbell(rxq); + + return ret; +} + +int stop_rx_queues(struct rte_eth_dev *dev) +{ + struct mana_priv *priv = dev->data->dev_private; + int ret, i; + + if (priv->rwq_qp) { + ret = ibv_destroy_qp(priv->rwq_qp); + if (ret) + DRV_LOG(ERR, "rx_queue destory_qp failed %d", ret); + priv->rwq_qp = NULL; + } + + if (priv->ind_table) { + ret = ibv_destroy_rwq_ind_table(priv->ind_table); + if (ret) + DRV_LOG(ERR, "destroy rwq ind table failed %d", ret); + priv->ind_table = NULL; + } + + for (i = 0; i < priv->num_queues; i++) { + struct mana_rxq *rxq = dev->data->rx_queues[i]; + + if (rxq->wq) { + ret = ibv_destroy_wq(rxq->wq); + if (ret) + DRV_LOG(ERR, + "rx_queue destroy_wq failed %d", ret); + rxq->wq = NULL; + } + + if (rxq->cq) { + ret = ibv_destroy_cq(rxq->cq); + if (ret) + DRV_LOG(ERR, + "rx_queue destroy_cq failed %d", ret); + rxq->cq = NULL; + } + + /* Drain and free posted WQEs */ + while (rxq->desc_ring_tail != rxq->desc_ring_head) { + struct mana_rxq_desc *desc = + &rxq->desc_ring[rxq->desc_ring_tail]; + + rte_pktmbuf_free(desc->pkt); + + rxq->desc_ring_tail = + (rxq->desc_ring_tail + 1) % rxq->num_desc; + } + rxq->desc_ring_head = 0; + rxq->desc_ring_tail = 0; + + memset(&rxq->gdma_rq, 0, sizeof(rxq->gdma_rq)); + memset(&rxq->gdma_cq, 0, sizeof(rxq->gdma_cq)); + } + return 0; +} + +int start_rx_queues(struct rte_eth_dev *dev) +{ + struct mana_priv *priv = dev->data->dev_private; + int ret, i; + struct ibv_wq *ind_tbl[priv->num_queues]; + + DRV_LOG(INFO, "start rx queues"); + for (i = 0; i < priv->num_queues; i++) { + struct mana_rxq *rxq = dev->data->rx_queues[i]; + struct ibv_wq_init_attr wq_attr = {}; + + manadv_set_context_attr(priv->ib_ctx, + MANADV_CTX_ATTR_BUF_ALLOCATORS, + (void *)((uintptr_t)&(struct manadv_ctx_allocators){ + .alloc = &mana_alloc_verbs_buf, + .free = &mana_free_verbs_buf, + .data = (void *)(uintptr_t)rxq->socket, + })); + + rxq->cq = ibv_create_cq(priv->ib_ctx, rxq->num_desc, + NULL, NULL, 0); + if (!rxq->cq) { + ret = -errno; + DRV_LOG(ERR, "failed to create rx cq queue %d", i); + goto fail; + } + + wq_attr.wq_type = IBV_WQT_RQ; + wq_attr.max_wr = rxq->num_desc; + wq_attr.max_sge = 1; + wq_attr.pd = priv->ib_parent_pd; + wq_attr.cq = rxq->cq; + + rxq->wq = ibv_create_wq(priv->ib_ctx, &wq_attr); + if (!rxq->wq) { + ret = -errno; + DRV_LOG(ERR, "failed to create rx wq %d", i); + goto fail; + } + + ind_tbl[i] = rxq->wq; + } + + struct ibv_rwq_ind_table_init_attr ind_table_attr = { + .log_ind_tbl_size = rte_log2_u32(RTE_DIM(ind_tbl)), + .ind_tbl = ind_tbl, + .comp_mask = 0, + }; + + priv->ind_table = ibv_create_rwq_ind_table(priv->ib_ctx, + &ind_table_attr); + if (!priv->ind_table) { + ret = -errno; + DRV_LOG(ERR, "failed to create ind_table ret %d", ret); + goto fail; + } + + DRV_LOG(INFO, "ind_table handle %d num %d", + priv->ind_table->ind_tbl_handle, + priv->ind_table->ind_tbl_num); + + struct ibv_qp_init_attr_ex qp_attr_ex = { + .comp_mask = IBV_QP_INIT_ATTR_PD | + IBV_QP_INIT_ATTR_RX_HASH | + IBV_QP_INIT_ATTR_IND_TABLE, + .qp_type = IBV_QPT_RAW_PACKET, + .pd = priv->ib_parent_pd, + .rwq_ind_tbl = priv->ind_table, + .rx_hash_conf = { + .rx_hash_function = IBV_RX_HASH_FUNC_TOEPLITZ, + .rx_hash_key_len = TOEPLITZ_HASH_KEY_SIZE_IN_BYTES, + .rx_hash_key = mana_rss_hash_key_default, + .rx_hash_fields_mask = + IBV_RX_HASH_SRC_IPV4 | IBV_RX_HASH_DST_IPV4, + }, + + }; + + /* overwrite default if rss key is set */ + if (priv->rss_conf.rss_key_len && priv->rss_conf.rss_key) + qp_attr_ex.rx_hash_conf.rx_hash_key = + priv->rss_conf.rss_key; + + /* overwrite default if rss hash fields are set */ + if (priv->rss_conf.rss_hf) { + qp_attr_ex.rx_hash_conf.rx_hash_fields_mask = 0; + + if (priv->rss_conf.rss_hf & ETH_RSS_IPV4) + qp_attr_ex.rx_hash_conf.rx_hash_fields_mask |= + IBV_RX_HASH_SRC_IPV4 | IBV_RX_HASH_DST_IPV4; + + if (priv->rss_conf.rss_hf & ETH_RSS_IPV6) + qp_attr_ex.rx_hash_conf.rx_hash_fields_mask |= + IBV_RX_HASH_SRC_IPV6 | IBV_RX_HASH_SRC_IPV6; + + if (priv->rss_conf.rss_hf & + (ETH_RSS_NONFRAG_IPV4_TCP | ETH_RSS_NONFRAG_IPV6_TCP)) + qp_attr_ex.rx_hash_conf.rx_hash_fields_mask |= + IBV_RX_HASH_SRC_PORT_TCP | + IBV_RX_HASH_DST_PORT_TCP; + + if (priv->rss_conf.rss_hf & + (ETH_RSS_NONFRAG_IPV4_UDP | ETH_RSS_NONFRAG_IPV6_UDP)) + qp_attr_ex.rx_hash_conf.rx_hash_fields_mask |= + IBV_RX_HASH_SRC_PORT_UDP | + IBV_RX_HASH_DST_PORT_UDP; + } + + priv->rwq_qp = ibv_create_qp_ex(priv->ib_ctx, &qp_attr_ex); + if (!priv->rwq_qp) { + ret = -errno; + DRV_LOG(ERR, "rx ibv_create_qp_ex failed"); + goto fail; + } + + for (i = 0; i < priv->num_queues; i++) { + struct mana_rxq *rxq = dev->data->rx_queues[i]; + struct manadv_obj obj = {}; + struct manadv_cq dv_cq; + struct manadv_rwq dv_wq; + + obj.cq.in = rxq->cq; + obj.cq.out = &dv_cq; + obj.rwq.in = rxq->wq; + obj.rwq.out = &dv_wq; + ret = manadv_init_obj(&obj, MANADV_OBJ_CQ | MANADV_OBJ_RWQ); + if (ret) { + DRV_LOG(ERR, "manadv_init_obj failed ret %d", ret); + goto fail; + } + + rxq->gdma_cq.buffer = obj.cq.out->buf; + rxq->gdma_cq.count = obj.cq.out->count; + rxq->gdma_cq.size = rxq->gdma_cq.count * COMP_ENTRY_SIZE; + rxq->gdma_cq.id = obj.cq.out->cq_id; + + /* CQ head starts with count */ + rxq->gdma_cq.head = rxq->gdma_cq.count; + + DRV_LOG(INFO, "rxq cq id %u buf %px count %u size %u", + rxq->gdma_cq.id, rxq->gdma_cq.buffer, + rxq->gdma_cq.count, rxq->gdma_cq.size); + + priv->db_page = obj.rwq.out->db_page; + + rxq->gdma_rq.buffer = obj.rwq.out->buf; + rxq->gdma_rq.count = obj.rwq.out->count; + rxq->gdma_rq.size = obj.rwq.out->size; + rxq->gdma_rq.id = obj.rwq.out->wq_id; + + DRV_LOG(INFO, "rxq rq id %u buf %px count %u size %u", + rxq->gdma_rq.id, rxq->gdma_rq.buffer, + rxq->gdma_rq.count, rxq->gdma_rq.size); + } + + for (i = 0; i < priv->num_queues; i++) { + ret = alloc_and_post_rx_wqes(dev->data->rx_queues[i]); + if (ret) + goto fail; + } + + return 0; + +fail: + stop_rx_queues(dev); + return ret; +} From patchwork Fri Jul 1 09:02:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113617 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 269DEA00C2; Fri, 1 Jul 2022 11:04:40 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9276942BB2; Fri, 1 Jul 2022 11:03:19 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 1A67142B78 for ; Fri, 1 Jul 2022 11:03:10 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id C894220D5906; Fri, 1 Jul 2022 02:03:09 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com C894220D5906 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666189; bh=jH1LBVX6IYb+nSal3s79jqXRnqtkHn+kWM6eCDVU+0w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=Ud9Ui9+oKokccDtDa/agycn9CIYwTq30/vL0dIlt76EMh09d/rLKMbaKRlt3zpXsI 3cT/WsT+PISjysi6KyA+bHEjELHQ+ehR8GqtoIvO+NHVV0NGUI6zkCD0Q91lqI+Stj IoX1TlWvJvkTFx9torQh22Ha1Uk36wa+RUBK48gA= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 14/17] net/mana: add function to receive packets Date: Fri, 1 Jul 2022 02:02:44 -0700 Message-Id: <1656666167-26035-15-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li With all the RX queues created, MANA can use those queues to receive packets. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 2 + drivers/net/mana/mana.c | 2 + drivers/net/mana/mana.h | 37 +++++++++++ drivers/net/mana/mp.c | 2 + drivers/net/mana/rx.c | 104 ++++++++++++++++++++++++++++++ 5 files changed, 147 insertions(+) diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index 7546c99ea3..b47860554d 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -6,6 +6,8 @@ [Features] Link status = P Linux = Y +L3 checksum offload = Y +L4 checksum offload = Y Multiprocess aware = Y Queue start/stop = Y Removal event = Y diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 6c8983cd6a..6d8a0512c1 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -987,6 +987,8 @@ static int mana_pci_probe_mac(struct rte_pci_driver *pci_drv __rte_unused, /* fd is no not used after mapping doorbell */ close(fd); + eth_dev->rx_pkt_burst = mana_rx_burst; + rte_spinlock_lock(&mana_shared_data->lock); mana_shared_data->secondary_cnt++; mana_local_data.secondary_cnt++; diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index 5052ec9061..626abc431a 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -178,6 +178,11 @@ struct gdma_work_request { enum mana_cqe_type { CQE_INVALID = 0, + + CQE_RX_OKAY = 1, + CQE_RX_COALESCED_4 = 2, + CQE_RX_OBJECT_FENCE = 3, + CQE_RX_TRUNCATED = 4, }; struct mana_cqe_header { @@ -203,6 +208,35 @@ struct mana_cqe_header { (NDIS_HASH_TCP_IPV4 | NDIS_HASH_UDP_IPV4 | NDIS_HASH_TCP_IPV6 | \ NDIS_HASH_UDP_IPV6 | NDIS_HASH_TCP_IPV6_EX | NDIS_HASH_UDP_IPV6_EX) +struct mana_rx_comp_per_packet_info { + uint32_t packet_length : 16; + uint32_t reserved0 : 16; + uint32_t reserved1; + uint32_t packet_hash; +}; /* HW DATA */ +#define RX_COM_OOB_NUM_PACKETINFO_SEGMENTS 4 + +struct mana_rx_comp_oob { + struct mana_cqe_header cqe_hdr; + + uint32_t rx_vlan_id : 12; + uint32_t rx_vlan_tag_present : 1; + uint32_t rx_outer_ip_header_checksum_succeeded : 1; + uint32_t rx_outer_ip_header_checksum_failed : 1; + uint32_t reserved : 1; + uint32_t rx_hash_type : 9; + uint32_t rx_ip_header_checksum_succeeded : 1; + uint32_t rx_ip_header_checksum_failed : 1; + uint32_t rx_tcp_checksum_succeeded : 1; + uint32_t rx_tcp_checksum_failed : 1; + uint32_t rx_udp_checksum_succeeded : 1; + uint32_t rx_udp_checksum_failed : 1; + uint32_t reserved1 : 1; + struct mana_rx_comp_per_packet_info + packet_info[RX_COM_OOB_NUM_PACKETINFO_SEGMENTS]; + uint32_t received_wqe_offset; +}; /* HW DATA */ + struct gdma_wqe_dma_oob { uint32_t reserved:24; uint32_t last_Vbytes:8; @@ -371,6 +405,9 @@ int gdma_post_work_request(struct mana_gdma_queue *queue, struct gdma_posted_wqe_info *wqe_info); uint8_t *gdma_get_wqe_pointer(struct mana_gdma_queue *queue); +uint16_t mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **rx_pkts, + uint16_t pkts_n); + uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); diff --git a/drivers/net/mana/mp.c b/drivers/net/mana/mp.c index 9cb3c09d32..a4c612c1a3 100644 --- a/drivers/net/mana/mp.c +++ b/drivers/net/mana/mp.c @@ -160,6 +160,8 @@ static int mana_mp_secondary_handle(const struct rte_mp_msg *mp_msg, case MANA_MP_REQ_START_RXTX: DRV_LOG(INFO, "Port %u starting datapath", dev->data->port_id); + dev->rx_pkt_burst = mana_rx_burst; + rte_mb(); res->result = 0; diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c index bcc9f308f3..4e43299144 100644 --- a/drivers/net/mana/rx.c +++ b/drivers/net/mana/rx.c @@ -367,3 +367,107 @@ int start_rx_queues(struct rte_eth_dev *dev) stop_rx_queues(dev); return ret; } + +uint16_t mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) +{ + uint16_t pkt_received = 0, cqe_processed = 0; + struct mana_rxq *rxq = dpdk_rxq; + struct mana_priv *priv = rxq->priv; + struct gdma_comp comp; + struct rte_mbuf *mbuf; + int ret; + + while (pkt_received < pkts_n && + gdma_poll_completion_queue(&rxq->gdma_cq, &comp) == 1) { + struct mana_rxq_desc *desc; + struct mana_rx_comp_oob *oob = + (struct mana_rx_comp_oob *)&comp.completion_data[0]; + + if (comp.work_queue_number != rxq->gdma_rq.id) { + DRV_LOG(ERR, "rxq comp id mismatch wqid=0x%x rcid=0x%x", + comp.work_queue_number, rxq->gdma_rq.id); + rxq->stats.errors++; + break; + } + + desc = &rxq->desc_ring[rxq->desc_ring_tail]; + rxq->gdma_rq.tail += desc->wqe_size_in_bu; + mbuf = desc->pkt; + + switch (oob->cqe_hdr.cqe_type) { + case CQE_RX_OKAY: + /* Proceed to process mbuf */ + break; + + case CQE_RX_TRUNCATED: + DRV_LOG(ERR, "Drop a truncated packet"); + rxq->stats.errors++; + rte_pktmbuf_free(mbuf); + goto drop; + + case CQE_RX_COALESCED_4: + DRV_LOG(ERR, "RX coalescing is not supported"); + continue; + + default: + DRV_LOG(ERR, "Unknown RX CQE type %d", + oob->cqe_hdr.cqe_type); + continue; + } + + DRV_LOG(DEBUG, "mana_rx_comp_oob CQE_RX_OKAY rxq %p", rxq); + + mbuf->data_off = RTE_PKTMBUF_HEADROOM; + mbuf->nb_segs = 1; + mbuf->next = NULL; + mbuf->pkt_len = oob->packet_info[0].packet_length; + mbuf->data_len = oob->packet_info[0].packet_length; + mbuf->port = priv->port_id; + + if (oob->rx_ip_header_checksum_succeeded) + mbuf->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD; + + if (oob->rx_ip_header_checksum_failed) + mbuf->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD; + + if (oob->rx_outer_ip_header_checksum_failed) + mbuf->ol_flags |= RTE_MBUF_F_RX_OUTER_IP_CKSUM_BAD; + + if (oob->rx_tcp_checksum_succeeded || + oob->rx_udp_checksum_succeeded) + mbuf->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD; + + if (oob->rx_tcp_checksum_failed || + oob->rx_udp_checksum_failed) + mbuf->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD; + + if (oob->rx_hash_type == MANA_HASH_L3 || + oob->rx_hash_type == MANA_HASH_L4) { + mbuf->ol_flags |= RTE_MBUF_F_RX_RSS_HASH; + mbuf->hash.rss = oob->packet_info[0].packet_hash; + } + + pkts[pkt_received++] = mbuf; + rxq->stats.packets++; + rxq->stats.bytes += mbuf->data_len; + +drop: + rxq->desc_ring_tail++; + if (rxq->desc_ring_tail >= rxq->num_desc) + rxq->desc_ring_tail = 0; + + cqe_processed++; + + /* Post another request */ + ret = alloc_and_post_rx_wqe(rxq); + if (ret) { + DRV_LOG(ERR, "failed to post rx wqe ret=%d", ret); + break; + } + } + + if (cqe_processed) + rq_ring_doorbell(rxq); + + return pkt_received; +} From patchwork Fri Jul 1 09:02:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113618 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A35F1A00C2; Fri, 1 Jul 2022 11:04:45 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8FB7442BB7; Fri, 1 Jul 2022 11:03:20 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 969C742B75 for ; Fri, 1 Jul 2022 11:03:10 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 5208A20D4D79; Fri, 1 Jul 2022 02:03:10 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 5208A20D4D79 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666190; bh=ynUtGjDNr7xvh6+cgzlb3rczEND3KPsu1ve7DmLa8vc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=aXDLLM2tM/Umf4dcg/gsoAPuYFYfal9Uw7jKleHN/pVWmQS+RfLuj7lVV7c7zT1oP IbUmG8r+/pTcXNgRDrRbrq/zdsG079VRr/duAXJTTKSKVytb1WrjNv0UeK0TLdq6LJ l8Rm9SbBrfbQ+nUDmFK5IC9aUItWEDz04cSMvwew= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 15/17] net/mana: add function to send packets Date: Fri, 1 Jul 2022 02:02:45 -0700 Message-Id: <1656666167-26035-16-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li With all the TX queues created, MANA can send packets over those queues. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 1 + drivers/net/mana/mana.c | 1 + drivers/net/mana/mana.h | 65 ++++++++ drivers/net/mana/mp.c | 1 + drivers/net/mana/tx.c | 240 ++++++++++++++++++++++++++++++ 5 files changed, 308 insertions(+) diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index b47860554d..bd50fe81d6 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -4,6 +4,7 @@ ; Refer to default.ini for the full list of available PMD features. ; [Features] +Free Tx mbuf on demand = Y Link status = P Linux = Y L3 checksum offload = Y diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 6d8a0512c1..0ffa2882e0 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -987,6 +987,7 @@ static int mana_pci_probe_mac(struct rte_pci_driver *pci_drv __rte_unused, /* fd is no not used after mapping doorbell */ close(fd); + eth_dev->tx_pkt_burst = mana_tx_burst; eth_dev->rx_pkt_burst = mana_rx_burst; rte_spinlock_lock(&mana_shared_data->lock); diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h index 626abc431a..2a74e54007 100644 --- a/drivers/net/mana/mana.h +++ b/drivers/net/mana/mana.h @@ -62,6 +62,47 @@ struct mana_shared_data { #define NOT_USING_CLIENT_DATA_UNIT 0 +enum tx_packet_format_v2 { + short_packet_format = 0, + long_packet_format = 1 +}; + +struct transmit_short_oob_v2 { + enum tx_packet_format_v2 packet_format : 2; + uint32_t tx_is_outer_IPv4 : 1; + uint32_t tx_is_outer_IPv6 : 1; + uint32_t tx_compute_IP_header_checksum : 1; + uint32_t tx_compute_TCP_checksum : 1; + uint32_t tx_compute_UDP_checksum : 1; + uint32_t suppress_tx_CQE_generation : 1; + uint32_t VCQ_number : 24; + uint32_t tx_transport_header_offset : 10; + uint32_t VSQ_frame_num : 14; + uint32_t short_vport_offset : 8; +}; + +struct transmit_long_oob_v2 { + uint32_t TxIsEncapsulatedPacket : 1; + uint32_t TxInnerIsIPv6 : 1; + uint32_t TxInnerTcpOptionsPresent : 1; + uint32_t InjectVlanPriorTag : 1; + uint32_t Reserved1 : 12; + uint32_t PriorityCodePoint : 3; + uint32_t DropEligibleIndicator : 1; + uint32_t VlanIdentifier : 12; + uint32_t TxInnerFrameOffset : 10; + uint32_t TxInnerIpHeaderRelativeOffset : 6; + uint32_t LongVportOffset : 12; + uint32_t Reserved3 : 4; + uint32_t Reserved4 : 32; + uint32_t Reserved5 : 32; +}; + +struct transmit_oob_v2 { + struct transmit_short_oob_v2 short_oob; + struct transmit_long_oob_v2 long_oob; +}; + enum gdma_queue_types { gdma_queue_type_invalid = 0, gdma_queue_send, @@ -183,6 +224,17 @@ enum mana_cqe_type { CQE_RX_COALESCED_4 = 2, CQE_RX_OBJECT_FENCE = 3, CQE_RX_TRUNCATED = 4, + + CQE_TX_OKAY = 32, + CQE_TX_SA_DROP = 33, + CQE_TX_MTU_DROP = 34, + CQE_TX_INVALID_OOB = 35, + CQE_TX_INVALID_ETH_TYPE = 36, + CQE_TX_HDR_PROCESSING_ERROR = 37, + CQE_TX_VF_DISABLED = 38, + CQE_TX_VPORT_IDX_OUT_OF_RANGE = 39, + CQE_TX_VPORT_DISABLED = 40, + CQE_TX_VLAN_TAGGING_VIOLATION = 41, }; struct mana_cqe_header { @@ -191,6 +243,17 @@ struct mana_cqe_header { uint32_t vendor_err : 24; }; /* HW DATA */ +struct mana_tx_comp_oob { + struct mana_cqe_header cqe_hdr; + + uint32_t tx_data_offset; + + uint32_t tx_sgl_offset : 5; + uint32_t tx_wqe_offset : 27; + + uint32_t reserved[12]; +}; /* HW DATA */ + /* NDIS HASH Types */ #define BIT(nr) (1 << (nr)) #define NDIS_HASH_IPV4 BIT(0) @@ -407,6 +470,8 @@ uint8_t *gdma_get_wqe_pointer(struct mana_gdma_queue *queue); uint16_t mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **rx_pkts, uint16_t pkts_n); +uint16_t mana_tx_burst(void *dpdk_txq, struct rte_mbuf **tx_pkts, + uint16_t pkts_n); uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); diff --git a/drivers/net/mana/mp.c b/drivers/net/mana/mp.c index a4c612c1a3..ea9014db8d 100644 --- a/drivers/net/mana/mp.c +++ b/drivers/net/mana/mp.c @@ -160,6 +160,7 @@ static int mana_mp_secondary_handle(const struct rte_mp_msg *mp_msg, case MANA_MP_REQ_START_RXTX: DRV_LOG(INFO, "Port %u starting datapath", dev->data->port_id); + dev->tx_pkt_burst = mana_tx_burst; dev->rx_pkt_burst = mana_rx_burst; rte_mb(); diff --git a/drivers/net/mana/tx.c b/drivers/net/mana/tx.c index dde911e548..211af6e124 100644 --- a/drivers/net/mana/tx.c +++ b/drivers/net/mana/tx.c @@ -178,3 +178,243 @@ static inline uint16_t get_vsq_frame_num(uint32_t vsq) v.gdma_txq_id = vsq; return v.vsq_frame; } + +uint16_t mana_tx_burst(void *dpdk_txq, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts) +{ + struct mana_txq *txq = dpdk_txq; + struct mana_priv *priv = txq->priv; + struct gdma_comp comp; + int ret; + void *db_page; + + /* Process send completions from GDMA */ + while (gdma_poll_completion_queue(&txq->gdma_cq, &comp) == 1) { + struct mana_txq_desc *desc = + &txq->desc_ring[txq->desc_ring_tail]; + struct mana_tx_comp_oob *oob = + (struct mana_tx_comp_oob *)&comp.completion_data[0]; + + if (oob->cqe_hdr.cqe_type != CQE_TX_OKAY) { + DRV_LOG(ERR, + "mana_tx_comp_oob cqe_type %u vendor_err %u", + oob->cqe_hdr.cqe_type, oob->cqe_hdr.vendor_err); + txq->stats.errors++; + } else { + DRV_LOG(DEBUG, "mana_tx_comp_oob CQE_TX_OKAY"); + txq->stats.packets++; + } + + if (!desc->pkt) { + DRV_LOG(ERR, "mana_txq_desc has a NULL pkt"); + } else { + txq->stats.bytes += desc->pkt->data_len; + rte_pktmbuf_free(desc->pkt); + } + + desc->pkt = NULL; + txq->desc_ring_tail = (txq->desc_ring_tail + 1) % txq->num_desc; + txq->gdma_sq.tail += desc->wqe_size_in_bu; + } + + /* Post send requests to GDMA */ + uint16_t pkt_idx; + + for (pkt_idx = 0; pkt_idx < nb_pkts; pkt_idx++) { + struct rte_mbuf *m_pkt = tx_pkts[pkt_idx]; + struct rte_mbuf *m_seg = m_pkt; + struct transmit_oob_v2 tx_oob = {0}; + struct one_sgl sgl = {0}; + + /* Drop the packet if it exceeds max segments */ + if (m_pkt->nb_segs > priv->max_send_sge) { + DRV_LOG(ERR, "send packet segments %d exceeding max", + m_pkt->nb_segs); + continue; + } + + /* Fill in the oob */ + tx_oob.short_oob.packet_format = short_packet_format; + tx_oob.short_oob.tx_is_outer_IPv4 = + m_pkt->ol_flags & RTE_MBUF_F_TX_IPV4 ? 1 : 0; + tx_oob.short_oob.tx_is_outer_IPv6 = + m_pkt->ol_flags & RTE_MBUF_F_TX_IPV6 ? 1 : 0; + + tx_oob.short_oob.tx_compute_IP_header_checksum = + m_pkt->ol_flags & RTE_MBUF_F_TX_IP_CKSUM ? 1 : 0; + + if ((m_pkt->ol_flags & RTE_MBUF_F_TX_L4_MASK) == + RTE_MBUF_F_TX_TCP_CKSUM) { + struct rte_tcp_hdr *tcp_hdr; + + /* HW needs partial TCP checksum */ + + tcp_hdr = rte_pktmbuf_mtod_offset(m_pkt, + struct rte_tcp_hdr *, + m_pkt->l2_len + m_pkt->l3_len); + + if (m_pkt->ol_flags & RTE_MBUF_F_TX_IPV4) { + struct rte_ipv4_hdr *ip_hdr; + + ip_hdr = rte_pktmbuf_mtod_offset(m_pkt, + struct rte_ipv4_hdr *, + m_pkt->l2_len); + tcp_hdr->cksum = rte_ipv4_phdr_cksum(ip_hdr, + m_pkt->ol_flags); + + } else if (m_pkt->ol_flags & RTE_MBUF_F_TX_IPV6) { + struct rte_ipv6_hdr *ip_hdr; + + ip_hdr = rte_pktmbuf_mtod_offset(m_pkt, + struct rte_ipv6_hdr *, + m_pkt->l2_len); + tcp_hdr->cksum = rte_ipv6_phdr_cksum(ip_hdr, + m_pkt->ol_flags); + } else { + DRV_LOG(ERR, "Invalid input for TCP CKSUM"); + } + + tx_oob.short_oob.tx_compute_TCP_checksum = 1; + tx_oob.short_oob.tx_transport_header_offset = + m_pkt->l2_len + m_pkt->l3_len; + } + + if ((m_pkt->ol_flags & RTE_MBUF_F_TX_L4_MASK) == + RTE_MBUF_F_TX_UDP_CKSUM) { + struct rte_udp_hdr *udp_hdr; + + /* HW needs partial UDP checksum */ + udp_hdr = rte_pktmbuf_mtod_offset(m_pkt, + struct rte_udp_hdr *, + m_pkt->l2_len + m_pkt->l3_len); + + if (m_pkt->ol_flags & RTE_MBUF_F_TX_IPV4) { + struct rte_ipv4_hdr *ip_hdr; + + ip_hdr = rte_pktmbuf_mtod_offset(m_pkt, + struct rte_ipv4_hdr *, + m_pkt->l2_len); + + udp_hdr->dgram_cksum = + rte_ipv4_phdr_cksum(ip_hdr, + m_pkt->ol_flags); + + } else if (m_pkt->ol_flags & RTE_MBUF_F_TX_IPV6) { + struct rte_ipv6_hdr *ip_hdr; + + ip_hdr = rte_pktmbuf_mtod_offset(m_pkt, + struct rte_ipv6_hdr *, + m_pkt->l2_len); + + udp_hdr->dgram_cksum = + rte_ipv6_phdr_cksum(ip_hdr, + m_pkt->ol_flags); + + } else { + DRV_LOG(ERR, "Invalid input for UDP CKSUM"); + } + + tx_oob.short_oob.tx_compute_UDP_checksum = 1; + } + + tx_oob.short_oob.suppress_tx_CQE_generation = 0; + tx_oob.short_oob.VCQ_number = txq->gdma_cq.id; + + tx_oob.short_oob.VSQ_frame_num = + get_vsq_frame_num(txq->gdma_sq.id); + tx_oob.short_oob.short_vport_offset = txq->tx_vp_offset; + + DRV_LOG(DEBUG, "tx_oob packet_format %u ipv4 %u ipv6 %u", + tx_oob.short_oob.packet_format, + tx_oob.short_oob.tx_is_outer_IPv4, + tx_oob.short_oob.tx_is_outer_IPv6); + + DRV_LOG(DEBUG, "tx_oob checksum ip %u tcp %u udp %u offset %u", + tx_oob.short_oob.tx_compute_IP_header_checksum, + tx_oob.short_oob.tx_compute_TCP_checksum, + tx_oob.short_oob.tx_compute_UDP_checksum, + tx_oob.short_oob.tx_transport_header_offset); + + DRV_LOG(DEBUG, "pkt[%d]: buf_addr 0x%p, nb_segs %d, pkt_len %d", + pkt_idx, m_pkt->buf_addr, m_pkt->nb_segs, + m_pkt->pkt_len); + + /* Create SGL for packet data buffers */ + for (uint16_t seg_idx = 0; seg_idx < m_pkt->nb_segs; seg_idx++) { + struct mana_mr_cache *mr = + find_pmd_mr(&txq->mr_btree, priv, m_seg); + + if (!mr) { + DRV_LOG(ERR, "failed to get MR, pkt_idx %u", + pkt_idx); + return pkt_idx; + } + + sgl.gdma_sgl[seg_idx].address = + rte_cpu_to_le_64(rte_pktmbuf_mtod(m_seg, + uint64_t)); + sgl.gdma_sgl[seg_idx].size = m_seg->data_len; + sgl.gdma_sgl[seg_idx].memory_key = mr->lkey; + + DRV_LOG(DEBUG, "seg idx %u address %lx size %x key %x", + seg_idx, sgl.gdma_sgl[seg_idx].address, + sgl.gdma_sgl[seg_idx].size, + sgl.gdma_sgl[seg_idx].memory_key); + + m_seg = m_seg->next; + } + + struct gdma_work_request work_req = {0}; + struct gdma_posted_wqe_info wqe_info = {0}; + + work_req.gdma_header.struct_size = sizeof(work_req); + wqe_info.gdma_header.struct_size = sizeof(wqe_info); + + work_req.sgl = sgl.gdma_sgl; + work_req.num_sgl_elements = m_pkt->nb_segs; + work_req.inline_oob_size_in_bytes = + sizeof(struct transmit_short_oob_v2); + work_req.inline_oob_data = &tx_oob; + work_req.flags = 0; + work_req.client_data_unit = NOT_USING_CLIENT_DATA_UNIT; + + ret = gdma_post_work_request(&txq->gdma_sq, &work_req, + &wqe_info); + if (!ret) { + struct mana_txq_desc *desc = + &txq->desc_ring[txq->desc_ring_head]; + + /* Update queue for tracking pending requests */ + desc->pkt = m_pkt; + desc->wqe_size_in_bu = wqe_info.wqe_size_in_bu; + txq->desc_ring_head = + (txq->desc_ring_head + 1) % txq->num_desc; + + DRV_LOG(DEBUG, "nb_pkts %u pkt[%d] sent", + nb_pkts, pkt_idx); + } else { + DRV_LOG(INFO, "pkt[%d] failed to post send ret %d", + pkt_idx, ret); + break; + } + } + + /* Ring hardware door bell */ + db_page = priv->db_page; + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { + struct rte_eth_dev *dev = + &rte_eth_devices[priv->dev_data->port_id]; + struct mana_process_priv *process_priv = dev->process_private; + + db_page = process_priv->db_page; + } + + ret = mana_ring_doorbell(db_page, gdma_queue_send, + txq->gdma_sq.id, + txq->gdma_sq.head * + GDMA_WQE_ALIGNMENT_UNIT_SIZE); + if (ret) + DRV_LOG(ERR, "mana_ring_doorbell failed ret %d", ret); + + return pkt_idx; +} From patchwork Fri Jul 1 09:02:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113619 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 27F37A00C2; Fri, 1 Jul 2022 11:04:53 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0216C42BC1; Fri, 1 Jul 2022 11:03:22 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 24E4442B7E for ; Fri, 1 Jul 2022 11:03:11 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id D215520C356C; Fri, 1 Jul 2022 02:03:10 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com D215520C356C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666190; bh=EKt0yGbEX0csvVGw40Zw6WI7d3Qp4psN7EcNY2Ih5ZU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=T9Hw0z4jY/hye6cTNrI5tTqgc3Ibe3yQAdMuMS8JaF0xsT/2ImC8wEq6DeNUuxc8D MT/kGUS2+ZDLUol1kIMQ5/vHQG3W5eFqLZGScDW8I33il5PtHWXJqBCzY6ynYFrZlV SMoq46kwasRgVXCd2m8dj2kWhfGiIv7KtUfhsF0k= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 16/17] net/mana: add function to start/stop device Date: Fri, 1 Jul 2022 02:02:46 -0700 Message-Id: <1656666167-26035-17-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li Add support for starting/stopping the device. Signed-off-by: Long Li --- drivers/net/mana/mana.c | 70 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 0ffa2882e0..b919d86500 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -126,6 +126,74 @@ static int mana_dev_configure(struct rte_eth_dev *dev) static int mana_intr_uninstall(struct mana_priv *priv); +static int +mana_dev_start(struct rte_eth_dev *dev) +{ + int ret; + struct mana_priv *priv = dev->data->dev_private; + + rte_rwlock_init(&priv->mr_list_lock); + ret = mana_mr_btree_init(&priv->mr_btree, MANA_MR_BTREE_CACHE_N, + dev->device->numa_node); + if (ret) { + DRV_LOG(ERR, "Failed to init device MR btree %d", ret); + return ret; + } + + ret = start_tx_queues(dev); + if (ret) { + DRV_LOG(ERR, "failed to start tx queues %d", ret); + return ret; + } + + ret = start_rx_queues(dev); + if (ret) { + DRV_LOG(ERR, "failed to start rx queues %d", ret); + stop_tx_queues(dev); + return ret; + } + + rte_wmb(); + + dev->tx_pkt_burst = mana_tx_burst; + dev->rx_pkt_burst = mana_rx_burst; + + DRV_LOG(INFO, "TX/RX queues have started"); + + /* Enable datapath for secondary processes */ + mana_mp_req_on_rxtx(dev, MANA_MP_REQ_START_RXTX); + + return 0; +} + +static int +mana_dev_stop(struct rte_eth_dev *dev __rte_unused) +{ + int ret; + + dev->tx_pkt_burst = mana_tx_burst_removed; + dev->rx_pkt_burst = mana_rx_burst_removed; + + /* Stop datapath on secondary processes */ + mana_mp_req_on_rxtx(dev, MANA_MP_REQ_STOP_RXTX); + + rte_wmb(); + + ret = stop_tx_queues(dev); + if (ret) { + DRV_LOG(ERR, "failed to stop tx queues"); + return ret; + } + + ret = stop_rx_queues(dev); + if (ret) { + DRV_LOG(ERR, "failed to stop tx queues"); + return ret; + } + + return 0; +} + static int mana_dev_close(struct rte_eth_dev *dev) { @@ -464,6 +532,8 @@ static int mana_dev_link_update(struct rte_eth_dev *dev, const struct eth_dev_ops mana_dev_ops = { .dev_configure = mana_dev_configure, + .dev_start = mana_dev_start, + .dev_stop = mana_dev_stop, .dev_close = mana_dev_close, .dev_infos_get = mana_dev_info_get, .txq_info_get = mana_dev_tx_queue_info, From patchwork Fri Jul 1 09:02:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 113620 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7BC7FA00C2; Fri, 1 Jul 2022 11:04:58 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1ED1442BC6; Fri, 1 Jul 2022 11:03:23 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 13BC342B6E for ; Fri, 1 Jul 2022 11:03:12 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1004) id 75CB520D4D79; Fri, 1 Jul 2022 02:03:11 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 75CB520D4D79 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxonhyperv.com; s=default; t=1656666191; bh=NlRSNW3AsUWmwGeLeYv42WFknHvIC4UUHQ/iR8NLWlM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To:From; b=s6nFyURGW4lvLfxAWX9reNL9FfegONDfKIS2M5I+ZqMfhnyqmtAB+Vk+DjfWskBNz KdJcEdVCLFWPXKX+JwIUDnYEPBabRYTBVVJuyGKtBhhM9LLY3yw7Mn2C8t+RdQ/ueG uw6BJuN4mZM7oWq3vJLierhUnu6YaL6d6CnyX1BQ= From: longli@linuxonhyperv.com To: Ferruh Yigit Cc: dev@dpdk.org, Ajay Sharma , Stephen Hemminger , Long Li Subject: [PATCH 17/17] net/mana: add function to report queue stats Date: Fri, 1 Jul 2022 02:02:47 -0700 Message-Id: <1656666167-26035-18-git-send-email-longli@linuxonhyperv.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> References: <1656666167-26035-1-git-send-email-longli@linuxonhyperv.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: longli@microsoft.com Errors-To: dev-bounces@dpdk.org From: Long Li Report packet statistics. Signed-off-by: Long Li --- doc/guides/nics/features/mana.ini | 2 + drivers/net/mana/mana.c | 77 +++++++++++++++++++++++++++++++ 2 files changed, 79 insertions(+) diff --git a/doc/guides/nics/features/mana.ini b/doc/guides/nics/features/mana.ini index bd50fe81d6..a77d6f2249 100644 --- a/doc/guides/nics/features/mana.ini +++ b/doc/guides/nics/features/mana.ini @@ -4,6 +4,7 @@ ; Refer to default.ini for the full list of available PMD features. ; [Features] +Basic stats = Y Free Tx mbuf on demand = Y Link status = P Linux = Y @@ -14,5 +15,6 @@ Queue start/stop = Y Removal event = Y RSS hash = Y Speed capabilities = P +Stats per queue = Y Usage doc = Y x86-64 = Y diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index b919d86500..b514a4cfef 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -530,6 +530,79 @@ static int mana_dev_link_update(struct rte_eth_dev *dev, return rte_eth_linkstatus_set(dev, &link); } +static int mana_dev_stats_get(struct rte_eth_dev *dev, + struct rte_eth_stats *stats) +{ + unsigned int i; + + for (i = 0; i < dev->data->nb_tx_queues; i++) { + struct mana_txq *txq = dev->data->tx_queues[i]; + + if (!txq) + continue; + + stats->opackets = txq->stats.packets; + stats->obytes = txq->stats.bytes; + stats->oerrors = txq->stats.errors; + + if (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) { + stats->q_opackets[i] = txq->stats.packets; + stats->q_obytes[i] = txq->stats.bytes; + } + } + + stats->rx_nombuf = 0; + for (i = 0; i < dev->data->nb_rx_queues; i++) { + struct mana_rxq *rxq = dev->data->rx_queues[i]; + + if (!rxq) + continue; + + stats->ipackets = rxq->stats.packets; + stats->ibytes = rxq->stats.bytes; + stats->ierrors = rxq->stats.errors; + + /* There is no good way to get stats->imissed, not setting it */ + + if (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) { + stats->q_ipackets[i] = rxq->stats.packets; + stats->q_ibytes[i] = rxq->stats.bytes; + } + + stats->rx_nombuf += rxq->stats.nombuf; + } + + return 0; +} + +static int +mana_dev_stats_reset(struct rte_eth_dev *dev __rte_unused) +{ + unsigned int i; + + PMD_INIT_FUNC_TRACE(); + + for (i = 0; i < dev->data->nb_tx_queues; i++) { + struct mana_txq *txq = dev->data->tx_queues[i]; + + if (!txq) + continue; + + memset(&txq->stats, 0, sizeof(txq->stats)); + } + + for (i = 0; i < dev->data->nb_rx_queues; i++) { + struct mana_rxq *rxq = dev->data->rx_queues[i]; + + if (!rxq) + continue; + + memset(&rxq->stats, 0, sizeof(rxq->stats)); + } + + return 0; +} + const struct eth_dev_ops mana_dev_ops = { .dev_configure = mana_dev_configure, .dev_start = mana_dev_start, @@ -546,9 +619,13 @@ const struct eth_dev_ops mana_dev_ops = { .rx_queue_setup = mana_dev_rx_queue_setup, .rx_queue_release = mana_dev_rx_queue_release, .link_update = mana_dev_link_update, + .stats_get = mana_dev_stats_get, + .stats_reset = mana_dev_stats_reset, }; const struct eth_dev_ops mana_dev_sec_ops = { + .stats_get = mana_dev_stats_get, + .stats_reset = mana_dev_stats_reset, .dev_infos_get = mana_dev_info_get, };