From patchwork Tue Jul 10 17:25:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 42742 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 68C971B438; Tue, 10 Jul 2018 19:26:09 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id 858361B3AE; Tue, 10 Jul 2018 19:26:06 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w6AHPuxC007842; Tue, 10 Jul 2018 18:25:56 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w6AHPuvs007841; Tue, 10 Jul 2018 18:25:56 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org, anatoly.burakov@intel.com Date: Tue, 10 Jul 2018 18:25:48 +0100 Message-Id: <1531243552-7795-2-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> References: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH v4 1/5] mem: add function for checking memsegs IOVAs addresses X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" A device can suffer addressing limitations. This functions checks memsegs have iovas within the supported range based on dma mask. PMD should use this during initialization if supported devices suffer addressing limitations, returning an error if this function returns memsegs out of range. Another potential usage is for emulated IOMMU hardware with addressing limitations. Applicable to v17.11.3 only. Signed-off-by: Alejandro Lucero Acked-by: Anatoly Burakov Acked-by: Eelco Chaudron --- lib/librte_eal/common/eal_common_memory.c | 48 ++++++++++++++++++++++++++++++ lib/librte_eal/common/include/rte_memory.h | 3 ++ lib/librte_eal/rte_eal_version.map | 1 + 3 files changed, 52 insertions(+) diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index fc6c44d..00ab393 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -109,6 +109,54 @@ } } +#if defined(RTE_ARCH_X86) +#define X86_VA_WIDTH 47 /* From Documentation/x86/x86_64/mm.txt */ +#define MAX_DMA_MASK_BITS X86_VA_WIDTH +#else +/* 63 bits is good enough for a sanity check */ +#define MAX_DMA_MASK_BITS 63 +#endif + +/* check memseg iovas are within the required range based on dma mask */ +int +rte_eal_check_dma_mask(uint8_t maskbits) +{ + + const struct rte_mem_config *mcfg; + uint64_t mask; + int i; + + /* sanity check */ + if (maskbits > MAX_DMA_MASK_BITS) { + RTE_LOG(INFO, EAL, "wrong dma mask size %u (Max: %u)\n", + maskbits, MAX_DMA_MASK_BITS); + return -1; + } + + /* create dma mask */ + mask = ~((1ULL << maskbits) - 1); + + /* get pointer to global configuration */ + mcfg = rte_eal_get_configuration()->mem_config; + + for (i = 0; i < RTE_MAX_MEMSEG; i++) { + if (mcfg->memseg[i].addr == NULL) + break; + + if (mcfg->memseg[i].iova & mask) { + RTE_LOG(INFO, EAL, + "memseg[%d] iova %"PRIx64" out of range:\n", + i, mcfg->memseg[i].iova); + + RTE_LOG(INFO, EAL, "\tusing dma mask %"PRIx64"\n", + mask); + return -1; + } + } + + return 0; +} + /* return the number of memory channels */ unsigned rte_memory_get_nchannel(void) { diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h index 80a8fc0..b2a0168 100644 --- a/lib/librte_eal/common/include/rte_memory.h +++ b/lib/librte_eal/common/include/rte_memory.h @@ -209,6 +209,9 @@ struct rte_memseg { */ unsigned rte_memory_get_nrank(void); +/* check memsegs iovas are within a range based on dma mask */ +int rte_eal_check_dma_mask(uint8_t maskbits); + /** * Drivers based on uio will not load unless physical * addresses are obtainable. It is only possible to get diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index f4f46c1..aa6cf87 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -184,6 +184,7 @@ DPDK_17.11 { rte_eal_create_uio_dev; rte_bus_get_iommu_class; + rte_eal_check_dma_mask; rte_eal_has_pci; rte_eal_iova_mode; rte_eal_mbuf_default_mempool_ops; From patchwork Tue Jul 10 17:25:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 42743 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 98F351B44E; Tue, 10 Jul 2018 19:26:11 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id AD89D1B3BB; Tue, 10 Jul 2018 19:26:06 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w6AHPuCN007846; Tue, 10 Jul 2018 18:25:56 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w6AHPuER007845; Tue, 10 Jul 2018 18:25:56 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org, anatoly.burakov@intel.com Date: Tue, 10 Jul 2018 18:25:49 +0100 Message-Id: <1531243552-7795-3-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> References: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH v4 2/5] bus/pci: use IOVAs check when setting IOVA mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Although VT-d emulation currently only supports 39 bits, it could be iovas being within that supported range. This patch allows IOVA mode in such a case. Indeed, memory initialization code can be modified for using lower virtual addresses than those used by the kernel for 64 bits processes by default, and therefore memsegs iovas can use 39 bits or less for most system. And this is likely 100% true for VMs. Applicable to v17.11.3 only. Signed-off-by: Alejandro Lucero --- drivers/bus/pci/linux/pci.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c index 74deef3..792c819 100644 --- a/drivers/bus/pci/linux/pci.c +++ b/drivers/bus/pci/linux/pci.c @@ -43,6 +43,7 @@ #include #include #include +#include #include "eal_private.h" #include "eal_filesystem.h" @@ -613,10 +614,12 @@ fclose(fp); mgaw = ((vtd_cap_reg & VTD_CAP_MGAW_MASK) >> VTD_CAP_MGAW_SHIFT) + 1; - if (mgaw < X86_VA_WIDTH) + + if (!rte_eal_check_dma_mask(mgaw)) + return true; + else return false; - return true; } #elif defined(RTE_ARCH_PPC_64) static bool @@ -640,13 +643,17 @@ { struct rte_pci_device *dev = NULL; struct rte_pci_driver *drv = NULL; + int iommu_dma_mask_check_done = 0; FOREACH_DRIVER_ON_PCIBUS(drv) { FOREACH_DEVICE_ON_PCIBUS(dev) { if (!rte_pci_match(drv, dev)) continue; - if (!pci_one_device_iommu_support_va(dev)) - return false; + if (!iommu_dma_mask_check_done) { + if (pci_one_device_iommu_support_va(dev) < 0) + return false; + iommu_dma_mask_check_done = 1; + } } } return true; From patchwork Tue Jul 10 17:25:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 42744 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 69B761B463; Tue, 10 Jul 2018 19:26:14 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id DD3561B426; Tue, 10 Jul 2018 19:26:06 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w6AHPvdu007850; Tue, 10 Jul 2018 18:25:57 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w6AHPvae007849; Tue, 10 Jul 2018 18:25:57 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org, anatoly.burakov@intel.com Date: Tue, 10 Jul 2018 18:25:50 +0100 Message-Id: <1531243552-7795-4-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> References: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH v4 3/5] mem: use address hint for mapping hugepages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Linux kernel uses a really high address as starting address for serving mmaps calls. If there exists addressing limitations and IOVA mode is VA, this starting address is likely too high for those devices. However, it is possible to use a lower address in the process virtual address space as with 64 bits there is a lot of available space. This patch adds an address hint as starting address for 64 bits systems. Applicable to v17.11.3 only. Signed-off-by: Alejandro Lucero Acked-by: Anatoly Burakov Acked-by: Eelco Chaudron --- lib/librte_eal/linuxapp/eal/eal_memory.c | 55 ++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 9 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 17c20d4..2ed4017 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -88,6 +88,23 @@ static uint64_t baseaddr_offset; +#ifdef RTE_ARCH_64 +/* + * Linux kernel uses a really high address as starting address for serving + * mmaps calls. If there exists addressing limitations and IOVA mode is VA, + * this starting address is likely too high for those devices. However, it + * is possible to use a lower address in the process virtual address space + * as with 64 bits there is a lot of available space. + * + * Current known limitations are 39 or 40 bits. Setting the starting address + * at 4GB implies there are 508GB or 1020GB for mapping the available + * hugepages. This is likely enough for most systems, although a device with + * addressing limitations should call rte_dev_check_dma_mask for ensuring all + * memory is within supported range. + */ +static uint64_t baseaddr = 0x100000000; +#endif + static bool phys_addrs_available = true; #define RANDOMIZE_VA_SPACE_FILE "/proc/sys/kernel/randomize_va_space" @@ -250,6 +267,23 @@ } } +static void * +get_addr_hint(void) +{ + if (internal_config.base_virtaddr != 0) { + return (void *) (uintptr_t) + (internal_config.base_virtaddr + + baseaddr_offset); + } else { +#ifdef RTE_ARCH_64 + return (void *) (uintptr_t) (baseaddr + + baseaddr_offset); +#else + return NULL; +#endif + } +} + /* * Try to mmap *size bytes in /dev/zero. If it is successful, return the * pointer to the mmap'd area and keep *size unmodified. Else, retry @@ -260,16 +294,10 @@ static void * get_virtual_area(size_t *size, size_t hugepage_sz) { - void *addr; + void *addr, *addr_hint; int fd; long aligned_addr; - if (internal_config.base_virtaddr != 0) { - addr = (void*) (uintptr_t) (internal_config.base_virtaddr + - baseaddr_offset); - } - else addr = NULL; - RTE_LOG(DEBUG, EAL, "Ask a virtual area of 0x%zx bytes\n", *size); fd = open("/dev/zero", O_RDONLY); @@ -278,7 +306,9 @@ return NULL; } do { - addr = mmap(addr, + addr_hint = get_addr_hint(); + + addr = mmap(addr_hint, (*size) + hugepage_sz, PROT_READ, #ifdef RTE_ARCH_PPC_64 MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, @@ -286,8 +316,15 @@ MAP_PRIVATE, #endif fd, 0); - if (addr == MAP_FAILED) + if (addr == MAP_FAILED) { + /* map failed. Let's try with less memory */ *size -= hugepage_sz; + } else if (addr_hint && addr != addr_hint) { + /* hint was not used. Try with another offset */ + munmap(addr, (*size) + hugepage_sz); + addr = MAP_FAILED; + baseaddr_offset += 0x100000000; + } } while (addr == MAP_FAILED && *size > 0); if (addr == MAP_FAILED) { From patchwork Tue Jul 10 17:25:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 42745 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 955BF1B46E; Tue, 10 Jul 2018 19:26:15 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id 224EA1B39E; Tue, 10 Jul 2018 19:26:07 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w6AHPv36007854; Tue, 10 Jul 2018 18:25:57 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w6AHPvhL007853; Tue, 10 Jul 2018 18:25:57 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org, anatoly.burakov@intel.com Date: Tue, 10 Jul 2018 18:25:51 +0100 Message-Id: <1531243552-7795-5-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> References: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH v4 4/5] net/nfp: check hugepages IOVAs based on DMA mask X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" NFP devices can not handle DMA addresses requiring more than 40 bits. This patch uses rte_dev_check_dma_mask with 40 bits and avoids device initialization if memory out of NFP range. Applicable to v17.11.3 only. Signed-off-by: Alejandro Lucero Acked-by: Eelco Chaudron --- drivers/net/nfp/nfp_net.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c index d9cd047..8fc1b8f 100644 --- a/drivers/net/nfp/nfp_net.c +++ b/drivers/net/nfp/nfp_net.c @@ -2649,6 +2649,14 @@ uint32_t nfp_net_txq_full(struct nfp_net_txq *txq) pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev); + /* NFP can not handle DMA addresses requiring more than 40 bits */ + if (rte_eal_check_dma_mask(40) < 0) { + RTE_LOG(INFO, PMD, "device %s can not be used:", + pci_dev->device.name); + RTE_LOG(INFO, PMD, "\trestricted dma mask to 40 bits!\n"); + return -ENODEV; + }; + if ((pci_dev->id.device_id == PCI_DEVICE_ID_NFP4000_PF_NIC) || (pci_dev->id.device_id == PCI_DEVICE_ID_NFP6000_PF_NIC)) { port = get_pf_port_number(eth_dev->data->name); From patchwork Tue Jul 10 17:25:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 42746 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B49761B475; Tue, 10 Jul 2018 19:26:16 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id 4E48A1B3AE; Tue, 10 Jul 2018 19:26:07 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w6AHPv52007858; Tue, 10 Jul 2018 18:25:57 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w6AHPvKQ007857; Tue, 10 Jul 2018 18:25:57 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org, anatoly.burakov@intel.com Date: Tue, 10 Jul 2018 18:25:52 +0100 Message-Id: <1531243552-7795-6-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> References: <1531243552-7795-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH v4 5/5] net/nfp: support IOVA VA mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" NFP can handle IOVA as VA. It requires to check those IOVAs being in the supported range what is done during initialization. Applicable to v17.11.3 only. Signed-off-by: Alejandro Lucero Acked-by: Eelco Chaudron --- drivers/net/nfp/nfp_net.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c index 8fc1b8f..98098f6 100644 --- a/drivers/net/nfp/nfp_net.c +++ b/drivers/net/nfp/nfp_net.c @@ -3053,14 +3053,16 @@ static int eth_nfp_pci_remove(struct rte_pci_device *pci_dev) static struct rte_pci_driver rte_nfp_net_pf_pmd = { .id_table = pci_id_nfp_pf_net_map, - .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC | + RTE_PCI_DRV_IOVA_AS_VA, .probe = nfp_pf_pci_probe, .remove = eth_nfp_pci_remove, }; static struct rte_pci_driver rte_nfp_net_vf_pmd = { .id_table = pci_id_nfp_vf_net_map, - .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC | + RTE_PCI_DRV_IOVA_AS_VA, .probe = eth_nfp_pci_probe, .remove = eth_nfp_pci_remove, };