From patchwork Thu Aug 30 15:21:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 44048 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 384725398; Thu, 30 Aug 2018 17:22:32 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id AFDC54CBD; Thu, 30 Aug 2018 17:22:27 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w7UFLZcv021911; Thu, 30 Aug 2018 16:21:35 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w7UFLZti021910; Thu, 30 Aug 2018 16:21:35 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org Date: Thu, 30 Aug 2018 16:21:28 +0100 Message-Id: <1535642492-21831-2-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> References: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH 1/5] mem: add function for checking memsegs IOVAs addresses X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" A device can suffer addressing limitations. This functions checks memsegs have iovas within the supported range based on dma mask. PMD should use this during initialization if supported devices suffer addressing limitations, returning an error if this function returns memsegs out of range. Another potential usage is for emulated IOMMU hardware with addressing limitations. It is necessary to save the most restricted dma mask for checking memory allocated dynamically after initialization. Signed-off-by: Alejandro Lucero --- lib/librte_eal/common/eal_common_memory.c | 56 +++++++++++++++++++++++ lib/librte_eal/common/include/rte_eal_memconfig.h | 3 ++ lib/librte_eal/common/include/rte_memory.h | 3 ++ lib/librte_eal/common/malloc_heap.c | 12 +++++ lib/librte_eal/linuxapp/eal/eal.c | 2 + lib/librte_eal/rte_eal_version.map | 1 + 6 files changed, 77 insertions(+) diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index fbfb1b0..1e8312b 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -383,6 +383,62 @@ struct virtiova { rte_memseg_walk(dump_memseg, f); } +static int +check_iova(const struct rte_memseg_list *msl __rte_unused, + const struct rte_memseg *ms, void *arg) +{ + uint64_t *mask = arg; + rte_iova_t iova; + + /* higher address within segment */ + iova = (ms->iova + ms->len) - 1; + if (!(iova & *mask)) + return 0; + + RTE_LOG(INFO, EAL, "memseg iova %"PRIx64", len %zx, out of range\n", + ms->iova, ms->len); + + RTE_LOG(INFO, EAL, "\tusing dma mask %"PRIx64"\n", *mask); + /* Stop the walk and change mask */ + *mask = 0; + return 1; +} + +#if defined(RTE_ARCH_64) +#define X86_VA_WIDTH 63 +#else +#define MAX_DMA_MASK_BITS 31 +#endif + +/* check memseg iovas are within the required range based on dma mask */ +int __rte_experimental +rte_eal_check_dma_mask(uint8_t maskbits) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + uint64_t mask; + + /* sanity check */ + if (maskbits > MAX_DMA_MASK_BITS) { + RTE_LOG(INFO, EAL, "wrong dma mask size %u (Max: %u)\n", + maskbits, MAX_DMA_MASK_BITS); + return -1; + } + + /* keep the more restricted maskbit */ + if (!mcfg->dma_maskbits || maskbits < mcfg->dma_maskbits) + mcfg->dma_maskbits = maskbits; + + /* create dma mask */ + mask = ~((1ULL << maskbits) - 1); + + rte_memseg_walk(check_iova, &mask); + + if (!mask) + return -1; + + return 0; +} + /* return the number of memory channels */ unsigned rte_memory_get_nchannel(void) { diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h index aff0688..aea44cb 100644 --- a/lib/librte_eal/common/include/rte_eal_memconfig.h +++ b/lib/librte_eal/common/include/rte_eal_memconfig.h @@ -77,6 +77,9 @@ struct rte_mem_config { * exact same address the primary process maps it. */ uint64_t mem_cfg_addr; + + /* keeps the more restricted dma mask */ + uint8_t dma_maskbits; } __attribute__((__packed__)); diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h index c4b7f4c..cd439e3 100644 --- a/lib/librte_eal/common/include/rte_memory.h +++ b/lib/librte_eal/common/include/rte_memory.h @@ -357,6 +357,9 @@ typedef int (*rte_memseg_list_walk_t)(const struct rte_memseg_list *msl, */ unsigned rte_memory_get_nrank(void); +/* check memsegs iovas are within a range based on dma mask */ +int rte_eal_check_dma_mask(uint8_t maskbits); + /** * Drivers based on uio will not load unless physical * addresses are obtainable. It is only possible to get diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 12aaf2d..255d717 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -259,11 +259,13 @@ struct malloc_elem * int socket, unsigned int flags, size_t align, size_t bound, bool contig, struct rte_memseg **ms, int n_segs) { + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; struct rte_memseg_list *msl; struct malloc_elem *elem = NULL; size_t alloc_sz; int allocd_pages; void *ret, *map_addr; + uint64_t mask; alloc_sz = (size_t)pg_sz * n_segs; @@ -291,6 +293,16 @@ struct malloc_elem * goto fail; } + if (mcfg->dma_maskbits) { + mask = ~((1ULL << mcfg->dma_maskbits) - 1); + if (rte_eal_check_dma_mask(mask)) { + RTE_LOG(DEBUG, EAL, + "%s(): couldn't allocate memory due to DMA mask\n", + __func__); + goto fail; + } + } + /* add newly minted memsegs to malloc heap */ elem = malloc_heap_add_memory(heap, msl, map_addr, alloc_sz); diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c index e59ac65..616723e 100644 --- a/lib/librte_eal/linuxapp/eal/eal.c +++ b/lib/librte_eal/linuxapp/eal/eal.c @@ -263,6 +263,8 @@ enum rte_iova_mode * processes could later map the config into this exact location */ rte_config.mem_config->mem_cfg_addr = (uintptr_t) rte_mem_cfg_addr; + rte_config.mem_config->dma_maskbits = 0; + } /* attach to an existing shared memory config */ diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 344a43d..85e6212 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -284,6 +284,7 @@ EXPERIMENTAL { rte_devargs_parsef; rte_devargs_remove; rte_devargs_type_count; + rte_eal_check_dma_mask; rte_eal_cleanup; rte_eal_hotplug_add; rte_eal_hotplug_remove; From patchwork Thu Aug 30 15:21:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 44049 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6640E5699; Thu, 30 Aug 2018 17:22:34 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id F34414CBB; Thu, 30 Aug 2018 17:22:27 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w7UFLZrK021915; Thu, 30 Aug 2018 16:21:35 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w7UFLZqf021914; Thu, 30 Aug 2018 16:21:35 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org Date: Thu, 30 Aug 2018 16:21:29 +0100 Message-Id: <1535642492-21831-3-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> References: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH 2/5] mem: use address hint for mapping hugepages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Linux kernel uses a really high address as starting address for serving mmaps calls. If there exist addressing limitations and IOVA mode is VA, this starting address is likely too high for those devices. However, it is possible to use a lower address in the process virtual address space as with 64 bits there is a lot of available space. This patch adds an address hint as starting address for 64 bits systems. Signed-off-by: Alejandro Lucero --- lib/librte_eal/common/eal_common_memory.c | 35 ++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index 1e8312b..914f1d8 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -37,6 +37,23 @@ static void *next_baseaddr; static uint64_t system_page_sz; +#ifdef RTE_ARCH_64 +/* + * Linux kernel uses a really high address as starting address for serving + * mmaps calls. If there exists addressing limitations and IOVA mode is VA, + * this starting address is likely too high for those devices. However, it + * is possible to use a lower address in the process virtual address space + * as with 64 bits there is a lot of available space. + * + * Current known limitations are 39 or 40 bits. Setting the starting address + * at 4GB implies there are 508GB or 1020GB for mapping the available + * hugepages. This is likely enough for most systems, although a device with + * addressing limitations should call rte_eal_check_dma_mask for ensuring all + * memory is within supported range. + */ +static uint64_t baseaddr = 0x100000000; +#endif + void * eal_get_virtual_area(void *requested_addr, size_t *size, size_t page_sz, int flags, int mmap_flags) @@ -60,6 +77,11 @@ rte_eal_process_type() == RTE_PROC_PRIMARY) next_baseaddr = (void *) internal_config.base_virtaddr; +#ifdef RTE_ARCH_64 + if (next_baseaddr == NULL && internal_config.base_virtaddr == 0 && + rte_eal_process_type() == RTE_PROC_PRIMARY) + next_baseaddr = (void *) baseaddr; +#endif if (requested_addr == NULL && next_baseaddr != NULL) { requested_addr = next_baseaddr; requested_addr = RTE_PTR_ALIGN(requested_addr, page_sz); @@ -89,9 +111,20 @@ mapped_addr = mmap(requested_addr, (size_t)map_sz, PROT_READ, mmap_flags, -1, 0); + if (mapped_addr == MAP_FAILED && allow_shrink) *size -= page_sz; - } while (allow_shrink && mapped_addr == MAP_FAILED && *size > 0); + + if (mapped_addr != MAP_FAILED && addr_is_hint && + mapped_addr != requested_addr) { + /* hint was not used. Try with another offset */ + munmap(mapped_addr, map_sz); + mapped_addr = MAP_FAILED; + next_baseaddr = RTE_PTR_ADD(next_baseaddr, 0x100000000); + requested_addr = next_baseaddr; + } + } while ((allow_shrink || addr_is_hint) && + mapped_addr == MAP_FAILED && *size > 0); /* align resulting address - if map failed, we will ignore the value * anyway, so no need to add additional checks. From patchwork Thu Aug 30 15:21:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 44050 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 99BE65323; Thu, 30 Aug 2018 17:22:36 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id 2F87C4CBD; Thu, 30 Aug 2018 17:22:28 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w7UFLZIP021919; Thu, 30 Aug 2018 16:21:35 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w7UFLZXb021918; Thu, 30 Aug 2018 16:21:35 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org Date: Thu, 30 Aug 2018 16:21:30 +0100 Message-Id: <1535642492-21831-4-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> References: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH 3/5] bus/pci: use IOVAs check when setting IOVA mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Although VT-d emulation currently only supports 39 bits, it could be iovas being within that supported range. This patch allows IOVA mode in such a case. Indeed, memory initialization code can be modified for using lower virtual addresses than those used by the kernel for 64 bits processes by default, and therefore memsegs iovas can use 39 bits or less for most system. And this is likely 100% true for VMs. Signed-off-by: Alejandro Lucero --- drivers/bus/pci/linux/pci.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c index 04648ac..215dc10 100644 --- a/drivers/bus/pci/linux/pci.c +++ b/drivers/bus/pci/linux/pci.c @@ -588,10 +588,11 @@ fclose(fp); mgaw = ((vtd_cap_reg & VTD_CAP_MGAW_MASK) >> VTD_CAP_MGAW_SHIFT) + 1; - if (mgaw < X86_VA_WIDTH) - return false; - return true; + if (!rte_eal_check_dma_mask(mgaw)) + return true; + else + return false; } #elif defined(RTE_ARCH_PPC_64) static bool @@ -615,13 +616,17 @@ { struct rte_pci_device *dev = NULL; struct rte_pci_driver *drv = NULL; + int iommu_dma_mask_check_done = 0; FOREACH_DRIVER_ON_PCIBUS(drv) { FOREACH_DEVICE_ON_PCIBUS(dev) { if (!rte_pci_match(drv, dev)) continue; - if (!pci_one_device_iommu_support_va(dev)) - return false; + if (!iommu_dma_mask_check_done) { + if (!pci_one_device_iommu_support_va(dev)) + return false; + iommu_dma_mask_check_done = 1; + } } } return true; From patchwork Thu Aug 30 15:21:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 44051 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E79765942; Thu, 30 Aug 2018 17:22:37 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id 66EE94CBB; Thu, 30 Aug 2018 17:22:28 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w7UFLZ1D021923; Thu, 30 Aug 2018 16:21:35 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w7UFLZF9021922; Thu, 30 Aug 2018 16:21:35 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org Date: Thu, 30 Aug 2018 16:21:31 +0100 Message-Id: <1535642492-21831-5-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> References: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH 4/5] net/nfp: check hugepages IOVAs based on DMA mask X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" NFP devices can not handle DMA addresses requiring more than 40 bits. This patch uses rte_dev_check_dma_mask with 40 bits and avoids device initialization if memory out of NFP range. Signed-off-by: Alejandro Lucero --- drivers/net/nfp/nfp_net.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c index 6e5e305..aedca31 100644 --- a/drivers/net/nfp/nfp_net.c +++ b/drivers/net/nfp/nfp_net.c @@ -2688,6 +2688,14 @@ uint32_t nfp_net_txq_full(struct nfp_net_txq *txq) pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev); + /* NFP can not handle DMA addresses requiring more than 40 bits */ + if (rte_eal_check_dma_mask(40) < 0) { + RTE_LOG(INFO, PMD, "device %s can not be used:", + pci_dev->device.name); + RTE_LOG(INFO, PMD, "\trestricted dma mask to 40 bits!\n"); + return -ENODEV; + }; + if ((pci_dev->id.device_id == PCI_DEVICE_ID_NFP4000_PF_NIC) || (pci_dev->id.device_id == PCI_DEVICE_ID_NFP6000_PF_NIC)) { port = get_pf_port_number(eth_dev->data->name); From patchwork Thu Aug 30 15:21:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alejandro Lucero X-Patchwork-Id: 44052 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 510D85B2C; Thu, 30 Aug 2018 17:22:39 +0200 (CEST) Received: from netronome.com (host-79-78-33-110.static.as9105.net [79.78.33.110]) by dpdk.org (Postfix) with ESMTP id 9F6924CBD; Thu, 30 Aug 2018 17:22:28 +0200 (CEST) Received: from netronome.com (localhost [127.0.0.1]) by netronome.com (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id w7UFLaww021927; Thu, 30 Aug 2018 16:21:36 +0100 Received: (from alucero@localhost) by netronome.com (8.14.4/8.14.4/Submit) id w7UFLaTk021926; Thu, 30 Aug 2018 16:21:36 +0100 From: Alejandro Lucero To: dev@dpdk.org Cc: stable@dpdk.org Date: Thu, 30 Aug 2018 16:21:32 +0100 Message-Id: <1535642492-21831-6-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> References: <1535642492-21831-1-git-send-email-alejandro.lucero@netronome.com> Subject: [dpdk-dev] [PATCH 5/5] net/nfp: support IOVA VA mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" NFP can handle IOVA as VA. It requires to check those IOVAs being in the supported range what is done during initialization. Signed-off-by: Alejandro Lucero --- drivers/net/nfp/nfp_net.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c index aedca31..1a7d58c 100644 --- a/drivers/net/nfp/nfp_net.c +++ b/drivers/net/nfp/nfp_net.c @@ -3273,14 +3273,16 @@ static int eth_nfp_pci_remove(struct rte_pci_device *pci_dev) static struct rte_pci_driver rte_nfp_net_pf_pmd = { .id_table = pci_id_nfp_pf_net_map, - .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC | + RTE_PCI_DRV_IOVA_AS_VA, .probe = nfp_pf_pci_probe, .remove = eth_nfp_pci_remove, }; static struct rte_pci_driver rte_nfp_net_vf_pmd = { .id_table = pci_id_nfp_vf_net_map, - .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC | + RTE_PCI_DRV_IOVA_AS_VA, .probe = eth_nfp_pci_probe, .remove = eth_nfp_pci_remove, };