Message ID | 1450269064-23608-1-git-send-email-david.marchand@6wind.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Thomas Monjalon |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id D9EA78E79; Wed, 16 Dec 2015 13:31:38 +0100 (CET) Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) by dpdk.org (Postfix) with ESMTP id 20F7E952 for <dev@dpdk.org>; Wed, 16 Dec 2015 13:31:37 +0100 (CET) Received: by mail-wm0-f44.google.com with SMTP id l126so36322454wml.0 for <dev@dpdk.org>; Wed, 16 Dec 2015 04:31:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=p+AP+0v43MyYzdn9B625MN8PBQ+RljRXbZ485qE37a4=; b=xK9viWsdvHPUb9hORMn6dig7rmcdVe+yNUt10XPMhXk/DHJQbd5DgDDtJm+oFV0qwp SlmjGqz/kfuWsQerXw6zjTCaHAJ61UJBGY73Vcixo/i7BMnoA64fe57emQyvw7jINjyJ lZolsXeFMNcqHOuMHy3+81/HZxyejbQ2ASJR04axmFAa1l+DgNwEYm9vhAdq7F8Pz6rt 5iJTWDWLFj8xDu+vTopLPYIqHoBdoH+GzhLhNswIwxInsRvUyDbTazN/OimmETzbpMIO acDhqe956FiwThAEgJe7dHIbvM+xov1IuH80oXTa3AM3TyIsIJD56cl3Tz4mtFMuNz5x QwBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=p+AP+0v43MyYzdn9B625MN8PBQ+RljRXbZ485qE37a4=; b=QCREqARcQqBkrfnj0LBF4QeMk9p/M+CG7OUnEVMbnd6pElMqdff4YLf5l5OyGXbTx7 c+a2G6sy5r7EGLofhY2JBSphVcpvN9IJTwT2yYQhH8WqUKAIacadjrlun/mWOmTb5tT+ YMMGamcri+MgBwzt7WlRGX0FHlAdNxhX88GzXlne1YLhaFLRgSQr2Z+9A53fhhv+lg20 mOx6+1wLt42UoSZLm+DwillCWQ09fV1+IRvBcyLLfZqQQrsyREU3YR1QVG5/McvgBarn b1P9j7P2KByn7BfyUPALdsf4BXlkoeNVAaZhcIVtkPNQPdl/1i7v0+lnG8E/unbehmKb Zojg== X-Gm-Message-State: ALoCoQndg4UsMwrpaUcSDTdn4o9vOMMJovozgaEUZz/MF6eRNG7TZhk05ZVIxheIz6dTA2QJNlBlRFnbcNofkCxIr9R15DjFVw== X-Received: by 10.28.48.194 with SMTP id w185mr4306024wmw.73.1450269096899; Wed, 16 Dec 2015 04:31:36 -0800 (PST) Received: from alcyon.dev.6wind.com (guy78-3-82-239-227-177.fbx.proxad.net. [82.239.227.177]) by smtp.gmail.com with ESMTPSA id qm9sm5807683wjc.39.2015.12.16.04.31.35 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 16 Dec 2015 04:31:36 -0800 (PST) From: David Marchand <david.marchand@6wind.com> To: sshukla@mvista.com Date: Wed, 16 Dec 2015 13:31:04 +0100 Message-Id: <1450269064-23608-1-git-send-email-david.marchand@6wind.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <CAAyOgsaMO7V1K8Yxh=Zf-E4iodDevMFG+rRBPgZXBysca9JopA@mail.gmail.com> References: <CAAyOgsaMO7V1K8Yxh=Zf-E4iodDevMFG+rRBPgZXBysca9JopA@mail.gmail.com> Cc: dev@dpdk.org Subject: [dpdk-dev] [PATCH] eal: map io resources for non x86 architectures X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
David Marchand
Dec. 16, 2015, 12:31 p.m. UTC
x86 requires a special set of instructions to access ioports, but other
architectures let you remap io resources.
So let eal remap io resources by accepting IORESOURCE_IO flag for
architectures other than x86.
Signed-off-by: David Marchand <david.marchand@6wind.com>
---
lib/librte_eal/common/include/rte_pci.h | 3 ++-
lib/librte_eal/linuxapp/eal/eal_pci.c | 21 +++++++++++++++------
2 files changed, 17 insertions(+), 7 deletions(-)
Comments
On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: > x86 requires a special set of instructions to access ioports, but other > architectures let you remap io resources. > So let eal remap io resources by accepting IORESOURCE_IO flag for > architectures other than x86. One question: this patch could be a replacement of the igbuio_iomap patch from Santosh? If so, I like it: It's more elegant. --yliu > > Signed-off-by: David Marchand <david.marchand@6wind.com> > --- > lib/librte_eal/common/include/rte_pci.h | 3 ++- > lib/librte_eal/linuxapp/eal/eal_pci.c | 21 +++++++++++++++------ > 2 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h > index 334c12e..8aaab4a 100644 > --- a/lib/librte_eal/common/include/rte_pci.h > +++ b/lib/librte_eal/common/include/rte_pci.h > @@ -105,7 +105,8 @@ extern struct pci_device_list pci_device_list; /**< Global list of PCI devices. > /** Nb. of values in PCI resource format. */ > #define PCI_RESOURCE_FMT_NVAL 3 > > -/** IO resource type: memory address space */ > +/** IO resource type: */ > +#define IORESOURCE_IO 0x00000100 > #define IORESOURCE_MEM 0x00000200 > > /** > diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c > index bc5b5be..9c4651d 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_pci.c > +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c > @@ -236,12 +236,21 @@ pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev) > goto error; > } > > - if (flags & IORESOURCE_MEM) { > - dev->mem_resource[i].phys_addr = phys_addr; > - dev->mem_resource[i].len = end_addr - phys_addr + 1; > - /* not mapped for now */ > - dev->mem_resource[i].addr = NULL; > - } > + /* we only care about IORESOURCE_IO or IORESOURCE_MEM */ > + if (!(flags & IORESOURCE_IO) && > + !(flags & IORESOURCE_MEM)) > + continue; > + > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) > + /* x86 can not remap ioports, so skip it, remapping code will > + * look at dev->mem_resource[i].phys_addr == 0 and skip it */ > + if (flags & IORESOURCE_IO) > + continue; > +#endif > + dev->mem_resource[i].phys_addr = phys_addr; > + dev->mem_resource[i].len = end_addr - phys_addr + 1; > + /* not mapped for now */ > + dev->mem_resource[i].addr = NULL; > } > fclose(f); > return 0; > -- > 1.7.10.4
On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: > x86 requires a special set of instructions to access ioports, but other > architectures let you remap io resources. > So let eal remap io resources by accepting IORESOURCE_IO flag for > architectures other than x86. > > Signed-off-by: David Marchand <david.marchand@6wind.com> > --- > lib/librte_eal/common/include/rte_pci.h | 3 ++- > lib/librte_eal/linuxapp/eal/eal_pci.c | 21 +++++++++++++++------ > 2 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h > index 334c12e..8aaab4a 100644 > --- a/lib/librte_eal/common/include/rte_pci.h > +++ b/lib/librte_eal/common/include/rte_pci.h > @@ -105,7 +105,8 @@ extern struct pci_device_list pci_device_list; /**< Global list of PCI devices. > /** Nb. of values in PCI resource format. */ > #define PCI_RESOURCE_FMT_NVAL 3 > > -/** IO resource type: memory address space */ > +/** IO resource type: */ > +#define IORESOURCE_IO 0x00000100 > #define IORESOURCE_MEM 0x00000200 > > /** > diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c > index bc5b5be..9c4651d 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_pci.c > +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c > @@ -236,12 +236,21 @@ pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev) > goto error; > } > > - if (flags & IORESOURCE_MEM) { > - dev->mem_resource[i].phys_addr = phys_addr; > - dev->mem_resource[i].len = end_addr - phys_addr + 1; > - /* not mapped for now */ > - dev->mem_resource[i].addr = NULL; > - } > + /* we only care about IORESOURCE_IO or IORESOURCE_MEM */ > + if (!(flags & IORESOURCE_IO) && > + !(flags & IORESOURCE_MEM)) > + continue; > + > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) > + /* x86 can not remap ioports, so skip it, remapping code will > + * look at dev->mem_resource[i].phys_addr == 0 and skip it */ > + if (flags & IORESOURCE_IO) > + continue; > +#endif As a tangential comment: We maybe could look to make certain preprocessor defines available as C globals as well. There is no reason that the ifdef here could not be implemented as a runtime check in C code. /Bruce
Bruce, On Wed, Dec 16, 2015 at 2:15 PM, Bruce Richardson < bruce.richardson@intel.com> wrote: > > > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) > > + /* x86 can not remap ioports, so skip it, remapping code > will > > + * look at dev->mem_resource[i].phys_addr == 0 and skip it > */ > > + if (flags & IORESOURCE_IO) > > + continue; > > +#endif > > As a tangential comment: We maybe could look to make certain preprocessor > defines available as C globals as well. There is no reason that the ifdef > here > could not be implemented as a runtime check in C code. > > Well, instead of having the same information as the preprocessor define, maybe some capability per arch/cpu would be better "arch supports io remap". Maybe we can extend the cpuflags ?
Yuanhan, On Wed, Dec 16, 2015 at 1:48 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: > > x86 requires a special set of instructions to access ioports, but other > > architectures let you remap io resources. > > So let eal remap io resources by accepting IORESOURCE_IO flag for > > architectures other than x86. > > One question: this patch could be a replacement of the igbuio_iomap patch > from Santosh? If so, I like it: It's more elegant. > Well, yes, unless I missed something since I am no guru :-).
On Wed, Dec 16, 2015 at 02:34:35PM +0100, David Marchand wrote: > Yuanhan, > > On Wed, Dec 16, 2015 at 1:48 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> > wrote: > > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: > > x86 requires a special set of instructions to access ioports, but other > > architectures let you remap io resources. > > So let eal remap io resources by accepting IORESOURCE_IO flag for > > architectures other than x86. > > One question: this patch could be a replacement of the igbuio_iomap patch > from Santosh? If so, I like it: It's more elegant. > > > Well, yes, unless I missed something since I am no guru :-). Great then. If there is a test-by, I could give my Ack :) (I have no arm or other platform for testing). --yliu
On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: >> x86 requires a special set of instructions to access ioports, but other >> architectures let you remap io resources. >> So let eal remap io resources by accepting IORESOURCE_IO flag for >> architectures other than x86. > > One question: this patch could be a replacement of the igbuio_iomap patch > from Santosh? If so, I like it: It's more elegant. > > --yliu > I did tried similar in past but not in parse_sysfs (such that mem.resource_addr to accept IO_RESOURCE_IO types) and observed that pci_map_resource not able to map address hence segfault at tespmd initialization. i was getting these: EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): Invalid argument (0xffffffffffffffff) after enabling RTE_PCI_NEED_DRV_MAPPING flags in virtio_ethdev. I guess patch assume that flag enabled for driver right? >> >> Signed-off-by: David Marchand <david.marchand@6wind.com> >> --- >> lib/librte_eal/common/include/rte_pci.h | 3 ++- >> lib/librte_eal/linuxapp/eal/eal_pci.c | 21 +++++++++++++++------ >> 2 files changed, 17 insertions(+), 7 deletions(-) >> >> diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h >> index 334c12e..8aaab4a 100644 >> --- a/lib/librte_eal/common/include/rte_pci.h >> +++ b/lib/librte_eal/common/include/rte_pci.h >> @@ -105,7 +105,8 @@ extern struct pci_device_list pci_device_list; /**< Global list of PCI devices. >> /** Nb. of values in PCI resource format. */ >> #define PCI_RESOURCE_FMT_NVAL 3 >> >> -/** IO resource type: memory address space */ >> +/** IO resource type: */ >> +#define IORESOURCE_IO 0x00000100 >> #define IORESOURCE_MEM 0x00000200 >> >> /** >> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c >> index bc5b5be..9c4651d 100644 >> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c >> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c >> @@ -236,12 +236,21 @@ pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev) >> goto error; >> } >> >> - if (flags & IORESOURCE_MEM) { >> - dev->mem_resource[i].phys_addr = phys_addr; >> - dev->mem_resource[i].len = end_addr - phys_addr + 1; >> - /* not mapped for now */ >> - dev->mem_resource[i].addr = NULL; >> - } >> + /* we only care about IORESOURCE_IO or IORESOURCE_MEM */ >> + if (!(flags & IORESOURCE_IO) && >> + !(flags & IORESOURCE_MEM)) >> + continue; >> + >> +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) >> + /* x86 can not remap ioports, so skip it, remapping code will >> + * look at dev->mem_resource[i].phys_addr == 0 and skip it */ >> + if (flags & IORESOURCE_IO) >> + continue; >> +#endif >> + dev->mem_resource[i].phys_addr = phys_addr; >> + dev->mem_resource[i].len = end_addr - phys_addr + 1; >> + /* not mapped for now */ >> + dev->mem_resource[i].addr = NULL; >> } >> fclose(f); >> return 0; >> -- >> 1.7.10.4
On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: > On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu > <yuanhan.liu@linux.intel.com> wrote: > > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: > >> x86 requires a special set of instructions to access ioports, but other > >> architectures let you remap io resources. > >> So let eal remap io resources by accepting IORESOURCE_IO flag for > >> architectures other than x86. > > > > One question: this patch could be a replacement of the igbuio_iomap patch > > from Santosh? If so, I like it: It's more elegant. > > > > --yliu > > > > I did tried similar in past but not in parse_sysfs (such that > mem.resource_addr to accept IO_RESOURCE_IO types) and observed that > pci_map_resource not able to map address hence segfault at tespmd > initialization. > > i was getting these: > EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): > Invalid argument (0xffffffffffffffff) That's because ARM (at least the kernel) doesn't allow an IO map: arch/arm/kernel/bios32.c ------------------------ 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, 619 enum pci_mmap_state mmap_state, int write_combine) 620 { 621 if (mmap_state == pci_mmap_io) 622 return -EINVAL; And with a quick glimpse of powerpc, I see no such limitation. Hence, this peice of code may work only on Powerpc platform (and maybe a few others we don't care). So, apparently, this will not work for ARM. --yliu
On Thu, Dec 17, 2015 at 3:08 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: >> On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu >> <yuanhan.liu@linux.intel.com> wrote: >> > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: >> >> x86 requires a special set of instructions to access ioports, but other >> >> architectures let you remap io resources. >> >> So let eal remap io resources by accepting IORESOURCE_IO flag for >> >> architectures other than x86. >> > >> > One question: this patch could be a replacement of the igbuio_iomap patch >> > from Santosh? If so, I like it: It's more elegant. >> > >> > --yliu >> > >> >> I did tried similar in past but not in parse_sysfs (such that >> mem.resource_addr to accept IO_RESOURCE_IO types) and observed that >> pci_map_resource not able to map address hence segfault at tespmd >> initialization. >> >> i was getting these: >> EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): >> Invalid argument (0xffffffffffffffff) > > That's because ARM (at least the kernel) doesn't allow an IO map: > > arch/arm/kernel/bios32.c > ------------------------ > 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, > 619 enum pci_mmap_state mmap_state, int write_combine) > 620 { > 621 if (mmap_state == pci_mmap_io) > 622 return -EINVAL; > > And with a quick glimpse of powerpc, I see no such limitation. Hence, > this peice of code may work only on Powerpc platform (and maybe a few > others we don't care). > > So, apparently, this will not work for ARM. > Right and I did shared detailed explanation on why it wont work on this link [1], infact this patch shouldn;t work for mips too. As I mentioned earlier I did tried similar approach and so to get everything working like iomem is currently in dpdk; we need to add something like pci_remap_iospace --> ioremap_page_range() but this api not really pci_mmap_page_range types. user need to write more code on top so to use this api efficiently, also this api looks like meant to use by arch file only in kernel space. > --yliu
On Thu, Dec 17, 2015 at 3:31 PM, Santosh Shukla <sshukla@mvista.com> wrote: > On Thu, Dec 17, 2015 at 3:08 PM, Yuanhan Liu > <yuanhan.liu@linux.intel.com> wrote: >> On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: >>> On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu >>> <yuanhan.liu@linux.intel.com> wrote: >>> > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: >>> >> x86 requires a special set of instructions to access ioports, but other >>> >> architectures let you remap io resources. >>> >> So let eal remap io resources by accepting IORESOURCE_IO flag for >>> >> architectures other than x86. >>> > >>> > One question: this patch could be a replacement of the igbuio_iomap patch >>> > from Santosh? If so, I like it: It's more elegant. >>> > >>> > --yliu >>> > >>> >>> I did tried similar in past but not in parse_sysfs (such that >>> mem.resource_addr to accept IO_RESOURCE_IO types) and observed that >>> pci_map_resource not able to map address hence segfault at tespmd >>> initialization. >>> >>> i was getting these: >>> EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): >>> Invalid argument (0xffffffffffffffff) >> >> That's because ARM (at least the kernel) doesn't allow an IO map: >> >> arch/arm/kernel/bios32.c >> ------------------------ >> 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, >> 619 enum pci_mmap_state mmap_state, int write_combine) >> 620 { >> 621 if (mmap_state == pci_mmap_io) >> 622 return -EINVAL; >> >> And with a quick glimpse of powerpc, I see no such limitation. Hence, >> this peice of code may work only on Powerpc platform (and maybe a few >> others we don't care). >> >> So, apparently, this will not work for ARM. >> > > Right and I did shared detailed explanation on why it wont work on > this link [1], infact this patch shouldn;t work for mips too. > > As I mentioned earlier I did tried similar approach and so to get > everything working like iomem is currently in dpdk; we need to add > something like pci_remap_iospace --> ioremap_page_range() but this api > not really pci_mmap_page_range types. user need to write more code on > top so to use this api efficiently, also this api looks like meant to > use by arch file only in kernel space. > > missed link; [1] http://dpdk.org/dev/patchwork/patch/9365/ >> --yliu
On Thu, Dec 17, 2015 at 3:32 PM, Santosh Shukla <sshukla@mvista.com> wrote: > On Thu, Dec 17, 2015 at 3:31 PM, Santosh Shukla <sshukla@mvista.com> wrote: >> On Thu, Dec 17, 2015 at 3:08 PM, Yuanhan Liu >> <yuanhan.liu@linux.intel.com> wrote: >>> On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: >>>> On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu >>>> <yuanhan.liu@linux.intel.com> wrote: >>>> > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: >>>> >> x86 requires a special set of instructions to access ioports, but other >>>> >> architectures let you remap io resources. >>>> >> So let eal remap io resources by accepting IORESOURCE_IO flag for >>>> >> architectures other than x86. >>>> > >>>> > One question: this patch could be a replacement of the igbuio_iomap patch >>>> > from Santosh? If so, I like it: It's more elegant. >>>> > >>>> > --yliu >>>> > >>>> >>>> I did tried similar in past but not in parse_sysfs (such that >>>> mem.resource_addr to accept IO_RESOURCE_IO types) and observed that >>>> pci_map_resource not able to map address hence segfault at tespmd >>>> initialization. >>>> >>>> i was getting these: >>>> EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): >>>> Invalid argument (0xffffffffffffffff) >>> >>> That's because ARM (at least the kernel) doesn't allow an IO map: >>> >>> arch/arm/kernel/bios32.c >>> ------------------------ >>> 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, >>> 619 enum pci_mmap_state mmap_state, int write_combine) >>> 620 { >>> 621 if (mmap_state == pci_mmap_io) >>> 622 return -EINVAL; >>> >>> And with a quick glimpse of powerpc, I see no such limitation. Hence, >>> this peice of code may work only on Powerpc platform (and maybe a few >>> others we don't care). >>> >>> So, apparently, this will not work for ARM. >>> >> >> Right and I did shared detailed explanation on why it wont work on >> this link [1], infact this patch shouldn;t work for mips too. >> >> As I mentioned earlier I did tried similar approach and so to get >> everything working like iomem is currently in dpdk; we need to add >> something like pci_remap_iospace --> ioremap_page_range() but this api >> not really pci_mmap_page_range types. user need to write more code on >> top so to use this api efficiently, also this api looks like meant to >> use by arch file only in kernel space. >> >> > missed link; > > [1] http://dpdk.org/dev/patchwork/patch/9365/ > IMO, it is worth keeping one special device file who could work across archs like arm/arm64/powerpc and others, who could map iopci bar to dpdk user-space. also this approach has no kernel version dependency too. BTW; I did mentioned in second approach in to add /dev/ioport interface in drivers/char/mem.c which could read more than byte in one single operation, but that has kernel dependency. However that approach too is arch agnostic. Let me know! >>> --yliu
Hi, 2015-12-17 15:37, Santosh Shukla: > On Thu, Dec 17, 2015 at 3:32 PM, Santosh Shukla <sshukla@mvista.com> wrote: > > On Thu, Dec 17, 2015 at 3:31 PM, Santosh Shukla <sshukla@mvista.com> wrote: > >> On Thu, Dec 17, 2015 at 3:08 PM, Yuanhan Liu > >> <yuanhan.liu@linux.intel.com> wrote: > >>> On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: > >>>> On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu > >>>> <yuanhan.liu@linux.intel.com> wrote: > >>>> > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: > >>>> >> x86 requires a special set of instructions to access ioports, but other > >>>> >> architectures let you remap io resources. > >>>> >> So let eal remap io resources by accepting IORESOURCE_IO flag for > >>>> >> architectures other than x86. > >>>> > > >>>> > One question: this patch could be a replacement of the igbuio_iomap patch > >>>> > from Santosh? If so, I like it: It's more elegant. > >>>> > > >>>> > --yliu > >>>> > > >>>> > >>>> I did tried similar in past but not in parse_sysfs (such that > >>>> mem.resource_addr to accept IO_RESOURCE_IO types) and observed that > >>>> pci_map_resource not able to map address hence segfault at tespmd > >>>> initialization. > >>>> > >>>> i was getting these: > >>>> EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): > >>>> Invalid argument (0xffffffffffffffff) > >>> > >>> That's because ARM (at least the kernel) doesn't allow an IO map: > >>> > >>> arch/arm/kernel/bios32.c > >>> ------------------------ > >>> 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, > >>> 619 enum pci_mmap_state mmap_state, int write_combine) > >>> 620 { > >>> 621 if (mmap_state == pci_mmap_io) > >>> 622 return -EINVAL; > >>> > >>> And with a quick glimpse of powerpc, I see no such limitation. Hence, > >>> this peice of code may work only on Powerpc platform (and maybe a few > >>> others we don't care). > >>> > >>> So, apparently, this will not work for ARM. > >>> > >> > >> Right and I did shared detailed explanation on why it wont work on > >> this link [1], infact this patch shouldn;t work for mips too. > >> > >> As I mentioned earlier I did tried similar approach and so to get > >> everything working like iomem is currently in dpdk; we need to add > >> something like pci_remap_iospace --> ioremap_page_range() but this api > >> not really pci_mmap_page_range types. user need to write more code on > >> top so to use this api efficiently, also this api looks like meant to > >> use by arch file only in kernel space. > >> > >> > > missed link; > > > > [1] http://dpdk.org/dev/patchwork/patch/9365/ > > > > IMO, it is worth keeping one special device file who could work across > archs like arm/arm64/powerpc and others, who could map iopci bar to > dpdk user-space. also this approach has no kernel version dependency > too. BTW; I did mentioned in second approach in to add /dev/ioport > interface in drivers/char/mem.c which could read more than byte in one > single operation, but that has kernel dependency. However that > approach too is arch agnostic. Your first approach use an out-of-tree kernel module (igb_uio), so we cannot really say there is no kernel dependency. We should try to remove the need for any out-of-tree kernel module. That's why the Linux upstream approach is a better solution.
On Thu, Dec 17, 2015 at 3:44 PM, Thomas Monjalon <thomas.monjalon@6wind.com> wrote: > Hi, > > 2015-12-17 15:37, Santosh Shukla: >> On Thu, Dec 17, 2015 at 3:32 PM, Santosh Shukla <sshukla@mvista.com> wrote: >> > On Thu, Dec 17, 2015 at 3:31 PM, Santosh Shukla <sshukla@mvista.com> wrote: >> >> On Thu, Dec 17, 2015 at 3:08 PM, Yuanhan Liu >> >> <yuanhan.liu@linux.intel.com> wrote: >> >>> On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: >> >>>> On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu >> >>>> <yuanhan.liu@linux.intel.com> wrote: >> >>>> > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: >> >>>> >> x86 requires a special set of instructions to access ioports, but other >> >>>> >> architectures let you remap io resources. >> >>>> >> So let eal remap io resources by accepting IORESOURCE_IO flag for >> >>>> >> architectures other than x86. >> >>>> > >> >>>> > One question: this patch could be a replacement of the igbuio_iomap patch >> >>>> > from Santosh? If so, I like it: It's more elegant. >> >>>> > >> >>>> > --yliu >> >>>> > >> >>>> >> >>>> I did tried similar in past but not in parse_sysfs (such that >> >>>> mem.resource_addr to accept IO_RESOURCE_IO types) and observed that >> >>>> pci_map_resource not able to map address hence segfault at tespmd >> >>>> initialization. >> >>>> >> >>>> i was getting these: >> >>>> EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): >> >>>> Invalid argument (0xffffffffffffffff) >> >>> >> >>> That's because ARM (at least the kernel) doesn't allow an IO map: >> >>> >> >>> arch/arm/kernel/bios32.c >> >>> ------------------------ >> >>> 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, >> >>> 619 enum pci_mmap_state mmap_state, int write_combine) >> >>> 620 { >> >>> 621 if (mmap_state == pci_mmap_io) >> >>> 622 return -EINVAL; >> >>> >> >>> And with a quick glimpse of powerpc, I see no such limitation. Hence, >> >>> this peice of code may work only on Powerpc platform (and maybe a few >> >>> others we don't care). >> >>> >> >>> So, apparently, this will not work for ARM. >> >>> >> >> >> >> Right and I did shared detailed explanation on why it wont work on >> >> this link [1], infact this patch shouldn;t work for mips too. >> >> >> >> As I mentioned earlier I did tried similar approach and so to get >> >> everything working like iomem is currently in dpdk; we need to add >> >> something like pci_remap_iospace --> ioremap_page_range() but this api >> >> not really pci_mmap_page_range types. user need to write more code on >> >> top so to use this api efficiently, also this api looks like meant to >> >> use by arch file only in kernel space. >> >> >> >> >> > missed link; >> > >> > [1] http://dpdk.org/dev/patchwork/patch/9365/ >> > >> >> IMO, it is worth keeping one special device file who could work across >> archs like arm/arm64/powerpc and others, who could map iopci bar to >> dpdk user-space. also this approach has no kernel version dependency >> too. BTW; I did mentioned in second approach in to add /dev/ioport >> interface in drivers/char/mem.c which could read more than byte in one >> single operation, but that has kernel dependency. However that >> approach too is arch agnostic. > > Your first approach use an out-of-tree kernel module (igb_uio), so we cannot > really say there is no kernel dependency. Agree but I mentioned kernel __version__ dependency. > We should try to remove the need for any out-of-tree kernel module. > That's why the Linux upstream approach is a better solution. IIUC, your suggesting archs like arm/arm64 to support io_mappe_io in pci_mmap_page_range()?
2015-12-17 15:51, Santosh Shukla: > On Thu, Dec 17, 2015 at 3:44 PM, Thomas Monjalon > <thomas.monjalon@6wind.com> wrote: > > Hi, > > > > 2015-12-17 15:37, Santosh Shukla: > >> On Thu, Dec 17, 2015 at 3:32 PM, Santosh Shukla <sshukla@mvista.com> wrote: > >> > On Thu, Dec 17, 2015 at 3:31 PM, Santosh Shukla <sshukla@mvista.com> wrote: > >> >> On Thu, Dec 17, 2015 at 3:08 PM, Yuanhan Liu > >> >> <yuanhan.liu@linux.intel.com> wrote: > >> >>> On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: > >> >>>> On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu > >> >>>> <yuanhan.liu@linux.intel.com> wrote: > >> >>>> > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: > >> >>>> >> x86 requires a special set of instructions to access ioports, but other > >> >>>> >> architectures let you remap io resources. > >> >>>> >> So let eal remap io resources by accepting IORESOURCE_IO flag for > >> >>>> >> architectures other than x86. > >> >>>> > > >> >>>> > One question: this patch could be a replacement of the igbuio_iomap patch > >> >>>> > from Santosh? If so, I like it: It's more elegant. > >> >>>> > > >> >>>> > --yliu > >> >>>> > > >> >>>> > >> >>>> I did tried similar in past but not in parse_sysfs (such that > >> >>>> mem.resource_addr to accept IO_RESOURCE_IO types) and observed that > >> >>>> pci_map_resource not able to map address hence segfault at tespmd > >> >>>> initialization. > >> >>>> > >> >>>> i was getting these: > >> >>>> EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): > >> >>>> Invalid argument (0xffffffffffffffff) > >> >>> > >> >>> That's because ARM (at least the kernel) doesn't allow an IO map: > >> >>> > >> >>> arch/arm/kernel/bios32.c > >> >>> ------------------------ > >> >>> 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, > >> >>> 619 enum pci_mmap_state mmap_state, int write_combine) > >> >>> 620 { > >> >>> 621 if (mmap_state == pci_mmap_io) > >> >>> 622 return -EINVAL; > >> >>> > >> >>> And with a quick glimpse of powerpc, I see no such limitation. Hence, > >> >>> this peice of code may work only on Powerpc platform (and maybe a few > >> >>> others we don't care). > >> >>> > >> >>> So, apparently, this will not work for ARM. > >> >>> > >> >> > >> >> Right and I did shared detailed explanation on why it wont work on > >> >> this link [1], infact this patch shouldn;t work for mips too. > >> >> > >> >> As I mentioned earlier I did tried similar approach and so to get > >> >> everything working like iomem is currently in dpdk; we need to add > >> >> something like pci_remap_iospace --> ioremap_page_range() but this api > >> >> not really pci_mmap_page_range types. user need to write more code on > >> >> top so to use this api efficiently, also this api looks like meant to > >> >> use by arch file only in kernel space. > >> >> > >> >> > >> > missed link; > >> > > >> > [1] http://dpdk.org/dev/patchwork/patch/9365/ > >> > > >> > >> IMO, it is worth keeping one special device file who could work across > >> archs like arm/arm64/powerpc and others, who could map iopci bar to > >> dpdk user-space. also this approach has no kernel version dependency > >> too. BTW; I did mentioned in second approach in to add /dev/ioport > >> interface in drivers/char/mem.c which could read more than byte in one > >> single operation, but that has kernel dependency. However that > >> approach too is arch agnostic. > > > > Your first approach use an out-of-tree kernel module (igb_uio), so we cannot > > really say there is no kernel dependency. > > Agree but I mentioned kernel __version__ dependency. Yes you did. One of the main issue with out-of-tree kernel modules is the version dependency. Probably that igb_uio from DPDK 2.3 will not compile with the kernel 5.0. > > We should try to remove the need for any out-of-tree kernel module. > > That's why the Linux upstream approach is a better solution. > > IIUC, your suggesting archs like arm/arm64 to support io_mappe_io in > pci_mmap_page_range()? I don't know what is the best solution in the kernel. First we need to be sure that there is absolutely no solution without kernel changes. Then we can try a pci_mmap solution or, as you suggest, an interface in drivers/char/mem.c
On Thu, Dec 17, 2015 at 4:03 PM, Thomas Monjalon <thomas.monjalon@6wind.com> wrote: > 2015-12-17 15:51, Santosh Shukla: >> On Thu, Dec 17, 2015 at 3:44 PM, Thomas Monjalon >> <thomas.monjalon@6wind.com> wrote: >> > Hi, >> > >> > 2015-12-17 15:37, Santosh Shukla: >> >> On Thu, Dec 17, 2015 at 3:32 PM, Santosh Shukla <sshukla@mvista.com> wrote: >> >> > On Thu, Dec 17, 2015 at 3:31 PM, Santosh Shukla <sshukla@mvista.com> wrote: >> >> >> On Thu, Dec 17, 2015 at 3:08 PM, Yuanhan Liu >> >> >> <yuanhan.liu@linux.intel.com> wrote: >> >> >>> On Wed, Dec 16, 2015 at 07:21:55PM +0530, Santosh Shukla wrote: >> >> >>>> On Wed, Dec 16, 2015 at 6:18 PM, Yuanhan Liu >> >> >>>> <yuanhan.liu@linux.intel.com> wrote: >> >> >>>> > On Wed, Dec 16, 2015 at 01:31:04PM +0100, David Marchand wrote: >> >> >>>> >> x86 requires a special set of instructions to access ioports, but other >> >> >>>> >> architectures let you remap io resources. >> >> >>>> >> So let eal remap io resources by accepting IORESOURCE_IO flag for >> >> >>>> >> architectures other than x86. >> >> >>>> > >> >> >>>> > One question: this patch could be a replacement of the igbuio_iomap patch >> >> >>>> > from Santosh? If so, I like it: It's more elegant. >> >> >>>> > >> >> >>>> > --yliu >> >> >>>> > >> >> >>>> >> >> >>>> I did tried similar in past but not in parse_sysfs (such that >> >> >>>> mem.resource_addr to accept IO_RESOURCE_IO types) and observed that >> >> >>>> pci_map_resource not able to map address hence segfault at tespmd >> >> >>>> initialization. >> >> >>>> >> >> >>>> i was getting these: >> >> >>>> EAL: pci_map_resource(): cannot mmap(19, 0x7fa5c00000, 0x20, 0x0): >> >> >>>> Invalid argument (0xffffffffffffffff) >> >> >>> >> >> >>> That's because ARM (at least the kernel) doesn't allow an IO map: >> >> >>> >> >> >>> arch/arm/kernel/bios32.c >> >> >>> ------------------------ >> >> >>> 618 int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, >> >> >>> 619 enum pci_mmap_state mmap_state, int write_combine) >> >> >>> 620 { >> >> >>> 621 if (mmap_state == pci_mmap_io) >> >> >>> 622 return -EINVAL; >> >> >>> >> >> >>> And with a quick glimpse of powerpc, I see no such limitation. Hence, >> >> >>> this peice of code may work only on Powerpc platform (and maybe a few >> >> >>> others we don't care). >> >> >>> >> >> >>> So, apparently, this will not work for ARM. >> >> >>> >> >> >> >> >> >> Right and I did shared detailed explanation on why it wont work on >> >> >> this link [1], infact this patch shouldn;t work for mips too. >> >> >> >> >> >> As I mentioned earlier I did tried similar approach and so to get >> >> >> everything working like iomem is currently in dpdk; we need to add >> >> >> something like pci_remap_iospace --> ioremap_page_range() but this api >> >> >> not really pci_mmap_page_range types. user need to write more code on >> >> >> top so to use this api efficiently, also this api looks like meant to >> >> >> use by arch file only in kernel space. >> >> >> >> >> >> >> >> > missed link; >> >> > >> >> > [1] http://dpdk.org/dev/patchwork/patch/9365/ >> >> > >> >> >> >> IMO, it is worth keeping one special device file who could work across >> >> archs like arm/arm64/powerpc and others, who could map iopci bar to >> >> dpdk user-space. also this approach has no kernel version dependency >> >> too. BTW; I did mentioned in second approach in to add /dev/ioport >> >> interface in drivers/char/mem.c which could read more than byte in one >> >> single operation, but that has kernel dependency. However that >> >> approach too is arch agnostic. >> > >> > Your first approach use an out-of-tree kernel module (igb_uio), so we cannot >> > really say there is no kernel dependency. >> >> Agree but I mentioned kernel __version__ dependency. > > Yes you did. > One of the main issue with out-of-tree kernel modules is the version > dependency. Probably that igb_uio from DPDK 2.3 will not compile with > the kernel 5.0. > don't know kernel 5.0 feature list so I guess your may be right. is uio obsoleted for 5.0 kernel? >> > We should try to remove the need for any out-of-tree kernel module. >> > That's why the Linux upstream approach is a better solution. >> >> IIUC, your suggesting archs like arm/arm64 to support io_mappe_io in >> pci_mmap_page_range()? > > I don't know what is the best solution in the kernel. > First we need to be sure that there is absolutely no solution without > kernel changes. I guess we have done enough evaluation / investigation that suggest - so to map iopci region to userspace in arch agnostic-way - # either we need to modify kernel - Make sure all the non-x86 arch to support mapping for iopci region (i.e. pci_mmap_page_range). I don;t think its a correct approach though. or - include /dev/ioport char-mem device file who could do more than byte operation, Note that this implementation does not exist in kernel. I could send an RFC to lkml. OR keep device file in user space (current approach) Right now Virtio-for-arm patches are blocked on this, let me know if someone has better approach/thought in mind. Thanks. > Then we can try a pci_mmap solution or, as you suggest, an interface in > drivers/char/mem.c >
On Thu, Dec 17, 2015 at 04:52:00PM +0530, Santosh Shukla wrote: > >> >> IMO, it is worth keeping one special device file who could work across > >> >> archs like arm/arm64/powerpc and others, who could map iopci bar to > >> >> dpdk user-space. also this approach has no kernel version dependency > >> >> too. BTW; I did mentioned in second approach in to add /dev/ioport > >> >> interface in drivers/char/mem.c which could read more than byte in one > >> >> single operation, but that has kernel dependency. However that > >> >> approach too is arch agnostic. > >> > > >> > Your first approach use an out-of-tree kernel module (igb_uio), so we cannot > >> > really say there is no kernel dependency. > >> > >> Agree but I mentioned kernel __version__ dependency. > > > > Yes you did. > > One of the main issue with out-of-tree kernel modules is the version > > dependency. Probably that igb_uio from DPDK 2.3 will not compile with > > the kernel 5.0. > > > > don't know kernel 5.0 feature list so I guess your may be right. is > uio obsoleted for 5.0 kernel? Nope, Thomas meant to say that we should keep the old DPDK will work with newer kernel, not just the newest DPDK work on newest linux kernel only. The out-of-tree kernel makes no garantee on that. > >> > We should try to remove the need for any out-of-tree kernel module. > >> > That's why the Linux upstream approach is a better solution. > >> > >> IIUC, your suggesting archs like arm/arm64 to support io_mappe_io in > >> pci_mmap_page_range()? > > > > I don't know what is the best solution in the kernel. > > First we need to be sure that there is absolutely no solution without > > kernel changes. > > I guess we have done enough evaluation / investigation that suggest - > so to map iopci region to userspace in arch agnostic-way - > > # either we need to modify kernel > - Make sure all the non-x86 arch to support mapping for > iopci region (i.e. pci_mmap_page_range). I don;t think its a correct > approach though. > or > - include /dev/ioport char-mem device file who could do > more than byte operation, Note that this implementation does not exist > in kernel. I could send an RFC to lkml. Maybe you could propose the two to lkml, to get some feedbacks from those kernel/ARM gurus? Please cc me if you do so. --yliu > > OR keep device file in user space (current approach) > Right now Virtio-for-arm patches are blocked on this, let me know if > someone has better approach/thought in mind. > > Thanks. > > > Then we can try a pci_mmap solution or, as you suggest, an interface in > > drivers/char/mem.c > >
On Fri, Dec 18, 2015 at 01:30:53PM +0800, Yuanhan Liu wrote: > On Thu, Dec 17, 2015 at 04:52:00PM +0530, Santosh Shukla wrote: > > >> >> IMO, it is worth keeping one special device file who could work across > > >> >> archs like arm/arm64/powerpc and others, who could map iopci bar to > > >> >> dpdk user-space. also this approach has no kernel version dependency > > >> >> too. BTW; I did mentioned in second approach in to add /dev/ioport > > >> >> interface in drivers/char/mem.c which could read more than byte in one > > >> >> single operation, but that has kernel dependency. However that > > >> >> approach too is arch agnostic. > > >> > > > >> > Your first approach use an out-of-tree kernel module (igb_uio), so we cannot > > >> > really say there is no kernel dependency. > > >> > > >> Agree but I mentioned kernel __version__ dependency. > > > > > > Yes you did. > > > One of the main issue with out-of-tree kernel modules is the version > > > dependency. Probably that igb_uio from DPDK 2.3 will not compile with > > > the kernel 5.0. > > > > > > > don't know kernel 5.0 feature list so I guess your may be right. is > > uio obsoleted for 5.0 kernel? > > Nope, Thomas meant to say that we should keep the old DPDK will work > with newer kernel, not just the newest DPDK work on newest linux kernel > only. The out-of-tree kernel makes no garantee on that. > > > >> > We should try to remove the need for any out-of-tree kernel module. > > >> > That's why the Linux upstream approach is a better solution. > > >> > > >> IIUC, your suggesting archs like arm/arm64 to support io_mappe_io in > > >> pci_mmap_page_range()? > > > > > > I don't know what is the best solution in the kernel. > > > First we need to be sure that there is absolutely no solution without > > > kernel changes. > > > > I guess we have done enough evaluation / investigation that suggest - > > so to map iopci region to userspace in arch agnostic-way - > > > > # either we need to modify kernel > > - Make sure all the non-x86 arch to support mapping for > > iopci region (i.e. pci_mmap_page_range). I don;t think its a correct > > approach though. > > or > > - include /dev/ioport char-mem device file who could do > > more than byte operation, Note that this implementation does not exist > > in kernel. I could send an RFC to lkml. > > Maybe you could propose the two to lkml, to get some feedbacks from > those kernel/ARM gurus? Please cc me if you do so. IMO, We need to support DPDK-ARM for the old kernels also. If there is no means to expose iomap region to userspace in the existing kernel then we add that support through out-of-tree driver like igb_uio and when make become part of the kernel then we can turn off of the out-of-tree driver based on the kernel major and minor version numbers. Jerin > > --yliu > > > > OR keep device file in user space (current approach) > > Right now Virtio-for-arm patches are blocked on this, let me know if > > someone has better approach/thought in mind. > > > > Thanks. > > > > > Then we can try a pci_mmap solution or, as you suggest, an interface in > > > drivers/char/mem.c > > >
On Fri, Dec 18, 2015 at 11:00 AM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Thu, Dec 17, 2015 at 04:52:00PM +0530, Santosh Shukla wrote: >> >> >> IMO, it is worth keeping one special device file who could work across >> >> >> archs like arm/arm64/powerpc and others, who could map iopci bar to >> >> >> dpdk user-space. also this approach has no kernel version dependency >> >> >> too. BTW; I did mentioned in second approach in to add /dev/ioport >> >> >> interface in drivers/char/mem.c which could read more than byte in one >> >> >> single operation, but that has kernel dependency. However that >> >> >> approach too is arch agnostic. >> >> > >> >> > Your first approach use an out-of-tree kernel module (igb_uio), so we cannot >> >> > really say there is no kernel dependency. >> >> >> >> Agree but I mentioned kernel __version__ dependency. >> > >> > Yes you did. >> > One of the main issue with out-of-tree kernel modules is the version >> > dependency. Probably that igb_uio from DPDK 2.3 will not compile with >> > the kernel 5.0. >> > >> >> don't know kernel 5.0 feature list so I guess your may be right. is >> uio obsoleted for 5.0 kernel? > > Nope, Thomas meant to say that we should keep the old DPDK will work > with newer kernel, not just the newest DPDK work on newest linux kernel > only. The out-of-tree kernel makes no garantee on that. > >> >> > We should try to remove the need for any out-of-tree kernel module. >> >> > That's why the Linux upstream approach is a better solution. >> >> >> >> IIUC, your suggesting archs like arm/arm64 to support io_mappe_io in >> >> pci_mmap_page_range()? >> > >> > I don't know what is the best solution in the kernel. >> > First we need to be sure that there is absolutely no solution without >> > kernel changes. >> >> I guess we have done enough evaluation / investigation that suggest - >> so to map iopci region to userspace in arch agnostic-way - >> >> # either we need to modify kernel >> - Make sure all the non-x86 arch to support mapping for >> iopci region (i.e. pci_mmap_page_range). I don;t think its a correct >> approach though. >> or >> - include /dev/ioport char-mem device file who could do >> more than byte operation, Note that this implementation does not exist >> in kernel. I could send an RFC to lkml. > > Maybe you could propose the two to lkml, to get some feedbacks from > those kernel/ARM gurus? Please cc me if you do so. > The latter one I already shared old lkml thread, Pl. revisit my v1 0/6 patch [1] and in that refer [2]. Josh has already proposed to lkml but for some reason thread didn't went far. I can restart that discussion giving dpdk use-case as an example/ requirement. And for the former one, I'll have to check with linux-arm why iopci region not mappable. By grepping kernel source all i could see two commit - latest one is 415ae101 and older one 1da177e. Both has nothing to explain why if (mmap_state == pci_mmap_io) return -EINVAL; Setting pci io region to -EINVAL, should have fundamental reason for sure. But we'll have to check, for that I could post as a query rather a patch to lkml. Note that dpdk already has out-of-tree implementation for dom0/xen-case too, it creates special device file which maps pci resources. so keep one more igb_uio char device so to map iopci region wont hurt much though! [1] http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/29531 > --yliu >> >> OR keep device file in user space (current approach) >> Right now Virtio-for-arm patches are blocked on this, let me know if >> someone has better approach/thought in mind. >> >> Thanks. >> >> > Then we can try a pci_mmap solution or, as you suggest, an interface in >> > drivers/char/mem.c >> >
On Fri, Dec 18, 2015 at 12:04:48PM +0530, Jerin Jacob wrote: > > > # either we need to modify kernel > > > - Make sure all the non-x86 arch to support mapping for > > > iopci region (i.e. pci_mmap_page_range). I don;t think its a correct > > > approach though. > > > or > > > - include /dev/ioport char-mem device file who could do > > > more than byte operation, Note that this implementation does not exist > > > in kernel. I could send an RFC to lkml. > > > > Maybe you could propose the two to lkml, to get some feedbacks from > > those kernel/ARM gurus? Please cc me if you do so. > > IMO, We need to support DPDK-ARM for the old kernels also. If there is no means > to expose iomap region to userspace in the existing kernel then we add that > support through out-of-tree driver like igb_uio and when make become part of > the kernel then we can turn off of the out-of-tree driver based > on the kernel major and minor version numbers. Agreed. (BTW, I have no idea why the CC list of my last reply was shortened to Santosh only. I double checked with my sent box, the mail header sent out looks normal: Date: Fri, 18 Dec 2015 13:30:53 +0800 From: Yuanhan Liu <yuanhan.liu@linux.intel.com> To: Santosh Shukla <sshukla@mvista.com> Cc: Thomas Monjalon <thomas.monjalon@6wind.com>, dev@dpdk.org, David Marchand <david.marchand@6wind.com> Not sure it's a bug from my mutt email client, or the mailing list) --yliu
On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla wrote: > >> I guess we have done enough evaluation / investigation that suggest - > >> so to map iopci region to userspace in arch agnostic-way - > >> > >> # either we need to modify kernel > >> - Make sure all the non-x86 arch to support mapping for > >> iopci region (i.e. pci_mmap_page_range). I don;t think its a correct > >> approach though. > >> or > >> - include /dev/ioport char-mem device file who could do > >> more than byte operation, Note that this implementation does not exist > >> in kernel. I could send an RFC to lkml. > > > > Maybe you could propose the two to lkml, to get some feedbacks from > > those kernel/ARM gurus? Please cc me if you do so. > > > > The latter one I already shared old lkml thread, Pl. revisit my v1 0/6 > patch [1] and in that refer [2]. Oops, sorry, a bit busy, that I didn't look at it carefully. My bad, anyway. > Josh has already proposed to lkml but for some reason thread didn't > went far. I can restart that discussion giving dpdk use-case as an > example/ requirement. I had a quick go through of the discussion. Both hpa and Arnd seem to be fine with the ioctl interface on /dev/port. Have you tried that? And if you want to restart it, ioctl might be a better option than /dev/ioport, judging from the discussion. > > And for the former one, I'll have to check with linux-arm why iopci > region not mappable. By grepping kernel source all i could see two > commit - latest one is 415ae101 and older one 1da177e. Both has > nothing to explain why > if (mmap_state == pci_mmap_io) > return -EINVAL; > > Setting pci io region to -EINVAL, should have fundamental reason for > sure. But we'll have to check, for that I could post as a query rather > a patch to lkml. No idea, such check should have been there in the very beginning for some reason. And here is a quote from Arnd: Only powerpc, microblaze, alpha, sparc and xtensa allow users to mmap I/O space, even though a lot of others could. > > Note that dpdk already has out-of-tree implementation for > dom0/xen-case too, it creates special device file which maps pci > resources. so keep one more igb_uio char device so to map iopci region > wont hurt much though! Xen is not a good example here; I'm even not sure it still works or not :) --yliu > [1] http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/29531 > > > --yliu > >> > >> OR keep device file in user space (current approach) > >> Right now Virtio-for-arm patches are blocked on this, let me know if > >> someone has better approach/thought in mind. > >> > >> Thanks. > >> > >> > Then we can try a pci_mmap solution or, as you suggest, an interface in > >> > drivers/char/mem.c > >> >
2015-12-18 12:04, Jerin Jacob: > On Fri, Dec 18, 2015 at 01:30:53PM +0800, Yuanhan Liu wrote: > > On Thu, Dec 17, 2015 at 04:52:00PM +0530, Santosh Shukla wrote: > > > > One of the main issue with out-of-tree kernel modules is the version > > > > dependency. Probably that igb_uio from DPDK 2.3 will not compile with > > > > the kernel 5.0. > > > > > > don't know kernel 5.0 feature list so I guess your may be right. is > > > uio obsoleted for 5.0 kernel? > > > > Nope, Thomas meant to say that we should keep the old DPDK will work > > with newer kernel, not just the newest DPDK work on newest linux kernel > > only. The out-of-tree kernel makes no garantee on that. > > > > > >> > We should try to remove the need for any out-of-tree kernel module. > > > >> > That's why the Linux upstream approach is a better solution. > > > >> > > > >> IIUC, your suggesting archs like arm/arm64 to support io_mappe_io in > > > >> pci_mmap_page_range()? > > > > > > > > I don't know what is the best solution in the kernel. > > > > First we need to be sure that there is absolutely no solution without > > > > kernel changes. > > > > > > I guess we have done enough evaluation / investigation that suggest - > > > so to map iopci region to userspace in arch agnostic-way - > > > > > > # either we need to modify kernel > > > - Make sure all the non-x86 arch to support mapping for > > > iopci region (i.e. pci_mmap_page_range). I don;t think its a correct > > > approach though. > > > or > > > - include /dev/ioport char-mem device file who could do > > > more than byte operation, Note that this implementation does not exist > > > in kernel. I could send an RFC to lkml. > > > > Maybe you could propose the two to lkml, to get some feedbacks from > > those kernel/ARM gurus? Please cc me if you do so. > > IMO, We need to support DPDK-ARM for the old kernels also. If there is no means > to expose iomap region to userspace in the existing kernel then we add that > support through out-of-tree driver like igb_uio and when make become part of > the kernel then we can turn off of the out-of-tree driver based > on the kernel major and minor version numbers. Of course, we have to support DPDK for old/current kernels. But if we continue to implement features in out-of-tree modules without trying to have a solution with upstream kernel, we are stuck with our legacy forever. So, as sane policy, we should not accept new changes in our kernel modules if there is no alternative in Linus' repo.
On Fri, Dec 18, 2015 at 1:51 PM, Yuanhan Liu <yuanhan.liu@linux.intel.com> wrote: > On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla wrote: >> >> I guess we have done enough evaluation / investigation that suggest - >> >> so to map iopci region to userspace in arch agnostic-way - >> >> >> >> # either we need to modify kernel >> >> - Make sure all the non-x86 arch to support mapping for >> >> iopci region (i.e. pci_mmap_page_range). I don;t think its a correct >> >> approach though. >> >> or >> >> - include /dev/ioport char-mem device file who could do >> >> more than byte operation, Note that this implementation does not exist >> >> in kernel. I could send an RFC to lkml. >> > >> > Maybe you could propose the two to lkml, to get some feedbacks from >> > those kernel/ARM gurus? Please cc me if you do so. >> > >> >> The latter one I already shared old lkml thread, Pl. revisit my v1 0/6 >> patch [1] and in that refer [2]. > > Oops, sorry, a bit busy, that I didn't look at it carefully. My bad, > anyway. > >> Josh has already proposed to lkml but for some reason thread didn't >> went far. I can restart that discussion giving dpdk use-case as an >> example/ requirement. > > I had a quick go through of the discussion. Both hpa and Arnd seem > to be fine with the ioctl interface on /dev/port. Have you tried > that? And if you want to restart it, ioctl might be a better option > than /dev/ioport, judging from the discussion. > I tried legacy patch and re-writing with ioctl-way; doing changes in dpdk port as well in kernel, had to test on quite other hw not only arm64 though! so it will take time for me, I am travelling tomorrow so bit delayed, We'll post patch to lkml and share dpdk-virtio feedback perhaps by Monday. >> >> And for the former one, I'll have to check with linux-arm why iopci >> region not mappable. By grepping kernel source all i could see two >> commit - latest one is 415ae101 and older one 1da177e. Both has >> nothing to explain why >> if (mmap_state == pci_mmap_io) >> return -EINVAL; >> >> Setting pci io region to -EINVAL, should have fundamental reason for >> sure. But we'll have to check, for that I could post as a query rather >> a patch to lkml. > > No idea, such check should have been there in the very beginning > for some reason. And here is a quote from Arnd: > > Only powerpc, microblaze, alpha, sparc and xtensa allow > users to mmap I/O space, even though a lot of others could. > >> >> Note that dpdk already has out-of-tree implementation for >> dom0/xen-case too, it creates special device file which maps pci >> resources. so keep one more igb_uio char device so to map iopci region >> wont hurt much though! > > Xen is not a good example here; I'm even not sure it still works > or not :) > > --yliu >> [1] http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/29531 >> >> > --yliu >> >> >> >> OR keep device file in user space (current approach) >> >> Right now Virtio-for-arm patches are blocked on this, let me know if >> >> someone has better approach/thought in mind. >> >> >> >> Thanks. >> >> >> >> > Then we can try a pci_mmap solution or, as you suggest, an interface in >> >> > drivers/char/mem.c >> >> >
On Fri, Dec 18, 2015 at 6:25 PM, Santosh Shukla <sshukla@mvista.com> wrote: > On Fri, Dec 18, 2015 at 1:51 PM, Yuanhan Liu > <yuanhan.liu@linux.intel.com> wrote: >> On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla wrote: >>> >> I guess we have done enough evaluation / investigation that suggest - >>> >> so to map iopci region to userspace in arch agnostic-way - >>> >> >>> >> # either we need to modify kernel >>> >> - Make sure all the non-x86 arch to support mapping for >>> >> iopci region (i.e. pci_mmap_page_range). I don;t think its a correct >>> >> approach though. >>> >> or >>> >> - include /dev/ioport char-mem device file who could do >>> >> more than byte operation, Note that this implementation does not exist >>> >> in kernel. I could send an RFC to lkml. >>> > >>> > Maybe you could propose the two to lkml, to get some feedbacks from >>> > those kernel/ARM gurus? Please cc me if you do so. >>> > >>> >>> The latter one I already shared old lkml thread, Pl. revisit my v1 0/6 >>> patch [1] and in that refer [2]. >> >> Oops, sorry, a bit busy, that I didn't look at it carefully. My bad, >> anyway. >> >>> Josh has already proposed to lkml but for some reason thread didn't >>> went far. I can restart that discussion giving dpdk use-case as an >>> example/ requirement. >> >> I had a quick go through of the discussion. Both hpa and Arnd seem >> to be fine with the ioctl interface on /dev/port. Have you tried >> that? And if you want to restart it, ioctl might be a better option >> than /dev/ioport, judging from the discussion. >> > > I tried legacy patch and re-writing with ioctl-way; doing changes in > dpdk port as well in kernel, had to test on quite other hw not only > arm64 though! so it will take time for me, I am travelling tomorrow so > bit delayed, We'll post patch to lkml and share dpdk-virtio feedback > perhaps by Monday. > I posted a query about /dev/ioports approach in lkml thread [1], and Arnd suggested to use vfio framework but it looks like vfio too does not map ioresource_io region. Same communicated to Arnd and waiting for his reply. In mean time I like to ask general question; - Has anyone tried vfio/non-iommu approach for virtio pmd driver? If not then Is there any plan? Someone planning to take up. [1] https://lkml.org/lkml/2015/12/23/145
Hi Santosh, > On Fri, Dec 18, 2015 at 6:25 PM, Santosh Shukla <sshukla@mvista.com> > wrote: > > On Fri, Dec 18, 2015 at 1:51 PM, Yuanhan Liu > > <yuanhan.liu@linux.intel.com> wrote: > >> On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla wrote: > >>> >> I guess we have done enough evaluation / investigation that > >>> >> suggest - so to map iopci region to userspace in arch > >>> >> agnostic-way - > >>> >> > >>> >> # either we need to modify kernel > >>> >> - Make sure all the non-x86 arch to support > >>> >> mapping for iopci region (i.e. pci_mmap_page_range). I don;t > >>> >> think its a correct approach though. > >>> >> or > >>> >> - include /dev/ioport char-mem device file who > >>> >> could do more than byte operation, Note that this implementation > >>> >> does not exist in kernel. I could send an RFC to lkml. > >>> > > >>> > Maybe you could propose the two to lkml, to get some feedbacks > >>> > from those kernel/ARM gurus? Please cc me if you do so. > >>> > > >>> > >>> The latter one I already shared old lkml thread, Pl. revisit my v1 > >>> 0/6 patch [1] and in that refer [2]. > >> > >> Oops, sorry, a bit busy, that I didn't look at it carefully. My bad, > >> anyway. > >> > >>> Josh has already proposed to lkml but for some reason thread didn't > >>> went far. I can restart that discussion giving dpdk use-case as an > >>> example/ requirement. > >> > >> I had a quick go through of the discussion. Both hpa and Arnd seem to > >> be fine with the ioctl interface on /dev/port. Have you tried that? > >> And if you want to restart it, ioctl might be a better option than > >> /dev/ioport, judging from the discussion. > >> > > > > I tried legacy patch and re-writing with ioctl-way; doing changes in > > dpdk port as well in kernel, had to test on quite other hw not only > > arm64 though! so it will take time for me, I am travelling tomorrow so > > bit delayed, We'll post patch to lkml and share dpdk-virtio feedback > > perhaps by Monday. > > > > I posted a query about /dev/ioports approach in lkml thread [1], and Arnd > suggested to use vfio framework but it looks like vfio too does not map > ioresource_io region. Same communicated to Arnd and waiting for his reply. > > In mean time I like to ask general question; > - Has anyone tried vfio/non-iommu approach for virtio pmd driver? If not > then Is there any plan? Someone planning to take up. > [1] https://lkml.org/lkml/2015/12/23/145 I have submitted a patch to support no-iommu mode, but I'm not aware of anyone trying VFIO-noiommu at all. That's probably expected since it's Christmas/New Year in a lot of places, and everyone is on a break. That said, I'm not sure I completely understand what is it that you're asking about. The code you're referring to (in vfio_pci.c, line 854 as of kernel 4.3) is checking if a PCI BAR is available for IO (hence checking if the IORESOURCE_MEM bit is set). There isn't any "ioresource_mem" region as far as VFIO is concerned, there are only BARs, ROM, VGA and PCI config regions (linux/vfio.h, line 314 as of kernel 4.3). So if you're missing some PCI regions for VFIO to map, they would first need to be added to VFIO kernel implementation before they can be used by DPDK. That is, unless I'm misunderstanding something :) Thanks, Anatoly
On Tue, Dec 29, 2015 at 3:26 PM, Burakov, Anatoly <anatoly.burakov@intel.com> wrote: > Hi Santosh, > >> On Fri, Dec 18, 2015 at 6:25 PM, Santosh Shukla <sshukla@mvista.com> >> wrote: >> > On Fri, Dec 18, 2015 at 1:51 PM, Yuanhan Liu >> > <yuanhan.liu@linux.intel.com> wrote: >> >> On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla wrote: >> >>> >> I guess we have done enough evaluation / investigation that >> >>> >> suggest - so to map iopci region to userspace in arch >> >>> >> agnostic-way - >> >>> >> >> >>> >> # either we need to modify kernel >> >>> >> - Make sure all the non-x86 arch to support >> >>> >> mapping for iopci region (i.e. pci_mmap_page_range). I don;t >> >>> >> think its a correct approach though. >> >>> >> or >> >>> >> - include /dev/ioport char-mem device file who >> >>> >> could do more than byte operation, Note that this implementation >> >>> >> does not exist in kernel. I could send an RFC to lkml. >> >>> > >> >>> > Maybe you could propose the two to lkml, to get some feedbacks >> >>> > from those kernel/ARM gurus? Please cc me if you do so. >> >>> > >> >>> >> >>> The latter one I already shared old lkml thread, Pl. revisit my v1 >> >>> 0/6 patch [1] and in that refer [2]. >> >> >> >> Oops, sorry, a bit busy, that I didn't look at it carefully. My bad, >> >> anyway. >> >> >> >>> Josh has already proposed to lkml but for some reason thread didn't >> >>> went far. I can restart that discussion giving dpdk use-case as an >> >>> example/ requirement. >> >> >> >> I had a quick go through of the discussion. Both hpa and Arnd seem to >> >> be fine with the ioctl interface on /dev/port. Have you tried that? >> >> And if you want to restart it, ioctl might be a better option than >> >> /dev/ioport, judging from the discussion. >> >> >> > >> > I tried legacy patch and re-writing with ioctl-way; doing changes in >> > dpdk port as well in kernel, had to test on quite other hw not only >> > arm64 though! so it will take time for me, I am travelling tomorrow so >> > bit delayed, We'll post patch to lkml and share dpdk-virtio feedback >> > perhaps by Monday. >> > >> >> I posted a query about /dev/ioports approach in lkml thread [1], and Arnd >> suggested to use vfio framework but it looks like vfio too does not map >> ioresource_io region. Same communicated to Arnd and waiting for his reply. >> >> In mean time I like to ask general question; >> - Has anyone tried vfio/non-iommu approach for virtio pmd driver? If not >> then Is there any plan? Someone planning to take up. >> [1] https://lkml.org/lkml/2015/12/23/145 > > I have submitted a patch to support no-iommu mode, but I'm not aware of anyone trying VFIO-noiommu at all. That's probably expected since it's Christmas/New Year in a lot of places, and everyone is on a break. > > That said, I'm not sure I completely understand what is it that you're asking about. The code you're referring to (in vfio_pci.c, line 854 as of kernel 4.3) is checking if a PCI BAR is available for IO (hence checking if the IORESOURCE_MEM Thanks for reply! You comment might help to move this discuss to next level. Look at kernel/resource.c, it exports two symbol ioport_resource and iomem_resource and sets appropriate flag type i.e.. IORESOURCE_IO and IORESOURCE_MEM. In virtio-net case; it creates both pci region i.e.. _io bar and _mem bar. dpdk virtio pmd driver (<= 0.95 virtio spec) uses pci _io bar region for device initialization as virtio headers are locate at pci _io bar region. Since it uses pci _iobar region so likely it update pci_resource.[index].flag = IORESOURCE_IO. and vfio mmap function wont handle ioresource_io (i guess). And that is why I asked same to lkml thread. bit is set). There isn't any "ioresource_mem" region as far as VFIO is concerned, there are only BARs, ROM, VGA and PCI config regions (linux/vfio.h, line 314 as of kernel 4.3). So if you're missing some PCI regions for VFIO to map, they would first need to be added to VFIO kernel implementation before they can be used by DPDK. That is, unless I'm misunderstanding something :) > Agree. As mentioned above, I guess ioresource_io pci bar implementation missing in vfio kernel? but before adding code support in kernel I would like to hear from experts example, You, Alex! (looping alex) > Thanks, > Anatoly
Hi Santosh, > Look at kernel/resource.c, it exports two symbol ioport_resource and > iomem_resource and sets appropriate flag type i.e.. IORESOURCE_IO and > IORESOURCE_MEM. In virtio-net case; it creates both pci region i.e.. > _io bar and _mem bar. dpdk virtio pmd driver (<= 0.95 virtio spec) uses pci > _io bar region for device initialization as virtio headers are locate at pci _io bar > region. Since it uses pci _iobar region so likely it update > pci_resource.[index].flag = IORESOURCE_IO. and vfio mmap function wont > handle ioresource_io (i guess). And that is why I asked same to lkml thread. Yes, I can see that this is what I was misunderstanding (that IORESOURCE_IO isn't a BAR but rather a flag). Technically, if you also set IORESOURCE_MEM along with IORESOURCE_IO, then the BAR would be mapped by VFIO, but this sounds like a dangerous hack :) other than that, yes, I'm afraid it's up to kernel guys to add support for it. Once done, provided IO BARs are meant to be worked with the same way as MEM BARs, there's a good chance it would work out of the box with DPDK. Thanks, Anatoly
On Tue, Dec 29, 2015 at 4:36 PM, Burakov, Anatoly <anatoly.burakov@intel.com> wrote: > Hi Santosh, > >> Look at kernel/resource.c, it exports two symbol ioport_resource and >> iomem_resource and sets appropriate flag type i.e.. IORESOURCE_IO and >> IORESOURCE_MEM. In virtio-net case; it creates both pci region i.e.. >> _io bar and _mem bar. dpdk virtio pmd driver (<= 0.95 virtio spec) uses pci >> _io bar region for device initialization as virtio headers are locate at pci _io bar >> region. Since it uses pci _iobar region so likely it update >> pci_resource.[index].flag = IORESOURCE_IO. and vfio mmap function wont >> handle ioresource_io (i guess). And that is why I asked same to lkml thread. > > Yes, I can see that this is what I was misunderstanding (that IORESOURCE_IO isn't a BAR but rather a flag). Technically, if you also set IORESOURCE_MEM along with IORESOURCE_IO, then the BAR would be mapped by VFIO, but this sounds like a dangerous hack :) other than that, yes, I'm afraid it's up to kernel guys to add support for it. Once done, provided IO BARs are meant to be worked with the same way as MEM BARs, there's a good chance it would work out of the box with DPDK. > I guess so, I'll give it a try; although before that I need to port / use your vfio/noiommu patch for virtio pmd driver and perhaps do dependant changes, I will post my test feedback. Thanks > Thanks, > Anatoly
On Tue, 2015-12-29 at 16:17 +0530, Santosh Shukla wrote: > On Tue, Dec 29, 2015 at 3:26 PM, Burakov, Anatoly > <anatoly.burakov@intel.com> wrote: > > Hi Santosh, > > > > > On Fri, Dec 18, 2015 at 6:25 PM, Santosh Shukla <sshukla@mvista.c > > > om> > > > wrote: > > > > On Fri, Dec 18, 2015 at 1:51 PM, Yuanhan Liu > > > > <yuanhan.liu@linux.intel.com> wrote: > > > > > On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla > > > > > wrote: > > > > > > > > I guess we have done enough evaluation / investigation > > > > > > > > that > > > > > > > > suggest - so to map iopci region to userspace in arch > > > > > > > > agnostic-way - > > > > > > > > > > > > > > > > # either we need to modify kernel > > > > > > > > - Make sure all the non-x86 arch to > > > > > > > > support > > > > > > > > mapping for iopci region (i.e. pci_mmap_page_range). I > > > > > > > > don;t > > > > > > > > think its a correct approach though. > > > > > > > > or > > > > > > > > - include /dev/ioport char-mem device > > > > > > > > file who > > > > > > > > could do more than byte operation, Note that this > > > > > > > > implementation > > > > > > > > does not exist in kernel. I could send an RFC to lkml. > > > > > > > > > > > > > > Maybe you could propose the two to lkml, to get some > > > > > > > feedbacks > > > > > > > from those kernel/ARM gurus? Please cc me if you do so. > > > > > > > > > > > > > > > > > > > The latter one I already shared old lkml thread, Pl. > > > > > > revisit my v1 > > > > > > 0/6 patch [1] and in that refer [2]. > > > > > > > > > > Oops, sorry, a bit busy, that I didn't look at it carefully. > > > > > My bad, > > > > > anyway. > > > > > > > > > > > Josh has already proposed to lkml but for some reason > > > > > > thread didn't > > > > > > went far. I can restart that discussion giving dpdk use- > > > > > > case as an > > > > > > example/ requirement. > > > > > > > > > > I had a quick go through of the discussion. Both hpa and Arnd > > > > > seem to > > > > > be fine with the ioctl interface on /dev/port. Have you tried > > > > > that? > > > > > And if you want to restart it, ioctl might be a better option > > > > > than > > > > > /dev/ioport, judging from the discussion. > > > > > > > > > > > > > I tried legacy patch and re-writing with ioctl-way; doing > > > > changes in > > > > dpdk port as well in kernel, had to test on quite other hw not > > > > only > > > > arm64 though! so it will take time for me, I am travelling > > > > tomorrow so > > > > bit delayed, We'll post patch to lkml and share dpdk-virtio > > > > feedback > > > > perhaps by Monday. > > > > > > > > > > I posted a query about /dev/ioports approach in lkml thread [1], > > > and Arnd > > > suggested to use vfio framework but it looks like vfio too does > > > not map > > > ioresource_io region. Same communicated to Arnd and waiting for > > > his reply. > > > > > > In mean time I like to ask general question; > > > - Has anyone tried vfio/non-iommu approach for virtio pmd driver? > > > If not > > > then Is there any plan? Someone planning to take up. > > > [1] https://lkml.org/lkml/2015/12/23/145 > > > > I have submitted a patch to support no-iommu mode, but I'm not > > aware of anyone trying VFIO-noiommu at all. That's probably > > expected since it's Christmas/New Year in a lot of places, and > > everyone is on a break. > > > > That said, I'm not sure I completely understand what is it that > > you're asking about. The code you're referring to (in vfio_pci.c, > > line 854 as of kernel 4.3) is checking if a PCI BAR is available > > for IO (hence checking if the IORESOURCE_MEM > > > Thanks for reply! You comment might help to move this discuss to next > level. > > Look at kernel/resource.c, it exports two symbol ioport_resource and > iomem_resource and sets appropriate flag type i.e.. IORESOURCE_IO and > IORESOURCE_MEM. In virtio-net case; it creates both pci region i.e.. > _io bar and _mem bar. dpdk virtio pmd driver (<= 0.95 virtio spec) > uses pci _io bar region for device initialization as virtio headers > are locate at pci _io bar region. Since it uses pci _iobar region so > likely it update pci_resource.[index].flag = IORESOURCE_IO. and vfio > mmap function wont handle ioresource_io (i guess). And that is why I > asked same to lkml thread. > > > bit is set). There isn't any "ioresource_mem" region as far as VFIO > is > concerned, there are only BARs, ROM, VGA and PCI > config regions (linux/vfio.h, line 314 as of kernel 4.3). So if > you're > missing some PCI regions for VFIO to map, they would first need to be > added to VFIO kernel implementation before they can be used by DPDK. > That is, unless I'm misunderstanding something :) > > > > Agree. As mentioned above, I guess ioresource_io pci bar > implementation missing in vfio kernel? but before adding code support > in kernel I would like to hear from experts example, You, Alex! > (looping alex) MMIO and I/O port space are simply regions as far as VFIO is concerned. The userspace API supports the concept of I/O port regions that can be mmap'd by userspace, but the implementation does not since I/O port regions cannot be mmap'd on x86. Someone needs to propose some code for vfio-pci to support it. Thanks, Alex
On Tue, Dec 29, 2015 at 7:34 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 2015-12-29 at 16:17 +0530, Santosh Shukla wrote: >> On Tue, Dec 29, 2015 at 3:26 PM, Burakov, Anatoly >> <anatoly.burakov@intel.com> wrote: >> > Hi Santosh, >> > >> > > On Fri, Dec 18, 2015 at 6:25 PM, Santosh Shukla <sshukla@mvista.c >> > > om> >> > > wrote: >> > > > On Fri, Dec 18, 2015 at 1:51 PM, Yuanhan Liu >> > > > <yuanhan.liu@linux.intel.com> wrote: >> > > > > On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla >> > > > > wrote: >> > > > > > > > I guess we have done enough evaluation / investigation >> > > > > > > > that >> > > > > > > > suggest - so to map iopci region to userspace in arch >> > > > > > > > agnostic-way - >> > > > > > > > >> > > > > > > > # either we need to modify kernel >> > > > > > > > - Make sure all the non-x86 arch to >> > > > > > > > support >> > > > > > > > mapping for iopci region (i.e. pci_mmap_page_range). I >> > > > > > > > don;t >> > > > > > > > think its a correct approach though. >> > > > > > > > or >> > > > > > > > - include /dev/ioport char-mem device >> > > > > > > > file who >> > > > > > > > could do more than byte operation, Note that this >> > > > > > > > implementation >> > > > > > > > does not exist in kernel. I could send an RFC to lkml. >> > > > > > > >> > > > > > > Maybe you could propose the two to lkml, to get some >> > > > > > > feedbacks >> > > > > > > from those kernel/ARM gurus? Please cc me if you do so. >> > > > > > > >> > > > > > >> > > > > > The latter one I already shared old lkml thread, Pl. >> > > > > > revisit my v1 >> > > > > > 0/6 patch [1] and in that refer [2]. >> > > > > >> > > > > Oops, sorry, a bit busy, that I didn't look at it carefully. >> > > > > My bad, >> > > > > anyway. >> > > > > >> > > > > > Josh has already proposed to lkml but for some reason >> > > > > > thread didn't >> > > > > > went far. I can restart that discussion giving dpdk use- >> > > > > > case as an >> > > > > > example/ requirement. >> > > > > >> > > > > I had a quick go through of the discussion. Both hpa and Arnd >> > > > > seem to >> > > > > be fine with the ioctl interface on /dev/port. Have you tried >> > > > > that? >> > > > > And if you want to restart it, ioctl might be a better option >> > > > > than >> > > > > /dev/ioport, judging from the discussion. >> > > > > >> > > > >> > > > I tried legacy patch and re-writing with ioctl-way; doing >> > > > changes in >> > > > dpdk port as well in kernel, had to test on quite other hw not >> > > > only >> > > > arm64 though! so it will take time for me, I am travelling >> > > > tomorrow so >> > > > bit delayed, We'll post patch to lkml and share dpdk-virtio >> > > > feedback >> > > > perhaps by Monday. >> > > > >> > > >> > > I posted a query about /dev/ioports approach in lkml thread [1], >> > > and Arnd >> > > suggested to use vfio framework but it looks like vfio too does >> > > not map >> > > ioresource_io region. Same communicated to Arnd and waiting for >> > > his reply. >> > > >> > > In mean time I like to ask general question; >> > > - Has anyone tried vfio/non-iommu approach for virtio pmd driver? >> > > If not >> > > then Is there any plan? Someone planning to take up. >> > > [1] https://lkml.org/lkml/2015/12/23/145 >> > >> > I have submitted a patch to support no-iommu mode, but I'm not >> > aware of anyone trying VFIO-noiommu at all. That's probably >> > expected since it's Christmas/New Year in a lot of places, and >> > everyone is on a break. >> > >> > That said, I'm not sure I completely understand what is it that >> > you're asking about. The code you're referring to (in vfio_pci.c, >> > line 854 as of kernel 4.3) is checking if a PCI BAR is available >> > for IO (hence checking if the IORESOURCE_MEM >> >> >> Thanks for reply! You comment might help to move this discuss to next >> level. >> >> Look at kernel/resource.c, it exports two symbol ioport_resource and >> iomem_resource and sets appropriate flag type i.e.. IORESOURCE_IO and >> IORESOURCE_MEM. In virtio-net case; it creates both pci region i.e.. >> _io bar and _mem bar. dpdk virtio pmd driver (<= 0.95 virtio spec) >> uses pci _io bar region for device initialization as virtio headers >> are locate at pci _io bar region. Since it uses pci _iobar region so >> likely it update pci_resource.[index].flag = IORESOURCE_IO. and vfio >> mmap function wont handle ioresource_io (i guess). And that is why I >> asked same to lkml thread. >> >> >> bit is set). There isn't any "ioresource_mem" region as far as VFIO >> is >> concerned, there are only BARs, ROM, VGA and PCI >> config regions (linux/vfio.h, line 314 as of kernel 4.3). So if >> you're >> missing some PCI regions for VFIO to map, they would first need to be >> added to VFIO kernel implementation before they can be used by DPDK. >> That is, unless I'm misunderstanding something :) >> > >> >> Agree. As mentioned above, I guess ioresource_io pci bar >> implementation missing in vfio kernel? but before adding code support >> in kernel I would like to hear from experts example, You, Alex! >> (looping alex) > > MMIO and I/O port space are simply regions as far as VFIO is concerned. > The userspace API supports the concept of I/O port regions that can be > mmap'd by userspace, but the implementation does not since I/O port > regions cannot be mmap'd on x86. Someone needs to propose some code > for vfio-pci to support it. Thanks, > I will work on this and post an RFC soon to LKML, That RFC targetted to map IOport region for non-x86 platform in particular. Thanks for confirming the vfio behaviour. > Alex
On Tue, Dec 29, 2015 at 8:21 PM, Santosh Shukla <sshukla@mvista.com> wrote: > On Tue, Dec 29, 2015 at 7:34 PM, Alex Williamson > <alex.williamson@redhat.com> wrote: >> On Tue, 2015-12-29 at 16:17 +0530, Santosh Shukla wrote: >>> On Tue, Dec 29, 2015 at 3:26 PM, Burakov, Anatoly >>> <anatoly.burakov@intel.com> wrote: >>> > Hi Santosh, >>> > >>> > > On Fri, Dec 18, 2015 at 6:25 PM, Santosh Shukla <sshukla@mvista.c >>> > > om> >>> > > wrote: >>> > > > On Fri, Dec 18, 2015 at 1:51 PM, Yuanhan Liu >>> > > > <yuanhan.liu@linux.intel.com> wrote: >>> > > > > On Fri, Dec 18, 2015 at 01:24:41PM +0530, Santosh Shukla >>> > > > > wrote: >>> > > > > > > > I guess we have done enough evaluation / investigation >>> > > > > > > > that >>> > > > > > > > suggest - so to map iopci region to userspace in arch >>> > > > > > > > agnostic-way - >>> > > > > > > > >>> > > > > > > > # either we need to modify kernel >>> > > > > > > > - Make sure all the non-x86 arch to >>> > > > > > > > support >>> > > > > > > > mapping for iopci region (i.e. pci_mmap_page_range). I >>> > > > > > > > don;t >>> > > > > > > > think its a correct approach though. >>> > > > > > > > or >>> > > > > > > > - include /dev/ioport char-mem device >>> > > > > > > > file who >>> > > > > > > > could do more than byte operation, Note that this >>> > > > > > > > implementation >>> > > > > > > > does not exist in kernel. I could send an RFC to lkml. >>> > > > > > > >>> > > > > > > Maybe you could propose the two to lkml, to get some >>> > > > > > > feedbacks >>> > > > > > > from those kernel/ARM gurus? Please cc me if you do so. >>> > > > > > > >>> > > > > > >>> > > > > > The latter one I already shared old lkml thread, Pl. >>> > > > > > revisit my v1 >>> > > > > > 0/6 patch [1] and in that refer [2]. >>> > > > > >>> > > > > Oops, sorry, a bit busy, that I didn't look at it carefully. >>> > > > > My bad, >>> > > > > anyway. >>> > > > > >>> > > > > > Josh has already proposed to lkml but for some reason >>> > > > > > thread didn't >>> > > > > > went far. I can restart that discussion giving dpdk use- >>> > > > > > case as an >>> > > > > > example/ requirement. >>> > > > > >>> > > > > I had a quick go through of the discussion. Both hpa and Arnd >>> > > > > seem to >>> > > > > be fine with the ioctl interface on /dev/port. Have you tried >>> > > > > that? >>> > > > > And if you want to restart it, ioctl might be a better option >>> > > > > than >>> > > > > /dev/ioport, judging from the discussion. >>> > > > > >>> > > > >>> > > > I tried legacy patch and re-writing with ioctl-way; doing >>> > > > changes in >>> > > > dpdk port as well in kernel, had to test on quite other hw not >>> > > > only >>> > > > arm64 though! so it will take time for me, I am travelling >>> > > > tomorrow so >>> > > > bit delayed, We'll post patch to lkml and share dpdk-virtio >>> > > > feedback >>> > > > perhaps by Monday. >>> > > > >>> > > >>> > > I posted a query about /dev/ioports approach in lkml thread [1], >>> > > and Arnd >>> > > suggested to use vfio framework but it looks like vfio too does >>> > > not map >>> > > ioresource_io region. Same communicated to Arnd and waiting for >>> > > his reply. >>> > > >>> > > In mean time I like to ask general question; >>> > > - Has anyone tried vfio/non-iommu approach for virtio pmd driver? >>> > > If not >>> > > then Is there any plan? Someone planning to take up. >>> > > [1] https://lkml.org/lkml/2015/12/23/145 >>> > >>> > I have submitted a patch to support no-iommu mode, but I'm not >>> > aware of anyone trying VFIO-noiommu at all. That's probably >>> > expected since it's Christmas/New Year in a lot of places, and >>> > everyone is on a break. >>> > >>> > That said, I'm not sure I completely understand what is it that >>> > you're asking about. The code you're referring to (in vfio_pci.c, >>> > line 854 as of kernel 4.3) is checking if a PCI BAR is available >>> > for IO (hence checking if the IORESOURCE_MEM >>> >>> >>> Thanks for reply! You comment might help to move this discuss to next >>> level. >>> >>> Look at kernel/resource.c, it exports two symbol ioport_resource and >>> iomem_resource and sets appropriate flag type i.e.. IORESOURCE_IO and >>> IORESOURCE_MEM. In virtio-net case; it creates both pci region i.e.. >>> _io bar and _mem bar. dpdk virtio pmd driver (<= 0.95 virtio spec) >>> uses pci _io bar region for device initialization as virtio headers >>> are locate at pci _io bar region. Since it uses pci _iobar region so >>> likely it update pci_resource.[index].flag = IORESOURCE_IO. and vfio >>> mmap function wont handle ioresource_io (i guess). And that is why I >>> asked same to lkml thread. >>> >>> >>> bit is set). There isn't any "ioresource_mem" region as far as VFIO >>> is >>> concerned, there are only BARs, ROM, VGA and PCI >>> config regions (linux/vfio.h, line 314 as of kernel 4.3). So if >>> you're >>> missing some PCI regions for VFIO to map, they would first need to be >>> added to VFIO kernel implementation before they can be used by DPDK. >>> That is, unless I'm misunderstanding something :) >>> > >>> >>> Agree. As mentioned above, I guess ioresource_io pci bar >>> implementation missing in vfio kernel? but before adding code support >>> in kernel I would like to hear from experts example, You, Alex! >>> (looping alex) >> >> MMIO and I/O port space are simply regions as far as VFIO is concerned. >> The userspace API supports the concept of I/O port regions that can be >> mmap'd by userspace, but the implementation does not since I/O port >> regions cannot be mmap'd on x86. Someone needs to propose some code >> for vfio-pci to support it. Thanks, >> > > I will work on this and post an RFC soon to LKML, That RFC targetted > to map IOport region for non-x86 platform in particular. Thanks for > confirming the vfio behaviour. > Some update: Patches used for setup - - Used Alex(s) vfio-noiommu linux-next patch - then used Anatoly(s) vfio-noiommu dpdk patch [1] - Added ioport pci bar map code support in kernel - Did similar change at dpdk(s) eal_pci_vfio.c file I could able to run testpmd on virtio-net dpdk's pmd driver.. I am cleaning the patches / changes in kernel and dpdk, soon posting the v3 patchset.. Thanks for the input! [1] http://dpdk.org/dev/patchwork/patch/9619/ >> Alex
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h index 334c12e..8aaab4a 100644 --- a/lib/librte_eal/common/include/rte_pci.h +++ b/lib/librte_eal/common/include/rte_pci.h @@ -105,7 +105,8 @@ extern struct pci_device_list pci_device_list; /**< Global list of PCI devices. /** Nb. of values in PCI resource format. */ #define PCI_RESOURCE_FMT_NVAL 3 -/** IO resource type: memory address space */ +/** IO resource type: */ +#define IORESOURCE_IO 0x00000100 #define IORESOURCE_MEM 0x00000200 /** diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c index bc5b5be..9c4651d 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c @@ -236,12 +236,21 @@ pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev) goto error; } - if (flags & IORESOURCE_MEM) { - dev->mem_resource[i].phys_addr = phys_addr; - dev->mem_resource[i].len = end_addr - phys_addr + 1; - /* not mapped for now */ - dev->mem_resource[i].addr = NULL; - } + /* we only care about IORESOURCE_IO or IORESOURCE_MEM */ + if (!(flags & IORESOURCE_IO) && + !(flags & IORESOURCE_MEM)) + continue; + +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) + /* x86 can not remap ioports, so skip it, remapping code will + * look at dev->mem_resource[i].phys_addr == 0 and skip it */ + if (flags & IORESOURCE_IO) + continue; +#endif + dev->mem_resource[i].phys_addr = phys_addr; + dev->mem_resource[i].len = end_addr - phys_addr + 1; + /* not mapped for now */ + dev->mem_resource[i].addr = NULL; } fclose(f); return 0;