raw/ioat: fix bus requiring virtual addressing when no devs

Message ID 20210506150908.797110-1-bruce.richardson@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series raw/ioat: fix bus requiring virtual addressing when no devs |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/iol-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/github-robot success github build: passed
ci/Intel-compilation fail Compilation issues
ci/iol-mellanox-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Functional success Functional Testing PASS

Commit Message

Bruce Richardson May 6, 2021, 3:09 p.m. UTC
  If after a bus scan, there are no devices using a particular bus, then
that bus should not be taken into account when deciding whether DPDK
should be run in VA or PA addressing mode. This becomes an issue when
the DSA bus driver code is used on a system without an IOMMU. The PCI
bus correctly reports that it only works in PA mode, while the DSA bus -
also correctly - reports that it works only in VA mode. The difference
is that there will be no devices found in a scan for the DSA bus, since
the kernel driver can only present those to userspace in the presence of
an IOMMU.

While we could change DSA instance to always report that it does not
care about the addressing mode, this would imply that it could be used
with DPDK in PA mode which is not the case. Therefore, this patch
changes the driver to report DC (don't care) in the case where no
devices are present, and VA otherwise.

NOTE: this addressing mode use of VA-only applies only in the case of
using DSA through the idxd kernel driver. The use of DSA though vfio-pci
is unaffected and works as with other PCI devices.

Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")

Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/raw/ioat/idxd_bus.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
  

Comments

Burakov, Anatoly May 6, 2021, 3:16 p.m. UTC | #1
On 06-May-21 4:09 PM, Bruce Richardson wrote:
> If after a bus scan, there are no devices using a particular bus, then
> that bus should not be taken into account when deciding whether DPDK
> should be run in VA or PA addressing mode. This becomes an issue when
> the DSA bus driver code is used on a system without an IOMMU. The PCI
> bus correctly reports that it only works in PA mode, while the DSA bus -
> also correctly - reports that it works only in VA mode. The difference
> is that there will be no devices found in a scan for the DSA bus, since
> the kernel driver can only present those to userspace in the presence of
> an IOMMU.
> 
> While we could change DSA instance to always report that it does not
> care about the addressing mode, this would imply that it could be used
> with DPDK in PA mode which is not the case. Therefore, this patch
> changes the driver to report DC (don't care) in the case where no
> devices are present, and VA otherwise.
> 
> NOTE: this addressing mode use of VA-only applies only in the case of
> using DSA through the idxd kernel driver. The use of DSA though vfio-pci
> is unaffected and works as with other PCI devices.
> 
> Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
> 
> Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
Burakov, Anatoly May 6, 2021, 3:27 p.m. UTC | #2
On 06-May-21 4:09 PM, Bruce Richardson wrote:
> If after a bus scan, there are no devices using a particular bus, then
> that bus should not be taken into account when deciding whether DPDK
> should be run in VA or PA addressing mode. This becomes an issue when
> the DSA bus driver code is used on a system without an IOMMU. The PCI
> bus correctly reports that it only works in PA mode, while the DSA bus -
> also correctly - reports that it works only in VA mode. The difference
> is that there will be no devices found in a scan for the DSA bus, since
> the kernel driver can only present those to userspace in the presence of
> an IOMMU.
> 
> While we could change DSA instance to always report that it does not
> care about the addressing mode, this would imply that it could be used
> with DPDK in PA mode which is not the case. Therefore, this patch
> changes the driver to report DC (don't care) in the case where no
> devices are present, and VA otherwise.
> 
> NOTE: this addressing mode use of VA-only applies only in the case of
> using DSA through the idxd kernel driver. The use of DSA though vfio-pci
> is unaffected and works as with other PCI devices.
> 
> Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
> 
> Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

This would be a good opportunity to start a discussion on a "proper" fix 
for this, because while this hack is acceptable being so close to 
release, IMO we need a more general solution. Maybe a bus API reporting 
number of scanned devices would be nice to have, as EAL will then be 
able to ignore the opinion of those buses who don't have any devices.

The "number of devices" is a bit of a fluid concept, as number of 
scanned devices may not be the same as number of *probed* devices, and 
there's also hotplug. However, i think that we can ignore all that, 
because we care about initialization only - any other issues can be 
addressed by forcing --iova-mode by the user, as we cannot really do 
anything about hotplug or devices that fail to probe.

Thoughts?
  
Van Haaren, Harry May 6, 2021, 3:27 p.m. UTC | #3
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: Thursday, May 6, 2021 4:17 PM
> To: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org
> Cc: Van Haaren, Harry <harry.van.haaren@intel.com>
> Subject: Re: [PATCH] raw/ioat: fix bus requiring virtual addressing when no devs
> 
> On 06-May-21 4:09 PM, Bruce Richardson wrote:
> > If after a bus scan, there are no devices using a particular bus, then
> > that bus should not be taken into account when deciding whether DPDK
> > should be run in VA or PA addressing mode. This becomes an issue when
> > the DSA bus driver code is used on a system without an IOMMU. The PCI
> > bus correctly reports that it only works in PA mode, while the DSA bus -
> > also correctly - reports that it works only in VA mode. The difference
> > is that there will be no devices found in a scan for the DSA bus, since
> > the kernel driver can only present those to userspace in the presence of
> > an IOMMU.
> >
> > While we could change DSA instance to always report that it does not
> > care about the addressing mode, this would imply that it could be used
> > with DPDK in PA mode which is not the case. Therefore, this patch
> > changes the driver to report DC (don't care) in the case where no
> > devices are present, and VA otherwise.
> >
> > NOTE: this addressing mode use of VA-only applies only in the case of
> > using DSA through the idxd kernel driver. The use of DSA though vfio-pci
> > is unaffected and works as with other PCI devices.
> >
> > Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
> >
> > Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> 
> Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>

Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
  
Bruce Richardson May 6, 2021, 3:34 p.m. UTC | #4
On Thu, May 06, 2021 at 04:27:03PM +0100, Burakov, Anatoly wrote:
> On 06-May-21 4:09 PM, Bruce Richardson wrote:
> > If after a bus scan, there are no devices using a particular bus, then
> > that bus should not be taken into account when deciding whether DPDK
> > should be run in VA or PA addressing mode. This becomes an issue when
> > the DSA bus driver code is used on a system without an IOMMU. The PCI
> > bus correctly reports that it only works in PA mode, while the DSA bus -
> > also correctly - reports that it works only in VA mode. The difference
> > is that there will be no devices found in a scan for the DSA bus, since
> > the kernel driver can only present those to userspace in the presence of
> > an IOMMU.
> > 
> > While we could change DSA instance to always report that it does not
> > care about the addressing mode, this would imply that it could be used
> > with DPDK in PA mode which is not the case. Therefore, this patch
> > changes the driver to report DC (don't care) in the case where no
> > devices are present, and VA otherwise.
> > 
> > NOTE: this addressing mode use of VA-only applies only in the case of
> > using DSA through the idxd kernel driver. The use of DSA though vfio-pci
> > is unaffected and works as with other PCI devices.
> > 
> > Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
> > 
> > Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> 
> This would be a good opportunity to start a discussion on a "proper" fix for
> this, because while this hack is acceptable being so close to release, IMO
> we need a more general solution. Maybe a bus API reporting number of scanned
> devices would be nice to have, as EAL will then be able to ignore the
> opinion of those buses who don't have any devices.
> 
> The "number of devices" is a bit of a fluid concept, as number of scanned
> devices may not be the same as number of *probed* devices, and there's also
> hotplug. However, i think that we can ignore all that, because we care about
> initialization only - any other issues can be addressed by forcing
> --iova-mode by the user, as we cannot really do anything about hotplug or
> devices that fail to probe.
> 
> Thoughts?
> 
Perhaps rather than just "number of devices" we should have buses report
whether they are relevant or not. For example, there are a number of
SoC-specific bus drivers in DPDK, which are presumably only relevant on
particular SoCs. Similarly here for the DSA bus, the presence of the
devices is a useful proxy for relevance - especially since it doesn't
support hotplug - but if we had a way to report relevance/non-relevance DSA
could just report itself as not-relevant in case where there is no IOMMU
enabled on the system, or when the "idxd" kernel driver is not present, for
example, and then have EAL skip even asking about PA/VA support

/Bruce
  
Conor Walsh May 7, 2021, 12:30 p.m. UTC | #5
<snip>
> > > Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > ---
> >
> > Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> Tested-by: Harry van Haaren <harry.van.haaren@intel.com>

I was having an issue while running l3fwd after rc2 where I had to force
IOVA mode to PA, this patch fixes that issue.

Tested-by: Conor Walsh <conor.walsh@intel.com>
  
Thomas Monjalon May 10, 2021, 10:13 a.m. UTC | #6
06/05/2021 17:09, Bruce Richardson:
> +static unsigned int devcount = 0;

ERROR:INITIALISED_STATIC: do not initialise statics to 0

I will fix (remove = 0) if you agree.
  
Thomas Monjalon May 10, 2021, 10:18 a.m. UTC | #7
06/05/2021 17:34, Bruce Richardson:
> On Thu, May 06, 2021 at 04:27:03PM +0100, Burakov, Anatoly wrote:
> > On 06-May-21 4:09 PM, Bruce Richardson wrote:
> > > If after a bus scan, there are no devices using a particular bus, then
> > > that bus should not be taken into account when deciding whether DPDK
> > > should be run in VA or PA addressing mode. This becomes an issue when
> > > the DSA bus driver code is used on a system without an IOMMU. The PCI
> > > bus correctly reports that it only works in PA mode, while the DSA bus -
> > > also correctly - reports that it works only in VA mode. The difference
> > > is that there will be no devices found in a scan for the DSA bus, since
> > > the kernel driver can only present those to userspace in the presence of
> > > an IOMMU.
> > > 
> > > While we could change DSA instance to always report that it does not
> > > care about the addressing mode, this would imply that it could be used
> > > with DPDK in PA mode which is not the case. Therefore, this patch
> > > changes the driver to report DC (don't care) in the case where no
> > > devices are present, and VA otherwise.
> > > 
> > > NOTE: this addressing mode use of VA-only applies only in the case of
> > > using DSA through the idxd kernel driver. The use of DSA though vfio-pci
> > > is unaffected and works as with other PCI devices.
> > > 
> > > Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
> > > 
> > > Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > ---
> > 
> > This would be a good opportunity to start a discussion on a "proper" fix for
> > this, because while this hack is acceptable being so close to release, IMO
> > we need a more general solution. Maybe a bus API reporting number of scanned
> > devices would be nice to have, as EAL will then be able to ignore the
> > opinion of those buses who don't have any devices.
> > 
> > The "number of devices" is a bit of a fluid concept, as number of scanned
> > devices may not be the same as number of *probed* devices, and there's also
> > hotplug. However, i think that we can ignore all that, because we care about
> > initialization only - any other issues can be addressed by forcing
> > --iova-mode by the user, as we cannot really do anything about hotplug or
> > devices that fail to probe.
> > 
> > Thoughts?
> > 
> Perhaps rather than just "number of devices" we should have buses report
> whether they are relevant or not. For example, there are a number of
> SoC-specific bus drivers in DPDK, which are presumably only relevant on
> particular SoCs. Similarly here for the DSA bus, the presence of the
> devices is a useful proxy for relevance - especially since it doesn't
> support hotplug - but if we had a way to report relevance/non-relevance DSA
> could just report itself as not-relevant in case where there is no IOMMU
> enabled on the system, or when the "idxd" kernel driver is not present, for
> example, and then have EAL skip even asking about PA/VA support

I think what you are describing is the same as reporting "don't care".
Why do you consider it as a hack?
  
Bruce Richardson May 10, 2021, 10:22 a.m. UTC | #8
On Mon, May 10, 2021 at 12:13:43PM +0200, Thomas Monjalon wrote:
> 06/05/2021 17:09, Bruce Richardson:
> > +static unsigned int devcount = 0;
> 
> ERROR:INITIALISED_STATIC: do not initialise statics to 0
> 
> I will fix (remove = 0) if you agree.
> 
Sure. Thanks.
  
Thomas Monjalon May 10, 2021, 10:37 a.m. UTC | #9
07/05/2021 14:30, Walsh, Conor:
> <snip>
> > > > Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > ---
> > >
> > > Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > 
> > Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
> 
> I was having an issue while running l3fwd after rc2 where I had to force
> IOVA mode to PA, this patch fixes that issue.
> 
> Tested-by: Conor Walsh <conor.walsh@intel.com>

Applied (with checkpatch fixed), thanks
  
Bruce Richardson May 10, 2021, 11:20 a.m. UTC | #10
On Mon, May 10, 2021 at 12:18:50PM +0200, Thomas Monjalon wrote:
> 06/05/2021 17:34, Bruce Richardson:
> > On Thu, May 06, 2021 at 04:27:03PM +0100, Burakov, Anatoly wrote:
> > > On 06-May-21 4:09 PM, Bruce Richardson wrote:
> > > > If after a bus scan, there are no devices using a particular bus, then
> > > > that bus should not be taken into account when deciding whether DPDK
> > > > should be run in VA or PA addressing mode. This becomes an issue when
> > > > the DSA bus driver code is used on a system without an IOMMU. The PCI
> > > > bus correctly reports that it only works in PA mode, while the DSA bus -
> > > > also correctly - reports that it works only in VA mode. The difference
> > > > is that there will be no devices found in a scan for the DSA bus, since
> > > > the kernel driver can only present those to userspace in the presence of
> > > > an IOMMU.
> > > > 
> > > > While we could change DSA instance to always report that it does not
> > > > care about the addressing mode, this would imply that it could be used
> > > > with DPDK in PA mode which is not the case. Therefore, this patch
> > > > changes the driver to report DC (don't care) in the case where no
> > > > devices are present, and VA otherwise.
> > > > 
> > > > NOTE: this addressing mode use of VA-only applies only in the case of
> > > > using DSA through the idxd kernel driver. The use of DSA though vfio-pci
> > > > is unaffected and works as with other PCI devices.
> > > > 
> > > > Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
> > > > 
> > > > Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > ---
> > > 
> > > This would be a good opportunity to start a discussion on a "proper" fix for
> > > this, because while this hack is acceptable being so close to release, IMO
> > > we need a more general solution. Maybe a bus API reporting number of scanned
> > > devices would be nice to have, as EAL will then be able to ignore the
> > > opinion of those buses who don't have any devices.
> > > 
> > > The "number of devices" is a bit of a fluid concept, as number of scanned
> > > devices may not be the same as number of *probed* devices, and there's also
> > > hotplug. However, i think that we can ignore all that, because we care about
> > > initialization only - any other issues can be addressed by forcing
> > > --iova-mode by the user, as we cannot really do anything about hotplug or
> > > devices that fail to probe.
> > > 
> > > Thoughts?
> > > 
> > Perhaps rather than just "number of devices" we should have buses report
> > whether they are relevant or not. For example, there are a number of
> > SoC-specific bus drivers in DPDK, which are presumably only relevant on
> > particular SoCs. Similarly here for the DSA bus, the presence of the
> > devices is a useful proxy for relevance - especially since it doesn't
> > support hotplug - but if we had a way to report relevance/non-relevance DSA
> > could just report itself as not-relevant in case where there is no IOMMU
> > enabled on the system, or when the "idxd" kernel driver is not present, for
> > example, and then have EAL skip even asking about PA/VA support
> 
> I think what you are describing is the same as reporting "don't care".
> Why do you consider it as a hack?
>
It's not the same as don't-care, because the bus just won't work with
iova-as-pa, it must have iova-as-va. Therefore to respond as anything other
than VA here would be incorrect.

The issue is that if one bus must have PA and another must have VA, then
obviously both cannot be in use at the same time, and we have no way of
sanely resolving that conflict, or specifying which to ignore. The reason
this patch is a hack, is that this dsa bus driver is taking it on itself to
basic say "ignore me" (by returning DC rather than VA), when no devices are
found.

Maybe it's the case that busses are supposed to behave that way, but it's
not clear from the doxygen comments. I would have thought that we should
have a way for a bus to signal back to EAL that the whole bus is not
relevant, and then can be skipped for querying PA/VA and for probe etc.

/Bruce
  
Thomas Monjalon May 10, 2021, 12:02 p.m. UTC | #11
10/05/2021 13:20, Bruce Richardson:
> On Mon, May 10, 2021 at 12:18:50PM +0200, Thomas Monjalon wrote:
> > 06/05/2021 17:34, Bruce Richardson:
> > > On Thu, May 06, 2021 at 04:27:03PM +0100, Burakov, Anatoly wrote:
> > > > On 06-May-21 4:09 PM, Bruce Richardson wrote:
> > > > > If after a bus scan, there are no devices using a particular bus, then
> > > > > that bus should not be taken into account when deciding whether DPDK
> > > > > should be run in VA or PA addressing mode. This becomes an issue when
> > > > > the DSA bus driver code is used on a system without an IOMMU. The PCI
> > > > > bus correctly reports that it only works in PA mode, while the DSA bus -
> > > > > also correctly - reports that it works only in VA mode. The difference
> > > > > is that there will be no devices found in a scan for the DSA bus, since
> > > > > the kernel driver can only present those to userspace in the presence of
> > > > > an IOMMU.
> > > > > 
> > > > > While we could change DSA instance to always report that it does not
> > > > > care about the addressing mode, this would imply that it could be used
> > > > > with DPDK in PA mode which is not the case. Therefore, this patch
> > > > > changes the driver to report DC (don't care) in the case where no
> > > > > devices are present, and VA otherwise.
> > > > > 
> > > > > NOTE: this addressing mode use of VA-only applies only in the case of
> > > > > using DSA through the idxd kernel driver. The use of DSA though vfio-pci
> > > > > is unaffected and works as with other PCI devices.
> > > > > 
> > > > > Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
> > > > > 
> > > > > Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
> > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > ---
> > > > 
> > > > This would be a good opportunity to start a discussion on a "proper" fix for
> > > > this, because while this hack is acceptable being so close to release, IMO
> > > > we need a more general solution. Maybe a bus API reporting number of scanned
> > > > devices would be nice to have, as EAL will then be able to ignore the
> > > > opinion of those buses who don't have any devices.
> > > > 
> > > > The "number of devices" is a bit of a fluid concept, as number of scanned
> > > > devices may not be the same as number of *probed* devices, and there's also
> > > > hotplug. However, i think that we can ignore all that, because we care about
> > > > initialization only - any other issues can be addressed by forcing
> > > > --iova-mode by the user, as we cannot really do anything about hotplug or
> > > > devices that fail to probe.
> > > > 
> > > > Thoughts?
> > > > 
> > > Perhaps rather than just "number of devices" we should have buses report
> > > whether they are relevant or not. For example, there are a number of
> > > SoC-specific bus drivers in DPDK, which are presumably only relevant on
> > > particular SoCs. Similarly here for the DSA bus, the presence of the
> > > devices is a useful proxy for relevance - especially since it doesn't
> > > support hotplug - but if we had a way to report relevance/non-relevance DSA
> > > could just report itself as not-relevant in case where there is no IOMMU
> > > enabled on the system, or when the "idxd" kernel driver is not present, for
> > > example, and then have EAL skip even asking about PA/VA support
> > 
> > I think what you are describing is the same as reporting "don't care".
> > Why do you consider it as a hack?
> >
> It's not the same as don't-care, because the bus just won't work with
> iova-as-pa, it must have iova-as-va. Therefore to respond as anything other
> than VA here would be incorrect.
> 
> The issue is that if one bus must have PA and another must have VA, then
> obviously both cannot be in use at the same time, and we have no way of
> sanely resolving that conflict, or specifying which to ignore. The reason
> this patch is a hack, is that this dsa bus driver is taking it on itself to
> basic say "ignore me" (by returning DC rather than VA), when no devices are
> found.
> 
> Maybe it's the case that busses are supposed to behave that way, but it's
> not clear from the doxygen comments. I would have thought that we should
> have a way for a bus to signal back to EAL that the whole bus is not
> relevant, and then can be skipped for querying PA/VA and for probe etc.

We can have more functions to better describe the real status,
but would it make any difference compared as the self "don't care"
meaning "ignore me at init stage"?
I think it could make a difference only for detecting issues on hotplug,
but it would not solve hotplug conflict :/
  

Patch

diff --git a/drivers/raw/ioat/idxd_bus.c b/drivers/raw/ioat/idxd_bus.c
index 5199786785..d7c5a19cec 100644
--- a/drivers/raw/ioat/idxd_bus.c
+++ b/drivers/raw/ioat/idxd_bus.c
@@ -19,6 +19,8 @@ 
 #define DSA_DEV_PATH "/dev/dsa"
 #define DSA_SYSFS_PATH "/sys/bus/dsa/devices"
 
+static unsigned int devcount = 0;
+
 /** unique identifier for a DSA device/WQ instance */
 struct dsa_wq_addr {
 	uint16_t device_id;
@@ -307,6 +309,7 @@  dsa_scan(void)
 		dev->device.bus = &dsa_bus.bus;
 		strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name));
 		TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next);
+		devcount++;
 
 		read_device_int(dev, "numa_node", &numa_node);
 		dev->device.numa_node = numa_node;
@@ -338,7 +341,8 @@  dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 static enum rte_iova_mode
 dsa_get_iommu_class(void)
 {
-	return RTE_IOVA_VA;
+	/* if there are no devices, report don't care, otherwise VA mode */
+	return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC;
 }
 
 static int