[v3,4/4] doc: add VFIO troubleshooting section

Message ID 75243d22963555e1460b243df6bb12efd146abf4.1605785484.git.anatoly.burakov@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series [v3,1/4] doc: move VFIO driver to be first |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/travis-robot success Travis build: passed

Commit Message

Burakov, Anatoly Nov. 19, 2020, 11:32 a.m. UTC
  There are common problems with VFIO that get asked over and over on the
mailing list. Document common problems with VFIO and how to fix them or
at least figure out what went wrong.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 doc/guides/linux_gsg/linux_drivers.rst | 43 ++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)
  

Comments

Thomas Monjalon Nov. 27, 2020, 3:30 p.m. UTC | #1
19/11/2020 12:32, Anatoly Burakov:
> There are common problems with VFIO that get asked over and over on the
> mailing list. Document common problems with VFIO and how to fix them or
> at least figure out what went wrong.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

Thanks for the effort.
It hope there will be some reviews for 21.02.
It seems too late for 20.11.
  
Bruce Richardson Nov. 27, 2020, 3:49 p.m. UTC | #2
On Thu, Nov 19, 2020 at 11:32:32AM +0000, Anatoly Burakov wrote:
> There are common problems with VFIO that get asked over and over on the
> mailing list. Document common problems with VFIO and how to fix them or
> at least figure out what went wrong.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---

One typo spotted below.

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

>  doc/guides/linux_gsg/linux_drivers.rst | 43 ++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst
> index 9c61850dbb..f3c06c68d1 100644
> --- a/doc/guides/linux_gsg/linux_drivers.rst
> +++ b/doc/guides/linux_gsg/linux_drivers.rst
> @@ -276,3 +276,46 @@ To restore device ``82:00.0`` to its original kernel binding:
>  .. code-block:: console
>  
>      ./usertools/dpdk-devbind.py --bind=ixgbe 82:00.0
> +
> +Troubleshooting VFIO
> +--------------------
> +
> +In certain situations, using ``dpdk-devbind.py`` script to bing a device to VFIO

typo: s/bing/bind/
  
Burakov, Anatoly Nov. 27, 2020, 3:57 p.m. UTC | #3
On 27-Nov-20 3:30 PM, Thomas Monjalon wrote:
> 19/11/2020 12:32, Anatoly Burakov:
>> There are common problems with VFIO that get asked over and over on the
>> mailing list. Document common problems with VFIO and how to fix them or
>> at least figure out what went wrong.
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> Thanks for the effort.
> It hope there will be some reviews for 21.02.
> It seems too late for 20.11.
> 
> 
> 

All but one patch seems to be acked now. Is it still possible to get it 
into LTS? Or maybe we'll get it into 21.02 and backport it then? Seems 
like a missed opportunity for good documentation.
  
Kevin Traynor Nov. 27, 2020, 4:29 p.m. UTC | #4
On 19/11/2020 11:32, Anatoly Burakov wrote:
> There are common problems with VFIO that get asked over and over on the
> mailing list. Document common problems with VFIO and how to fix them or
> at least figure out what went wrong.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>  doc/guides/linux_gsg/linux_drivers.rst | 43 ++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst
> index 9c61850dbb..f3c06c68d1 100644
> --- a/doc/guides/linux_gsg/linux_drivers.rst
> +++ b/doc/guides/linux_gsg/linux_drivers.rst
> @@ -276,3 +276,46 @@ To restore device ``82:00.0`` to its original kernel binding:
>  .. code-block:: console
>  
>      ./usertools/dpdk-devbind.py --bind=ixgbe 82:00.0
> +
> +Troubleshooting VFIO
> +--------------------
> +
> +In certain situations, using ``dpdk-devbind.py`` script to bing a device to VFIO
> +driver may fail. The first place to check is the kernel messages:
> +
> +.. code-block:: console
> +
> +    # dmesg | tail
> +    ...
> +    [ 1297.875090] vfio-pci: probe of 0000:31:00.0 failed with error -22
> +    ...
> +
> +In most cases, the ``error -22`` indicates that the VFIO subsystem couldn't be
> +enabled because there is no IOMMU support. To check whether the kernel has been
> +booted with correct parameters, one can check the kernel command-line:
> +
> +.. code-block:: console
> +
> +    cat /proc/cmdline
> +
> +Please refer to earlier sections on how to configure kernel parameters correctly
> +for your system.
> +
> +If the kernel is configured correctly, one also has to make sure that the BIOS
> +configuration has virtualization features (such as Intel® VT-d). There is no
> +standard way to check if the platform is configured correctly, so please check
> +with your platform documentation to see if it has such features, and how to
> +enable them.
> +
> +In certain distributions, default kernel configuration is such that the no-IOMMU
> +mode is disabled altogether at compile time. This can be checked in the boot
> +configuration of your system:
> +
> +.. code-block:: console
> +
> +    # cat /boot/config-$(uname -r) | grep NOIOMMU
> +    # CONFIG_VFIO_NOIOMMU is not set
> +
> +If ``CONFIG_VFIO_NOIOMMU`` is not enabled in the kernel configuration, VFIO
> +driver will not support the no-IOMMU mode, and other alternatives (such as UIO
> +drivers) will have to be used.
> 

Good to have some debug hints and it avoids a backport ;-)

With Bruce's s/bing/bind/

Acked-by: Kevin Traynor <ktraynor@redhat.com>
  
Thomas Monjalon Nov. 27, 2020, 5:27 p.m. UTC | #5
27/11/2020 17:29, Kevin Traynor:
> On 19/11/2020 11:32, Anatoly Burakov wrote:
> > There are common problems with VFIO that get asked over and over on the
> > mailing list. Document common problems with VFIO and how to fix them or
> > at least figure out what went wrong.
> > 
> > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
[...]
> With Bruce's s/bing/bind/
> 
> Acked-by: Kevin Traynor <ktraynor@redhat.com>

Series applied with some formatting adjustments, thanks
  

Patch

diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst
index 9c61850dbb..f3c06c68d1 100644
--- a/doc/guides/linux_gsg/linux_drivers.rst
+++ b/doc/guides/linux_gsg/linux_drivers.rst
@@ -276,3 +276,46 @@  To restore device ``82:00.0`` to its original kernel binding:
 .. code-block:: console
 
     ./usertools/dpdk-devbind.py --bind=ixgbe 82:00.0
+
+Troubleshooting VFIO
+--------------------
+
+In certain situations, using ``dpdk-devbind.py`` script to bing a device to VFIO
+driver may fail. The first place to check is the kernel messages:
+
+.. code-block:: console
+
+    # dmesg | tail
+    ...
+    [ 1297.875090] vfio-pci: probe of 0000:31:00.0 failed with error -22
+    ...
+
+In most cases, the ``error -22`` indicates that the VFIO subsystem couldn't be
+enabled because there is no IOMMU support. To check whether the kernel has been
+booted with correct parameters, one can check the kernel command-line:
+
+.. code-block:: console
+
+    cat /proc/cmdline
+
+Please refer to earlier sections on how to configure kernel parameters correctly
+for your system.
+
+If the kernel is configured correctly, one also has to make sure that the BIOS
+configuration has virtualization features (such as Intel® VT-d). There is no
+standard way to check if the platform is configured correctly, so please check
+with your platform documentation to see if it has such features, and how to
+enable them.
+
+In certain distributions, default kernel configuration is such that the no-IOMMU
+mode is disabled altogether at compile time. This can be checked in the boot
+configuration of your system:
+
+.. code-block:: console
+
+    # cat /boot/config-$(uname -r) | grep NOIOMMU
+    # CONFIG_VFIO_NOIOMMU is not set
+
+If ``CONFIG_VFIO_NOIOMMU`` is not enabled in the kernel configuration, VFIO
+driver will not support the no-IOMMU mode, and other alternatives (such as UIO
+drivers) will have to be used.