[dpdk-dev] VFIO: Avoid to enable vfio while the module not loaded
Commit Message
When vfio module is not loaded when kernel support vfio feature,
the routine still try to open the container to get file
description.
This action is not safe, and of cause got error messages:
EAL: Detected 40 lcore(s)
EAL: unsupported IOMMU type!
EAL: VFIO support could not be initialized
EAL: Setting up memory...
This may make user confuse, this patch make it reasonable
and much more soomth to user.
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
---
lib/librte_eal/common/include/rte_common.h | 37 ++++++++++++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_pci_vfio.c | 23 +++++++++++++------
2 files changed, 53 insertions(+), 7 deletions(-)
Comments
Hi Michael
> When vfio module is not loaded when kernel support vfio feature, the
> routine still try to open the container to get file description.
>
> This action is not safe, and of cause got error messages:
>
> EAL: Detected 40 lcore(s)
> EAL: unsupported IOMMU type!
> EAL: VFIO support could not be initialized
> EAL: Setting up memory...
>
> This may make user confuse, this patch make it reasonable and much more
> soomth to user.
Not sure I agree with the premise of this patch.
First of all, if VFIO driver is not enabled, the container file would not be present and you would get a different error (namely, "cannot open VFIO container", in pci_vfio_get_container_fd()). If you have a container file, that means VFIO driver is loaded, so I'm not sure why you get the "unsupported IOMMU type" error. I suppose it could happen when vfio is loaded but vfio_iommu_type1 isn't?
And even then, this error is harmless and doesn't do anything, so I'm not sure what this patch is supposed to fix. The error messages tells the user exactly what happens.
Thanks,
Anatoly
Hi Michael
> When vfio module is not loaded when kernel support vfio feature, the
> routine still try to open the container to get file description.
>
> This action is not safe, and of cause got error messages:
>
> EAL: Detected 40 lcore(s)
> EAL: unsupported IOMMU type!
> EAL: VFIO support could not be initialized
> EAL: Setting up memory...
>
> This may make user confuse, this patch make it reasonable and much more
> soomth to user.
Not sure I agree with the premise of this patch.
First of all, if VFIO driver is not enabled, the container file would not be present and you would get a different error (namely, "cannot open VFIO container", in pci_vfio_get_container_fd()). If you have a container file, that means VFIO driver is loaded, so I'm not sure why you get the "unsupported IOMMU type" error. I suppose it could happen when vfio is loaded but vfio_iommu_type1 isn't?
And even then, this error is harmless and doesn't do anything, so I'm not sure what this patch is supposed to fix. The error messages tells the user exactly what happens.
Thanks,
Anatoly
On 12/4/2014 9:12 PM, Burakov, Anatoly wrote:
> Hi Michael
>
>> When vfio module is not loaded when kernel support vfio feature, the
>> routine still try to open the container to get file description.
>>
>> This action is not safe, and of cause got error messages:
>>
>> EAL: Detected 40 lcore(s)
>> EAL: unsupported IOMMU type!
>> EAL: VFIO support could not be initialized
>> EAL: Setting up memory...
>>
>> This may make user confuse, this patch make it reasonable and much more
>> soomth to user.
> Not sure I agree with the premise of this patch.
>
> First of all, if VFIO driver is not enabled, the container file would not be present and you would get a different error (namely, "cannot open VFIO container", in pci_vfio_get_container_fd()). If you have a container file, that means VFIO driver is loaded, so I'm not sure why you get the "unsupported IOMMU type" error. I suppose it could happen when vfio is loaded but vfio_iommu_type1 isn't?
But indeed, when try to unload both vfio and vfio_iommu_type1,
/dev/vfio/vfio still there, I'm also surprise.
My ENV is fedora20, kernel version 3.6.7-200 X86_64.
Believe or not, you can have a try, it seems a kernel issue.
When you unload both two modules, then open /dev/vfio/vfio, you will
find it can be opened with no errors(but this time both two modules
loaded automatically, strange enough)
Also you can use ioctl to get API Version. But when you try to get the
iommu type, it will return a "0" not expect value of '1'.
Then you can shutdown DPDK, reopen like test-pmd, all works fine :)
I will take a deep look at in the kernel side, to find out why this happens.
Thanks,
Michael
> And even then, this error is harmless and doesn't do anything, so I'm not sure what this patch is supposed to fix. The error messages tells the user exactly what happens.
>
> Thanks,
> Anatoly
>
Hi Michael
> But indeed, when try to unload both vfio and vfio_iommu_type1,
> /dev/vfio/vfio still there, I'm also surprise.
>
> My ENV is fedora20, kernel version 3.6.7-200 X86_64.
>
> Believe or not, you can have a try, it seems a kernel issue.
>
> When you unload both two modules, then open /dev/vfio/vfio, you will find
> it can be opened with no errors(but this time both two modules loaded
> automatically, strange enough)
>
Thanks to Sergio, we found a most likely cause for this. This patch to Linux kernel by Alex Williamson of Red Hat:
https://lkml.org/lkml/2013/12/12/421
it seems, however, that it has been merged into 3.14. Your kernel, by your own admission, is 3.6. Are you sure this is the right kernel version? Because my own machine has Fedora 18 with a 3.11 kernel, and it (correctly) does not display this behavior. So unless Fedora 20 backported those changes to kernel 3.6, this shouldn't happen on your set up. (but it doesn't really matter, just FYI - the patch still should be fixed and resubmitted, just as we discussed)
Thanks,
Anatoly
On 12/5/2014 12:31 AM, Burakov, Anatoly wrote:
> Hi Michael
>
>> But indeed, when try to unload both vfio and vfio_iommu_type1,
>> /dev/vfio/vfio still there, I'm also surprise.
>>
>> My ENV is fedora20, kernel version 3.6.7-200 X86_64.
>>
>> Believe or not, you can have a try, it seems a kernel issue.
>>
>> When you unload both two modules, then open /dev/vfio/vfio, you will find
>> it can be opened with no errors(but this time both two modules loaded
>> automatically, strange enough)
>>
> Thanks to Sergio, we found a most likely cause for this. This patch to Linux kernel by Alex Williamson of Red Hat:
>
> https://lkml.org/lkml/2013/12/12/421
>
> it seems, however, that it has been merged into 3.14. Your kernel, by your own admission, is 3.6. Are you sure this is the right kernel version? Because my own machine has Fedora 18 with a 3.11 kernel, and it (correctly) does not
Sorry, the kernel version is 3.16 :), just make a mistake :)
Thanks,
Michael
> display this behavior. So unless Fedora 20 backported those changes to kernel 3.6, this shouldn't happen on your set up. (but it doesn't really matter, just FYI - the patch still should be fixed and resubmitted, just as we discussed)
>
> Thanks,
> Anatoly
>
@@ -50,6 +50,8 @@ extern "C" {
#include <ctype.h>
#include <errno.h>
#include <limits.h>
+#include <string.h>
+#include <stdio.h>
/*********** Macros to eliminate unused variable warnings ********/
@@ -382,6 +384,41 @@ rte_exit(int exit_code, const char *format, ...)
__attribute__((noreturn))
__attribute__((format(printf, 2, 3)));
+/**
+ * Function is to check if the kernel module(like, vfio, vfio_iommu_type1,
+ * etc.) loaded.
+ *
+ * @param module_name
+ * The module's name which need to be checked
+ *
+ * @return
+ * -1 means some error happens(NULL pointer or open failure)
+ * 0 means the module not loaded
+ * 1 means the module loaded
+ */
+static inline int
+check_module(const char *module_name)
+{
+ char mod_name[30]; /* Any module names can be longer than 30 bytes? */
+ int ret = 0;
+
+ if (NULL == module_name)
+ return -1;
+ FILE * fd = fopen("/proc/modules", "r");
+ if( fd == NULL)
+ return -1;
+ while(!feof(fd)) {
+ fscanf(fd, "%s %*[^\n]", mod_name);
+ if(!strcmp(mod_name, module_name)) {
+ ret = 1;
+ break;
+ }
+ }
+ fclose(fd);
+
+ return ret;
+}
+
#ifdef __cplusplus
}
#endif
@@ -44,6 +44,7 @@
#include <rte_tailq.h>
#include <rte_eal_memconfig.h>
#include <rte_malloc.h>
+#include <rte_common.h>
#include "eal_filesystem.h"
#include "eal_pci_init.h"
@@ -342,7 +343,8 @@ pci_vfio_get_container_fd(void)
RTE_LOG(ERR, EAL, " could not get IOMMU type, "
"error %i (%s)\n", errno, strerror(errno));
else
- RTE_LOG(ERR, EAL, " unsupported IOMMU type!\n");
+ RTE_LOG(ERR, EAL, " unsupported IOMMU type! "
+ "expect: 1, actual: %d\n", ret);
close(vfio_container_fd);
return -1;
}
@@ -788,13 +790,20 @@ pci_vfio_enable(void)
vfio_cfg.vfio_groups[i].fd = -1;
vfio_cfg.vfio_groups[i].group_no = -1;
}
- vfio_cfg.vfio_container_fd = pci_vfio_get_container_fd();
- /* check if we have VFIO driver enabled */
- if (vfio_cfg.vfio_container_fd != -1)
- vfio_cfg.vfio_enabled = 1;
- else
- RTE_LOG(INFO, EAL, "VFIO support could not be initialized\n");
+ if (check_module("vfio") == 1 &&
+ check_module("vfio_iommu_type1") == 1) {
+ vfio_cfg.vfio_container_fd = pci_vfio_get_container_fd();
+
+ /* check if we have VFIO driver enabled */
+ if (vfio_cfg.vfio_container_fd != -1)
+ vfio_cfg.vfio_enabled = 1;
+ else
+ RTE_LOG(INFO, EAL, "VFIO support could not be"
+ " initialized\n");
+ } else
+ RTE_LOG(INFO, EAL, "VFIO modules are not all loaded,"
+ " skip VFIO support ...\n");
return 0;
}