eal: fix detection of static or shared DPDK builds

Message ID 20210208163319.507567-1-bruce.richardson@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series eal: fix detection of static or shared DPDK builds |

Checks

Context Check Description
ci/iol-broadcom-Functional success Functional Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-broadcom-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-testing warning Testing issues
ci/travis-robot warning Travis build: failed
ci/checkpatch success coding style OK

Commit Message

Bruce Richardson Feb. 8, 2021, 4:33 p.m. UTC
  When checking the loading of EAL shared lib to see if we have a shared
DPDK build, we only want to include part of the ABI version in the check
rather than the whole thing. For example, with ABI version 21.1 for DPDK
release 21.02, the linker links the binary against librte_eal.so.21,
without the ".1".

To avoid any further brittleness in this area, we can check for multiple
versions when doing the check, since just about any version of EAL implies
a shared build. Therefore we check for presence of librte_eal.so with full
ABI_VERSION extension, and then repeatedly remove the end part of the
filename after the last dot, checking each time. For example (debug log
output for static build):

  EAL: Checking presence of .so 'librte_eal.so.21.1'
  EAL: Checking presence of .so 'librte_eal.so.21'
  EAL: Checking presence of .so 'librte_eal.so'
  EAL: Detected static linkage of DPDK

Fixes: 7781950f4d38 ("eal: fix shared lib mode detection")
Cc: tredaelli@redhat.com
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/common/eal_common_options.c | 35 +++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

--
2.27.0
  

Comments

Bruce Richardson Feb. 9, 2021, 12:49 p.m. UTC | #1
On Mon, Feb 08, 2021 at 04:33:19PM +0000, Bruce Richardson wrote:
> When checking the loading of EAL shared lib to see if we have a shared
> DPDK build, we only want to include part of the ABI version in the check
> rather than the whole thing. For example, with ABI version 21.1 for DPDK
> release 21.02, the linker links the binary against librte_eal.so.21,
> without the ".1".
> 
> To avoid any further brittleness in this area, we can check for multiple
> versions when doing the check, since just about any version of EAL implies
> a shared build. Therefore we check for presence of librte_eal.so with full
> ABI_VERSION extension, and then repeatedly remove the end part of the
> filename after the last dot, checking each time. For example (debug log
> output for static build):
> 
>   EAL: Checking presence of .so 'librte_eal.so.21.1'
>   EAL: Checking presence of .so 'librte_eal.so.21'
>   EAL: Checking presence of .so 'librte_eal.so'
>   EAL: Detected static linkage of DPDK
> 
> Fixes: 7781950f4d38 ("eal: fix shared lib mode detection")
> Cc: tredaelli@redhat.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

I saw this issue with OVS, where I was getting weird failures about ports
not being bound (in case of physical ports) or not being created (in case
of virtio ports), when using a shared build. Since it's potentially
serious, I'd appreciate if someone can reproduce the issue and verify the
fix so we can consider it for 21.02 inclusion.

To demonstrate this with regular DPDK, do a usual build of DPDK and then do
"ninja install" to install system-wide. Then build an example app, e.g.
l2fwd, using "make" from the examples/l2fwd directory. Running the example
normally, e.g. ./build/l2fwd -c F00, leads to no drivers being loaded or
ports being found. Adding "-d /path/to/drivers" e.g.
"/usr/local/lib/x86_64-linux-gnu/dpdk/pmds-21.1" on my system works as
expected. This shows the driver loading is not correct.

After applying this patch and re-running "ninja install", l2fwd should run
the same with and without the "-d" flag.

/Bruce
  
Sunil Pai G Feb. 9, 2021, 8:10 p.m. UTC | #2
Hi Bruce,

Thanks for the fix.
I do see the issue mentioned when using DPDK shared libs with OVS and this patch fixes it.

However, I saw the issue only for system installed DPDK but not for directory installed DPDK.

 
> I saw this issue with OVS, where I was getting weird failures about ports not
> being bound (in case of physical ports) or not being created (in case of virtio
> ports), when using a shared build. Since it's potentially serious, I'd appreciate
> if someone can reproduce the issue and verify the fix so we can consider it
> for 21.02 inclusion.
> 
> To demonstrate this with regular DPDK, do a usual build of DPDK and then do
> "ninja install" to install system-wide. Then build an example app, e.g.
> l2fwd, using "make" from the examples/l2fwd directory. Running the
> example normally, e.g. ./build/l2fwd -c F00, leads to no drivers being loaded
> or ports being found. Adding "-d /path/to/drivers" e.g.
> "/usr/local/lib/x86_64-linux-gnu/dpdk/pmds-21.1" on my system works as
> expected. This shows the driver loading is not correct.
> 
> After applying this patch and re-running "ninja install", l2fwd should run the
> same with and without the "-d" flag.
> 
> /Bruce

Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
  
Thomas Monjalon Feb. 10, 2021, 9:25 a.m. UTC | #3
09/02/2021 21:10, Pai G, Sunil:
> Hi Bruce,
> 
> Thanks for the fix.
> I do see the issue mentioned when using DPDK shared libs with OVS and this patch fixes it.
> 
> However, I saw the issue only for system installed DPDK but not for directory installed DPDK.
> 
>  
> > I saw this issue with OVS, where I was getting weird failures about ports not
> > being bound (in case of physical ports) or not being created (in case of virtio
> > ports), when using a shared build. Since it's potentially serious, I'd appreciate
> > if someone can reproduce the issue and verify the fix so we can consider it
> > for 21.02 inclusion.
> > 
> > To demonstrate this with regular DPDK, do a usual build of DPDK and then do
> > "ninja install" to install system-wide. Then build an example app, e.g.
> > l2fwd, using "make" from the examples/l2fwd directory. Running the
> > example normally, e.g. ./build/l2fwd -c F00, leads to no drivers being loaded
> > or ports being found. Adding "-d /path/to/drivers" e.g.
> > "/usr/local/lib/x86_64-linux-gnu/dpdk/pmds-21.1" on my system works as
> > expected. This shows the driver loading is not correct.
> > 
> > After applying this patch and re-running "ninja install", l2fwd should run the
> > same with and without the "-d" flag.
> > 
> > /Bruce
> 
> Tested-by: Sunil Pai G <sunil.pai.g@intel.com>

Applied, thanks
  

Patch

diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 6b3707725f..94029bf7f1 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -494,6 +494,39 @@  eal_dlopen(const char *pathname)
 	return retval;
 }

+static int
+is_shared_build(void)
+{
+#define EAL_SO "librte_eal.so"
+	char soname[32];
+	size_t len, minlen = strlen(EAL_SO);
+
+	len = strlcpy(soname, EAL_SO"."ABI_VERSION, sizeof(soname));
+	if (len > sizeof(soname)) {
+		RTE_LOG(ERR, EAL, "Shared lib name too long in shared build check\n");
+		len = sizeof(soname) - 1;
+	}
+
+	while (len >= minlen) {
+		/* check if we have this .so loaded, if so - shared build */
+		RTE_LOG(DEBUG, EAL, "Checking presence of .so '%s'\n", soname);
+		if (dlopen(soname, RTLD_LAZY | RTLD_NOLOAD) != NULL) {
+			RTE_LOG(INFO, EAL, "Detected shared linkage of DPDK\n");
+			return 1;
+		}
+
+		/* remove any version numbers off the end to retry */
+		while (len-- > 0)
+			if (soname[len] == '.') {
+				soname[len] = '\0';
+				break;
+			}
+	}
+
+	RTE_LOG(INFO, EAL, "Detected static linkage of DPDK\n");
+	return 0;
+}
+
 int
 eal_plugins_init(void)
 {
@@ -505,7 +538,7 @@  eal_plugins_init(void)
 	 * (Using dlopen with NOLOAD flag on EAL, will return NULL if the EAL
 	 * shared library is not already loaded i.e. it's statically linked.)
 	 */
-	if (dlopen("librte_eal.so."ABI_VERSION, RTLD_LAZY | RTLD_NOLOAD) != NULL &&
+	if (is_shared_build() &&
 			*default_solib_dir != '\0' &&
 			stat(default_solib_dir, &sb) == 0 &&
 			S_ISDIR(sb.st_mode))