[v3] usertools: fix py3 support with pyelftools>=0.24

Message ID 20191015123918.10775-1-robin.jarry@6wind.com (mailing list archive)
State Accepted, archived
Headers
Series [v3] usertools: fix py3 support with pyelftools>=0.24 |

Checks

Context Check Description
ci/iol-compilation success Compile Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/checkpatch success coding style OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Robin Jarry Oct. 15, 2019, 12:39 p.m. UTC
  Running dpdk-pmdinfo.py on Ubuntu 18.04 (bionic) with python 3 and
pyelftools installed produces no output but no error is reported
neither:

  ~$ python3 usertools/dpdk-pmdinfo.py -r build/app/testpmd
  ~$ echo $?
  0

While with python 2, it works:

  ~# python2 usertools/dpdk-pmdinfo.py -r build/app/testpmd
  {"pci_ids": [], "name": "dpio"}
  {"pci_ids": [], "name": "dpbp"}
  {"pci_ids": [], "name": "dpaa2_qdma"}
  .....

On Ubuntu 18.04, pyelftools is version 0.24. The change log of
pyelftools v0.24 says:

 - Symbol/section names are strings internally now, not bytestrings
   (this may affect API usage in Python 3) (#76).

We cannot guess which version of pyelftools is actually being used. The
elftools.__version__ symbol is not consistent with each distro's package
version. For example, on Ubuntu 16.04 (xenial), the .deb package version
is '0.23-2' but elftools.__version__ contains '0.25'. This is certainly
due to partial backports.

To have a more consistent behaviour of this script across all versions
of python, add the unicode_literals future import so that literal
strings are now always "unicode".

Add 2 utility functions to force a string into bytes or bytes into an
unicode string.

Force pyelftools return values to unicode strings (will do nothing with
recent version of pyelftools).

If elffile.get_section_by_name returns None with a unicode section name,
try with the same one encoded as bytes.

Also, replace all open() calls by io.open() which behaves like the
builtin open in python 3. The only non-binary opened file is
/usr/share/hwdata/pci.ids which is UTF-8 encoded text. Explicitly
specify that encoding.

Link: https://github.com/eliben/pyelftools/blob/v0.24/CHANGES#L7
Link: https://github.com/eliben/pyelftools/commit/108eaea9e75a8b5a
Fixes: 54ca545dce4b ("make python scripts python2/3 compliant")
Cc: John McNamara <john.mcnamara@intel.com>
Cc: stable@dpdk.org
Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
---
Change log:

v2

  * Behavior is different depending on pyelftools versions. Handle that
    properly.
  * More bytes/unicode fixes.

v3

  * Add missing Fixes line
  * Add Cc: stable@dpdk.org

 usertools/dpdk-pmdinfo.py | 65 +++++++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 19 deletions(-)
  

Comments

Thomas Monjalon Oct. 27, 2019, 8:32 p.m. UTC | #1
15/10/2019 14:39, Robin Jarry:
> Running dpdk-pmdinfo.py on Ubuntu 18.04 (bionic) with python 3 and
> pyelftools installed produces no output but no error is reported
> neither:
> 
>   ~$ python3 usertools/dpdk-pmdinfo.py -r build/app/testpmd
>   ~$ echo $?
>   0
> 
> While with python 2, it works:
> 
>   ~# python2 usertools/dpdk-pmdinfo.py -r build/app/testpmd
>   {"pci_ids": [], "name": "dpio"}
>   {"pci_ids": [], "name": "dpbp"}
>   {"pci_ids": [], "name": "dpaa2_qdma"}
>   .....
> 
> On Ubuntu 18.04, pyelftools is version 0.24. The change log of
> pyelftools v0.24 says:
> 
>  - Symbol/section names are strings internally now, not bytestrings
>    (this may affect API usage in Python 3) (#76).
> 
> We cannot guess which version of pyelftools is actually being used. The
> elftools.__version__ symbol is not consistent with each distro's package
> version. For example, on Ubuntu 16.04 (xenial), the .deb package version
> is '0.23-2' but elftools.__version__ contains '0.25'. This is certainly
> due to partial backports.
> 
> To have a more consistent behaviour of this script across all versions
> of python, add the unicode_literals future import so that literal
> strings are now always "unicode".
> 
> Add 2 utility functions to force a string into bytes or bytes into an
> unicode string.
> 
> Force pyelftools return values to unicode strings (will do nothing with
> recent version of pyelftools).
> 
> If elffile.get_section_by_name returns None with a unicode section name,
> try with the same one encoded as bytes.
> 
> Also, replace all open() calls by io.open() which behaves like the
> builtin open in python 3. The only non-binary opened file is
> /usr/share/hwdata/pci.ids which is UTF-8 encoded text. Explicitly
> specify that encoding.
> 
> Link: https://github.com/eliben/pyelftools/blob/v0.24/CHANGES#L7
> Link: https://github.com/eliben/pyelftools/commit/108eaea9e75a8b5a
> Fixes: 54ca545dce4b ("make python scripts python2/3 compliant")
> Cc: John McNamara <john.mcnamara@intel.com>
> Cc: stable@dpdk.org
> Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>

I am really scary by all of this,
but I trust you to fix it properly.

Applied, thanks
  

Patch

diff --git a/usertools/dpdk-pmdinfo.py b/usertools/dpdk-pmdinfo.py
index 03623d5b8b48..069a3bf124b2 100755
--- a/usertools/dpdk-pmdinfo.py
+++ b/usertools/dpdk-pmdinfo.py
@@ -8,13 +8,15 @@ 
 #
 # -------------------------------------------------------------------------
 from __future__ import print_function
+from __future__ import unicode_literals
 import json
+import io
 import os
 import platform
 import string
 import sys
 from elftools.common.exceptions import ELFError
-from elftools.common.py3compat import (byte2int, bytes2str, str2bytes)
+from elftools.common.py3compat import byte2int
 from elftools.elf.elffile import ELFFile
 from optparse import OptionParser
 
@@ -213,7 +215,8 @@  def readLocal(self, filename):
         """
         Reads the local file
         """
-        self.contents = open(filename).readlines()
+        with io.open(filename, 'r', encoding='utf-8') as f:
+            self.contents = f.readlines()
         self.date = self.findDate(self.contents)
 
     def loadLocal(self):
@@ -267,7 +270,13 @@  def _section_from_spec(self, spec):
                 return None
         except ValueError:
             # Not a number. Must be a name then
-            return self.elffile.get_section_by_name(str2bytes(spec))
+            section = self.elffile.get_section_by_name(force_unicode(spec))
+            if section is None:
+                # No match with a unicode name.
+                # Some versions of pyelftools (<= 0.23) store internal strings
+                # as bytes. Try again with the name encoded as bytes.
+                section = self.elffile.get_section_by_name(force_bytes(spec))
+            return section
 
     def pretty_print_pmdinfo(self, pmdinfo):
         global pcidb
@@ -339,7 +348,8 @@  def display_pmd_info_strings(self, section_spec):
             while endptr < len(data) and byte2int(data[endptr]) != 0:
                 endptr += 1
 
-            mystring = bytes2str(data[dataptr:endptr])
+            # pyelftools may return byte-strings, force decode them
+            mystring = force_unicode(data[dataptr:endptr])
             rc = mystring.find("PMD_INFO_STRING")
             if (rc != -1):
                 self.parse_pmd_info_string(mystring)
@@ -348,9 +358,10 @@  def display_pmd_info_strings(self, section_spec):
 
     def find_librte_eal(self, section):
         for tag in section.iter_tags():
-            if tag.entry.d_tag == 'DT_NEEDED':
-                if "librte_eal" in tag.needed:
-                    return tag.needed
+            # pyelftools may return byte-strings, force decode them
+            if force_unicode(tag.entry.d_tag) == 'DT_NEEDED':
+                if "librte_eal" in force_unicode(tag.needed):
+                    return force_unicode(tag.needed)
         return None
 
     def search_for_autoload_path(self):
@@ -373,7 +384,7 @@  def search_for_autoload_path(self):
                     return (None, None)
                 if raw_output is False:
                     print("Scanning for autoload path in %s" % library)
-                scanfile = open(library, 'rb')
+                scanfile = io.open(library, 'rb')
                 scanelf = ReadElf(scanfile, sys.stdout)
         except AttributeError:
             # Not a dynamic binary
@@ -403,7 +414,8 @@  def search_for_autoload_path(self):
             while endptr < len(data) and byte2int(data[endptr]) != 0:
                 endptr += 1
 
-            mystring = bytes2str(data[dataptr:endptr])
+            # pyelftools may return byte-strings, force decode them
+            mystring = force_unicode(data[dataptr:endptr])
             rc = mystring.find("DPDK_PLUGIN_PATH")
             if (rc != -1):
                 rc = mystring.find("=")
@@ -416,8 +428,9 @@  def search_for_autoload_path(self):
 
     def get_dt_runpath(self, dynsec):
         for tag in dynsec.iter_tags():
-            if tag.entry.d_tag == 'DT_RUNPATH':
-                return tag.runpath
+            # pyelftools may return byte-strings, force decode them
+            if force_unicode(tag.entry.d_tag) == 'DT_RUNPATH':
+                return force_unicode(tag.runpath)
         return ""
 
     def process_dt_needed_entries(self):
@@ -438,16 +451,16 @@  def process_dt_needed_entries(self):
             return
 
         for tag in dynsec.iter_tags():
-            if tag.entry.d_tag == 'DT_NEEDED':
-                rc = tag.needed.find(b"librte_pmd")
-                if (rc != -1):
-                    library = search_file(tag.needed,
+            # pyelftools may return byte-strings, force decode them
+            if force_unicode(tag.entry.d_tag) == 'DT_NEEDED':
+                if 'librte_pmd' in force_unicode(tag.needed):
+                    library = search_file(force_unicode(tag.needed),
                                           runpath + ":" + ldlibpath +
                                           ":/usr/lib64:/lib64:/usr/lib:/lib")
                     if library is not None:
                         if raw_output is False:
                             print("Scanning %s for pmd information" % library)
-                        with open(library, 'rb') as file:
+                        with io.open(library, 'rb') as file:
                             try:
                                 libelf = ReadElf(file, sys.stdout)
                             except ELFError:
@@ -458,6 +471,20 @@  def process_dt_needed_entries(self):
                             file.close()
 
 
+# compat: remove force_unicode & force_bytes when pyelftools<=0.23 support is
+# dropped.
+def force_unicode(s):
+    if hasattr(s, 'decode') and callable(s.decode):
+        s = s.decode('latin-1')  # same encoding used in pyelftools py3compat
+    return s
+
+
+def force_bytes(s):
+    if hasattr(s, 'encode') and callable(s.encode):
+        s = s.encode('latin-1')  # same encoding used in pyelftools py3compat
+    return s
+
+
 def scan_autoload_path(autoload_path):
     global raw_output
 
@@ -476,7 +503,7 @@  def scan_autoload_path(autoload_path):
             scan_autoload_path(dpath)
         if os.path.isfile(dpath):
             try:
-                file = open(dpath, 'rb')
+                file = io.open(dpath, 'rb')
                 readelf = ReadElf(file, sys.stdout)
             except ELFError:
                 # this is likely not an elf file, skip it
@@ -503,7 +530,7 @@  def scan_for_autoload_pmds(dpdk_path):
             print("Must specify a file name")
         return
 
-    file = open(dpdk_path, 'rb')
+    file = io.open(dpdk_path, 'rb')
     try:
         readelf = ReadElf(file, sys.stdout)
     except ElfError:
@@ -595,7 +622,7 @@  def main(stream=None):
         print("File not found")
         sys.exit(1)
 
-    with open(myelffile, 'rb') as file:
+    with io.open(myelffile, 'rb') as file:
         try:
             readelf = ReadElf(file, sys.stdout)
             readelf.process_dt_needed_entries()