Documentation/sphinx: fix kernel-doc decode for non-utf-8 locale
authorJani Nikula <jani.nikula@intel.com>
Thu, 31 Aug 2017 19:21:29 +0000 (22:21 +0300)
committerJonathan Corbet <corbet@lwn.net>
Thu, 31 Aug 2017 19:36:28 +0000 (13:36 -0600)
On python3, Popen() universal_newlines=True converts the subprocess
stdout to unicode text using a codec based on user preferences. Given
LANG indicating ascii and utf-8 stdout from the subprocess, you'd get:

WARNING: kernel-doc '../scripts/kernel-doc -rst -enable-lineno
../drivers/media/dvb-core/demux.h' processing failed with: 'ascii' codec can't
decode byte 0xe2 in position 6368: ordinal not in range(128)

Fix this by dropping universal_newlines=True and replacing the implicit
LANG specific decode with an explicit utf-8 decode. This also gets rid
of the annoying conditional code for python 2 vs. 3.

Fixes: ba3501859354 ("Documentation/sphinx: fix kernel-doc extension on python3")
Reference: http://mid.mail-archive.com/54c23e8e-89c0-5cea-0dcc-e938952c5642@infradead.org
Reported-and-tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Documentation/sphinx/kerneldoc.py

index d15e07f..39aa9e8 100644 (file)
@@ -27,6 +27,7 @@
 # Please make sure this works on both python2 and python3.
 #
 
+import codecs
 import os
 import subprocess
 import sys
@@ -88,13 +89,10 @@ class KernelDocDirective(Directive):
         try:
             env.app.verbose('calling kernel-doc \'%s\'' % (" ".join(cmd)))
 
-            p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
+            p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
             out, err = p.communicate()
 
-            # python2 needs conversion to unicode.
-            # python3 with universal_newlines=True returns strings.
-            if sys.version_info.major < 3:
-                out, err = unicode(out, 'utf-8'), unicode(err, 'utf-8')
+            out, err = codecs.decode(out, 'utf-8'), codecs.decode(err, 'utf-8')
 
             if p.returncode != 0:
                 sys.stderr.write(err)