kunit: tool: continue past invalid utf-8 output
authorDaniel Latypov <dlatypov@google.com>
Wed, 20 Oct 2021 23:21:21 +0000 (16:21 -0700)
committerShuah Khan <skhan@linuxfoundation.org>
Mon, 25 Oct 2021 19:06:45 +0000 (13:06 -0600)
kunit.py currently crashes and fails to parse kernel output if it's not
fully valid utf-8.

This can come from memory corruption or just inadvertently printing
out binary data as strings.

E.g. adding this line into a kunit test
  pr_info("\x80")
will cause this exception
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position
  1961: invalid start byte

We can tell Python how to handle errors, see
https://docs.python.org/3/library/codecs.html#error-handlers

Unfortunately, it doesn't seem like there's a way to specify this in
just one location, so we need to repeat ourselves quite a bit.

Specify `errors='backslashreplace'` so we instead:
* print out the offending byte as '\x80'
* try and continue parsing the output.
  * as long as the TAP lines themselves are valid, we're fine.

Fixed spelling/grammar in commit log:
Shuah Khan <<skhan@linuxfoundation.org>

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
tools/testing/kunit/kunit.py
tools/testing/kunit/kunit_kernel.py

index e1dd318..68e6f46 100755 (executable)
@@ -477,9 +477,10 @@ def main(argv, linux=None):
                        sys.exit(1)
        elif cli_args.subcommand == 'parse':
                if cli_args.file == None:
+                       sys.stdin.reconfigure(errors='backslashreplace')  # pytype: disable=attribute-error
                        kunit_output = sys.stdin
                else:
-                       with open(cli_args.file, 'r') as f:
+                       with open(cli_args.file, 'r', errors='backslashreplace') as f:
                                kunit_output = f.read().splitlines()
                request = KunitParseRequest(cli_args.raw_output,
                                            None,
index faa6320..f08c6c3 100644 (file)
@@ -135,7 +135,7 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
                                           stdin=subprocess.PIPE,
                                           stdout=subprocess.PIPE,
                                           stderr=subprocess.STDOUT,
-                                          text=True, shell=True)
+                                          text=True, shell=True, errors='backslashreplace')
 
 class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
        """An abstraction over command line operations performed on a source tree."""
@@ -172,7 +172,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
                                           stdin=subprocess.PIPE,
                                           stdout=subprocess.PIPE,
                                           stderr=subprocess.STDOUT,
-                                          text=True)
+                                          text=True, errors='backslashreplace')
 
 def get_kconfig_path(build_dir) -> str:
        return get_file_path(build_dir, KCONFIG_PATH)