Improve handling of static assert messages.
authorCorentin Jabot <corentinjabot@gmail.com>
Fri, 20 Aug 2021 15:52:28 +0000 (17:52 +0200)
committerCorentin Jabot <corentinjabot@gmail.com>
Wed, 29 Jun 2022 12:57:35 +0000 (14:57 +0200)
commit64ab2b1dcc5136a744fcac21d3d2c59e9cce040a
tree9f02889306fcb41526e4cf66346d2188eb35fd87
parent62e907e9f9e3c1762e6ce1f24f98f4b0ddc82353
Improve handling of static assert messages.

Instead of dumping the string literal (which
quotes it and escape every non-ascii symbol),
we can use the content of the string when it is a
8 byte string.

Wide, UTF-8/UTF-16/32 strings are still completely
escaped, until we clarify how these entities should
behave (cf https://wg21.link/p2361).

`FormatDiagnostic` is modified to escape
non printable characters and invalid UTF-8.

This ensures that unicode characters, spaces and new
lines are properly rendered in static messages.
This make clang more consistent with other implementation
and fixes this tweet
https://twitter.com/jfbastien/status/1298307325443231744 :)

Of note, `PaddingChecker` did print out new lines that were
later removed by the diagnostic printing code.
To be consistent with its tests, the new lines are removed
from the diagnostic.

Unicode tables updated to both use the Unicode definitions
and the Unicode 14.0 data.

U+00AD SOFT HYPHEN is still considered a print character
to match existing practices in terminals, in addition of
being considered a formatting character as per Unicode.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D108469
clang/docs/ReleaseNotes.rst
clang/lib/Basic/Diagnostic.cpp
clang/lib/Sema/SemaDeclCXX.cpp
clang/lib/StaticAnalyzer/Checkers/PaddingChecker.cpp
clang/test/Lexer/null-character-in-literal.c
clang/test/Misc/diag-special-chars.c
clang/test/Misc/wrong-encoding.c
clang/test/SemaCXX/static-assert.cpp
llvm/include/llvm/Support/Unicode.h
llvm/lib/Support/Unicode.cpp