[sanitizer] Intercept glibc 2.38 __isoc23_* functions
authorFangrui Song <i@maskray.me>
Mon, 28 Aug 2023 07:49:49 +0000 (00:49 -0700)
committerTobias Hieta <tobias@hieta.se>
Wed, 30 Aug 2023 15:00:18 +0000 (17:00 +0200)
commitdd230efe703f34678ce52280e50238abf908aaa1
tree374c6d896e274ceaefe38732b097ddabe6895597
parentcf16374055c1eb3367ada29e4e544f84bc588413
[sanitizer] Intercept glibc 2.38 __isoc23_* functions

`strtol("0b1", 0, 0)` can be (pre-C23) 0 or (C23) 1.
`sscanf("0b10", "%i", &x)` is similar. glibc 2.38 introduced
`__isoc23_strtol` and `__isoc23_scanf` family functions for binary
compatibility.

When `_ISOC2X_SOURCE` is defined (implied by `_GNU_SOURCE`) or
`__STDC_VERSION__ > 201710L`, `__GLIBC_USE_ISOC2X` is defined to 1 and
these `__isoc23_*` symbols are used.

Add `__isoc23_` versions for the following interceptors:

* sanitizer_common_interceptors.inc implements strtoimax/strtoumax.
  Remove incorrect FIXME about https://github.com/google/sanitizers/issues/321
* asan_interceptors.cpp implements just strtol and strtoll. The default
  `replace_str` mode checks `nptr` is readable and `endptr` is writable.
  atoi reuses the existing strtol interceptor.
* msan_interceptors.cpp implements strtol family functions and their
  `_l` versions. Tested by lib/msan/tests/msan_test.cpp
* sanitizer_common_interceptors.inc implements scanf family functions.

The strtol family functions are spreaded, which is not great, but the
patch (intended for release/17.x) does not attempt to address the issue.

Add symbols to lib/sanitizer_common/symbolizer/scripts/global_symbols.txt to
support both glibc pre-2.38 and 2.38.

When build bots migrate to glibc 2.38+, we will lose test coverage for
non-isoc23 versions since the existing C++ unittests imply `_GNU_SOURCE`.
Add test/sanitizer_common/TestCases/{strtol.c,scanf.c}.
They catch msan false positive in the absence of the interceptors.

Fix https://github.com/llvm/llvm-project/issues/64388
Fix https://github.com/llvm/llvm-project/issues/64946

Link: https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00010.html
("The GNU C Library version 2.38 is now available")

Reviewed By: #sanitizers, vitalybuka, mgorny

Differential Revision: https://reviews.llvm.org/D158943

(cherry picked from commit ad7e2501000da2494860f06a306dfe8c08cc07c3)
compiler-rt/lib/asan/asan_interceptors.cpp
compiler-rt/lib/msan/msan_interceptors.cpp
compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc
compiler-rt/lib/sanitizer_common/symbolizer/scripts/global_symbols.txt
compiler-rt/test/sanitizer_common/TestCases/scanf.c [new file with mode: 0644]
compiler-rt/test/sanitizer_common/TestCases/strtol.c [new file with mode: 0644]