Arnamoy Bhattacharyya [Mon, 5 Apr 2021 16:58:00 +0000 (12:58 -0400)]
[flang][driver] Modify the existing test cases that use -Mstandard in f18, to use -pedantic and %flang_fc1 to share with the new driver
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D99518
Ta-Wei Tu [Mon, 5 Apr 2021 17:08:35 +0000 (01:08 +0800)]
[LoopFusion] Bails out if only the second candidate is guarded (PR48060)
If only the second candidate loop is guarded while the first one is not, fusioning
two loops might not be valid but this check is currently missing.
Fixes https://bugs.llvm.org/show_bug.cgi?id=48060
Reviewed By: sidbav
Differential Revision: https://reviews.llvm.org/D99716
Charusso [Mon, 5 Apr 2021 17:04:30 +0000 (19:04 +0200)]
[analyzer] DynamicSize: Store the dynamic size
This patch introduces a way to store the size.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D69726
Arnamoy Bhattacharyya [Mon, 5 Apr 2021 16:41:46 +0000 (12:41 -0400)]
[flang][driver] Add options for -Werror
With the option given, warnings are treated as error.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D98657
Fraser Cormack [Wed, 31 Mar 2021 16:01:16 +0000 (17:01 +0100)]
[RISCV] Add support for bitcasts between scalars and fixed-length vectors
This patch supports bitcasts from scalar types to fixed-length vectors
and vice versa. It custom-lowers and custom-legalizes them to
EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT operations, using a single-element
vectors to hold the scalar where appropriate.
Previously, some of these would fail to select, others would be expanded
through stack loads and stores. Effort was made to ensure the codegen
avoids the stack for both legal and illegal scalar types.
Some of the codegen could be improved, but on first glance it looks like
a general optimization of EXTRACT_VECTOR_ELT when extracting an i64
element on RV32.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99667
Sanjay Patel [Mon, 5 Apr 2021 16:14:49 +0000 (12:14 -0400)]
[InstCombine] fix potential miscompile in select value equivalence
As shown in the example based on:
https://llvm.org/PR49832
...and the existing test, we can't substitute
a vector value because the equality compare
replacement that we are attempting requires
that the comparison is true for the entire
value. Vector select can be partly true/false.
Sanjay Patel [Mon, 5 Apr 2021 16:03:50 +0000 (12:03 -0400)]
[InstCombine] add test for miscompile from select value equivalence; NFC
The new test is reduced from:
https://llvm.org/PR49832
...but we already show a potential miscompile in the existing test too.
John Paul Adrian Glaubitz [Mon, 5 Apr 2021 16:22:59 +0000 (09:22 -0700)]
[M68k] Mark public functions with the LLVM_EXTERNAL_VISIBILITY macro
In
0dbcb3639451, most most target symbols were made hidden by default
with the public ones marked with LLVM_EXTERNAL_VISIBILITY. When the
M68k target was added, this particular change was forgotten so that
external tools cannot make use of the public M68k target functions
in libLLVM.so. Thus, add the missing LLVM_EXTERNAL_VISIBILITY macro
to all public target functions in the M68k backend.
Differential Revision: https://reviews.llvm.org/D99869
Fraser Cormack [Wed, 31 Mar 2021 11:51:03 +0000 (12:51 +0100)]
[RISCV] Expand scalable-vector truncstores and extloads
Caught in internal testing, these operations are assumed legal by
default, even for scalable vector types. Expand them back into separate
truncations and stores, or loads and extensions.
Also add explicit fixed-length vector tests for these operations, even
though they should have been correct already.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99654
Erik Pilkington [Mon, 5 Apr 2021 13:05:56 +0000 (09:05 -0400)]
[SemaObjC] Fix a -Wbridge-cast false-positive
Clang used to emit a bad -Wbridge-cast diagnostic on the cast in the attached
test. This was because, after
09abecef7, struct __CFString was not added to
lookup, so the objc_bridge attribute wasn't getting duplicated onto the most
recent declaration, causing us to fail to find it in getObjCBridgeAttr. This
patch fixes this by instead walking through the redeclarations to find an
appropriate bridge attribute. rdar://
72823399
Differential revision: https://reviews.llvm.org/D99661
Stefan Pintilie [Mon, 5 Apr 2021 13:07:16 +0000 (08:07 -0500)]
[PowerPC] Fix issue where binary uses a .got but is missing a .TOC.
From the PowerPC ELFv2 ABI section 4.2.3. Global Offset Table.
```
The GOT consists of an 8-byte header that contains the TOC base (the first TOC
base when multiple TOCs are present), followed by an array of 8-byte addresses.
```
Due to the introduction of PC Relative code it is now possible to require a GOT
without having a .TOC. symbol in the object that is being linked. Since LLD uses
the .TOC. symbol to determine whether or not a GOT is required the GOT header is
not setup correctly and the 8-byte header is missing.
This patch allows the Power PC GOT setup to happen when an element is added to
the GOT instead of at the very begining. When this header is added a .TOC.
symbol is also added.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D91426
Peyton, Jonathan L [Thu, 25 Feb 2021 18:49:12 +0000 (12:49 -0600)]
[OpenMP] Fix incorrect KMP_STRLEN() macro
The second argument to the strnlen_s(str, size) function should be
sizeof(str) when str is a true array of characters with known size
(instead of just a char*). Use type traits to determine if first
parameter is a character array and use the correct size based on that
trait.
Differential Revision: https://reviews.llvm.org/D98209
Alexey Bataev [Thu, 1 Apr 2021 16:16:31 +0000 (09:16 -0700)]
[SLP]Improve vectorization of the CmpInst instructions.
During vectorization better to postpone the vectorization of the CmpInst
instructions till the end of the basic block. Otherwise we may vectorize
it too early and may miss some vectorization patterns, like reductions.
Reworked part of D57059
Differential Revision: https://reviews.llvm.org/D99796
Paul C. Anagnostopoulos [Fri, 2 Apr 2021 16:35:24 +0000 (12:35 -0400)]
[TableGen] [docs] Correct a couple of mistakes; use 'true' and 'false' in examples
Differential Revision: https://reviews.llvm.org/D99800
Alex Orlov [Mon, 5 Apr 2021 11:40:41 +0000 (15:40 +0400)]
* NFC. Refactored DIPrinter for better support of new print styles.
This patch introduces a DIPrinter interface to implement by different output style printer implementations. DIPrinterGNU and DIPrinterLLVM implement the GNU and LLVM output style printing respectively. No functional changes.
This refactoring clarifies and simplifies the code, and makes a new output style addition easier.
Reviewed By: jhenderson, dblaikie
Differential Revision: https://reviews.llvm.org/D98994
Fraser Cormack [Wed, 20 Jan 2021 07:49:53 +0000 (07:49 +0000)]
[RISCV] Add a test showing incorrect codegen
This patch adds a test which shows how the compiler incorrectly sets the
size and alignment of a stack object used to indirectly pass vector
types to functions.
In the particular example, the test passes a <4 x i8> vector type to a
function and creates a stack object of size and alignment equal to 4
bytes. However, the code generated to set up that parameter has been
scalarized and stores each element as individual XLEN-sized values. Thus
on RV32 this stores 16 bytes and on RV64 32 bytes, both of which clobber
the stack. Similarly, the alignment is set up as the alignment
of the vector type, which is not necessarily the natural alignment of XLEN.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D95025
Simon Pilgrim [Mon, 5 Apr 2021 10:40:29 +0000 (11:40 +0100)]
[X86] Fold xor(zext(xor(x,c1)),c2) -> xor(zext(x),xor(zext(c1),c2))
Fixes PR47603 (second case) by extending rG89afec348dbd3e5078f176e978971ee2d3b5dec8
Simon Pilgrim [Mon, 5 Apr 2021 10:16:03 +0000 (11:16 +0100)]
[X86] Add second PR47603 test case
We had coverage for the xor(trunc(xor(x,31)),31) case but not xor(zext(xor(x,31)),31)
Thomas Preud'homme [Sat, 3 Apr 2021 07:52:39 +0000 (08:52 +0100)]
[DebugInfo, CallSites, test] Fix use of undef FileCheck var
Clang test CodeGen/debug-info-extern-call.c tries to check for the
absence of a sequence of instructions with several CHECK-NOT with one of
those directives using a variable defined in another. However CHECK-NOT
are checked independently so that is using a variable defined in a
pattern that should not occur in the input.
This commit removes the CHECK-NOT for the retained line attribute
definition since the CHECK-NOT on the compile unit will already check
that there is no retained lines.
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D99830
Josh Berdine [Sun, 28 Mar 2021 22:18:01 +0000 (23:18 +0100)]
[NFC][OCaml] Reformat to clean up following CAMLprim removal
The removal of CAMLprim left the code in need of an application of
clang-format. There are various other changes made by clang-format
which it seems ought to be rolled together into this diff.
Differential Revision: https://reviews.llvm.org/D99477
Josh Berdine [Sun, 28 Mar 2021 21:52:55 +0000 (22:52 +0100)]
[NFC][OCaml] Remove vestigial CAMLprim declarations
The CAMLprim macro has not been needed since OCaml 3.11, and is
defined to the empty string. This diff removes all instances of it.
Differential Revision: https://reviews.llvm.org/D99476
Josh Berdine [Sun, 28 Mar 2021 20:54:25 +0000 (21:54 +0100)]
[OCaml] Omit unnecessary GC root registrations
The current code does not follow the simple interface to the OCaml GC,
where GC roots are registered conservatively, only initializing
allocations are performed, etc. This is intentional, as stated in the
opening file comments. On the other hand, the current code does
register GC roots in many situations where it is not strictly
necessary. This diff omits many of them.
Differential Revision: https://reviews.llvm.org/D99475
Josh Berdine [Sat, 27 Mar 2021 23:00:47 +0000 (23:00 +0000)]
[OCaml] Code simplification using string allocation functions
Using the `cstr_to_string` function that allocates and initializes an
OCaml `string` value enables simplifications in several cases. This
change also has the effect of avoiding calling `memcpy` on NULL
pointers even if only 0 bytes are to be copied.
Differential Revision: https://reviews.llvm.org/D99474
Josh Berdine [Sat, 27 Mar 2021 22:53:35 +0000 (22:53 +0000)]
[OCaml] Code simplification using option allocation functions
Using the `caml_alloc_some` and `ptr_to_option` functions that
allocate OCaml `option` values enables simplifications in many
cases. These simplifications also result in avoiding unnecessary
double initialization in many cases, so yield a minor optimization as
well.
Also, change to avoid using the old unprefixed functions such as
`alloc_small` and instead use the current `caml_alloc_small`.
A few of the changed functions were slightly rewritten in the
early-return style.
Differential Revision: https://reviews.llvm.org/D99473
Josh Berdine [Sat, 27 Mar 2021 16:54:16 +0000 (16:54 +0000)]
[OCaml] Minor optimizations by avoiding double initialization
In several functions an OCaml block is allocated and no further OCaml
allocation functions (or other functions that might trigger allocation
or collection) are performed before the block is fully initialized. In
these cases, it is safe and slightly more efficient to allocate an
uninitialized block.
Also, the code does not become more complex after the non-initializing
allocation, since in the case that a non-small allocation is made, the
initial values stored are definitely not pointers to OCaml young
blocks, and so initializing via direct assignment is still safe. That
is, in general if `caml_alloc_small` is called, initializing it with
direct assignments is safe, but if `caml_alloc_shr` is
called (e.g. for a block larger than `Max_young_wosize`), then
`caml_initialize` should be called to inform the GC of a potential
major to minor pointer. But if the initial value is definitely not a
young OCaml block, direct assignment is safe.
Differential Revision: https://reviews.llvm.org/D99472
Josh Berdine [Sat, 27 Mar 2021 15:16:25 +0000 (15:16 +0000)]
[OCaml] Fix unsafe uses of Store_field
Using `Store_field` to initialize fields of blocks allocated with
`caml_alloc_small` is unsafe. The fields of blocks allocated by
`caml_alloc_small` are not initialized, and `Store_field` calls the
OCaml GC write barrier. If the uninitialized value of a field happens
to point into the OCaml heap, then it will e.g. be added to a conflict
set or followed and have what the GC thinks are color bits
changed. This leads to crashes or memory corruption.
This diff fixes a few (I think all) instances of this problem. Some of
these are creating option values. OCaml 4.12 has a dedicated
`caml_alloc_some` function for this, so this diff adds a compatible
function with a version check to avoid conflict. With that, macros for
accessing option values are also added.
Differential Revision: https://reviews.llvm.org/D99471
Sylvestre Ledru [Mon, 5 Apr 2021 09:54:17 +0000 (11:54 +0200)]
ignore -flto= options recognized by GCC
as requested in https://bugs.llvm.org/show_bug.cgi?id=49553, submitting the proposed changes to just ignore the -flto= options which are recognized by GCC ("auto" and "jobserver").
GCC supports -flto=<auto|jobserver|<N> to select the parallelity for LTO builds. LLVM also has -flto-jobs=<N>, which only seems to have a meaning when used with -flto=thin?
The attached patch just ignores the values "auto" and "jobserver". that doesn't change anything in functionality. Another option would be to map these values to either "thin" or "full", maybe in presence of the -ffat-lto-objects option?
-flto=<n> could also be translated to -flto-jobs=<N>.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D99501
Max Kazantsev [Mon, 5 Apr 2021 09:24:00 +0000 (16:24 +0700)]
[Test] Auto-update checks in a test
Max Kazantsev [Mon, 5 Apr 2021 07:51:29 +0000 (14:51 +0700)]
[Test] Split out new and old PM tests
This is to avoid sophistication of checks as the old and new PM behave
differently with fix patches.
Max Kazantsev [Mon, 5 Apr 2021 05:00:10 +0000 (12:00 +0700)]
[Test] Add tests for various scenarios of PRE of a loop load
Yaxun (Sam) Liu [Wed, 31 Mar 2021 21:23:11 +0000 (17:23 -0400)]
[CUDA][HIP] rename -fcuda-flush-denormals-to-zero
Rename it to -fgpu-flush-denormals-to-zero.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D99688
Dave Lee [Sun, 4 Apr 2021 21:52:26 +0000 (14:52 -0700)]
[lldb] Replace unneeded use of Foundation with ObjectiveC in tests (NFC)
When referencing `NSObject`, it's enough to import `objc/NSObject.h`. Importing `Foundation` is unnecessary in these cases.
Differential Revision: https://reviews.llvm.org/D99867
Dave Lee [Sun, 4 Apr 2021 15:28:10 +0000 (08:28 -0700)]
[lldb] Import ObjectiveC module instead of Foundation in test
Use `@import ObjectiveC` instead of `@import Foundation`, as the former is all
that's needed, and results in fewer clang modules being built.
This results in the following clang modules *not* being built for this test.
ApplicationServices
CFNetwork
ColorSync
CoreFoundation
CoreGraphics
CoreServices
CoreText
DiskArbitration
Dispatch
Foundation
IOKit
ImageIO
Security
XPC
_Builtin_intrinsics
launch
libkern
os_object
os_workgroup
Differential Revision: https://reviews.llvm.org/D99859
Craig Topper [Sun, 4 Apr 2021 22:40:41 +0000 (15:40 -0700)]
[RISCV] Use gorciw for i32 orc.b intrinsic when Zbp is enabled.
The W version of orc.b does not exist in Zbp so we need to use
gorci encoding. If we have Zbp, we can use gorciw which can avoid a
sext.w in some cases.
Fangrui Song [Sun, 4 Apr 2021 22:35:53 +0000 (15:35 -0700)]
[sanitizer] Simplify GetTls with dl_iterate_phdr on Linux
This was reverted by
f176803ef1f4050a350e01868d64fe09a674d3bf due to
Ubuntu 16.04 x86-64 glibc 2.23 problems.
This commit additionally calls `__tls_get_addr({modid,0})` to work around the
dlpi_tls_data==NULL issues for glibc<2.25
(https://sourceware.org/bugzilla/show_bug.cgi?id=19826)
GetTls is the range of
* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus
On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.
This patch uses `dl_iterate_phdr` to collect TLS blocks. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.
This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize` entirely. Use the simplified method with non-Android Linux for
now, but in theory this can be used with *BSD and potentially other ELF OSes.
This simplification enables D99566 for TLS Variant I architectures.
See https://reviews.llvm.org/D93972#2480556 for analysis on GetTls usage
across various sanitizers.
Differential Revision: https://reviews.llvm.org/D98926
Arthur O'Dwyer [Sun, 4 Apr 2021 22:05:12 +0000 (18:05 -0400)]
[libc++] Fix test_macros.h in the same way as commit
49e5a896 fixed __config.
Since D99515, this header triggers -Wundef on Mac OSX older than 10.15.
This is now fixed.
Arthur O'Dwyer [Sun, 4 Apr 2021 21:39:50 +0000 (17:39 -0400)]
[libc++] Fix the header guard from _LIBCPP_STEAMBUF to _LIBCPP_STREAMBUF.
Roman Lebedev [Sun, 4 Apr 2021 20:25:29 +0000 (23:25 +0300)]
[InstCombine] dropRedundantMaskingOfLeftShiftInput(): check that adding shift amounts doesn't overflow (PR49778)
This is identical to
781d077afb0ed9771c513d064c40170c1ccd21c9,
but for the other function.
For certain shift amount bit widths, we must first ensure that adding
shift amounts is safe, that the sum won't have an unsigned overflow.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49778
Roman Lebedev [Sun, 4 Apr 2021 20:23:10 +0000 (23:23 +0300)]
[NFC][InstCombine] Extract canTryToConstantAddTwoShiftAmounts() as helper
Roman Lebedev [Sun, 4 Apr 2021 20:15:29 +0000 (23:15 +0300)]
[NFC][InstCombine] Add test for PR49778
Craig Topper [Sun, 4 Apr 2021 19:30:25 +0000 (12:30 -0700)]
[RISCV] Lower orc.b intrinsic to RISCVISD::GORCI.
This will allow us to share any future known bits, demaned bits,
or sign bits improvements.
Thomas Preud'homme [Sat, 3 Apr 2021 08:29:38 +0000 (09:29 +0100)]
[HIP, test] Fix use of undef FileCheck var
Clang test CodeGenCUDA/kernel-stub-name.cu uses never defined DKERN
variable in a CHECK-NOT directive. This commit replace the variable by a
regex, thereby avoiding the issue.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D99832
Thomas Preud'homme [Sat, 3 Apr 2021 08:13:11 +0000 (09:13 +0100)]
[HIP-Clang, test] Fix use of undef FileCheck var
Commit
8129521318accc44c2a009647572f6ebd3fc56dd changed a line defining
PREFIX in clang test CodeGenCUDA/device-stub.cu into a CHECK-NOT
directive. All following lines using PREFIX are therefore using an
undefined variable since the pattern defining PREFIX is not supposed to
occur and CHECK-NOT are checked independently.
This commit replaces all uses of PREFIX by the regex used to define it,
thereby avoiding the problem.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D99831
Mark de Wever [Tue, 30 Mar 2021 18:19:12 +0000 (20:19 +0200)]
[libc++] Improve generate_feature_test_macro_components.py.
This improves the naming of the fields `depends`/`internal_depends`. It
also adds the documentation for this script. The changes are based on
D99290 and its review comments.
Differential Revision: https://reviews.llvm.org/D99615
Fangrui Song [Sun, 4 Apr 2021 17:15:12 +0000 (10:15 -0700)]
[Driver] Detect libstdc++ include paths for native gcc (-m32 and -m64) on Debian i386
Take gcc-8 on Debian i386 as an example. The target-specific libstdc++ search
path (`GPLUSPLUS_TOOL_INCLUDE_DIR`) uses the multiarch name `i386-linux-gnu`,
instead of the triple of the GCC installation `i686-linux-gnu` (the directory
under `usr/lib/gcc/`):
```
/usr/include/c++/8
/usr/include/i386-linux-gnu/c++/8
/usr/include/c++/8/backward
```
Clang currently detects `/usr/lib/gcc/i686-linux-gnu/8/../../../include/i686-linux-gnu/c++/8`.
This patch changes the second i686-linux-gnu to i386-linux-gnu so that
`/usr/include/i386-linux-gnu/c++/8` can be found.
Fix PR49827 - this was somehow regressed by my previous libstdc++ include path
cleanups and fixes for gcc-cross, but it seems that the paths were never properly tested before.
Differential Revision: https://reviews.llvm.org/D99852
Martin Storsjö [Wed, 24 Mar 2021 08:46:17 +0000 (10:46 +0200)]
[libcxx] [test] Link against msvcprt as C++ ABI library in tests
This matches what we link the library itself against (set in
CMakeLists.txt). When testing a static library version of libc++,
this is needed for essentially every test due to libc++ object files
requiring it.
Also with libc++ built as a DLL, some tests directly call functions that
are provided by msvcprt (such as std::set_new_handler), thus this fixes
a number of tests in that configuration too.
Differential Revision: https://reviews.llvm.org/D99263
Sanjay Patel [Sun, 4 Apr 2021 15:38:09 +0000 (11:38 -0400)]
[InstCombine] fold popcount of exactly one bit to shift
This is discussed in https://llvm.org/PR48999 ,
but it does not solve that request.
The difference in the vector test shows that some
other logic transform is limited to scalar types.
Sanjay Patel [Sun, 4 Apr 2021 13:39:24 +0000 (09:39 -0400)]
[InstCombine] add tests for ctpop of power-of-2; NFC
PR48999
Nikita Popov [Sun, 4 Apr 2021 08:49:59 +0000 (10:49 +0200)]
[SimplifyCFG] Handle two equal cases in switch to select
When converting a switch with two cases and a default into a
select, also handle the denegerate case where two cases have the
same value.
Generate this case directly as
%or = or i1 %cmp1, %cmp2
%res = select i1 %or, i32 %val, i32 %default
rather than
%sel1 = select i1 %cmp1, i32 %val, i32 %default
%res = select i1 %cmp2, i32 %val, i32 %sel1
as InstCombine is going to canonicalize to the former anyway.
Nikita Popov [Sun, 4 Apr 2021 15:10:11 +0000 (17:10 +0200)]
[SimplifyCFG] Add switch-to-select test with two equal cases (NFC)
We handle the case where we have two cases and a default all having
different values, but not the case where two cases happen to have
the same one.
The PhaseOrdering test is a particularly bad example where this
showed up.
Nikita Popov [Sun, 4 Apr 2021 14:47:54 +0000 (16:47 +0200)]
[SimplifyCFG] Make test more robust (NFC)
These are supposed to test creation of a switch, so make sure
there is some actual code in the branches. Otherwise this could
be turned into a select instead.
Aaron Ballman [Sun, 4 Apr 2021 14:58:56 +0000 (10:58 -0400)]
Speculative fix for failing build bot.
This attempts to resolve an issue found by http://45.33.8.238/macm1/6821/step_6.txt
Roman Lebedev [Sun, 4 Apr 2021 12:56:43 +0000 (15:56 +0300)]
[llvm-exegesis] SnippetFile: do create source manager in MCContext
This way, once there's an error in the snippet file (like in the test),
llvm-exegesis won't crash with an assertion failure,
but print a nice diagnostic about the problem.
Nikita Popov [Sun, 4 Apr 2021 11:45:03 +0000 (13:45 +0200)]
[CVP] Add more tests for select with overdefined operand (NFC)
Also check the case where one operand isn't constant, which isn't
handled right now, because the SPF code requires both operands
to be ranges.
Move the tests to directly check ranges rather than go through an
and, to make it more obvious that this has no relation to bitmasks.
Roman Lebedev [Sun, 4 Apr 2021 11:36:56 +0000 (14:36 +0300)]
[llvm-exegesis] Don't erroneously refuse to measure POPCNT instruction
Dimitry Andric [Sat, 3 Apr 2021 10:20:13 +0000 (12:20 +0200)]
Don't check that std::pair is trivially copyable on FreeBSD
As FreeBSD already used libc++ before it changed its ABI, we still use
the non-trivially copyable version of std::pair, which used to be
exposed via `_LIBCPP_TRIVIAL_PAIR_COPY_CTOR`, but more recently via
`_LIBCPP_DEPRECATED_ABI_DISABLE_PAIR_TRIVIAL_COPY_CTOR`.
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D99834
Butygin [Sat, 3 Apr 2021 19:06:47 +0000 (22:06 +0300)]
[mlir][NFC] Fully spell mlir types names in LoopLikeOpInterface, so it can be used in ops defined outside mlir namespace
Differential Revision: https://reviews.llvm.org/D99844
Nikita Popov [Sun, 4 Apr 2021 08:52:22 +0000 (10:52 +0200)]
[LVI] Don't bail on overdefined value in select
Even if one of the operands is overdefined, we may still produce
a non-overdefined result, e.g. due to a min/max operation. This
matches our handling elsewhere, e.g. for binary operators.
The slot poisoning comment refers to a much older LVI cache
implementation.
Nikita Popov [Sun, 4 Apr 2021 09:05:59 +0000 (11:05 +0200)]
[CVP] Add test for and of min (NFC)
The and currently doesn't get optimized away because %a is
overdefined.
Jason Molenda [Sun, 4 Apr 2021 08:47:35 +0000 (01:47 -0700)]
Revert "Add support for fetching signed values from tagged pointers."
This reverts commit
4d9039c8dc2d1f0be1b5ee486d5a83b1614b038a.
This is causing the greendragon bots to fail most of the time when
running TestNSDictionarySynthetic.py. Reverting until Jim has a chance
to look at this on Monday. Running the commands from that test from
the command line, it fails 10-13% of the time on my desktop.
This is a revert of Jim's changes in https://reviews.llvm.org/D99694
Vitaly Buka [Sun, 4 Apr 2021 06:52:06 +0000 (23:52 -0700)]
[NFC][scudo] Restore !UseQuarantine check in tests
The check was removed in D99786 as it seems that quarantine is
irrelevant for the just created allocator. However there is internal
issues with tagged memory access.
We should be able to fix iterateOverChunks for taggin later.
Craig Topper [Sun, 4 Apr 2021 06:05:34 +0000 (23:05 -0700)]
[RISCV] Don't convert fshr/fshl to target specific FSL/FSR node if shift amount is a constant.
As long as it's a constant we can directly pattern match it
without any problems. It's only when it isn't a constant that
we need to add an AND.
In theory this should allow more target independent optimizations
to remain active.
Timm Bäder [Thu, 25 Mar 2021 12:32:42 +0000 (13:32 +0100)]
[clang][parser] Set source ranges for GNU-style attributes
Set the source ranges for parsed GNU-style attributes in
ParseGNUAttributes(), the same way that ParseCXX11Attributes() does it.
Differential Revision: https://reviews.llvm.org/D75844
Juneyoung Lee [Sun, 4 Apr 2021 04:35:33 +0000 (13:35 +0900)]
[InstCombine] Conditionally fold select i1 into and/or
This patch fixes llvm.org/pr49688 by conditionally folding select i1 into and/or:
```
select cond, cond2, false
->
and cond, cond2
```
This is not safe if cond2 is poison whereas cond isn’t.
Unconditionally disabling this transformation affects later pipelines that depend on and/or i1s.
To minimize its impact, this patch conservatively checks whether cond2 is an instruction that
creates a poison or its operand creates a poison.
This approach is similar to what InstSimplify's SimplifyWithOpReplaced is doing.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D99674
Juneyoung Lee [Sun, 4 Apr 2021 04:46:32 +0000 (13:46 +0900)]
[InstSimplify] Add a test for folding comparison with a undef vector (NFC)
This is to fix https://reviews.llvm.org/D93990#2666922
Juneyoung Lee [Sun, 4 Apr 2021 04:29:32 +0000 (13:29 +0900)]
[InstCombine] precommit pr49688.ll (NFC)
This is going to be fixed by D99674
Juneyoung Lee [Sun, 4 Apr 2021 04:27:42 +0000 (13:27 +0900)]
[InstCombine] Reapply update_test_checks.py to unsigned-multiply-overflow-check.ll (NFC)
Thomas Preud'homme [Sat, 3 Apr 2021 10:45:59 +0000 (11:45 +0100)]
[C++20, test] Fix use of undef FileCheck variable
Commit
f495de43bd5da50286da6020e508d106cfc60f57 forgot two lines when
removing checks for strong and weak equality, resulting in the use of an
undefined FileCheck variable.
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99838
David Blaikie [Sat, 3 Apr 2021 21:02:11 +0000 (14:02 -0700)]
Preprocessor conditionalize some assert-only functions to suppress -Wunused-function
David Blaikie [Sat, 3 Apr 2021 21:01:15 +0000 (14:01 -0700)]
Add void cast to suppress -Wunused-member-variable on assert-only member
David Blaikie [Sat, 3 Apr 2021 21:00:30 +0000 (14:00 -0700)]
Add workaround for false positive in -Wfree-nonheap-object
David Blaikie [Sat, 3 Apr 2021 21:00:05 +0000 (14:00 -0700)]
Opaque pointers: Migrate examples to use load with explicit type
Mircea Trofin [Sat, 3 Apr 2021 06:45:35 +0000 (23:45 -0700)]
[mlgo] fix build rules
This was prompted by D95727, which had the side-effect to break the
'release' mode build bot for ML-driven policies. The problem is that now
the pre-compiled object files don't get transitively carried through as
'source' anymore; that being said, the previous way of consuming them
was problematic, because it was only working for static builds; in
dynamic builds, the whole tf_xla_runtime was linked, which is
undesirable.
The alternative is to treat tf_xla_runtime as an archive, which then
leads to the desired effect.
Differential Revision: https://reviews.llvm.org/D99829
Roman Lebedev [Sat, 3 Apr 2021 19:23:17 +0000 (22:23 +0300)]
[NFC][X86] Split VPMOV* AVX2 instructions into their own sched class
At least on all three Zen's, all such instructions cleanly map
into this new class with no overrides needed.
Craig Topper [Sat, 3 Apr 2021 18:46:24 +0000 (11:46 -0700)]
[TableGen] Use StringRef instead of std::string to split up a string that's being parsed. NFCI
Philip Reames [Sat, 3 Apr 2021 16:44:28 +0000 (09:44 -0700)]
Speculative attempt to stablize a test
New pass manager and old pass manager appear to differ on whether declarations are included in SCCs. For some reason, which you get appears to depend on build configuration.
Jez Ng [Sat, 3 Apr 2021 15:58:23 +0000 (11:58 -0400)]
[lld-macho] Another attempt at fixing 32-bit builds
Jez Ng [Sat, 3 Apr 2021 15:10:45 +0000 (11:10 -0400)]
[lld-macho] Fix build on 32-bit systems
Summary: Follow-up to D99633.
Nico Weber [Sat, 3 Apr 2021 14:56:09 +0000 (10:56 -0400)]
Revert "[lld-link] Enable addrsig table in COFF lto"
This reverts commit
eabd55b1b2c5e322c3b36cb44348f178692890c8.
Speculative, for crbug.com/1195545
Nikita Popov [Tue, 9 Mar 2021 20:04:03 +0000 (21:04 +0100)]
[FastISel] Remove kill tracking
This is a followup to D98145: As far as I know, tracking of kill
flags in FastISel is just a compile-time optimization. However,
I'm not actually seeing any compile-time regression when removing
the tracking. This probably used to be more important in the past,
before FastRA was switched to allocate instructions in reverse
order, which means that it discovers kills as a matter of course.
As such, the kill tracking doesn't really seem to serve a purpose
anymore, and just adds additional complexity and potential for
errors. This patch removes it entirely. The primary changes are
dropping the hasTrivialKill() method and removing the kill
arguments from the emitFast methods. The rest is mechanical fixup.
Differential Revision: https://reviews.llvm.org/D98294
Christian Sigg [Thu, 1 Apr 2021 12:52:48 +0000 (14:52 +0200)]
Silence `-Wunused-private-field` warning on isIsolatedFromAbove.
NDEBUG builds currently warn because it's only used inside an assert.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D99721
Nikita Popov [Sat, 3 Apr 2021 13:27:46 +0000 (15:27 +0200)]
[InstCombine] Add load/store forwarding test with odd size (NFC)
Test the case where the type size doesn't equal the store size,
as suggested by bjope.
Simon Pilgrim [Sat, 3 Apr 2021 11:43:05 +0000 (12:43 +0100)]
[X86] Fold xor(truncate(xor(x,c1)),c2) -> xor(truncate(x),xor(truncate(c1),c2))
Fixes PR47603
This should probably be transferable to DAGCombine - the main limitation with the existing trunc(logicop) DAG fold is we don't know if legalization has tried to promote truncated logicops already. We might be able to peek through extensions as well.
Simon Pilgrim [Sat, 3 Apr 2021 10:59:05 +0000 (11:59 +0100)]
[X86] Add PR47603 test case
Simon Pilgrim [Sat, 3 Apr 2021 10:52:31 +0000 (11:52 +0100)]
[X86][SSE] isHorizontalBinOp - use getTargetShuffleInputs helper (REAPPLIED)
Use the getTargetShuffleInputs helper for all shuffle decoding
Reapplied (after reversion in rGfa0aff6d6960) with fix+test for subvector splitting - we weren't accounting for peeking through bitcasts changing the vector element count of the shuffle sources.
Bjorn Pettersson [Sat, 3 Apr 2021 10:25:37 +0000 (12:25 +0200)]
Fix build rules for LLVM_WITH_Z3 after D95727
Started to see build errors like this
../lib/Support/Z3Solver.cpp:19:10: fatal error: 'z3.h' file not found
#include <z3.h>
^~~~~~
1 error generated.
after commit
43ceb74eb1a5801662419fb66a6bf0d5414f1ec5.
The -isystem path to the Z3_INCLUDE_DIR wen't missing in the compile
commands. No idea why target_include_directories stopped working with
that commit, but using include_directories seem to work better.
Nikita Popov [Sat, 6 Mar 2021 15:10:21 +0000 (16:10 +0100)]
[Loads] Forward constant vector store to load of first element
InstCombine performs simple forwarding from stores to loads, but
currently only handles the case where the load and store have the
same size. This extends it to also handle a store of a constant
with a larger size followed by a load with a smaller size.
This is implemented through ConstantFoldLoadThroughBitcast() which
is fairly primitive (e.g. does not allow storing a large integer
and then loading a small one), but at least can forward the first
element of a vector store. Unfortunately it seems that we currently
don't have a generic helper for "read a constant value as a different
type", it's all tangled up with other logic in either
ConstantFolding or VNCoercion.
Differential Revision: https://reviews.llvm.org/D98114
Nikita Popov [Sat, 3 Apr 2021 08:53:56 +0000 (10:53 +0200)]
[BasicAA] Don't store AATags in cache key (NFC)
The AAMDNodes part of the MemoryLocation is not used by the BasicAA
cache, so don't store it. This reduces the size of each cache entry
from 112 bytes to 48 bytes.
Nikita Popov [Sat, 24 Oct 2020 08:39:24 +0000 (10:39 +0200)]
[BasicAA] Don't pass through AA metadata (NFCI)
BasicAA itself doesn't make use of AA metadata, but passes it
through to recursive queries and makes it part of the cache key.
Aliasing decisions that are based on AA metadata (i.e. TBAA and
ScopedAA) are based *only* on AA metadata, so checking them with
different pointer values or sizes is not useful, the result will
always be the same.
While this change is a mild compile-time improvement by itself,
the actual goal here is to reduce the size of AA cache keys in
a followup change.
Differential Revision: https://reviews.llvm.org/D90098
Thomas Preud'homme [Fri, 2 Apr 2021 23:06:55 +0000 (00:06 +0100)]
[PGO, test] Fix typo in FileCheck var
Reviewed By: xur
Differential Revision: https://reviews.llvm.org/D99821
Craig Topper [Sat, 3 Apr 2021 06:34:14 +0000 (23:34 -0700)]
[RISCV] Refactor conversion of B extensions to IR intrinsics a little to reduce clang binary size.
These all pass 1 type to getIntrinsic. So rather than assigning
IntrinsicTypes for each builtin which invokes the SmallVector
constructor, just select the intrinsic ID with a switch and
share a single assignment of IntrinsicTypes.
David Blaikie [Sat, 3 Apr 2021 03:47:49 +0000 (20:47 -0700)]
Add missing override to clang tblgen AttrEmitter
Matheus Izvekov [Sat, 3 Apr 2021 01:10:12 +0000 (03:10 +0200)]
[clang] NFC: remove trailing white spaces from some tests
Differential Revision: https://reviews.llvm.org/D99826
Fangrui Song [Sat, 3 Apr 2021 00:04:11 +0000 (17:04 -0700)]
[lld-macho] Fix -Wsuggest-override after D99633. NFC
Craig Topper [Fri, 2 Apr 2021 23:49:49 +0000 (16:49 -0700)]
[RISCV] Add signext attribute to i32 orc.b test for RV64 to match other Zbb tests.
Shows the sext.w at the end that would show up in C code. I'm thinking
orc.b would preserve sign bits from it's input, but I'm not sure.
Nico Weber [Fri, 2 Apr 2021 23:21:34 +0000 (19:21 -0400)]
[gn build] hook up tsan on macOS too
Mostly just works already.
Jez Ng [Fri, 2 Apr 2021 22:46:18 +0000 (18:46 -0400)]
[lld-macho][nfc] Refactor in preparation for 32-bit support
The main challenge was handling the different on-disk structures (e.g.
`mach_header` vs `mach_header_64`). I tried to strike a balance between
sprinkling `target->wordSize == 8` checks everywhere (branchy = slow, and ugly)
and templatizing everything (causes code bloat, also ugly). I think I struck a
decent balance by judicious use of type erasure.
Note that LLD-ELF has a similar architecture, though it seems to use more templating.
Linking chromium_framework takes about the same time before and after this
change:
N Min Max Median Avg Stddev
x 20 4.52 4.67 4.595 4.5945 0.
044423204
+ 20 4.5 4.71 4.575 4.582 0.
056344803
No difference proven at 95.0% confidence
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D99633
Nico Weber [Fri, 2 Apr 2021 22:21:37 +0000 (18:21 -0400)]
[gn build] (manually) port
4c58f333f141
Nico Weber [Fri, 2 Apr 2021 22:12:45 +0000 (18:12 -0400)]
Revert "[sanitizer] Simplify GetTls with dl_iterate_phdr"
This reverts commit
9be8f8b34d9b150cd1811e3556fe9d0cd735ae29.
This breaks tsan on Ubuntu 16.04:
$ cat tiny_race.c
#include <pthread.h>
int Global;
void *Thread1(void *x) {
Global = 42;
return x;
}
int main() {
pthread_t t;
pthread_create(&t, NULL, Thread1, NULL);
Global = 43;
pthread_join(t, NULL);
return Global;
}
$ out/gn/bin/clang -fsanitize=thread -g -O1 tiny_race.c --sysroot ~/src/chrome/src/build/linux/debian_sid_amd64-sysroot/
$ docker run -v $PWD:/foo ubuntu:xenial /foo/a.out
FATAL: ThreadSanitizer CHECK failed: ../../compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp:447 "((thr_beg)) >= ((tls_addr))" (0x7fddd76beb80, 0xfffffffffffff980)
#0 <null> <null> (a.out+0x4960b6)
#1 <null> <null> (a.out+0x4b677f)
#2 <null> <null> (a.out+0x49cf94)
#3 <null> <null> (a.out+0x499bd2)
#4 <null> <null> (a.out+0x42aaf1)
#5 <null> <null> (libpthread.so.0+0x76b9)
#6 <null> <null> (libc.so.6+0x1074dc)
(Get the sysroot from here: https://commondatastorage.googleapis.com/chrome-linux-sysroot/toolchain/
500976182686961e34974ea7bdc0a21fca32be06/debian_sid_amd64_sysroot.tar.xz)
Also reverts follow-on commits:
This reverts commit
58c62fd9768594ec8dd57e8320ba2396bf8b87e5.
This reverts commit
31e541e37587100a5b21378380f54c028fda2d04.
Jinsong Ji [Fri, 2 Apr 2021 22:15:56 +0000 (22:15 +0000)]
[CSSPGO][Test] XFAIL profile-context-tracker-debug.ll on AIX
The case start to fail since https://reviews.llvm.org/D99351.
Looks like to me that the node order within Context Profile Tree depends
on the implmementation of std::hash<std::string>.
Unfortunately, the current clang implementation generate different values on
AIX (or for all big-endian systems?)
On Linux:
main:
2408804140(0x8f936f2c)
external:
896680882(0x357243b2)
externalA:
620231129(0x24f7f9d9)
On AIX:
main:
994322777(0x3b442959)
external:
3548191215(0xd37d19ef)
externalA:
1390365101(0x52df49ad)
XFAIL it first while we discuss and seek for a fix.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D99815