Andrea Di Biagio [Wed, 25 Aug 2021 20:34:35 +0000 (21:34 +0100)]
[X86][MCA] Address the latest issues with MULX reported in PR51495.
It turns out that SchedWrite WriteIMulH was always assigned to the low half of
the result of a MULX (rather than to the high half).
To avoid confusion, this patch swaps the two MULX writes in the tablegen
definition of MULX32/64. That way, write names better describe what they
actually refer to; this also avoids further complications if in future we decide
to reuse the same MulH writes to also model other scalar integer multiply
instructions. I also had to swap the latency values for the two MULX writes to
make sure that the change is effectively an NFC. In fact, none of the existing
x86 tests were affected by this small refactoring.
This patch also fixes a bug in MCA: a wrong latency value was propagated for
instructions that perform multiple writes to a same register. This last issue
was found by Roman while testing MULX on targets that define a different latency
for the Low/High part of the result.
Differential Revision: https://reviews.llvm.org/D108727
Alex Richardson [Thu, 26 Aug 2021 10:11:56 +0000 (11:11 +0100)]
[sanitizer] Fix build on FreeBSD RISC-V
We have to avoid calling renameat2 and clone on FreeBSD.
Additionally, the mcontext structure has different members.
Reviewed By: jrtc27, luismarques
Differential Revision: https://reviews.llvm.org/D103886
Sindhu Chittireddy [Thu, 26 Aug 2021 10:58:56 +0000 (06:58 -0400)]
Assert pointer cannot be null; NFC
Klocwork static code analysis exposed this concern:
Pointer 'SubExpr' returned from call to getSubExpr() function which may
return NULL from 'cast_or_null<Expr>(Operand)', which will be
dereferenced in the statement following it
Add an assert on SubExpr to make it clear this pointer cannot be null.
Matthew Devereau [Thu, 26 Aug 2021 10:08:03 +0000 (11:08 +0100)]
[AArch64][SVE] Teach cost model masked gathers/scatters are cheap
Tell the cost model to use the scalable calculation for non-neon fixed vector.
This results in a cheaper cost for fixed-length SVE masked gathers/scatters
allowing the vectorizor to emit them more frequently.
Benjamin Kramer [Thu, 26 Aug 2021 10:11:02 +0000 (12:11 +0200)]
[X86] Don't write to the source directory in test
Roman Lebedev [Thu, 26 Aug 2021 08:51:28 +0000 (11:51 +0300)]
The maximal representable alignment in LLVM IR is 1GiB, not 512MiB
In LLVM IR, `AlignmentBitfieldElementT` is 5-bit wide
But that means that the maximal alignment exponent is `(1<<5)-2`,
which is `30`, not `29`. And indeed, alignment of `
1073741824`
roundtrips IR serialization-deserialization.
While this doesn't seem all that important, this doubles
the maximal supported alignment from 512MiB to 1GiB,
and there's actually one noticeable use-case for that;
On X86, the huge pages can have sizes of 2MiB and 1GiB (!).
So while this doesn't add support for truly huge alignments,
which i think we can easily-ish do if wanted, i think this adds
zero-cost support for a not-trivially-dismissable case.
I don't believe we need any upgrade infrastructure,
and since we don't explicitly record the IR version,
we don't need to bump one either.
As @craig.topper speculates in D108661#2963519,
this might be an artificial limit imposed by the original implementation
of the `getAlignment()` functions.
Differential Revision: https://reviews.llvm.org/D108661
Benjamin Kramer [Thu, 26 Aug 2021 09:37:07 +0000 (11:37 +0200)]
[libunwind] Don't include cet.h/immintrin.h unconditionally
These may not exist when CET isn't available.
Alex Richardson [Thu, 26 Aug 2021 08:51:23 +0000 (09:51 +0100)]
Make Value::MaxAlignment(Exponent) constexpr
This avoids references to the variables be generated when using e.g. max().
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95050
Alex Richardson [Thu, 26 Aug 2021 08:50:05 +0000 (09:50 +0100)]
Fix __attribute__((annotate("")) with non-zero globals AS
The existing code attempting to bitcast from a value in the default globals AS
to i8 addrspace(0)* was triggering an assertion failure in our downstream fork.
I found this while compiling poppler for CHERI-RISC-V (we use AS200 for all
globals). The test case uses AMDGPU since that is one of the in-tree targets
with a non-zero default globals address space.
The new test previously triggered a "Invalid constantexpr bitcast!" assertion
and now correctly generates code with addrspace(1) pointers.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D105972
Alex Richardson [Thu, 26 Aug 2021 08:47:53 +0000 (09:47 +0100)]
Fix LLVM_ENABLE_THREADS check from
26a92d5852b2c6bf77efd26f6c0194c913f40285
We should be using #if instead of #ifdef here since LLVM_ENABLE_THREADS
is set using #cmakedefine01 so is always defined.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108110
Florian Hahn [Thu, 26 Aug 2021 09:08:00 +0000 (10:08 +0100)]
[ConstraintElimination] Initial support for using info from assumes.
This patch adds initial support to use facts from @llvm.assume calls. It
intentionally does not handle all possible cases to keep things simple
initially.
For now, the condition from an assume is made available on entry to the
containing block, if the assume is guaranteed to execute. Otherwise it
is only made available in the successor blocks.
Florian Hahn [Thu, 26 Aug 2021 09:07:46 +0000 (10:07 +0100)]
[ConstraintElimination] Add more assume tests.
David Green [Thu, 26 Aug 2021 08:43:44 +0000 (09:43 +0100)]
[AArch64] Remove unpredictable from narrowing instructions.
Like other similar instructions the xtn2 family do not have side
effects, and explicitly marking them as such can help improve scheduling
freedom.
David Green [Thu, 26 Aug 2021 08:13:30 +0000 (09:13 +0100)]
[AArch64] Add a Cortex-A55 NEON scheduler test case.
Jay Foad [Thu, 26 Aug 2021 08:27:01 +0000 (09:27 +0100)]
[MachineScheduler] Fix tracing
Consistently print a newline before "RegionInstrs:".
LLVM GN Syncbot [Thu, 26 Aug 2021 08:14:37 +0000 (08:14 +0000)]
[gn build] Port
21b25a1fb32e
gejin [Thu, 26 Aug 2021 08:20:38 +0000 (16:20 +0800)]
[libunwind] Support stack unwind in CET environment
Control-flow Enforcement Technology (CET), published by Intel,
introduces shadow stack feature aiming to ensure a return from
a function is directed to where the function was called.
In a CET enabled system, each function call will push return
address into normal stack and shadow stack, when the function
returns, the address stored in shadow stack will be popped and
compared with the return address, program will fail if the 2
addresses don't match.
In exception handling, the control flow may skip some stack frames
and we must adjust shadow stack to avoid violating CET restriction.
In order to achieve this, we count the number of stack frames skipped
and adjust shadow stack by this number before jumping to landing pad.
Reviewed By: hjl.tools, compnerd, MaskRay
Differential Revision: https://reviews.llvm.org/D105968
Signed-off-by: gejin <ge.jin@intel.com>
Jean Perier [Thu, 26 Aug 2021 07:44:24 +0000 (09:44 +0200)]
[flang] Take result length into account in ApplyElementwise folding
ApplyElementwise on character operation was always creating a result
ArrayConstructor with the length of the left operand. This is not
correct for concatenation and SetLength operations.
Compute and thread the length to the spot creating the ArrayConstructor
so that the length is correct for those character operations.
Differential Revision: https://reviews.llvm.org/D108711
LLVM GN Syncbot [Thu, 26 Aug 2021 07:29:05 +0000 (07:29 +0000)]
[gn build] Port
3373e845398b
Gabor Bencze [Wed, 25 Aug 2021 18:22:15 +0000 (20:22 +0200)]
[clang-tidy] Add bugprone-suspicious-memory-comparison check
The check warns on suspicious calls to `memcmp`.
It currently checks for comparing types that do not have
unique object representations or are non-standard-layout.
Based on
https://wiki.sei.cmu.edu/confluence/display/c/EXP42-C.+Do+not+compare+padding+data
https://wiki.sei.cmu.edu/confluence/display/c/FLP37-C.+Do+not+use+object+representations+to+compare+floating-point+values
and part of
https://wiki.sei.cmu.edu/confluence/display/cplusplus/OOP57-CPP.+Prefer+special+member+functions+and+overloaded+operators+to+C+Standard+Library+functions
Add alias `cert-exp42-c` and `cert-flp37-c`.
Some tests are currently failing at head, the check depends on D89649.
Originally started in D71973
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D89651
Gabor Bencze [Wed, 25 Aug 2021 18:22:15 +0000 (20:22 +0200)]
Fix __has_unique_object_representations with no_unique_address
Fix incorrect behavior of `__has_unique_object_representations`
when using the no_unique_address attribute.
Based on the bug report: https://bugs.llvm.org/show_bug.cgi?id=47722
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D89649
Esme-Yi [Thu, 26 Aug 2021 07:17:06 +0000 (07:17 +0000)]
[llvm-readobj][XCOFF] Add support for `--needed-libs` option.
Summary: This patch is trying to add support for llvm-readobj
--needed-libs option under XCOFF.
For XCOFF, the needed libraries can be found from the Import
File ID Name Table of the Loader Section.
Currently, I am using binary inputs in the test since yaml2obj
does not yet support for writing the Loader Section and the
import file table.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D106643
Lin Sun [Thu, 26 Aug 2021 06:50:17 +0000 (23:50 -0700)]
[Driver][Linux] Fix regression when -DLIBCXX_LIBDIR_SUFFIX=64
This patch allows an installed (`ninja install-clang`) Clang to find
`../lib64/libc++.so`
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D108286
Jan Svoboda [Wed, 25 Aug 2021 16:45:46 +0000 (18:45 +0200)]
[clang][deps] Reset non-modular language and preprocessor options
There are a number of language and preprocessor options that are reset in the `CompilerInvocation` that describes the build of an implicit module. This patch uses the logic for explicit modules as well.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108710
Aart Bik [Thu, 26 Aug 2021 04:01:12 +0000 (21:01 -0700)]
[mlir][sparse] add asCOO() functionality to sparse tensor object
This prepares general sparse to sparse conversions. The code that
needs to be generated using this new feature is now simply:
(1) coo = sparse_tensor_1->asCOO(); // source format1
(2) sparse_tensor_2 = newSparseTensor(coo); // destination format2
By using COO as an intermediate, we can do *all* conversions without
having to implement the full O(N^2) conversion matrix. Note that we
can always improve particular conversions individually if a faster
solution is required.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D108681
Wenlei He [Tue, 24 Aug 2021 16:55:18 +0000 (09:55 -0700)]
[CSSPGO] Add switch for sample loader to honor global pre-inliner decision from llvm-profgen
The change adds a switch to allow sample loader to use global pre-inliner's decision instead. The pre-inliner in llvm-profgen makes inline decision globally based on whole program profile and function byte size as cost proxy.
Since pre-inliner also adjusts/merges context profile based on its inline decision, honoring its inline decision in sample loader would lead to better post-inline profile quality especially for thinlto where cross module profile merging isn't possible without pre-inliner.
Minor fix in profile reader is also included. When pre-inliner is use, we now also turn off the default merging and trimming logic unless it's explicitly asked.
Differential Revision: https://reviews.llvm.org/D108677
Fangrui Song [Wed, 25 Aug 2021 23:59:06 +0000 (16:59 -0700)]
[LLVMgold.so][test] Make comdat-nodeduplicate.ll work with binutils<2.27
Sam Clegg [Wed, 25 Aug 2021 22:13:46 +0000 (18:13 -0400)]
[clang][Emscripten] Define __unix family of macros
This will allow us to remove these from the downstream
driver:
https://github.com/emscripten-core/emscripten/blob/
57270ce8150a5107e591b4e9ec7cbeff0ba7c905/emcc.py#L860-L863
Differential Revision: https://reviews.llvm.org/D108735
Arthur Eubanks [Wed, 25 Aug 2021 23:13:40 +0000 (16:13 -0700)]
[gn build] Unbreak non-clang host builds
eecd5d0a broke non-clang host builds.
Some crt code is not always built with the just-built clang.
0da172b checked if the compiler is clang, not assert that the compiler
is clang.
Alexey Bataev [Wed, 25 Aug 2021 22:54:23 +0000 (15:54 -0700)]
[SLP][NFC]Add a test for non-optimal PHIs vectorization, NFC.
Heejin Ahn [Wed, 25 Aug 2021 10:53:22 +0000 (03:53 -0700)]
[WebAssembly] Use entry block only for initializations in EmSjLj
Emscripten SjLj transformation is done in four steps. This will be
mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4
will be shared and there will be separate way of doing step 2.
1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB
2. Handle `setjmp` callsites
3. Handle `longjmp` callsites
4. Cleanup and update SSA
We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But
if the entry BB contains a `setjmp` call, some `setjmp` handling
transformation will also happen in the entry BB, such as calling
`saveSetjmp`.
This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm
SjLj we will add a dispatch BB that contains a `switch` right after the
entry BB, from which we jump to one of post-`setjmp` BBs. And this
dispatch BB should precede all `setjmp` calls.
Emscripten SjLj (current):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
...
call @saveSetjmp(...)
```
Wasm SjLj (follow-up):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
setjmp.dispatch:
...
; Jump to the right post-setjmp BB, if we are returning from a
; longjmp. If this is the first setjmp call, go to %entry.split.
switch i32 %no, label %entry.split [
i32 1, label %post.setjmp1
i32 2, label %post.setjmp2
...
i32 N, label %post.setjmpN
]
entry.split:
...
call @saveSetjmp(...)
```
So in Wasm SjLj we split the entry BB to make the entry block only for
`setjmpTable` and `setjmpTableSize` initialization and insert a
`setjmp.dispatch` BB. (This part is not in this CL. This will be a
follow-up.) But note that Emscripten SjLj and Wasm SjLj share all
steps except for the step 2. If we only split the entry BB only for Wasm
SjLj, there will be one more `if`-`else` and the code will be more
complicated.
So this CL splits the entry BB in Emscripten SjLj and put only
initialization stuff there as follows:
Emscripten SjLj (this CL):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
br %entry.split
entry.split:
...
call @saveSetjmp(...)
```
This is just done to share code with Wasm SjLj. It adds an unnecessary
branch but this will be removed in later optimization passes anyway.
This is in effect NFC, meaning the program behavior will not change, but
existing ll tests files have changed because the entry block was split.
The reason I upload this in a separate CL is to make the Wasm SjLj diff
tidier, because this changes many existing Emscripten SjLj tests, which
can be confusing for the follow-up Wasm SjLj CL.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108729
Heejin Ahn [Wed, 25 Aug 2021 03:40:21 +0000 (20:40 -0700)]
[WebAssembly] Extract longjmp handling in EmSjLj to a function (NFC)
Emscripten SjLj and (soon-to-be-added) Wasm SjLj transformation share
many steps:
1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB
2. Handle `setjmp` callsites
3. Handle `longjmp` callsites
4. Cleanup and update SSA
1, 3, and 4 are identical for Emscripten SjLj and Wasm SjLj. Only the
step 2 is different. This CL extracts the current Emscripten SjLj's
longjmp callsites handling into a function. The reason to make this a
separate CL is, without this, the diff tool cannot compare things well
in the presence of moved code and added code in the followup Wasm SjLj
CL, and it ends up mixing them together, making the diff unreadable.
Also fixes some typos and variable names. So far we've been calling the
buffer argument to `setjmp` and `longjmp` `jmpbuf`, but the name used in
the man page for those functions is `env`, so updated them to be
consistent.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108728
Dimitry Andric [Wed, 25 Aug 2021 21:08:13 +0000 (23:08 +0200)]
[libc++][NFC] Remove duplicate ranges entry in CMakeLists.txt.
The second entry got added accidentally as part of
5a3309f825769.
Reviewed By: cjdb
Differential Revision: https://reviews.llvm.org/D108726
Reid Kleckner [Wed, 25 Aug 2021 18:34:00 +0000 (11:34 -0700)]
Effectively revert
33c3d8a916c / D33782
This change would treat the token `or` in system headers as an
identifier, and elsewhere as an operator. As reported in
llvm.org/pr42427, many users classify their third party library headers
as "system" headers to suppress warnings. There's no clean way to
separate Windows SDK headers from user headers.
Clang is still able to parse old Windows SDK headers if C++ operator
names are disabled. Traditionally this was controlled by
`-fno-operator-names`, but is now also enabled with `/permissive` since
D103773. This change will prevent `clang-cl` from parsing <query.h> from
the Windows SDK out of the box, but there are multiple ways to work
around that:
- Pass `/clang:-fno-operator-names`
- Pass `/permissive`
- Pass `-DQUERY_H_RESTRICTION_PERMISSIVE`
In all of these modes, the operator names will consistently be available
or not available, instead of depending on whether the code is in a
system header.
I added a release note for this, since it may break straightforward
users of the Windows SDK.
Fixes PR42427
Differential Revision: https://reviews.llvm.org/D108720
Vitaly Buka [Wed, 25 Aug 2021 21:33:06 +0000 (14:33 -0700)]
[sanitizer] Add new line to the test
Vitaly Buka [Wed, 25 Aug 2021 21:30:53 +0000 (14:30 -0700)]
[sanitizer] Fix VReport of symbol version
Version is already a string and does not need stringizing.
Vitaly Buka [Wed, 25 Aug 2021 21:24:02 +0000 (14:24 -0700)]
[sanitizers] Basic realpath test
Craig Topper [Wed, 25 Aug 2021 21:17:38 +0000 (14:17 -0700)]
[RISCV] Fix the check prefixes in some B extension tests. NFC
Looks like a bad merge happened after these were renamed in
D107992.
Ricky Taylor [Wed, 25 Aug 2021 19:48:28 +0000 (20:48 +0100)]
[M68k][NFC] Rename M68kOperand::Kind to KindTy
Rename the M68kOperand::Type enumeration to KindTy to avoid ambiguity
with the Kind field when referencing enumeration values e.g.
`Kind::Value`.
This works around a compilation error under GCC 5, where GCC won't
lookup enum class values if you have a similarly named field
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60994).
The error in question is:
`M68kAsmParser.cpp:857:8: error: 'Kind' is not a class, namespace, or enumeration`
Differential Revision: https://reviews.llvm.org/D108723
Heejin Ahn [Wed, 25 Aug 2021 02:43:28 +0000 (19:43 -0700)]
[WebAssembly] Rename wasm.catch.exn intrinsic back to wasm.catch
The plan was to use `wasm.catch.exn` intrinsic to catch exceptions and
add `wasm.catch.longjmp` intrinsic, that returns two values (setjmp
buffer and return value), later to catch longjmps. But because we
decided not to use multivalue support at the moment, we are going to use
one intrinsic that returns a single value for both exceptions and
longjmps. And even if it's not for that, I now think the naming of
`wasm.catch.exn` is a little weird, because the intrinsic can still take
a tag immediate, which means it can be used for anything, not only
exceptions, as long as that returns a single value.
This partially reverts D107405.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108683
Vitaly Buka [Wed, 25 Aug 2021 20:34:19 +0000 (13:34 -0700)]
Revert "Problem with realpath interceptor"
Breaks realpath(, nullptr) for all sanitizers.
Somehow INTERCEPT_FUNCTION and INTERCEPT_FUNCTION_VER return
false even if everything seemingly right.
And this is the issue for COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN.
There is a check in every sanitlizer:
if (!INTERCEPT_FUNCTION_VER(name, ver) && !INTERCEPT_FUNCTION(name))
For non-versioned interceptors when INTERCEPT_FUNCTION returns false
it's not considered fatal, and it just prints a warning.
However INTERCEPT_FUNCTION_VER in this case will fallback to
INTERCEPT_FUNCTION replacing realpath with wrong version.
We need to investigate that before relanding the patch.
This reverts commit
faef0d042f523357fe5590e7cb6a8391cf0351a8.
Omar Emara [Wed, 25 Aug 2021 20:54:49 +0000 (13:54 -0700)]
[LLDB][GUI] Add initial searcher support
This patch adds a new type of reusable UI components. Searcher Windows
contain a text field to enter a search keyword and a list of scrollable
matches are presented. The target match can be selected and executed
which invokes a user callback to do something with the match.
This patch also adds one searcher delegate, which wraps the common
command completion searchers for simple use cases.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D108545
Andrea Di Biagio [Wed, 25 Aug 2021 20:24:58 +0000 (21:24 +0100)]
[X86][MCA] Add more tests for MULX (PR51495).
llvm-mca still reports a wrong latency for the case where
the two destination registers of MULX are the same.
Justas Janickas [Wed, 25 Aug 2021 20:20:06 +0000 (21:20 +0100)]
[OpenCL][NFC] Fix code example in __remove_address_space documentation.
Sanjay Patel [Wed, 25 Aug 2021 20:09:51 +0000 (16:09 -0400)]
[DAGCombiner] create binop nodes with all of expected values
This is another bug exposed by https://llvm.org/PR51612
(and the one that triggered the initial assertion) in the report.
That example was suppressed with:
985b48f18341
...but these would still crash because we created nodes
like UADDO without the expected 2 output values.
Alfonso Sánchez-Beato [Wed, 25 Aug 2021 20:03:32 +0000 (23:03 +0300)]
[llvm-objcopy] [COFF] Consider section flags when adding section
The --set-section-flags option was being ignored when adding a new
section. Take it into account if present.
Fixes https://llvm.org/PR51244
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D106942
Florian Hahn [Wed, 25 Aug 2021 19:39:33 +0000 (20:39 +0100)]
[ConstraintElimination] Add test cases with @llvm.assume.
Tobias Gysi [Wed, 25 Aug 2021 19:27:42 +0000 (19:27 +0000)]
[mlir][linalg] Tune hasTensorSemantics/hasBufferSemantics methods.
Optimize performance by iterating all operands at once.
Reviewed By: benvanik
Differential Revision: https://reviews.llvm.org/D108716
LLVM GN Syncbot [Wed, 25 Aug 2021 19:14:11 +0000 (19:14 +0000)]
[gn build] Port
fe01014faa33
Patrick Holland [Sun, 22 Aug 2021 00:37:02 +0000 (17:37 -0700)]
[MCA] Moved View.h and View.cpp from /tools/llvm-mca/ to /lib/MCA/.
Moved View.h and View.cpp from /tools/llvm-mca/Views/ to /lib/MCA/ and
/include/llvm/MCA/. This is so that targets can define their own Views within
the /lib/Target/ directory (so that the View can use backend functionality).
To enable these Views within mca, targets will need to add them to the vector of
Views returned by their target's CustomBehaviour::getViews() methods.
Differential Revision: https://reviews.llvm.org/D108520
Nick Desaulniers [Wed, 25 Aug 2021 19:10:27 +0000 (12:10 -0700)]
[llvm][test][CodeGen] fix up D106030
Fixes missing -mtriple from llc tests, which were failing on non-x86
hosts.
Fixes: D106030
Reviewed By: arsenm, aaron.ballman
Differential Revision: https://reviews.llvm.org/D108718
David Green [Wed, 25 Aug 2021 19:10:18 +0000 (20:10 +0100)]
[ARM] Add Extra FpToIntSat tests.
This adds extra MVE vector fptosi.sat and fptoui.sat tests, along with
adding or adjusting the existing scalar tests to cover more
architectures and instruction combinations.
Tobias Gysi [Wed, 25 Aug 2021 18:43:41 +0000 (18:43 +0000)]
[mlir][linalg] Tune getTiedIndexingMap method (NFC).
Optimize the performance by using the range directly.
Reviewed By: benvanik
Differential Revision: https://reviews.llvm.org/D108715
Nico Weber [Tue, 24 Aug 2021 14:19:21 +0000 (10:19 -0400)]
[lld/COFF] Improve handling of the /manifestdependency: flag
If multiple /manifestdependency: flags are passed, they are
naively deduped, but after that each of them should have an
effect, instead of just the last one.
Also, /manifestdependency: flags are allowed in .drectve sections
(from `#pragma comment(linker, ...`). To make the interaction between
/manifestdependency: flags enabling manifest by default but
/manifest:no overriding this work, add an explict ManifestKind::Default
state to represent no explicit /manifest flag being passed.
To make /manifestdependency: flags from input file .drectve sections
work with /manifest:embed, delay embedded manifest emission until
after input files have been read.
Differential Revision: https://reviews.llvm.org/D108628
Richard Smith [Wed, 25 Aug 2021 18:01:45 +0000 (11:01 -0700)]
PR51105: look through ConstantExpr when looking for a braced string literal initialization.
Aart Bik [Wed, 25 Aug 2021 04:53:34 +0000 (21:53 -0700)]
[mlir][sparse] add sparse-dense cases to storage integration test
Reviewed By: grosul1
Differential Revision: https://reviews.llvm.org/D108685
Arthur Eubanks [Wed, 25 Aug 2021 18:29:41 +0000 (11:29 -0700)]
[test] Precommit some tests for invariant group icmps
Sanjay Patel [Wed, 25 Aug 2021 17:44:22 +0000 (13:44 -0400)]
[DAGCombiner] check uses more strictly on select-of-binop fold
There are 2 bugs here:
1. We were not checking uses of operand 2 (the false value of the select).
2. We were not checking for multiple uses of nodes that produce >1 result.
Correcting those is enough to avoid the crash in the reduced test based on:
https://llvm.org/PR51612
The additional use check on operand 0 (the condition value of the select)
should not strictly be necessary because we are only replacing one use
with another (whether it makes performance sense to do the transform with
that pattern is not clear). But as noted in the TODO, changing that
uncovers another bug.
Note: there's at least one more bug here - we aren't propagating EVTs
correctly, but I plan to fix that in another patch.
Arthur Eubanks [Wed, 25 Aug 2021 18:03:42 +0000 (11:03 -0700)]
[test] Use update_test_checks on llvm/test/Transforms/InstCombine/invariant.group.ll
Michael Kruse [Wed, 25 Aug 2021 17:06:42 +0000 (12:06 -0500)]
[test] Fix indention. NFC.
Michael Kruse [Wed, 25 Aug 2021 16:31:53 +0000 (11:31 -0500)]
[Preprocessor] Elide empty line(s) at start of file.
In -P mode, PrintPPOutputPPCallbacks::MoveToLine started at least one
newline if current and target line number mismatched. The method is also
called when entering a new file, be it the main file or an include file.
In this situation line numbers always almost mismatch, resulting in a
newline for each occurance even if no tokens have been printed
in-between.
Empty lines at the beginning of the output must be trimmed because it
may be parsed by scripts expecting the result to appear on the first
output line, as done by LibreOffice's configure script.
Fix by only emitting a newline if tokens have been printed so far using
the EmittedTokensOnThisLine flag. Also adding a test case of FileChanged
callbacks occuring with empty include files.
This fixes llvm.org/PR51616
Nick Desaulniers [Wed, 25 Aug 2021 17:18:13 +0000 (10:18 -0700)]
[Clang] add support for error+warning fn attrs
Add support for the GNU C style __attribute__((error(""))) and
__attribute__((warning(""))). These attributes are meant to be put on
declarations of functions whom should not be called.
They are frequently used to provide compile time diagnostics similar to
_Static_assert, but which may rely on non-ICE conditions (ie. relying on
compiler optimizations). This is also similar to diagnose_if function
attribute, but can diagnose after optimizations have been run.
While users may instead simply call undefined functions in such cases to
get a linkage failure from the linker, these provide a much more
ergonomic and actionable diagnostic to users and do so at compile time
rather than at link time. Users instead may be able use inline asm .err
directives.
These are used throughout the Linux kernel in its implementation of
BUILD_BUG and BUILD_BUG_ON macros. These macros generally cannot be
converted to use _Static_assert because many of the parameters are not
ICEs. The Linux kernel still needs to be modified to make use of these
when building with Clang; I have a patch that does so I will send once
this feature is landed.
To do so, we create a new IR level Function attribute, "dontcall" (both
error and warning boil down to one IR Fn Attr). Then, similar to calls
to inline asm, we attach a !srcloc Metadata node to call sites of such
attributed callees.
The backend diagnoses these during instruction selection, while we still
know that a call is a call (vs say a JMP that's a tail call) in an arch
agnostic manner.
The frontend then reconstructs the SourceLocation from that Metadata,
and determines whether to emit an error or warning based on the callee's
attribute.
Link: https://bugs.llvm.org/show_bug.cgi?id=16428
Link: https://github.com/ClangBuiltLinux/linux/issues/1173
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106030
Akira Hatanaka [Wed, 25 Aug 2021 16:55:50 +0000 (09:55 -0700)]
[Sema][ObjC] Allow conversions between pointers to ObjC pointers and
pointers to structs
clang was just being conservative and trying to prevent users from
messing up the qualifier on the inner pointer type. Lifting this
restriction enables using some of the libc++ templates with ObjC pointer
arguments, which clang currently rejects.
rdar://
79018677
Differential Revision: https://reviews.llvm.org/D107021
Nathan Sidwell [Fri, 9 Jul 2021 14:57:10 +0000 (07:57 -0700)]
[X86] pr51000 in-register struct return tailcalling
In-register structure returns are not special, and handled by lowering
to multiple-value tuples. We can tail-call from non-sret fns to
structure-returning functions, except on i686 where the sret pointer
is callee-pop.
Differential Revision: https://reviews.llvm.org/D105807
Arthur Eubanks [Wed, 25 Aug 2021 17:12:51 +0000 (10:12 -0700)]
[gn build] Add missing dependency required by
832aae73
Stanislav Mekhanoshin [Tue, 3 Aug 2021 21:50:10 +0000 (14:50 -0700)]
[AMDGPU] Avoid assert for saved FP
With spilling into AGPRs enabled we cannot reliably predict
if we need to save FP or not. We may finally spill everything
into AGPRs and never touch stack. In this case we still may
save FP. This is deficiency but not an error, so avoid the
assert.
Differential Revision: https://reviews.llvm.org/D107404
Alexey Bataev [Wed, 25 Aug 2021 14:27:03 +0000 (07:27 -0700)]
[SLP]No need to schedule/check parent for extract{element/value} instruction.
The instruction extractelement/extractvalue are not required to
be scheduled since they only depend on the source vector/aggregate (with
constant indices), smae applies to the parent basic block checks.
Improves compile time and saves scheduling budget.
Differential Revision: https://reviews.llvm.org/D108703
Rong Xu [Wed, 25 Aug 2021 16:07:34 +0000 (09:07 -0700)]
[SampleFDO] Set ProfileIsFS bit properly from the internal option
We have "-profile-isfs" internal option for text, binary, and
compactbinary format (mostly for debug and test purpose). We
need to set the related flag in FunctionSamples so that ProfileIsFS
is written to the header in extbinary format.
Differential Revision: https://reviews.llvm.org/D108707
Wenlei He [Thu, 19 Aug 2021 04:09:49 +0000 (21:09 -0700)]
[CSSPGO] Use probe inline tree to track zero size fully optimized context for pre-inliner
This is a follow up diff for BinarySizeContextTracker to track zero size for fully optimized inlinee. When an inlinee is fully optimized away, we won't be able to get its size through symbolizing instructions, hence we will treat the corresponding context size as unknown. However by traversing the inlined probe forest, we know what're original inlinees regardless of optimization. If a context show up in inlined probes, but not during symbolization, we know that it's fully optimized away hence its size is zero instead of unknown. It should provide more accurate size cost estimation for pre-inliner to make better inline decisions in llvm-profgen.
Differential Revision: https://reviews.llvm.org/D108350
Kazu Hirata [Wed, 25 Aug 2021 15:59:12 +0000 (08:59 -0700)]
[Transforms] Remove SplitCriticalEdge (NFC)
These functions have not been in use for at least one year.
Kirill Stoimenov [Tue, 24 Aug 2021 20:23:47 +0000 (20:23 +0000)]
[asan] Implemented intrinsic for the custom calling convention similar used by HWASan for X86.
The implementation uses the int_asan_check_memaccess intrinsic to instrument the code. The intrinsic is replaced by a call to a function which performs the access check. The generated function names encode the input register name as a number using Reg - X86::NoRegister formula.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D107850
alex-t [Thu, 15 Jul 2021 16:43:56 +0000 (19:43 +0300)]
[AMDGPU] Divergence-driven compare operations instruction selection
Description: This change enables the compare operations to be selected to SALU/VALU form
dependent of the SDNode divergence flag.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D106079
Neumann Hon [Wed, 25 Aug 2021 15:24:02 +0000 (11:24 -0400)]
[SystemZ] [NFC] Replace SpecialRegisters field with a unique_ptr instead of a raw pointer.
This patch replaces the SpecialRegisters field with a unique_ptr instead of a raw pointer. This is better practice, and allows us to remove the definition of the dtor for the SystemZSubtarget class.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D108639
Balazs Benics [Wed, 25 Aug 2021 14:47:13 +0000 (16:47 +0200)]
Revert "Revert "[analyzer] Ignore IncompleteArrayTypes in getStaticSize() for FAMs""
This reverts commit
df1f4e0cc6ec9a734aae41ffd48ee8b2007fcabb.
Now the test case explicitly specifies the target triple.
I decided to use x86_64 for that matter, to have a fixed
bitwidth for `size_t`.
Aside from that, relanding the original changes of:
https://reviews.llvm.org/D105184
Andrea Di Biagio [Wed, 25 Aug 2021 13:53:45 +0000 (14:53 +0100)]
[X86][SchedModel] Fix latency the Hi register write of MULX (PR51495).
Before this patch, WriteIMulH reported a latency value which is correct for the
RR variant of MULX, but not for the RM variant.
This patch fixes the issue by introducing a new WriteIMulHLd, which is meant to
be used only by the RM variant of MULX.
Differential Revision: https://reviews.llvm.org/D108701
Vyacheslav Zakharin [Tue, 24 Aug 2021 23:19:49 +0000 (16:19 -0700)]
[CodeExtractor] Preserve topological order for the return blocks.
Differential Revision: https://reviews.llvm.org/D108673
Jon Chesterfield [Wed, 25 Aug 2021 14:53:47 +0000 (15:53 +0100)]
[openmp] Delete unused grid value field, missed from D108380
Thomas Johnson [Tue, 24 Aug 2021 18:40:04 +0000 (14:40 -0400)]
[ARC] Add ADC (addition with carry) and SBC (subtraction with carry) instructions
Differential Revision: https://reviews.llvm.org/D108672
Balazs Benics [Wed, 25 Aug 2021 14:43:25 +0000 (16:43 +0200)]
Revert "[analyzer] Ignore IncompleteArrayTypes in getStaticSize() for FAMs"
This reverts commit
360ced3b8fd2cfb9f2a26deb739e6c381e98b9a5.
Nicholas Guy [Mon, 16 Aug 2021 13:10:21 +0000 (14:10 +0100)]
[AArch64] Generate SMOV in place of sext(fmov(...))
A single smov instruction is capable of moving from a vector register while performing
the sign-extend during said move, rather than each step being performed by separate instructions.
Differential Revision: https://reviews.llvm.org/D108633
Balazs Benics [Wed, 25 Aug 2021 14:12:17 +0000 (16:12 +0200)]
[analyzer] Ignore IncompleteArrayTypes in getStaticSize() for FAMs
Currently only `ConstantArrayType` is considered for flexible array
members (FAMs) in `getStaticSize()`.
However, `IncompleteArrayType` also shows up in practice as FAMs.
This patch will ignore the `IncompleteArrayType` and return Unknown
for that case as well. This way it will be at least consistent with
the current behavior until we start modeling them accurately.
I'm expecting that this will resolve a bunch of false-positives
internally, caused by the `ArrayBoundV2`.
Reviewed By: ASDenysPetrov
Differential Revision: https://reviews.llvm.org/D105184
Jon Chesterfield [Wed, 25 Aug 2021 14:09:46 +0000 (15:09 +0100)]
[libomptarget][amdgpu][nfc] Make grid value access match devicertl
Jeremy Morse [Wed, 25 Aug 2021 13:56:05 +0000 (14:56 +0100)]
[DebugInfo][InstrRef] Don't use instr-ref for unoptimised functions
InstrRefBasedLDV is marginally slower than VarlocBasedLDV when analysing
optimised code -- however, it's much slower when analysing code compiled
-O0.
To avoid this: don't use instruction referencing for -O0 functions. In the
"pure" case of unoptimised code, this won't really harm the debugging
experience because most variables won't have been promoted off the stack,
so can't go missing. It becomes more complicated when optimised code is
inlined into functions marked optnone; however these are rare, and as -O0
doesn't run many optimisations there should be little damage to the debug
experience as a result.
I've taken the opportunity to refactor testing for instruction-referencing
into a MachineFunction method, which seems the most appropriate place to
put it.
Differential Revision: https://reviews.llvm.org/D108585
Jon Chesterfield [Wed, 25 Aug 2021 13:57:50 +0000 (14:57 +0100)]
[libomptarget][amdgpu] Refactor debug printing
Move most debug printing in rtl.cpp behind DP() macro
Adjust the print output for gpu arch mismatch when the architectures match
Convert an assert into graceful failure
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108562
Joe Nash [Tue, 24 Aug 2021 18:40:04 +0000 (14:40 -0400)]
[AMDGPU] Support global_atomic_fmin/max on gfx10
Makes patterns added for gfx90a usable with the gfx10 versions of the
insts.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D108654
Change-Id: I86167bf6b4823f975f74ccb619bd6190331ba16b
Andrea Di Biagio [Wed, 25 Aug 2021 13:15:17 +0000 (14:15 +0100)]
[X86][NFC] Pre-commit llvm-mca tests for PR51495.
WriteIMulH reports an incorrect latency for RM variants of MULX.
Louis Dionne [Tue, 24 Aug 2021 15:58:36 +0000 (11:58 -0400)]
[libc++] Assume that compilers support extended constexpr in C++14 mode
We don't support any compiler that doesn't support C++14 constexpr when
compiling in C++14 mode anymore, so we can just assume that we have C++14
extended constexpr when compiling in C++14 mode. This allows us to remove
some workarounds for older compilers.
Differential Revision: https://reviews.llvm.org/D108638
Florian Hahn [Wed, 25 Aug 2021 10:58:49 +0000 (11:58 +0100)]
[LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks.
Support for peeling with multiple exit blocks was added in D63921/
77bb3a486fa6.
So far it has only been enabled for loops where all non-latch exits are
'de-optimizing' exits (D63923). But peeling of multi-exit loops can be
highly beneficial in other cases too, like if all non-latch exiting
blocks are unreachable.
The motivating case are loops with runtime checks, like the C++ example
below. The main issue preventing vectorization is that the invariant
accesses to load the bounds of B is conditionally executed in the loop
and cannot be hoisted out. If we peel off the first iteration, they
become dereferenceable in the loop, because they must execute before the
loop is executed, as all non-latch exits are terminated with
unreachable. This subsequently allows hoisting the loads and runtime
checks out of the loop, allowing vectorization of the loop.
int sum(std::vector<int> *A, std::vector<int> *B, int N) {
int cost = 0;
for (int i = 0; i < N; ++i)
cost += A->at(i) + B->at(i);
return cost;
}
This gives a ~20-30% increase of score for Geekbench5/HDR on AArch64.
Note that this requires a follow-up improvement to the peeling cost
model to actually peel iterations off loops as above. I will share that
shortly.
Also, peeling of multi-exits might be beneficial for exit blocks with
other terminators, but I would like to keep the scope limited to known
high-reward cases for now.
I removed the option to disable peeling for multi-deopt exits because
the code is more general now. Alternatively, the option could also be
generalized, but I am not sure if there's much value in the option?
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D108108
Dawid Jurczak [Wed, 25 Aug 2021 11:13:18 +0000 (13:13 +0200)]
[LoopIdiom] Don't transform loop into memmove when load from body has more than one use
This change fixes issue found by Markus: https://reviews.llvm.org/rG11338e998df1
Before this patch following code was transformed to memmove:
for (int i = 15; i >= 1; i--) {
p[i] = p[i-1];
sum += p[i-1];
}
However load from p[i-1] is used not only by store to p[i] but also by sum computation.
Therefore we cannot emit memmove in loop header.
Differential Revision: https://reviews.llvm.org/D107964
Jan Kuehle [Wed, 25 Aug 2021 12:11:43 +0000 (14:11 +0200)]
[clang-format] Support TypeScript override keyword
TypeScript 4.3 added a new "override" keyword for class members. This
lets clang-format know about it, so it can format code using it
properly.
Reviewed By: krasimir
Differential Revision: https://reviews.llvm.org/D108692
Peilin Guo [Wed, 25 Aug 2021 11:31:00 +0000 (19:31 +0800)]
[DAGCombine] Check the legality of the index of EXTRACT_SUBVECTOR
For ISD::EXTRACT_SUBVECTOR, its second operand must be a constant
multiple of the known-minimum vector length of the result type.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D107795
Jeremy Morse [Wed, 25 Aug 2021 11:04:59 +0000 (12:04 +0100)]
[DebugInfo][InstrRef] Avoid stack-slot-coloring changing codegen due to DI
Stack slot colouring adds "weight" to slots if a non-dbg-value instruction
refers to it. This, unfortunately, means that DBG_PHI instructions can have
an effect on codegen. The fix is very simple, replace isDebugValue with
isDebugInstr.
The regression test contains a scenario that reproduces this problem; I've
represented both normal-debug mode and instr-ref debug mode instructions
in comment lines prefixed with AAAAAA and BBBBBB, and un-comment them with
sed to test that the two different modes produce the same behaviour.
Differential Revision: https://reviews.llvm.org/D108627
River Riddle [Wed, 25 Aug 2021 09:26:56 +0000 (09:26 +0000)]
[mlir][AttrTypeGen] Add support for specifying a "accessor" type of a parameter
This allows for using a different type when accessing a parameter than the
one used for storage. This allows for returning parameters by reference,
enables using more optimized/convient reference results, and more.
Differential Revision: https://reviews.llvm.org/D108593
River Riddle [Wed, 25 Aug 2021 09:26:39 +0000 (09:26 +0000)]
[mlir] Update DialectAsmParser::parseString to use std::string instead of StringRef
This allows for parsing strings that have escape sequences, which require constructing
a string (as they can't be represented by looking at the Token contents directly).
Differential Revision: https://reviews.llvm.org/D108589
River Riddle [Wed, 25 Aug 2021 09:26:23 +0000 (09:26 +0000)]
[mlir] Move the Operation use iteration utilities to ResultRange
This allows for iterating and interacting with the uses of a specific subset of
results as opposed to just the full range.
Differential Revision: https://reviews.llvm.org/D108586
Jean Perier [Wed, 25 Aug 2021 09:15:22 +0000 (11:15 +0200)]
[flang] Implement Posix version of DATE_AND_TIME runtime
Use gettimeofday and localtime_r to implement DATE_AND_TIME intrinsic.
The Windows version fallbacks to the "no date and time information
available" defined by the standard (strings set to blanks and values to
-HUGE).
The implementation uses an ifdef between windows and the rest because
from my tests, the SFINAE approach leads to undeclared name bogus errors
with clang 8 that seems to ignore failure to instantiate is not an error
for the function names (i.e., it understands it should not instantiate
the version using gettimeofday if it is not there, but still yields an
error that it is not declared on the spot where it is called in the
uninstantiated version).
Differential Revision: https://reviews.llvm.org/D108622
Jan Svoboda [Wed, 25 Aug 2021 08:56:15 +0000 (10:56 +0200)]
[clang][deps] Ensure deterministic order of TU '-fmodule-file=' arguments
Translation units with multiple direct modular dependencies trigger a non-deterministic ordering in `clang-scan-deps`. This boils down to usage of `std::unordered_map`, which gets replaced by `std::map` in this patch.
Depends on D103526.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103807
Rosie Sumpter [Tue, 24 Aug 2021 16:04:35 +0000 (17:04 +0100)]
[LoopFlatten] Add statistic for number of loops flattened. NFC
Differential Revision: https://reviews.llvm.org/D108644
Tres Popp [Mon, 23 Aug 2021 17:24:12 +0000 (19:24 +0200)]
[mlir] Add assertion in NamedAttrList to prevent adding null attributes
Differential Revision: https://reviews.llvm.org/D108570
LLVM GN Syncbot [Wed, 25 Aug 2021 09:02:08 +0000 (09:02 +0000)]
[gn build] Port
48958d02d294