Jessica Paquette [Wed, 28 Sep 2022 23:20:24 +0000 (16:20 -0700)]
[AArch64][GlobalISel] Make G_PTRTOINT only legal for s64 + p0
A few issues:
1. There was no legalizer test for G_PTRTOINT
2. Same clamping issue as in many other opcodes
3. AArch64 pointers can only be 64b, so in reality we always have to trunc or
extend with any size other than p0 anyway.
This seems to actually produce more correct selection for narrow types as well.
Differential Revision: https://reviews.llvm.org/D107588
Philip Reames [Wed, 28 Sep 2022 22:47:25 +0000 (15:47 -0700)]
[RISCV] Add test coverage for upcoming select lowering optimization
Test copied from X86 backend since I'm going to be taking the code from there too.
Jessica Paquette [Wed, 28 Sep 2022 23:01:19 +0000 (16:01 -0700)]
[AArch64][GlobalISel] Implement custom legalization for s32/s64 G_FCOPYSIGN
This is intended to be equivalent to the s32 + s64 cases in
AArch64TargetLowering::LowerFCOPYSIGN.
Widen everything and then use G_BIT + a mask to handle the actual copysign
operation. Then, narrow back down to s32/s64.
I wasn't sure about what the best/most canonical INSERT_SUBREG-selectable
pattern is. I chose G_INSERT_VECTOR_ELT + an undef vector because it produces
reasonably okay codegen. (It doesn't produce INSERT_SUBREG right now though.)
If there's a better way to do this then I'm happy to change it.
We also have a couple codegen deficiencies with how we emit vector constants
right now. (We need a GISel equivalent to the tryAdvSIMDModImm64 stuff)
Differential Revision: https://reviews.llvm.org/D108725
Rafael Auler [Wed, 28 Sep 2022 22:59:58 +0000 (15:59 -0700)]
[PERF2BOLT] Fix unittest failure
Fix failure caused by commit
e549ac072b "Do not issue parsing error on
weird build ids".
Florian Mayer [Wed, 28 Sep 2022 00:46:53 +0000 (17:46 -0700)]
[MTE] [HWASan] unify isInterestingAlloca
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D134779
Jessica Paquette [Wed, 28 Sep 2022 22:48:35 +0000 (15:48 -0700)]
[AArch64][GlobalISel] Add a target-specific G_BIT opcode.
This is necessary for custom-legalizing G_FCOPYSIGN.
This is equivalent to the BIT instruction (bitwise insert if true).
Add selection testcases for imported patterns.
Differential Revision: https://reviews.llvm.org/D108714
Jessica Paquette [Tue, 27 Sep 2022 21:26:37 +0000 (14:26 -0700)]
[llvm-remarkutil] Add an option to print out function sizes
This adds an `instruction-count` command to llvm-remarkutil.
```
llvm-remarkutil instruction-count --parser=<bitstream|yaml> <file>
```
This will, for now, only print out asm-printer `InstructionCount` remarks.
Frequently I need to find out things like "what are the top 10 largest
functions" in a given project.
This makes it so we can find that information quickly and easily from any
format of remarks.
I chose a CSV because I usually want to stick these into a spreadsheet, and
the data is two-dimensional.
In the future, we may want to change this to another format if we add more
complicated data.
Differential Revision: https://reviews.llvm.org/D134765
Jessica Paquette [Wed, 28 Sep 2022 22:43:26 +0000 (15:43 -0700)]
[GlobalISel] Add isConstFalseVal helper to Utils
Add a utility function which returns true if the given value is a constant
false value.
This is necessary to port one of the compare simplifications in
TargetLowering::SimplifySetCC.
Differential Revision: https://reviews.llvm.org/D91754
Greg Clayton [Fri, 23 Sep 2022 00:54:06 +0000 (17:54 -0700)]
Track which modules have debug info variable errors.
Now that we display an error when users try to get variables, but something in the debug info is preventing variables from showing up, track this with a new bool in each module's statistic information named "debugInfoHadVariableErrors".
This patch modifies the code to track when we have variable errors in a module and adds accessors to get/set this value. This value is used in the module statistics and we added a test to verify this value gets set correctly.
Differential Revision: https://reviews.llvm.org/D134508
Greg Clayton [Wed, 21 Sep 2022 03:58:08 +0000 (20:58 -0700)]
When there are variable errors, display an error in VS Code's local variables view.
After recent diffs that enable variable errors that stop variables from being correctly displayed when debugging, allow users to see these errors in the LOCALS variables in the VS Code UI. We do this by detecting when no variables are available and when there is an error to be displayed, and we add a single variable named "<error>" whose value is a string error that the user can read. This allows the user to be aware of the reason variables are not available and fix the issue. Previously if someone enabled "-gline-tables-only" or was debugging with DWARF in .o files or with .dwo files and those separate object files were missing or they were out of date, the user would see nothing in the variables view. Communicating these errors to the user is essential to a good debugging experience.
Differential Revision: https://reviews.llvm.org/D134333
Benjamin Kramer [Wed, 28 Sep 2022 22:12:46 +0000 (00:12 +0200)]
Rafael Auler [Fri, 23 Sep 2022 01:32:48 +0000 (18:32 -0700)]
[PERF2BOLT] Do not issue parsing error on weird build ids
In weird entries we were issueing a parse error. For example, in line 5 here:
6862acc063b0aa86595f52ff81628577df4296ff a.so
6862acc063b0aa86595f52ff81628577df4296ff a.so
6862acc063b0aa86595f52ff81628577df4296ff a.so
db758cb3c970044e78d5a4c99b011708a9995636 bin1
60326683eab31acfd03435d9ed4ff9a8 bin2
7d448e51851b4bdb33eac84f90e74628a14a5f00 b.so
742aa26e0211794356cc25f415c25230a26aa045 c.so
Error reading BOLT data input file: line 89, column 33: malformed field
Fix that.
Reviewed By: #bolt, Amir
Differential Revision: https://reviews.llvm.org/D134822
Fangrui Song [Wed, 28 Sep 2022 21:32:26 +0000 (14:32 -0700)]
[ELF] Remove unused Symbol::getSymbolSize. NFC
Kirsten Lee [Wed, 28 Sep 2022 19:18:24 +0000 (12:18 -0700)]
[mlir][transform] Add multi-buffering to the transform dialect
Add the plumbing necessary to call the memref dialect's multiBuffer
function. This will allow separation between choosing which buffers
to multi-buffer and the actual transform.
Alter the multibuffer function to return the newly created
allocation if multi-buffering succeeds. This is necessary to
communicate with the transform dialect hooks what allocation
multi-buffering created.
Reviewed By: ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D133985
LLVM GN Syncbot [Wed, 28 Sep 2022 20:36:01 +0000 (20:36 +0000)]
[gn build] Port
e61d89efd78b
Daniel Thornburgh [Tue, 23 Aug 2022 20:39:33 +0000 (13:39 -0700)]
[NFC] [Object] Create library to fetch debug info by build ID.
This creates a library for fetching debug info by build ID, whether
locally or remotely via debuginfod. The functionality was refactored
out of existing code in the Symboliize library. Existing utilities
were refactored to use this library.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D132504
Mahesh Ravishankar [Wed, 7 Sep 2022 20:25:17 +0000 (20:25 +0000)]
[mlir][TilingInterface] NFC Refactor of tile and fuse using `TilingInterface`.
This patch refactors the tiling and tile + fuse implementation using
`TilingInterface`. Primarily, it exposes the functionality as simple
utility functions instead of as a Pattern to allow calling it from a
pattern as it is done in the test today or from within the transform
dialect (in the future). This is a step towards deprecating similar
methods in Linalg dialect.
- The utility methods do not erase the root operations.
- The return value provides the values to use for replacements.
Differential Revision: https://reviews.llvm.org/D134144
Xiang Li [Mon, 5 Sep 2022 08:02:01 +0000 (01:02 -0700)]
[DirectX backend] Support global ctor for DXILBitcodeWriter.
1. Save typed pointer type for GlobalVariable/Function instead of the ObjectType.
This will allow use GlobalVariable/Function as value.
2. Save target type for global ctors for Constant.
3. In DXILBitcodeWriter::getTypeID, check PointerMap first for Constant case.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D133283
Arthur Eubanks [Wed, 28 Sep 2022 20:13:40 +0000 (13:13 -0700)]
[test][msan] Pin varg.cpp to -fno-sanitize-memory-param-retval
Should fix https://lab.llvm.org/buildbot#builders/19/builds/12736
Stanislav Mekhanoshin [Tue, 27 Sep 2022 15:16:20 +0000 (08:16 -0700)]
[AMDGPU] Move SIModeRegisterDefaults to SI MFI
It does not belong to a general AMDGPU MFI.
Differential Revision: https://reviews.llvm.org/D134666
Fangrui Song [Wed, 28 Sep 2022 20:11:31 +0000 (13:11 -0700)]
[ELF] Refactor Symbol initialization and overwriting
Symbol::replace intends to overwrite a few fields (mostly Elf{32,64}_Sym
fields), but the implementation copies all fields then restores some old fields.
This is error-prone and wasteful. Add Symbol::overwrite to copy just the
needed fields and add other overwrite member functions to copy the extra
fields.
Nico Weber [Wed, 28 Sep 2022 19:40:50 +0000 (15:40 -0400)]
try to fix build yet more after
16544cbe64b8
Nico Weber [Wed, 28 Sep 2022 19:35:43 +0000 (15:35 -0400)]
try to fix build more after
16544cbe64b8
Martin Sebor [Wed, 28 Sep 2022 19:24:54 +0000 (13:24 -0600)]
[clang] handle extended integer constant expressions in _Static_assert (PR #57687)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D134311
Nico Weber [Wed, 28 Sep 2022 19:18:15 +0000 (15:18 -0400)]
try to fix build after
16544cbe64b8
Yonghong Song [Mon, 26 Sep 2022 14:08:24 +0000 (07:08 -0700)]
[clang][DebugInfo] Emit debuginfo for non-constant case value
Currently, clang does not emit debuginfo for the switch stmt
case value if it is an enum value. For example,
$ cat test.c
enum { AA = 1, BB = 2 };
int func1(int a) {
switch(a) {
case AA: return 10;
case BB: return 11;
default: break;
}
return 0;
}
$ llvm-dwarfdump test.o | grep AA
$
Note that gcc does emit debuginfo for the same test case.
This patch added such a support with similar implementation
to CodeGenFunction::EmitDeclRefExprDbgValue(). With this patch,
$ clang -g -c test.c
$ llvm-dwarfdump test.o | grep AA
DW_AT_name ("AA")
$
Differential Revision: https://reviews.llvm.org/D134705
serge-sans-paille [Wed, 28 Sep 2022 16:39:23 +0000 (18:39 +0200)]
[iwyu] Move <cmath> out of llvm/Support/MathExtras.h
Interestingly, MathExtras.h doesn't use <cmath> declaration, so move it out of
that header and include it when needed.
No functional change intended, but there's no longer a transitive include
fromMathExtras.h to cmath.
serge-sans-paille [Wed, 28 Sep 2022 14:53:43 +0000 (16:53 +0200)]
[iwyu] Move <iostream> out of llvm/DebugInfo/Symbolize/Markup.h header
It's only used in the implementation. No functional change intended.
Aiden Grossman [Wed, 28 Sep 2022 18:18:50 +0000 (18:18 +0000)]
[MLGO] Add per-instruction MBB frequencies to regalloc dev features
This commit adds in two new features to the ML regalloc eviction
analysis that can be used in ML models, a vector of MBB frequencies and
a vector of indicies mapping instructions to their corresponding basic
blocks. This will allow for further experimentation with per-instruction
features and give a lot more flexibility for future experimentation over
how we're extracting MBB frequency data currently.
Reviewed By: mtrofin, jacobhegna
Differential Revision: https://reviews.llvm.org/D134166
Jay Foad [Wed, 28 Sep 2022 14:20:08 +0000 (15:20 +0100)]
[ISel] Fix DAG divergence after new FMA combine
D132837 introduced a new DAG combine that used MorphNodeTo to morph an
FMUL into an FMA. It turns out that MorphNodeTo does not properly update
the divergence bit for users of the morphed node, causing an assertion
failure on the new test case:
llc: SelectionDAG.cpp:10486: void llvm::SelectionDAG::VerifyDAGDivergence(): Assertion `calculateDivergence(N) == N->isDivergent() && "Divergence bit inconsistency detected"' failed.
Fixing MorphNodeTo to propagate the divergence bit is tricky because of
the way it is used to select machine instructions, so use getNode and
ReplaceAllUsesOfValueWith instead.
Differential Revision: https://reviews.llvm.org/D134810
Aaron Ballman [Wed, 28 Sep 2022 18:33:52 +0000 (14:33 -0400)]
Repairing the release notes
A code block was separated from its release note, so this re-associates
them again. It also adds an example code block to another potentially
breaking change entry.
Qiongsi Wu [Wed, 28 Sep 2022 18:10:47 +0000 (14:10 -0400)]
[LTO][AIX] Invoking AIX System Assembler in LTO CodeGen
This patch teaches LTOCodeGenerator to call into the AIX system assembler to generate object files. This is in contrast to the approach taken on other platforms, where the LTOCodeGenerate calls the integrated assembler to generate object files. We need to rely on the system assembler because the integrated assembler is incomplete at the moment.
Reviewed By: w2yehia, MaskRay
Differential Revision: https://reviews.llvm.org/D134375
Aaron Ballman [Wed, 28 Sep 2022 18:21:30 +0000 (14:21 -0400)]
Fix a tautological comparison bug caught during post-commit
This amends
fd874e5fb119e1d9f427a299ffa5bbabaeba9455 to correctly set
the bit width of a '!' operator to be the same width as an 'int'. This
fixes a failed assertion about unexpected bit widths that was reported
during post-commit testing.
rkayaith [Wed, 28 Sep 2022 16:53:28 +0000 (12:53 -0400)]
[mlir] Use 'GEN_PASS_DECL' for pass declarations
Most dialects are using a single a header for all their passes, switch
them over to using `GEN_PASS_DECL` instead of the individual macros.
Reviewed By: jpienaar, mehdi_amini, mscuttari
Differential Revision: https://reviews.llvm.org/D134814
Mingming Liu [Mon, 19 Sep 2022 00:33:09 +0000 (17:33 -0700)]
[SimplifyCFG][TranformUtils]Do not simplify away a trivial basic block if both this block and at least one of its predecessors are loop latches.
- Before this patch, loop metadata (if exists) will override the metadata of each predecessor; if the predecessor block already has loop metadata, the orignal loop metadata won't be preserved and could cause missed loop transformations (see 'test2' in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll).
To illustrate how inner-loop metadata might be dropped before this patch:
CFG Before
entry
|
v
---> while.cond -------------> while.end
| |
| v
| while.body
| |
| v
| for.body <---- (md1)
| | |______|
| v
| while.cond.exit (md2)
| |
|_______|
CFG After
entry
|
v
---> while.cond.rewrite -------------> while.end
| |
| v
| while.body
| |
| v
| for.body <---- (md2)
|_______| |______|
Basically, when 'while.cond.exit' is folded into 'while.cond', 'md2' overrides 'md1' and 'md1' is dropped from the CFG.
Differential Revision: https://reviews.llvm.org/D134152
Aart Bik [Wed, 28 Sep 2022 00:06:20 +0000 (17:06 -0700)]
[mlir][sparse] add "sort" to the compress op codegen
This revision also adds convenience methods to test the
dim level type/property (with the codegen being first client)
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D134776
Aaron Ballman [Wed, 28 Sep 2022 17:38:57 +0000 (13:38 -0400)]
Speculatively fix the lldb build
This should fix the issues found by:
https://lab.llvm.org/buildbot/#/builders/68/builds/40172
Fangrui Song [Wed, 28 Sep 2022 17:39:31 +0000 (10:39 -0700)]
[ELF] Symbols: remove isPlaceholder() test for Defined/CommonSymbol. NFC
Aaron Ballman [Wed, 28 Sep 2022 17:25:58 +0000 (13:25 -0400)]
[C2x] implement typeof and typeof_unqual
This implements WG14 N2927 and WG14 N2930, which together define the
feature for typeof and typeof_unqual, which get the type of their
argument as either fully qualified or fully unqualified. The argument
to either operator is either a type name or an expression. If given a
type name, the type information is pulled directly from the given name.
If given an expression, the type information is pulled from the
expression. Recursive use of these operators is allowed and has the
expected behavior (the innermost operator is resolved to a type, and
that's used to resolve the next layer of typeof specifier, until a
fully resolved type is determined.
Note, we already supported typeof in GNU mode as a non-conforming
extension and we are *not* exposing typeof_unqual as a non-conforming
extension in that mode, nor are we exposing typeof or typeof_unqual as
a nonconforming extension in other language modes. The GNU variant of
typeof supports a form where the parentheses are elided from the
operator when given an expression (e.g., typeof 0 i = 12;). When in C2x
mode, we do not support this extension.
Differential Revision: https://reviews.llvm.org/D134286
Huan Nguyen [Wed, 28 Sep 2022 17:25:52 +0000 (19:25 +0200)]
[BOLT] Disable -lite when split function is present
In lite mode, BOLT only transforms a subset of functions, leave the
remaining functions intact.
For NoPIC, it is fine. BOLT can scan relocations and fix-up all refs
that point to any function body in the subset.
For no-split function PIC, it is fine. Since jump tables are intra-
procedural transfer, BOLT can find both the jump table base and the
target within same function. Thus, BOLT can update and/or move jump
tables.
However, it is wrong to process a subset of functions in split function
PIC. This is because BOLT does not know if functions in the subset are
isolated, i.e., cannot be accessed by functions out of the subset,
especially via split jump table.
For example, BOLT only process three functions A, B and C. Suppose that
A is reached via jump table from A.cold, which is not processed. When
A is moved (due to optimization), the jump table in A.cold is invalid.
We cannot fix-up this jump table since it is only recognized in A.cold,
which BOLT does not process.
Solution: Disable lite mode if split function is present.
Future improvement: In lite mode, if split function is found, BOLT
processes both functions in the subset and all of their sibling
fragments.
Test Plan:
```
ninja check-bolt
```
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D131283
Craig Topper [Wed, 28 Sep 2022 16:54:05 +0000 (09:54 -0700)]
[RISCV][SelectionDAGBuilder] Fix crash when copying a v1f32 vector between basic blocks.
On a rv64 without f32 or vector support, this will be passed across
the basic block as an i64. We need use i32 as an intermediate type
with bitcast and anyext/trunc.
Fixes PR58025
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D134758
Slava Zakharin [Thu, 22 Sep 2022 22:45:10 +0000 (15:45 -0700)]
[flang][runtime] Fixed identity value for REAL(16) == __float128.
std::numeric_limits<__float128>::max/lowest return 0.0, so recreate
value of FLT128_MAX ourselves to avoid using quadmath.h's FLT128_MAX
that is currently causes warnings with GCC -Wpedantic.
Differential Revision: https://reviews.llvm.org/D134496
Baptiste [Wed, 28 Sep 2022 16:59:30 +0000 (12:59 -0400)]
[AMDGPU] Avoid flushing the vmcnt counter in loop preheaders if not necessary
One of the conditions to flush the vmcnt counter in loop preheaders is: The loop
contains a use of a vgpr that is defined out of the loop. The code currently
checks if a waitcnt is needed by looking at the score of that vgpr in the score
brackets. This is not enough and may cause the generation of an unnecessary
vmcnt flush. This patch fixes that case.
Differential Revision: https://reviews.llvm.org/D130313
Matt Arsenault [Tue, 20 Sep 2022 21:57:40 +0000 (17:57 -0400)]
AtomicExpand: Add some more overaligned atomic tests
Matt Arsenault [Tue, 20 Sep 2022 19:29:47 +0000 (15:29 -0400)]
AtomicExpand: Use llvm.ptrmask instead of ptrtoint
This removes the ptrtoint from the load's pointer operand, although we
can't entirely eliminate these to get the LSB shift. In a future
patch, this will avoid ptrtoint in the case where the atomic is
overaligned to the word size.
Nicolas Lesser [Wed, 7 Sep 2022 00:33:54 +0000 (20:33 -0400)]
[C++2a] P0634r3: Down with typename!
This patch implements P0634r3 that removes the need for 'typename' in certain contexts.
For example,
```
template <typename T>
using foo = T::type; // ok
```
This is also allowed in previous language versions as an extension, because I think it's pretty useful. :)
Reviewed By: #clang-language-wg, erichkeane
Differential Revision: https://reviews.llvm.org/D53847
Alan Zhao [Sat, 24 Sep 2022 00:09:30 +0000 (17:09 -0700)]
Add missing `struct` keyword to the test p2-2.cpp
While working on D53847, I noticed that this test would fail once we
started recognizing the types in the modified `export` statement [0].
The tests would fail because Clang would emit a "declaration does not
declare anything" diagnostic instead of the expected namespace scope
diagnostic.
I believe that the test is currently incorrectly passing because Clang
doesn't parse the type and therefore doesn't treat the statement as a
declaration. My understanding is that the intention of this test case is
that it wants to export a `struct` type, which I believe requires a
`struct` keyword, even for types with template parameters. With this
change, the only error with these two statements should be the
namespace scope issue.
[0]: https://reviews.llvm.org/D53847?id=462032#inline-1297053
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D134578
Aaron Ballman [Wed, 28 Sep 2022 16:37:12 +0000 (12:37 -0400)]
Moving some C papers around on the status page; NFC
These three are basically related to the TS 18661 integration, so now
they're grouped there.
Arthur Eubanks [Tue, 27 Sep 2022 00:41:37 +0000 (17:41 -0700)]
[clang][msan] Turn on -fsanitize-memory-param-retval by default
This eagerly reports use of undef values when passed to noundef
parameters or returned from noundef functions.
This also decreases binary sizes under msan.
To go back to the previous behavior, pass `-fno-sanitize-memory-param-retval`.
Reviewed By: vitalybuka, MaskRay
Differential Revision: https://reviews.llvm.org/D134669
bipmis [Wed, 28 Sep 2022 16:32:47 +0000 (17:32 +0100)]
[AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below
1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)
The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.
Fix the error reported on reverse load merge.
Differential Revision: https://reviews.llvm.org/D127392
Katherine Rasmussen [Thu, 25 Aug 2022 21:33:59 +0000 (14:33 -0700)]
[flang] Add co_broadcast to the list of intrinsics
Add the collective subroutine, co_broadcast, to the list
of intrinsic subroutines. Add co_broadcast to the check
for coindexed objects for the first, third, and fourth dummy
arguments. Update the co_broadcast semantics test.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D134786
eopXD [Wed, 28 Sep 2022 10:04:29 +0000 (03:04 -0700)]
[RISCV][CodeGen][NFC] Add fixed vector type test cases for llvm.round.*
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D134799
bipmis [Wed, 28 Sep 2022 16:07:17 +0000 (17:07 +0100)]
remove LE,BE labels inserted incorrectly
Sanjay Patel [Wed, 28 Sep 2022 15:22:52 +0000 (11:22 -0400)]
[InstCombine] fold select shuffles with shared operand together
We don't combine generic shuffles together in IR, but select
shuffles are a special-case because a select shuffle of a
select shuffle is just another select shuffle; codegen is
expected to efficiently lower those (select shuffles are also
the canonical form of a vector select with constant condition).
Sanjay Patel [Tue, 27 Sep 2022 22:00:59 +0000 (18:00 -0400)]
[InstCombine] add tests for shuffle-of-shuffle; NFC
bipmis [Wed, 28 Sep 2022 15:50:51 +0000 (16:50 +0100)]
Add reverse load tests to test load combine patch
Nathan Sidwell [Wed, 14 Sep 2022 17:42:34 +0000 (10:42 -0700)]
[clang][DR2621] using enum NAME lookup fix
Although using-enum's grammar is 'using elaborated-enum-specifier',
the lookup for the enum is ordinary lookup (and not the tagged-type
lookup that normally occurs wth an tagged-type specifier). Thus (a)
we can find typedefs and (b) do not find enum tags hidden by a non-tag
name (the struct stat thing).
This reimplements that part of using-enum handling, to address DR2621,
where clang's behaviour does not match std intent (and other
compilers).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D134283
rkayaith [Tue, 27 Sep 2022 21:23:59 +0000 (17:23 -0400)]
[mlir] Add macro for enabling all generated pass declarations
Currently the generated pass declarations have to be enabled per-pass
using multiple `GEN_PASS_DECL_{PASSNAME}` defines. This adds
`GEN_PASS_DECL`, which enables the declarations for all passes in the
group with a single macro. This is convenient for cases where a single
header is used for all passes in the group.
Reviewed By: mehdi_amini, mscuttari
Differential Revision: https://reviews.llvm.org/D134766
Jon Chesterfield [Wed, 28 Sep 2022 15:30:01 +0000 (16:30 +0100)]
[amdgpu] Error, instead of miscompile, anonymous kernels using lds
The association between kernel and struct is done by symbol name.
This doesn't work robustly for anonymous kernels as shown by the modified
test case.
An alternative association between function and struct can be constructed
if necessary, probably though metadata, but on the basis that we currently
miscompile anonymous kernels and that they are difficult to construct from
application code and difficult to call from the runtime, this patch makes
it a fatal error for now.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D134741
Sameer Sahasrabuddhe [Tue, 27 Sep 2022 05:19:56 +0000 (10:49 +0530)]
[AAPointerInfo] OffsetInfo: Unassigned is distinct from Unknown
A User like the PHINode may be visited multiple times for the same pointer along
different def-use edges. The uninitialized state of OffsetInfo at the first
visit needs to be distinct from the Unknown value that may be assigned after
processing the PHINode. Without that, a PHINode with all inputs Unknown is never
followed to its uses. This results in incorrect optimization because some
interfering accessess are missed.
Differential Revision: https://reviews.llvm.org/D134704
Benjamin Kramer [Wed, 28 Sep 2022 14:53:45 +0000 (16:53 +0200)]
Revert "[FunctionAttrs] Infer precise FMRB"
This reverts commit
97dfa536260c434e68913129d79d863b26c1c179.
It can make DSE crash. Reduced test case at
https://reviews.llvm.org/P8291
Daniel Bertalan [Wed, 28 Sep 2022 14:23:18 +0000 (16:23 +0200)]
[lld-macho] Don't create entries in isecPriorities during sorting (NFC)
If a value for a given key is not present, `DenseMap::operator[]`
default-constructs one, which is wasteful when we don't do anything with
it afterwards. Fix it by calling `lookup()` instead which only returns
the default value, but does not modify the map.
This speeds up linking a fair bit when only a small portion of all
sections are specified in the order file, like in the case of Chromium
Framework:
N Min Max Median Avg Stddev
x 25 3.727684 3.8808699 3.753552 3.7702461 0.0397282
+ 25 3.6469049 3.7523289 3.6764321 3.6841622 0.
025525047
Difference at 95.0% confidence
-0.0860839 +/- 0.0189924
-2.28324% +/- 0.503745%
(Student's t, pooled s = 0.0333906)
Differential Revision: https://reviews.llvm.org/D134811
Matt Arsenault [Tue, 27 Sep 2022 03:07:49 +0000 (23:07 -0400)]
AMDGPU: Make various vector undefs legal
Surprisingly these were getting legalized to something
zero initialized.
This fixes an infinite loop when combining some vector types.
Also fixes zero initializing some undef values.
SimplifyDemandedVectorElts / SimplifyDemandedBits are not checking
for the legality of the output undefs they are replacing unused
operations with. This resulted in turning vectors into undefs
that were later re-legalized back into zero vectors.
Matt Devereau [Thu, 28 Jul 2022 09:18:18 +0000 (09:18 +0000)]
[AArch64][SVE] Expand gather index to 32 bits instead of 64 bits
For gathers which load in 8 and 16 bit data then use that data
as an index, the index can be extended to 32 bits instead of
64 bits
Differential Revision: https://reviews.llvm.org/D130692
Krzysztof Drewniak [Tue, 27 Sep 2022 15:41:09 +0000 (15:41 +0000)]
[mlir] Use hip's config mode to find libraries
Instead of using find_package(HIP) to find FindHIP.cmake, which
doesn't seem to be the preferred way to find HIP anymore, use
find_package(hip CONFIG) to find the HIP configuration. Give
preference to ${ROCM_PATH} over ${ROCM_PATH}/hip in order to handle
the fact that newer ROCm versions prefer the include path to use
${ROCM_PATH}/include/hip over ${ROCM_PATH}/hip/innclude/hip (the
latter throws up a bunch of deprecation warnings)
Then, instead of trying to manually find the host-side headers and
runtime library by hand, use the hip::host and hip::amdhip64 libraries
that the config module defines.
This makes the CMake config much less error-prone and brings it in
line with the recommended approach to finding HIP.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D134753
Florian Hahn [Wed, 28 Sep 2022 14:35:12 +0000 (15:35 +0100)]
Revert "[AARCH64][CostModel] Modified the cost of mask vector load/store"
This reverts commit
1c62af3e23cab41074f7ce0ba86a93bea82b99b9.
The commit causes the test below to fail. Revert for now to get the bots
back to green.
Failing test:
lvm/test/Transforms/LoopVectorize/AArch64/masked-op-cost.ll
Nikita Popov [Tue, 27 Sep 2022 10:55:35 +0000 (12:55 +0200)]
[cmake] Export GetHostTriple.cmake
GetHostTriple is used by the runtimes build, so this cmake file
must be exported. Otherwise it is not possible to build runtimes
against a previously built LLVM.
Differential Revision: https://reviews.llvm.org/D134730
Florian Hahn [Wed, 28 Sep 2022 14:20:25 +0000 (15:20 +0100)]
[AArch64] break non-temporal loads over 256 into 256-loads and a smaller load
Currently over 256 non-temporal loads are broken inefficently. For example, `v17i32` gets broken into 2 128-bit loads. It is better if we can use
256-bit loads instead.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D133421
Jon Chesterfield [Wed, 28 Sep 2022 13:55:14 +0000 (14:55 +0100)]
[amdgpu][nfc] Allocate kernel-specific LDS struct deterministically
A kernel may have an associated struct for laying out LDS variables.
This patch puts that instance, if present, at a deterministic address by
allocating it at the same time as the module scope instance.
This is relatively likely to be where the instance was allocated anyway (~NFC)
but will allow later patches to calculate where a given field can be found,
which means a function which is only reachable from a single kernel will be
able to access a LDS variable with zero overhead. That will be particularly
helpful for applications that instantiate a function template containing LDS
variables once per kernel.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127052
Archibald Elliott [Mon, 26 Sep 2022 15:26:30 +0000 (16:26 +0100)]
[AArch64] Correct v9.x-a Features
A change to D109517 during review stated it was disabling the crypto
extensions by default in armv9a, but it also ended up removing two other
non-crypto features: i8mm and bf16. This error was also made in D116158.
This patch re-adds those two extensions to the feature bitmaps for the
affected armv9a versions in the target parser.
Differential Revision: https://reviews.llvm.org/D134647
Archibald Elliott [Fri, 23 Sep 2022 17:41:58 +0000 (18:41 +0100)]
[ARM] Support fp16/bf16 using t constraint
fp16 and bf16 values can be used in GCC's inline assembly using the "t"
constraint, which means "VFP floating-point registers s0-s31" - fp16 and
bf16 values are stored in S registers too.
This change ensures that LLVM is compatible with GCC for programs that
use fp16 and the 't' constraint.
Fixes #57753
Differential Revision: https://reviews.llvm.org/D134553
David Truby [Mon, 26 Sep 2022 13:30:20 +0000 (13:30 +0000)]
[flang] Use libm over pgmath for complex number intrinsics
This patch changes the handling of complex number intrinsics that
have C libm equivalents to call into those instead of calling the
external pgmath library.
Currently complex numbers to integer powers are excluded as libm
has no powi equivalent function.
Differential Revision: https://reviews.llvm.org/D134655
Anubhab Ghosh [Wed, 28 Sep 2022 01:51:42 +0000 (07:21 +0530)]
[llvm-jitlink] Remove JITLinkSlabAllocator class
This class was used for testing JITLink with -noexec option and
also included slab allocation support. Its functionality has been
replaced with InProcessDeltaMapper and MapperJITLinkMemoryManager.
Differential Revision: https://reviews.llvm.org/D134781
David Spickett [Fri, 23 Sep 2022 14:55:14 +0000 (14:55 +0000)]
[LLDB] Remove the bool + RegisterInfo& version of GetRegisterInfo
All callers have been converted to the optional version.
Depends on D134540
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D134541
David Spickett [Fri, 23 Sep 2022 14:47:06 +0000 (14:47 +0000)]
[LLDB][AArch64] Move instruction emulation to optional GetRegisterInfo
Depends on D134539
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D134540
Louis Dionne [Mon, 19 Sep 2022 18:56:19 +0000 (14:56 -0400)]
[libc++] Remove MSVC tests checked into the libc++ test suite
We should strive to have our own tests, except when there is overwhelming
value in using another standard library's existing tests. The reason is
that it ensures that implementations don't all start relying on the same
interpretation of the Standard.
The unique_ptr tests did not add any test coverage AFAICT, and the
forward_like tests were moved to the style used everywhere in the
libc++ test suite.
Note that I got to this because this actually broke a downstream
configuration where we use -ffreestanding. The signature of main()
was not consistent with the signature we (need to) use everywhere
in the test suite.
Differential Revision: https://reviews.llvm.org/D134767
liqinweng [Wed, 28 Sep 2022 11:12:06 +0000 (19:12 +0800)]
[AARCH64][CostModel] Modified the cost of mask vector load/store
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D134413
Florian Hahn [Wed, 28 Sep 2022 11:08:20 +0000 (12:08 +0100)]
[SCEVExpander] Remove dead Root argument from expandCodeForImpl (NFC).
The argument is unused and can be removed.
Carl Ritson [Wed, 28 Sep 2022 10:46:46 +0000 (19:46 +0900)]
[AMDGPU] Add MIMG NSA threshold configuration attribute
Make MIMG NSA minimum addresses threshold an attribute that can
be set on a function or configured via command line.
This enables frontend tuning which allows increased NSA usage
where beneficial.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D134780
David Spickett [Fri, 23 Sep 2022 14:46:17 +0000 (14:46 +0000)]
[LLDB][MIPS] Move instruction emulation to optional GetRegisterInfo
Depends on D134538
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D134539
David Spickett [Fri, 23 Sep 2022 14:45:37 +0000 (14:45 +0000)]
[LLDB][ARM] Move instruction emulation to optional GetRegisterInfo
Depends on D134537
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D134538
liqinweng [Wed, 28 Sep 2022 09:18:33 +0000 (17:18 +0800)]
[RISCV] Add and update reverse mask tests, NFC
Reviewed By: Jimerlife
Differential Revision: https://reviews.llvm.org/D134520
David Spickett [Fri, 23 Sep 2022 14:42:25 +0000 (14:42 +0000)]
[LLDB] Move MIPS64/PPC64 and misc. to optional GetRegisterInfo
Depends on D134536
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D134537
Florian Hahn [Wed, 28 Sep 2022 10:33:42 +0000 (11:33 +0100)]
[LoopDeletion] Forget block and loop dispositions after deleting loop.
After deleting a loop, the block and loop dispositions need to be
cleared. As we don't know which SCEVs in the loop/blocks may be
impacted, completely clear the cache. This should also fix some cases
where deleted loops remained in the LoopDispositions cache.
This fixes a verification failure surfaced by D134531.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D134613
Hui Xie [Tue, 9 Aug 2022 13:56:30 +0000 (14:56 +0100)]
[libc++] implement "pair" section of P2321R2 `zip`
Differential Revision: https://reviews.llvm.org/D131495
Simon Pilgrim [Wed, 28 Sep 2022 10:03:29 +0000 (11:03 +0100)]
[SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds
Workaround for a chromium bug reported on D134605 - test case will be added later
Kristof Beyls [Tue, 27 Sep 2022 14:40:49 +0000 (16:40 +0200)]
Document use of Co-author-by git tag.
We are already using the Co-author-by git tag, but don't have documentation in
our developer policy about it. Fix that.
Differential Revision: https://reviews.llvm.org/D134740
Alvin Wong [Wed, 28 Sep 2022 09:46:27 +0000 (12:46 +0300)]
[lldb][COFF] Map symbols without base+complex type as 'Data' type
Both LLD and GNU ld write global/static variables to the COFF symbol
table with `IMAGE_SYM_TYPE_NULL` and `IMAGE_SYM_DTYPE_NULL` type. Map
these symbols as 'Data' type in the symtab to allow these symbols to be
used in expressions and printable.
Reviewed By: labath, DavidSpickett
Differential Revision: https://reviews.llvm.org/D134585
Alvin Wong [Wed, 28 Sep 2022 09:45:38 +0000 (12:45 +0300)]
[lldb][COFF] Add note to forwarder export symbols in symtab
Forwarder exports do not point to a real function or variable. Instead
they point to a string describing which DLL and symbol to forward to.
Any imports which uses them will be redirected by the loader
transparently. These symbols do not have much use in LLDB, but keep them
just in case someone find it useful. Also set a synthesized name with
the forwarder string for informational purpose.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134518
Alvin Wong [Wed, 28 Sep 2022 09:45:23 +0000 (12:45 +0300)]
[lldb][COFF] Load absolute symbols from COFF symbol table
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134517
Alvin Wong [Wed, 28 Sep 2022 09:44:03 +0000 (12:44 +0300)]
[lldb][COFF] Match symbols from COFF symbol table to export symbols
If a symbol is the same as an export symbol, mark it as 'Additional' to
prevent the duplicated symbol from being repeated in some commands (e.g.
`disas -n func`). If the RVA is the same but exported with a different
name, only synchronize the symbol types.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134426
Alvin Wong [Wed, 28 Sep 2022 09:43:14 +0000 (12:43 +0300)]
[lldb][COFF] Improve info of symbols from export table
- Skip dummy/invalid export symbols.
- Make the export ordinal of export symbols visible when dumping the
symtab.
- Stop setting the 'Debug' flag and set the 'External' flag instead to
better match the meaning of export symbols.
- Try to guess the type (code vs data) of the symbol from section flags.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134265
Alvin Wong [Wed, 28 Sep 2022 09:40:37 +0000 (12:40 +0300)]
[lldb][COFF] Rewrite ParseSymtab to list both export and symbol tables
This reimplements `ObjectFilePECOFF::ParseSymtab` to replace the manual
data extraction with what `COFFObjectFile` already provides. Also use
`SymTab::AddSymbol` instead of resizing the SymTab then assigning each
elements afterwards.
Previously, ParseSymTab loads symbols from both the COFF symbol table
and the export table, but if there are any entries in the export table,
it overwrites all the symbols already loaded from the COFF symbol table.
Due to the change to use AddSymbols, this no longer happens, and so the
SymTab now contains all symbols from both tables as expected.
The export symbols are now ordered by ordinal, instead of by the name
table order.
In its current state, it is possible for symbols in the COFF symbol
table to be duplicated by those in the export table. This behaviour will
be modified in a separate change.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134196
Simon Pilgrim [Wed, 28 Sep 2022 09:55:16 +0000 (10:55 +0100)]
Fix MSVC "not all control paths return a value" warning. NFCI.
wanglei [Wed, 28 Sep 2022 09:31:44 +0000 (17:31 +0800)]
[LoongArch] Specify registers used in DWARF exception handling
Defines LoongArch registers for getExceptionPointerRegister() and
getExceptionSelectorRegister().
Differential Revision: https://reviews.llvm.org/D134709
River Riddle [Wed, 28 Sep 2022 09:50:06 +0000 (02:50 -0700)]
[vscode-mlir] Bump to version 0.0.11
Since version 0.10 we've:
* Added support for viewing/editing bytecode files
Muhammad Omair Javaid [Mon, 12 Sep 2022 11:46:04 +0000 (16:46 +0500)]
[LLVM] Fix GetErrcMessages.cmake module for WoA
GetErrcMessages.cmake module makes use of cmake's try_run which by
default builds its sources in debug mode unless configured with
CMAKE_TRY_COMPILE_CONFIGURATION. Debug builds on Windows sometimes fail
when appropraite DLLs are not included in path. Also on Windows on Arm
machines debug builds sometimes fail to link the correct debug DLLs.
To fix this I am setting CMAKE_TRY_COMPILE_CONFIGURATION to active build
configuration of currently configured LLVM project. This makes sure we
select same build type for try_run/try_compile cmake modules as
currently configured LLVM project.
Reviewed By: zero9178
Differential Revision: https://reviews.llvm.org/D133482
Igor Kirillov [Fri, 9 Sep 2022 08:17:07 +0000 (09:17 +0100)]
[LoopVectorize][Fix] Crash when invariant store address is calculated inside loop
Fixes #57572
Generally LICM pass is responsible for sinking out code that calculates
invariant address inside loop as it only needed to be calculated once.
But in rare case it does not happen we will not be vectorizing the
loop.
Differential Revision: https://reviews.llvm.org/D133687
jacquesguan [Tue, 27 Sep 2022 08:37:25 +0000 (16:37 +0800)]
[LegalizeTypes] Use getVectorElementCount to avoid crash of scalable vector.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D134718
Cullen Rhodes [Wed, 28 Sep 2022 08:01:58 +0000 (08:01 +0000)]
[AArch64][SVE] Remove redundant ptest after match/nmatch
These instructions are flag setting so the ptest is redundant, the
TableGen class wasn't setting the element size for the predicate causing
the checks in AArch64InstrInfo::optimizePTestInstr to fail.