Matthias Springer [Mon, 21 Jun 2021 08:45:16 +0000 (17:45 +0900)]
[mlir][NFC] Remove Standard dialect dependency on MemRef dialect
* Remove dependency: Standard --> MemRef
* Add dependencies: GPUToNVVMTransforms --> MemRef, Linalg --> MemRef, MemRef --> Tensor
* Note: The `subtensor_insert_propagate_dest_cast` test case in MemRef/canonicalize.mlir will be moved to Tensor/canonicalize.mlir in a subsequent commit, which moves over the remaining Tensor ops from the Standard dialect to the Tensor dialect.
Differential Revision: https://reviews.llvm.org/D104506
Nikita Popov [Mon, 21 Jun 2021 08:47:14 +0000 (10:47 +0200)]
[Mem2Reg] Use poison for unreachable cases
Use poison instead of undef for cases dealing with unreachable
code. This still leaves the more interesting case of "load from
uninitialized memory" as undef.
Nikita Popov [Mon, 21 Jun 2021 08:47:59 +0000 (10:47 +0200)]
[Mem2Reg] Regenerate test checks (NFC)
Juneyoung Lee [Mon, 21 Jun 2021 07:50:54 +0000 (16:50 +0900)]
[InstCombine] Fold icmp (select c,const,arg), null if icmp arg, null can be simplified
This patch folds icmp (select c,const,arg), null if icmp arg, null can be simplified.
Resolves llvm.org/pr48975.
Reviewed By: nikic, xbolva00
Differential Revision: https://reviews.llvm.org/D96663
Sjoerd Meijer [Thu, 17 Jun 2021 13:43:02 +0000 (14:43 +0100)]
[FuncSpec] Don't specialise functions with NoDuplicate instructions.
getSpecializationCost was returning INT_MAX for a case when specialisation
shouldn't happen, but this wasn't properly checked if specialisation was
forced.
Differential Revision: https://reviews.llvm.org/D104461
Matthias Springer [Mon, 21 Jun 2021 07:29:42 +0000 (16:29 +0900)]
[mlir][linalg] Support low padding in subtensor(pad_tensor) lowering
Differential Revision: https://reviews.llvm.org/D104591
LLVM GN Syncbot [Mon, 21 Jun 2021 07:27:34 +0000 (07:27 +0000)]
[gn build] Port
208332de8abf
Ruiling Song [Mon, 19 Apr 2021 02:45:41 +0000 (10:45 +0800)]
[AMDGPU] Add Optimize VGPR LiveRange Pass.
This pass aims to optimize VGPR live-range in a typical divergent if-else
control flow. For example:
def(a)
if(cond)
use(a)
... // A
else
use(a)
As AMDGPU access vgpr with respect to active-mask, we can mark `a` as
dead in region A. For details, please refer to the comments in
implementation file.
The pass is enabled by default, the frontend can disable it through
"-amdgpu-opt-vgpr-liverange=false".
Differential Revision: https://reviews.llvm.org/D102212
Nicolas Vasilache [Mon, 21 Jun 2021 07:08:02 +0000 (07:08 +0000)]
[mlir][Linalg] NFC - Drop unused variable definition.
Nicolas Vasilache [Wed, 16 Jun 2021 11:39:40 +0000 (11:39 +0000)]
[mlir][Linalg] Introduce a BufferizationAliasInfo (6/n)
This revision adds a BufferizationAliasInfo which maintains and updates information about which tensors will alias once bufferized, which bufferized tensors are equivalent to others and how to handle clobbers.
Bufferization greedily tries to bufferize inplace by:
1. first trying to bufferize SubTensorInsertOp inplace, in reverse order (these are deemed the most expensives).
2. then trying to bufferize all non SubTensorOp / SubTensorInsertOp, in reverse order.
3. lastly trying to bufferize all SubTensorOp in reverse order.
Reverse order is a heuristic that seems to work nicely because structured tensor codegen very often proceeds by:
1. take a subset of a tensor
2. compute on that subset
3. insert the result subset into the full tensor and yield a new tensor.
BufferizationAliasInfo + equivalence sets + clobber analysis allows bufferizing nested
subtensor/compute/subtensor_insert sequences inplace to a certain extent.
To fully realize inplace bufferization, additional container-containee analysis will be necessary and is left for a subsequent commit.
Differential revision: https://reviews.llvm.org/D104110
LLVM GN Syncbot [Mon, 21 Jun 2021 06:23:08 +0000 (06:23 +0000)]
[gn build] Port
80fd5fa5269c
hsmahesha [Mon, 21 Jun 2021 05:25:23 +0000 (10:55 +0530)]
[AMDGPU] Replace non-kernel function uses of LDS globals by pointers.
The main motivation behind pointer replacement of LDS use within non-kernel
functions is - to *avoid* subsequent LDS lowering pass from directly packing
LDS (assume large LDS) into a struct type which would otherwise cause allocating
huge memory for struct instance within every kernel.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103225
Pushpinder Singh [Fri, 18 Jun 2021 09:34:08 +0000 (09:34 +0000)]
[AMDGPU][Libomptarget] Remove redundant functions
There does not seem to be any use of these functions. They just
put the value to a local which is never used again.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D104512
Max Kazantsev [Mon, 21 Jun 2021 06:11:15 +0000 (13:11 +0700)]
[Test] Add some tests showing room for optimization exploiting undef and UB
Nathan Ridge [Mon, 7 Jun 2021 06:53:37 +0000 (02:53 -0400)]
[clangd] Type hints for C++14 return type deduction
Differential Revision: https://reviews.llvm.org/D103789
Esme-Yi [Mon, 21 Jun 2021 05:09:56 +0000 (05:09 +0000)]
[yaml2obj] Add support for writing the long symbol name.
Summary: This patch, as a follow-up of D95505, adds
support for writing the long symbol name by implementing
the StringTable. Only XCOFF32 is suppoted now.
Reviewed By: jhenderson, shchenz
Differential Revision: https://reviews.llvm.org/D103455
Max Kazantsev [Mon, 21 Jun 2021 04:37:06 +0000 (11:37 +0700)]
[LoopDeletion] Handle Phis with similar inputs from different blocks
This patch lifts the requirement to have the only incoming live block
for Phis. There can be multiple live blocks if the same value comes to
phi from all of them.
Differential Revision: https://reviews.llvm.org/D103959
Reviewed By: nikic, lebedev.ri
Zhouyi Zhou [Mon, 21 Jun 2021 00:54:27 +0000 (08:54 +0800)]
[clang] NFC: adjust indentation of statements with more than one lines
Hi,
I think it will be more beautiful to adjust indentation of statements with more than one lines.
In function TreeTransform<Derived>::TransformDependentScopeDeclRefExpr
the second line of statement
NestedNameSpecifierLoc QualifierLoc \newline = getDerived().TransformNestedNameSpecifierLoc(E->getQualifierLoc());
is no more indent than the first line
There is a similar case in function TreeTransform<Derived>::TransformUnresolvedMemberExpr
Also I use clang-format to fix above functions
Thanks alot
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D104145
Nico Weber [Mon, 21 Jun 2021 01:59:11 +0000 (21:59 -0400)]
[lld/mac] Make a variable more local; no behavior change
The variable used to need the wider scope, but doesn't after the
reland. See LC_LINKER_OPTIONS-related discussion on
https://reviews.llvm.org/D104353 for background.
Juneyoung Lee [Sun, 20 Jun 2021 06:00:15 +0000 (15:00 +0900)]
[InstCombine] Use poison constant to represent the result of unreachable instrs
This patch updates InstCombine to use poison constant to represent the resulting value of (either semantically or syntactically) unreachable instrs, or a don't-care value of an unreachable store instruction.
This allows more aggressive folding of unused results, as shown in llvm/test/Transforms/InstCombine/getelementptr.ll .
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D104602
Nico Weber [Sun, 20 Jun 2021 23:39:09 +0000 (19:39 -0400)]
[lld/mac] Test zerofill sections after __thread_bss
Real zerofill sections go after __thread_bss, since zerofill sections
must all be at the end of their segment and __thread_bss must be right
after __thread_data.
Works fine already, but wasn't tested as far as I can tell.
Also tweak comment about zerofill sections a bit.
No behavior change.
Differential Revision: https://reviews.llvm.org/D104609
Eli Friedman [Mon, 21 Jun 2021 00:16:28 +0000 (17:16 -0700)]
[NFC][ScalarEvolution] Clean up ExitLimit constructors.
Make all the constructors forward to one constructor. Remove redundant
assertions.
Jim Lin [Sun, 20 Jun 2021 23:56:22 +0000 (07:56 +0800)]
[IVDescriptors] Fix comment that getUnsafeAlgebraInst has been renamed to getExactFPMathInst
https://reviews.llvm.org/rG36a489d194750dc888f214240e9dec9122ca1f0e renamed the function call
in the test from getUnsafeAlgebraInst to getExactFPMathInst.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D104441
Jez Ng [Sun, 20 Jun 2021 23:49:13 +0000 (19:49 -0400)]
[lld-macho] Have inputOrder default to less than INT_MAX
We make it less than INT_MAX in order not to conflict with the ordering
of zerofill sections, which must always be placed at the end of their
segment.
This is the more structural fix for the issue addressed in {D104596}.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D104607
Fangrui Song [Sun, 20 Jun 2021 21:28:56 +0000 (14:28 -0700)]
[ELF] Delete --no-cref which does not exist in GNU ld
Also delete the single dash form which does not appear to be used.
Fangrui Song [Sun, 20 Jun 2021 21:20:14 +0000 (14:20 -0700)]
[ELF][test] Add missing tests for --no-export-dynamic & --no-warn-backrefs
Dmitri Gribenko [Sun, 20 Jun 2021 20:48:35 +0000 (22:48 +0200)]
[GCOVProfiling][test] Ensure that 'opt' drops any files in a temp directory
Jason Molenda [Sun, 20 Jun 2021 20:12:38 +0000 (13:12 -0700)]
Try to unbreak the windows CI
MSVC and clang seem to disagree with whether I can do this.
Craig Topper [Sun, 20 Jun 2021 02:52:52 +0000 (19:52 -0700)]
[TypePromotion] Prune Intrinsic includes. NFC
TypePromotion is meant to be a generic pass and doesn't reference
any ARM intrinsics so it shouldn't include IntrinsicsARM.h.
The other Intrinsic related headers appear to be unneeded as well.
Jason Molenda [Sun, 20 Jun 2021 19:19:50 +0000 (12:19 -0700)]
Add a corefile style option to process save-core; skinny corefiles
Add a new feature to process save-core on Darwin systems -- for
lldb to create a user process corefile with only the dirty (modified
memory) pages included. All of the binaries that were used in the
corefile are assumed to still exist on the system for the duration
of the use of the corefile. A new --style option to process save-core
is added, so a full corefile can be requested if portability across
systems, or across time, is needed for this corefile.
debugserver can now identify the dirty pages in a memory region
when queried with qMemoryRegionInfo, and the size of vm pages is
given in qHostInfo.
Create a new "all image infos" LC_NOTE for Mach-O which allows us
to describe all of the binaries that were loaded in the process --
load address, UUID, file path, segment load addresses, and optionally
whether code from the binary was executing on any thread. The old
"read dyld_all_image_infos and then the in-memory Mach-O load
commands to get segment load addresses" no longer works when we
only have dirty memory.
rdar://
69670807
Differential Revision: https://reviews.llvm.org/D88387
Nikita Popov [Sat, 19 Jun 2021 07:44:28 +0000 (09:44 +0200)]
[LoopUnroll] Use smallest exact trip count from any exit
This is a more general alternative/extension to D102635. Rather than
handling the special case of "header exit with non-exiting latch",
this unrolls against the smallest exact trip count from any exit.
The latch exit is no longer treated as priviledged when it comes to
full unrolling.
The motivating case is in full-unroll-one-unpredictable-exit.ll.
Here the header exit is an IV-based exit, while the latch exit is
a data comparison. This kind of loop does not get rotated, because
the latch is already exiting, and loop rotation doesn't try to
distinguish IV-based/analyzable latches.
Differential Revision: https://reviews.llvm.org/D102982
Fangrui Song [Sun, 20 Jun 2021 18:55:00 +0000 (11:55 -0700)]
[mlir] Fix -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC
Fangrui Song [Sun, 20 Jun 2021 18:35:01 +0000 (11:35 -0700)]
[lld-link] Fix -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC
Fangrui Song [Sun, 20 Jun 2021 18:09:07 +0000 (11:09 -0700)]
Fix -Wunused-variable and -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC
Michał Górny [Thu, 22 Apr 2021 17:42:52 +0000 (19:42 +0200)]
[lldb] [Process/elf-core] Fix reading NetBSD/i386 core dumps
Add support for extracting basic data from NetBSD/i386 core dumps.
FPU registers are not supported at the moment.
Differential Revision: https://reviews.llvm.org/D101091
David Green [Sun, 20 Jun 2021 16:03:30 +0000 (17:03 +0100)]
[DSE] Remove stores in the same loop iteration
DSE will currently only remove stores in the same block unless they can
be guaranteed to be loop invariant. This expands that to any stores that
are in the same Loop, at the same loop level. This should still account
for where AA/MSSA will not handle aliasing between loops, but allow the
dead stores to be removed where they overlap in the same loop iteration.
It requires adding loop info to DSE, but that looks fairly harmless.
The test case this helps is from code like this, which can come up in
certain matrix operations:
for(i=..)
dst[i] = 0;
for(j=..)
dst[i] += src[i*n+j];
After LICM, this becomes:
for(i=..)
dst[i] = 0;
sum = 0;
for(j=..)
sum += src[i*n+j];
dst[i] = sum;
The first store is dead, and with this patch is now removed.
Differntial Revision: https://reviews.llvm.org/D100464
Sanjay Patel [Sun, 20 Jun 2021 15:26:11 +0000 (11:26 -0400)]
[InstCombine] fold ctpop-of-select with 1 or more constant arms
The general pattern is mentioned in:
https://llvm.org/PR50140
...but we need to do a bit more to handle intrinsics with extra operands
like ctlz/cttz.
Raul Tambre [Sun, 6 Jun 2021 16:02:20 +0000 (19:02 +0300)]
[libcxx] Implement P0883R2 ("Fixing Atomic Initialization")
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D103769
Peter Steinfeld [Sat, 19 Jun 2021 02:18:27 +0000 (19:18 -0700)]
[flang] Implement constant folding for the NOT intrinsic
I implemented constant folding for the NOT intrinsic and added some tests.
Differential Revision: https://reviews.llvm.org/D104587
Sanjay Patel [Sun, 20 Jun 2021 13:41:59 +0000 (09:41 -0400)]
[InstCombine] avoid infinite loops with select folds of constant expressions
This pair of transforms was added recently with:
8591640379ac9175a
And could lead to conflicting folds:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35399
Roman Lebedev [Sun, 20 Jun 2021 10:24:31 +0000 (13:24 +0300)]
[NFC][AArch64][ARM][Thumb][Hexagon] Autogenerate some tests
These all (and some others) are being affected by D104597,
but they are manually-written, which rather complicates
checking the effect that change has on them.
Roman Lebedev [Sun, 20 Jun 2021 10:14:33 +0000 (13:14 +0300)]
[UpdateTestUtils] Print test filename when complaining about conflicting prefix
Now that FileCheck eagerly complains when prefixes are unused,
the update script does the same, and is becoming very common
to need to drop some prefixes, yet figuring out the file
it complains about isn't obvious unless it actually tells us.
Roman Lebedev [Sun, 20 Jun 2021 09:33:57 +0000 (12:33 +0300)]
[SimplifyCFG] FoldTwoEntryPHINode(): don't fold if either block has it's address taken
Same as with HoistThenElseCodeToIf() (
ad87761925c2790aab272138b5bbbde4a93e0383).
Roman Lebedev [Sun, 20 Jun 2021 09:18:15 +0000 (12:18 +0300)]
[SimplifyCFG] HoistThenElseCodeToIf(): don't hoist if either block has it's address taken
This problem is exposed by D104598, after it tail-merges `ret` in
`@test_inline_constraint_S_label`, the verifier would start complaining
`invalid operand for inline asm constraint 'S'`.
Essentially, taking address of a block is mismodelled in IR.
It should probably be an explicit instruction, a first one in block,
that isn't identical to any other instruction of the same type,
so that it can't be hoisted.
Juneyoung Lee [Sun, 20 Jun 2021 06:29:05 +0000 (15:29 +0900)]
[InstSimplify] icmp poison, X -> poison
This adds a simple transformation from icmp with poison constant to poison.
Comparing poison with something else is poison, so this is okay.
https://alive2.llvm.org/ce/z/e8iReb
https://alive2.llvm.org/ce/z/q4MurY
Fangrui Song [Sun, 20 Jun 2021 05:51:20 +0000 (22:51 -0700)]
[llvm-cov gcov] Support GCC 12 format
GCC 12 will change the length field to represent the number of bytes instead of
32-bit words. This avoids padding for strings.
Fangrui Song [Sun, 20 Jun 2021 05:02:14 +0000 (22:02 -0700)]
[llvm-cov gcov] Change case to match the prevailing style && replace getString with readString
Michael Kruse [Sun, 20 Jun 2021 03:23:02 +0000 (22:23 -0500)]
[Flang][test] Fix Windows buildbot.
Add
REQUIRES: shell
to tests that execute a UNIX shell script to not run on Windows.
Fangrui Song [Sat, 19 Jun 2021 23:27:52 +0000 (16:27 -0700)]
[test] Fix nocompress.test
Petr Hosek [Sat, 19 Jun 2021 21:55:32 +0000 (14:55 -0700)]
[profile] Fix variable name
This fixes a bug introduced in
d85c258fd1e7459cc8085b5f364e356f50b490a4.
Fangrui Song [Sat, 19 Jun 2021 21:54:24 +0000 (14:54 -0700)]
[llvm-profdata] Make diagnostics consistent with the (no capitalization, no period) style
The format is currently inconsistent. Use the https://llvm.org/docs/CodingStandards.html#error-and-warning-messages style.
And add `error:` or `warning:` to CHECK lines wherever appropriate.
Petr Hosek [Wed, 19 May 2021 17:14:00 +0000 (10:14 -0700)]
[profile] Don't publish VMO if there are no counters
If there are no counters, there's no need to publish the VMO.
Differential Revision: https://reviews.llvm.org/D102786
Martin Storsjö [Thu, 17 Jun 2021 13:28:47 +0000 (16:28 +0300)]
[LLD] [COFF] Avoid doing repeated fuzzy symbol lookup for each iteration. NFC.
This is run every time around in the main linker loop. Once a match
has been found, stop trying to rematch such a symbol.
Not sure if this has any actual measurable performance impact though
(SymbolTable::findMangle() iterates over the whole symbol table for
each call and does fuzzy matching on top of that) but this makes the
code more reassuring to read at least. (This is in practice run for def
files listing undecorated stdcall functions to be exported.)
Differential Revision: https://reviews.llvm.org/D104529
Martin Storsjö [Fri, 18 Jun 2021 11:29:55 +0000 (14:29 +0300)]
[LLD] [MinGW] Print errors/warnings in lld-link with a "ld.lld" prefix
Pass the original argv[0] to the coff linker, as the coff linker uses
the basename of argv[0] as the log prefix.
This makes error messages to be printed with a "ld.lld:" prefix
instead of "lld-link:". The current "lld-link:" prefix can be confusing
to users, as they're invoking the MinGW linker (and might not even have
a lld-link executable).
Keep the first argument as lld-link when printing the command line, to
make it an actually reproducible standalone command.
Differential Revision: https://reviews.llvm.org/D104526
Fangrui Song [Sat, 19 Jun 2021 19:20:45 +0000 (12:20 -0700)]
[llvm-profdata] Delete unneeded empty output filename check
Craig Topper [Sat, 19 Jun 2021 18:54:59 +0000 (11:54 -0700)]
[RISCV] Prevent formation of shXadd(.uw) and add.uw if it prevents the use of addi.
If the outer add has an simm12 immediate operand we should prefer
it instead of materializing it in a register. This would guarantee
and extra instruction and temporary register. Since we don't check
one use on the shl or zext we might generate more instructions if
there is an additional user.
Roman Lebedev [Sat, 19 Jun 2021 19:04:42 +0000 (22:04 +0300)]
[NFC] AMD Zen 3: fix typo in a comment
Fangrui Song [Sat, 19 Jun 2021 18:36:44 +0000 (11:36 -0700)]
Simplify some typedef struct
Nico Weber [Sat, 19 Jun 2021 17:03:51 +0000 (13:03 -0400)]
[gn build] (manually) port
b9c05aff205b (MIRTests)
Nico Weber [Sat, 19 Jun 2021 14:55:48 +0000 (10:55 -0400)]
[lld/mac] Make sure __thread_ptrs is in front of __thread_bss
The exact location doesn't matter, but it should be in front
of __thread_bss. We put it right in front of __thread_data
which is where ld64 seems to put it as well.
Fixes PR50769.
(As mentioned on the bug, there is probably a more structural
fix too, see comment 5. If we don't address this, it's likely
we'll run into this again with other synthetic sections. But
for now, let's fix the immediate breakage.)
Differential Revision: https://reviews.llvm.org/D104596
Nico Weber [Sat, 19 Jun 2021 13:54:11 +0000 (09:54 -0400)]
[lld/mac] Give __DATA,__thread_ptrs type S_THREAD_LOCAL_VARIABLE_POINTERS
...instead of S_NON_LAZY_SYMBOL_POINTERS. This matches ld64.
Part of PR50769.
While here, also remove an old TODO that was done in D87178.
Differential Revision: https://reviews.llvm.org/D104594
Michael Liao [Wed, 26 May 2021 00:20:52 +0000 (20:20 -0400)]
[MIRPrinter] Add machine metadata support.
- Distinct metadata needs generating in the codegen to attach correct
AAInfo on the loads/stores after lowering, merging, and other relevant
transformations.
- This patch adds 'MachhineModuleSlotTracker' to help assign slot
numbers to these newly generated unnamed metadata nodes.
- To help 'MachhineModuleSlotTracker' track machine metadata, the
original 'SlotTracker' is rebased from 'AbstractSlotTrackerStorage',
which provides basic interfaces to create/retrive metadata slots. In
addition, once LLVM IR is processsed, additional hooks are also
introduced to help collect machine metadata and assign them slot
numbers.
- Finally, if there is any such machine metadata, 'MIRPrinter' outputs
an additional 'machineMetadataNodes' field containing all the
definition of those nodes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D103205
Michael Liao [Fri, 11 Jun 2021 21:21:09 +0000 (17:21 -0400)]
[amdgpu] Improve the from f32 to i64.
- Take the same principle as the conversion from f64 to i64 with extra
necessary pre- and post-processing. It helps to reduce that conversion
sequence by half compared to legacy one.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D104427
Sanjay Patel [Sat, 19 Jun 2021 16:20:45 +0000 (12:20 -0400)]
[InstCombine][test] add tests for select-of-bit-manip; NFC
Saleem Abdulrasool [Fri, 18 Jun 2021 23:00:01 +0000 (16:00 -0700)]
Revert "Re-Revert "DirectoryWatcher: add an implementation for Windows""
This reverts commit
fb32de9e97af0921242a021e30020ffacf7aa6e2.
Remove the secondary synchronization point as noted by Adrian. This is
technically only to make the builders happier about tests and should not
be needed. This also pushes the condition variable setting to after the
watch is actually established (which was the source of the original race
condition, but would normally succeed as the thread shouldn't get put to
sleep immediately on the trigger of the condition variable).
This also was pretested on the chromium builders:
https://ci.chromium.org/ui/p/chromium/builders/try/win_upload_clang/1612/overview.
Tomas Matheson [Fri, 18 Jun 2021 09:57:59 +0000 (10:57 +0100)]
Allow building for release with EXPENSIVE_CHECKS
D97225 moved LazyCallGraph verify() calls behind EXPENSIVE_CHECKS,
but verity() is defined for debug builds only so this had the unintended
effect of breaking release builds with EXPENSIVE_CHECKS.
Fix by enabling verify() for both debug and EXPENSIVE_CHECKS.
Differential Revision: https://reviews.llvm.org/D104514
Tomas Matheson [Thu, 17 Jun 2021 18:11:25 +0000 (19:11 +0100)]
[ARM][NFC] Tidy up subtarget frame pointer routines
getFramePointerReg only depends on information in ARMSubtarget,
so move it in there so it can be accessed from more places.
Make use of ARMSubtarget::getFramePointerReg to remove duplicated code.
The main use of useR7AsFramePointer is getFramePointerReg, so inline it.
Differential Revision: https://reviews.llvm.org/D104476
Melanie Blower [Sat, 19 Jun 2021 11:59:21 +0000 (07:59 -0400)]
Revert "[clang][FPEnv] Clang floatng point model ffp-model=precise enables ffp-contract=on"
This reverts commit
a1449a10dbcfcf353f4f761281c4e572b4ce9308.
Seems like my changes to LNT had no effect -- puzzled.
The 21 tests pass on my sandbox with the clang patch but are
failing in exec time in the bot
LLVM GN Syncbot [Sat, 19 Jun 2021 11:49:56 +0000 (11:49 +0000)]
[gn build] Port
134723edd5bf
Louis Dionne [Thu, 17 Jun 2021 15:30:11 +0000 (11:30 -0400)]
[libcxx] Move all algorithms into their own headers
This is a fairly mechanical change, it just moves each algorithm into
its own header. This is intended to be a NFC.
This commit re-applies
7ed7d4ccb899, which was reverted in
692d7166f771
because the Modules build got broken. The modules build has now been
fixed, so we're re-committing this.
Differential Revision: https://reviews.llvm.org/D103583
Attribution note
----------------
I'm only committing this. This commit is a mix of D103583, D103330 and
D104171 authored by:
Co-authored-by: Christopher Di Bella <cjdb@google.com>
Co-authored-by: zoecarver <z.zoelec2@gmail.com>
Markus Böck [Sat, 19 Jun 2021 11:28:12 +0000 (13:28 +0200)]
[clang-cl] Don't expand /permissive- to /ZC:strictStrings yet
Follow up on rGc70b0e808da8
/Zc:strictStrings is an alias to an option part of the -W group. When the driver tries to render the option back to a string for the cc1 invocation, it sadly gets rendered with the original spelling instead of the alias, causing issues reported here: https://reviews.llvm.org/D103773#inline-989447
I am thinking it's the best to revert this part of the patch until I figured out how to correctly add the arg and until /Zc:strictStrings- exists/is needed.
Melanie Blower [Sat, 19 Jun 2021 10:49:27 +0000 (06:49 -0400)]
[clang][FPEnv] Clang floatng point model ffp-model=precise enables ffp-contract=on
This patch changes the ffp-model=precise to enables -ffp-contract=on
(previously -ffp-model=precise enabled -ffp-contract=fast). This is a
follow-up to Andy Kaylor's comments in the llvm-dev discussion
"Floating Point semantic modes". From the same email thread, I put
Andy's distillation of floating point options and floating point modes
into UsersManual.rst
Differential Revision: https://reviews.llvm.org/D74436
Marius Brehler [Wed, 9 Jun 2021 13:38:10 +0000 (13:38 +0000)]
[mlir] Add EmitC dialect
This upstreams the EmitC dialect and the corresponding Cpp target, both
initially presented with [1], from [2] to MLIR core. For the related
discussion, see [3].
[1] https://reviews.llvm.org/D76571
[2] https://github.com/iml130/mlir-emitc
[3] https://llvm.discourse.group/t/emitc-generating-c-c-from-mlir/3388
Co-authored-by: Jacques Pienaar <jpienaar@google.com>
Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Co-authored-by: Oliver Scherf <oliver.scherf@iml.fraunhofer.de>
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D103969
Nikita Popov [Thu, 17 Jun 2021 19:49:29 +0000 (21:49 +0200)]
[LoopUnroll] Push runtime unrolling decision up into tryToUnrollLoop()
Currently, UnrollLoop() is passed an AllowRuntime flag and decides
itself whether runtime unrolling should be used or not. This patch
pushes the decision into the caller and allows us to eliminate the
ULO.TripCount and ULO.TripMultiple parameters.
Differential Revision: https://reviews.llvm.org/D104487
Ben Shi [Sat, 19 Jun 2021 03:01:43 +0000 (11:01 +0800)]
[RISCV] Optimize add-mul in the zba extension with SH*ADD
This patch does the following optimization.
Rx + Ry * 18 => (SH1ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 20 => (SH2ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 24 => (SH3ADD (SH1ADD Rx, Rx), Ry)
Rx + Ry * 36 => (SH2ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 40 => (SH3ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 72 => (SH3ADD (SH3ADD Rx, Rx), Ry)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104588
Ben Shi [Sat, 19 Jun 2021 02:09:08 +0000 (10:09 +0800)]
[RISCV][test] Add new tests for add-mul optimization in the zba extension with SH*ADD
These tests will show the following optimization by future patches.
Rx + Ry * 18 => (SH1ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 20 => (SH2ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 24 => (SH3ADD (SH1ADD Rx, Rx), Ry)
Rx + Ry * 36 => (SH2ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 40 => (SH3ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 72 => (SH3ADD (SH3ADD Rx, Rx), Ry)
Rx * (3 << C) => (SLLI (SH1ADD Rx, Rx), C)
Rx * (5 << C) => (SLLI (SH2ADD Rx, Rx), C)
Rx * (9 << C) => (SLLI (SH3ADD Rx, Rx), C)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104507
Lang Hames [Sat, 19 Jun 2021 04:44:38 +0000 (14:44 +1000)]
[ORC][examples] Add missing library dependence
Matthias Springer [Sat, 19 Jun 2021 04:25:47 +0000 (13:25 +0900)]
[mlir][linalg] Lower subtensor(pad_tensor) to pad_tensor(subtensor)
Only high padding is supported at the moment. Low padding will be added in a separate commit.
Differential Revision: https://reviews.llvm.org/D104357
Jez Ng [Sat, 19 Jun 2021 02:30:57 +0000 (22:30 -0400)]
[re-land][lld-macho] Avoid force-loading the same archive twice
This reverts commit
c9b241efd68c5a0f1f67e9250960ade454f3bc11, which was
a backout diff to fix the buildbots.
The real culprit of the crash is
https://github.com/llvm/llvm-project/commit/
1d31fb8d122b1117cf20a9edc09812db8472e930,
which is being reverted.
Differential Revision: https://reviews.llvm.org/D104353
Jez Ng [Sat, 19 Jun 2021 02:19:09 +0000 (22:19 -0400)]
Revert "[lld-macho] Have path-related functions return std::string, not StringRef"
This reverts commit
1d31fb8d122b1117cf20a9edc09812db8472e930.
Making `rerootPath` return a temporary std::string caused a
use-after-free:
https://ci.chromium.org/ui/p/chromium/builders/try/win_upload_clang/1608/overview
Liqiang Tao [Sat, 19 Jun 2021 02:17:19 +0000 (10:17 +0800)]
[llvm][Inliner] Add an optional PriorityInlineOrder
This patch adds an optional PriorityInlineOrder, which uses the heap to order inlining.
The callsite which size is smaller would have a higher priority.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D104028
Lang Hames [Sat, 19 Jun 2021 01:41:42 +0000 (11:41 +1000)]
[ORC][C-bindings] Add access to LLJIT IRTransformLayer, ThreadSafeModule utils.
This patch was derived from Valentin Churavy's work in
https://reviews.llvm.org/D104480. It adds support for setting the transform on
an IRTransformLayer, and for accessing the IRTransformLayer in LLJIT. It also
adds access to the ThreadSafeModule::withModuleDo method for thread-safe
access to modules.
A new example has been added to show how to use these APIs to optimize a module
during materialization.
Thanks Valentin!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D103855
Lang Hames [Fri, 18 Jun 2021 23:22:45 +0000 (09:22 +1000)]
[ORC][examples] Fix file name in comment.
George Balatsouras [Fri, 18 Jun 2021 21:10:49 +0000 (21:10 +0000)]
[libfuzzer] Disable failing DFSan-related tests
These have been broken by https://reviews.llvm.org/D104494.
However, `lib/fuzzer/dataflow/` is unused (?) so addressing this is not a priority.
Added TODOs to re-enable them in the future.
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104568
Guozhi Wei [Sat, 19 Jun 2021 01:01:34 +0000 (18:01 -0700)]
[InstCombine] Don't transform code if DoTransform is false
In patch https://reviews.llvm.org/D72396, it doesn't check DoTransform before transforming the code, and generates wrong result for the attached test case.
Differential Revision: https://reviews.llvm.org/D104567
Nico Weber [Sat, 19 Jun 2021 00:22:15 +0000 (20:22 -0400)]
Revert "[lld-macho] Avoid force-loading the same archive twice"
This reverts commit
24706cd73cd150543753a2e169c68a2c68da46a1.
Test seems to fail flakily. See comments on https://reviews.llvm.org/D104353
for a hypothesis for why.
Fangrui Song [Sat, 19 Jun 2021 00:01:17 +0000 (17:01 -0700)]
[InstrProfiling][ELF] Make __profd_ private if the function does not use value profiling
On ELF, the
D1003372 optimization can apply to more cases. There are two
prerequisites for making `__profd_` private:
* `__profc_` keeps `__profd_` live under compiler/linker GC
* `__profd_` is not referenced by code
The first is satisfied because all counters/data are in a section group (either
`comdat any` or `comdat noduplicates`). The second requires that the function
does not use value profiling.
Regarding the second point: `__profd_` may be referenced by other text sections
due to inlining. There will be a linker error if a prevailing text section
references the non-prevailing local symbol.
With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`)
clang is 4.2% smaller (1-
169620032/
177066968).
`stat -c %s **/*.o | awk '{s+=$1}END{print s}' is 2.5% smaller.
Reviewed By: davidxl, rnk
Differential Revision: https://reviews.llvm.org/D103717
peter klausler [Fri, 18 Jun 2021 23:55:37 +0000 (16:55 -0700)]
[flang] Recode a switch() to dodge a sketchy warning
One of the buildbots uses a compiler (can't tell which) that
doesn't approve of a "default:" in a switch statement whose
cases appear to completely cover all possible values of an
enum class. But this switch is in raw data dumping code that
needs to allow for incorrect values in memory. So rewrite it
as a cascade of if statements; performance doesn't matter here.
Fangrui Song [Fri, 18 Jun 2021 23:44:03 +0000 (16:44 -0700)]
[profile][test] Delete profraw directory so that tests are immune to format version upgrade
Matt Arsenault [Sat, 12 Jun 2021 15:21:57 +0000 (11:21 -0400)]
AMDGPU: Fix infinite loop in DAG combine with fneg + fma
We were not reporting isFNegFree for v2f32, although it is effectively
free after legalization. The generic combine was pulling fneg out of
the fma source operands, and the AMDGPU combine was doing the
opposite.
Nico Weber [Fri, 18 Jun 2021 22:50:44 +0000 (18:50 -0400)]
Re-Revert "DirectoryWatcher: add an implementation for Windows"
This reverts commit
76f1baa7875acd88bdd4b431eed6e2d2decfc0fe.
Also reverts 2 follow-ups:
1. Revert "DirectoryWatcher: also wait for the notifier thread"
This reverts commit
527a1821e6f8e115db3335a3341c7ac491725a0d.
2. Revert "DirectoryWatcher: close a possible window of race on Windows"
This reverts commit
a6948da86ad7e78d66b26263c2681ef6385cc234.
Makes tests hang, see comments on https://reviews.llvm.org/D88666
Matt Arsenault [Mon, 14 Jun 2021 16:53:36 +0000 (12:53 -0400)]
AMDGPU: Fix assert on m0_lo16/m0_hi16
These get added (redundantly) to the bundle expanded for indirect
register accesses. We hit this path only when there is a call in the
function.
Shilei Tian [Fri, 18 Jun 2021 22:35:34 +0000 (18:35 -0400)]
[OpenMP] Make bug49334.cpp more reproducible
`bug49334.cpp` cannot detect data race in `libomptarget` efficiently. It
is reported that with `N = 256` and `BS = 16`, the data race can be reproduced
more steadily. The next coming pathces will fix it so this patch is expected to
fail now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104552
Hongtao Yu [Thu, 17 Jun 2021 18:09:13 +0000 (11:09 -0700)]
[CSSPGO] Undoing the concept of dangling pseudo probe
As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.
I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D104477
peter klausler [Fri, 18 Jun 2021 22:11:55 +0000 (15:11 -0700)]
[flang] Fix clang build (struct/class mismatch warning)
A recent patch changed a struct into a class, but missed a
forward definition. GCC didn't warn, but clang does. Fix.
Nick Desaulniers [Fri, 18 Jun 2021 22:09:18 +0000 (15:09 -0700)]
Whitespace fixes for
193e41c987127aad86d0380df83e67a85266f1f1
which reportedly fails on the mac builds.
Stella Laurenzo [Fri, 18 Jun 2021 21:42:56 +0000 (21:42 +0000)]
Partial rollback: Disable MLIR verifier parallelism.
Deadlocks have been found in several downstream projects as noted on the original patch: https://reviews.llvm.org/D104207
Disabling pending full root cause analysis.
Differential Revision: https://reviews.llvm.org/D104570
Nikita Popov [Thu, 17 Jun 2021 19:03:50 +0000 (21:03 +0200)]
[LoopUnroll] Simplify optimization remarks
Remove dependence on ULO.TripCount/ULO.TripMultiple from ORE and
debug code. For debug code, print information about all exits.
For optimization remarks, only include the unroll count and the
type of unroll (complete, partial or runtime), but omit detailed
information about exit folding, now that more than one exit may
be folded.
Differential Revision: https://reviews.llvm.org/D104482
Hongtao Yu [Fri, 18 Jun 2021 01:02:45 +0000 (18:02 -0700)]
[CSSPGO][llvm-profgen] Fix an issue in findDisjointRanges
We were using 0 as an indicator of invalid offset when computing disjoint ranges. In reality, 0 can be an valid code offset which stands for the first function in .text section. I'm using UINT64_MAX as an invalid code offset instead.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D104497
River Riddle [Fri, 18 Jun 2021 20:30:16 +0000 (20:30 +0000)]
[mlir] Add support to SourceMgrDiagnosticHandler for filtering FileLineColLocs
This revision adds support for passing a functor to SourceMgrDiagnosticHandler for filtering out FileLineColLocs when emitting a diagnostic. More specifically, this can be useful in situations where there may be large CallSiteLocs with locations that aren't necessarily important/useful for users.
For now the filtering support is limited to FileLineColLocs, but conceptually we could allow filtering for all locations types if a need arises in the future.
Differential Revision: https://reviews.llvm.org/D103649