Pavel Labath [Tue, 28 Sep 2021 12:44:42 +0000 (14:44 +0200)]
[lldb/test] Remove a check from TestLoadAfterAttach
The two module retrieval methods (qXfer:libraries-svr4 and manual list
traversal) differ in how the handle the
manually-added-but-not-yet-loaded modules. The svr4 path will remove it,
while the manual one will keep in the list.
It's likely the two paths need ought to be synchronized, but right now,
this distinction is not relevant for the test.
Max Kazantsev [Tue, 28 Sep 2021 12:33:02 +0000 (19:33 +0700)]
Recommit "[Test] Add more tests with cycled phis"
Diana Picus [Tue, 28 Sep 2021 12:17:34 +0000 (12:17 +0000)]
Reland "[flang] GET_COMMAND_ARGUMENT runtime implementation"
Recommit https://reviews.llvm.org/
D109813 and
https://reviews.llvm.org/
D109814.
This implements the second and final entry point for GET_COMMAND_ARGUMENT,
handling the VALUE, STATUS and ERRMSG parameters.
It has a small fix in that we're now using memcpy instead of strncpy
(which was a bad idea to begin with, since we're not actually interested
in a string copy).
Max Kazantsev [Tue, 28 Sep 2021 12:32:26 +0000 (19:32 +0700)]
Revert "[Test] Add more tests with cycled phis"
This reverts commit
7128a545b3baa62c1164843103fb08daeba5cd9d.
Need to regenerate tests after rebase.
Jingu Kang [Tue, 28 Sep 2021 12:27:13 +0000 (13:27 +0100)]
Revert "[AArch64] Split bitmask immediate of bitwise AND operation"
This reverts commit
864b206796ae8aa7f35f830655337751dbd9176c.
Reverting due to error on buildbots.
Frederic Cambus [Tue, 28 Sep 2021 12:15:31 +0000 (17:45 +0530)]
[CMake] Add detection for the mold linker in AddLLVM.cmake.
mold says it is compatible with GNU ld and gold linkers:
```
$ mold -v
mold 0.9.5 (compatible with GNU ld and GNU gold)
```
And thus it currently gets detected as Gold.
With the following diff, CMake now correctly reports the linker name, and mold keeps being identified as Gold internally for now.
Reviewed By: ldionne, MaskRay
Differential Revision: https://reviews.llvm.org/
D110035
Pavel Labath [Mon, 27 Sep 2021 12:12:29 +0000 (14:12 +0200)]
[lldb/test] Add ability to specify environment when spawning processes
We only had that ability for regular debugger launches. This meant that
it was not possible to use the normal dlopen patterns in attach tests.
This fixes that.
Pavel Labath [Mon, 27 Sep 2021 14:54:00 +0000 (16:54 +0200)]
[lldb] Remove non-stop mode code
We added some support for this mode back in 2015, but the feature was
never productionized. It is completely untested, and there are known
major structural lldb issues that need to be resolved before this
feature can really be supported.
It also complicates making further changes to stop reply packet
handling, which is what I am about to do.
Differential Revision: https://reviews.llvm.org/
D110553
Diana Picus [Tue, 28 Sep 2021 12:02:52 +0000 (12:02 +0000)]
Revert "[flang] GET_COMMAND_ARGUMENT(VALUE) runtime implementation"
This reverts commit
0446f1299f6be9fd35bc5f458c78b34dca3105f6 and
df6302311f88d0fbc666b6277d029aa371039945.
There's a warning on flang-aarch64-latest-gcc related to strncpy using
the result of strlen as a bound. I'll recommit with a fix.
Max Kazantsev [Tue, 28 Sep 2021 12:03:47 +0000 (19:03 +0700)]
[Test] Add more tests with cycled phis
Salman Javed [Tue, 28 Sep 2021 11:52:12 +0000 (07:52 -0400)]
Add support for `NOLINTBEGIN` ... `NOLINTEND` comments
Add support for NOLINTBEGIN ... NOLINTEND comments to suppress
clang-tidy warnings over multiple lines. All lines between the "begin"
and "end" markers are suppressed.
Example:
// NOLINTBEGIN(some-check)
<Code with warnings to be suppressed, line 1>
<Code with warnings to be suppressed, line 2>
<Code with warnings to be suppressed, line 3>
// NOLINTEND(some-check)
Follows similar syntax as the NOLINT and NOLINTNEXTLINE comments
that are already implemented, i.e. allows multiple checks to be provided
in parentheses; suppresses all checks if the parentheses are omitted,
etc.
If the comments are misused, e.g. using a NOLINTBEGIN but not
terminating it with a NOLINTEND, a clang-tidy-nolint diagnostic
message pointing to the misuse is generated.
As part of implementing this feature, the following bugs were fixed in
existing code:
IsNOLINTFound(): IsNOLINTFound("NOLINT", Str) returns true when Str is
"NOLINTNEXTLINE". This is because the textual search finds NOLINT as
the stem of NOLINTNEXTLINE.
LineIsMarkedWithNOLINT(): NOLINTNEXTLINEs on the very first line of a
file are ignored. This is due to rsplit('\n\').second returning a blank
string when there are no more newline chars to split on.
Florian Hahn [Tue, 28 Sep 2021 11:26:00 +0000 (12:26 +0100)]
[VectorCombine] Discard ScalarizationResult state in early exit.
ScalarizationResult's destructor makes sure ToFreeze is not ignored if
set. Currently, scalarizeLoadExtract has an early exit if the index is
not safe directly. But when it is SafeWithFreeze, we need to discard the
state first, otherwise we hit the assert in the destructor.
Fixes PR51992.
Shivam Gupta [Tue, 28 Sep 2021 11:33:12 +0000 (17:03 +0530)]
[Docs][NFC] Add doxygen comment for AtomicExpandPass in passes.h
Kirill Bobyrev [Tue, 28 Sep 2021 11:34:42 +0000 (13:34 +0200)]
serge-sans-paille [Thu, 16 Sep 2021 16:13:15 +0000 (18:13 +0200)]
Simplify handling of builtin with inline redefinition
It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.
Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.
Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.
After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite
Differential Revision: https://reviews.llvm.org/
D109967
LLVM GN Syncbot [Tue, 28 Sep 2021 10:58:48 +0000 (10:58 +0000)]
[gn build] Port
864b206796ae
Jingu Kang [Wed, 22 Sep 2021 16:01:21 +0000 (17:01 +0100)]
[AArch64] Split bitmask immediate of bitwise AND operation
MOVi32imm + ANDWrr ==> ANDWri + ANDWri
MOVi64imm + ANDXrr ==> ANDXri + ANDXri
The mov pseudo instruction could be expanded to multiple mov instructions later.
In this case, try to split the constant operand of mov instruction into two
bitmask immediates. It makes only two AND instructions intead of multiple
mov + and instructions.
Added a peephole optimization pass on MIR level to implement it.
Differential Revision: https://reviews.llvm.org/
D109963
M Bakinovsky [Tue, 28 Sep 2021 10:56:01 +0000 (06:56 -0400)]
Fix documentation typos; NFC
Fixes bugprone-virtual-near-miss & performance-type-promotion-in-math-fn.
mydeveloperday [Tue, 28 Sep 2021 10:42:19 +0000 (11:42 +0100)]
[clang-format][docs] mark new clang-format configuration options based on which version they would GA
Sometimes I see people unsure about which options they can use in specific versions of clang-format because
https://clang.llvm.org/docs/ClangFormatStyleOptions.html points to the latest and greatest versions.
The reality is this says its version 13.0, but actually anything we add now, will not be in 13.0 GA but
instead 14.0 GA (as 13.0 has already been branched).
How about we introduce some nomenclature to the Format.h so that we can mark which options in the
documentation were introduced for which version?
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/
D110432
Jay Foad [Tue, 28 Sep 2021 10:07:29 +0000 (11:07 +0100)]
[LiveIntervals] Fix another asan debug build failure
Call RemoveMachineInstrFromMaps before erasing instrs.
repairIntervalsInRange will do this for you after erasing the
instruction, but it's not safe to rely on it because assertions in
SlotIndexes::removeMachineInstrFromMaps refer to fields in the erased
instruction.
This fixes asan buildbot failures caused by
D110335.
Kirill Bobyrev [Tue, 28 Sep 2021 10:02:13 +0000 (12:02 +0200)]
Investigate
D110386 failures even further
Eric Schweitz [Tue, 28 Sep 2021 09:56:32 +0000 (11:56 +0200)]
[fir] Add fir.save_result op
Add the fir.save_result operation. It is use to save an
array, box, or record function result SSA-value to a memory location
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/
D110407
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Florian Hahn [Tue, 28 Sep 2021 09:32:17 +0000 (10:32 +0100)]
Recommit "[SCEV] Look through single value PHIs." (take 2)
This reverts commit
8fdac7cb7abbeeaed016ef9eb7a087458e41e33f.
The issue causing the revert has been fixed a while ago in
60b852092c98.
Original message:
Now that SCEVExpander can preserve LCSSA form,
we do not have to worry about LCSSA form when
trying to look through PHIs. SCEVExpander will take
care of inserting LCSSA PHI nodes as required.
This increases precision of the analysis in some cases.
Reviewed By: mkazantsev, bmahjour
Differential Revision: https://reviews.llvm.org/D71539
“bhkumarn” [Mon, 23 Aug 2021 12:43:11 +0000 (18:13 +0530)]
[DebugInfo] Emit DW_TAG_namelist and DW_TAG_namelist_item
This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran
namelist variables. DICompositeType is extended to support this fortran
feature.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/
D108553
V Donaldson [Tue, 28 Sep 2021 09:01:20 +0000 (11:01 +0200)]
[fir] Update fir.insert_on_range op
Update the fir.insert_on_range operation. Add a better description,
builder and verifier.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/
D110389
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Diana Picus [Tue, 14 Sep 2021 11:51:19 +0000 (11:51 +0000)]
[flang] GET_COMMAND_ARGUMENT(ERRMSG) runtime implementation
Implement the final part of GET_COMMAND_ARGUMENT, i.e. the handling of
ERRMSG. This uses some of the infrastructure in stat.h and gets rid of
the magic numbers that we were using for return codes.
Differential Revision: https://reviews.llvm.org/
D109814
Diana Picus [Wed, 25 Aug 2021 08:19:34 +0000 (08:19 +0000)]
[flang] GET_COMMAND_ARGUMENT(VALUE) runtime implementation
Partial implementation for the second entry point for
GET_COMMAND_ARGUMENT. It handles the VALUE and STATUS arguments, and
doesn't touch ERRMSG.
Differential Revision: https://reviews.llvm.org/
D109813
Diana Picus [Fri, 3 Sep 2021 09:42:04 +0000 (09:42 +0000)]
[flang] GET_COMMAND_ARGUMENT(LENGTH) runtime implementation
Implement the ArgumentLength entry point of GET_COMMAND_ARGUMENT. Also
introduce a fixture for the tests.
Note that this also changes the interface for ArgumentLength from
returning a 4-byte integer to returning an 8-byte integer.
Differential Revision: https://reviews.llvm.org/
D109227
Kirill Bobyrev [Tue, 28 Sep 2021 07:50:45 +0000 (09:50 +0200)]
Investigate
D110386 Windows failures
Add more information for test failures inspection.
Alexander Belyaev [Tue, 28 Sep 2021 07:22:39 +0000 (09:22 +0200)]
[mlir] Add min/max operations to Standard.
[RFC: Add min/max ops](https://llvm.discourse.group/t/rfc-add-min-max-operations/4353)
I was following the naming style for Arith dialect in
https://reviews.llvm.org/
D110200,
i.e. similar to DivSIOp and DivUIOp I defined MaxSIOp, MaxUIOp.
When Arith PR is landed, I will migrate these ops as well.
Differential Revision: https://reviews.llvm.org/
D110540
Jay Foad [Fri, 24 Sep 2021 15:50:55 +0000 (16:50 +0100)]
[LiveIntervals] Repair subreg ranges in processTiedPairs
In TwoAddressInstructionPass::processTiedPairs, update subranges of the
live interval for RegB as well as the main range.
This is a small step towards switching TwoAddressInstructionPass over
from LiveVariables to LiveIntervals. Currently this path is only tested
if you explicitly enable -early-live-intervals.
Differential Revision: https://reviews.llvm.org/
D110526
Jay Foad [Wed, 22 Sep 2021 10:03:20 +0000 (11:03 +0100)]
[LiveIntervals] Improve repair after convertToThreeAddress
After TwoAddressInstructionPass calls
TargetInstrInfo::convertToThreeAddress, improve the LiveIntervals repair
to cope with convertToThreeAddress creating more than one new
instruction.
This mostly seems to benefit X86. For example in
test/CodeGen/X86/zext-trunc.ll it converts:
%4:gr32 = ADD32rr %3:gr32(tied-def 0), %2:gr32, implicit-def dead $eflags
to:
undef %6.sub_32bit:gr64 = COPY %3:gr32
undef %7.sub_32bit:gr64_nosp = COPY %2:gr32
%4:gr32 = LEA64_32r killed %6:gr64, 1, killed %7:gr64_nosp, 0, $noreg
Differential Revision: https://reviews.llvm.org/
D110335
Mehdi Amini [Mon, 27 Sep 2021 16:53:45 +0000 (16:53 +0000)]
Fix URLs to the prod/staging buildbot master in the doc
Differential Revision: https://reviews.llvm.org/
D110565
Kirill Bobyrev [Tue, 28 Sep 2021 06:13:01 +0000 (08:13 +0200)]
Attempt to fix Windows builds after
D110386
http://45.33.8.238/win/46013/summary.html
Kirill Bobyrev [Tue, 28 Sep 2021 05:44:18 +0000 (07:44 +0200)]
[clangd] Refactor IncludeStructure: use File (unsigned) for most computations
Preparation for
D108194.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/
D110386
Liu, Chen3 [Tue, 28 Sep 2021 01:36:34 +0000 (09:36 +0800)]
[X86][FP16] Fix a bug when Combine the FADD(A, FMA(B, C, 0)) to FMA(B, C, A).
This bug was introduced by
D109953. The operand order of generated FMA
is wrong.
Differential Revision: https://reviews.llvm.org/
D110606
Lang Hames [Tue, 28 Sep 2021 03:04:39 +0000 (20:04 -0700)]
[ORC] Fix the LLJITWithRemoteDebugging example.
This was broken by the switch from JITTargetAddress to ExecutorAddr in
21a06254a3a.
Xiang1 Zhang [Sat, 25 Sep 2021 02:41:37 +0000 (10:41 +0800)]
[ISel] Legalized arithmetic.fence.f128 for 32-bits target
Reviewed By: Craig Topper, Wang Pengfei
Differential Revision: https://reviews.llvm.org/
D110467
Anna Thomas [Tue, 28 Sep 2021 01:27:01 +0000 (21:27 -0400)]
[LoopPred Test] Fix lld-x86_64-win BB failure
Need a more general CHECK line for testcase in
5df9112 for correctly
handling lld-x86_64-win buildbot.
Ahsan Saghir [Tue, 28 Sep 2021 01:17:17 +0000 (20:17 -0500)]
Revert "tsan: fix trace tests on darwin"
This reverts commit
94ea36649ecc854d290c6797e6adb91bdfac756d.
Reverting due to errors on buildbots.
Anna Thomas [Tue, 28 Sep 2021 00:51:04 +0000 (20:51 -0400)]
Reland "[LoopPredication] Add testcase showing BPI computation. NFC"
This relands commit
16a62d4f.
Relanded after fixing CHECK-LINES for opt pipeline output to be more
general (based on failures seen in buildbot).
Lang Hames [Tue, 28 Sep 2021 01:00:23 +0000 (18:00 -0700)]
clang-format
Lang Hames [Tue, 28 Sep 2021 00:59:15 +0000 (17:59 -0700)]
[llvm-jitlink] Add more information about allocation failures.
Slab allocator failures will now report requested size and remaining capacity.
Ahsan Saghir [Mon, 13 Sep 2021 01:19:41 +0000 (20:19 -0500)]
[PowerPC] MMA - Add __builtin_vsx_build_pair and __builtin_mma_build_acc builtins
This patch adds the following built-ins:
__builtin_vsx_build_pair
__builtin_mma_build_acc
Reviewed By: #powerpc, nemanjai, lei
Differential Revision: https://reviews.llvm.org/
D107647
Lang Hames [Mon, 27 Sep 2021 23:47:24 +0000 (16:47 -0700)]
[ORC] Switch from JITTargetAddress to ExecutorAddr for EPC-call APIs.
Part of the ongoing move to ExecutorAddr.
Michael Kruse [Mon, 27 Sep 2021 01:10:26 +0000 (20:10 -0500)]
[Polly] Reject regions entered by an indirectbr/callbr.
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.
This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.
This fixes llvm.org/PR51964
Recommit with "REQUIRES: asserts" in test that uses statistics.
Joe Loser [Mon, 27 Sep 2021 23:18:46 +0000 (19:18 -0400)]
[libc++][NFC] s/enable_if<...>::type/enable_if_t<...> in span
There is some use of `enable_if<...>::type` when the rest of the file
uses `enable_if_t`. So, use `enable_if_t` consistently throughout.
Haowei Wu [Mon, 27 Sep 2021 23:05:33 +0000 (16:05 -0700)]
Revert "[Polly] Reject reject regions entered by an indirectbr/callbr."
This reverts commit
91f46bb77e6d56955c3b96e9e844ae6a251c41e9 which
causes test failures when assertions are off.
Lang Hames [Mon, 27 Sep 2021 22:25:30 +0000 (15:25 -0700)]
[ORC] Hold shared_ptr<SymbolStringPool> in errors containing SymbolStringPtrs.
This allows these error values to remain valid, even if they tear down the JIT
itself.
Congzhe Cao [Mon, 27 Sep 2021 22:30:20 +0000 (18:30 -0400)]
[CodeMoverUtils] Enhance isSafeToMoveBefore() when control flow equivalence is satisfied
With improved analysis in determining CFG equivalence that does
not require strict dominance and post-dominance conditions, we
now relax isSafeToMoveBefore() such that an instruction I can
be moved before InsertPoint even if they do not strictly dominate
each other, as long as they follow the same control flow path.
For example, we can move Instruction 0 before Instruction 1,
and vice versa.
```
if (cond1)
// Instruction 0: %add = add i32 1, 2
if (cond1)
// Instruction 1: %add2 = add i32 2, 1
```
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/
D110456
Kevin Athey [Mon, 27 Sep 2021 21:48:44 +0000 (14:48 -0700)]
Revert "tsan: add a test for stack init race"
This reverts commit
b72176b9bc06146d12e495167977effe050dc326.
Broke bot: https://lab.llvm.org/buildbot/#/builders/70/builds/12193
LLVM GN Syncbot [Mon, 27 Sep 2021 21:56:39 +0000 (21:56 +0000)]
[gn build] Port
6cfb4d46bae1
Jozef Lawrynowicz [Mon, 27 Sep 2021 21:55:32 +0000 (00:55 +0300)]
[llvm-readobj] Support dumping of MSP430 ELF attributes
The MSP430 ABI supports build attributes for specifying
the ISA, code model, data model and enum size in ELF object files.
Differential Revision: https://reviews.llvm.org/
D107969
Jon Chesterfield [Mon, 27 Sep 2021 21:21:07 +0000 (22:21 +0100)]
[libomptarget][amdgpu] Follow on to
D110513, empty kernarg pools are not fatal
Jon Chesterfield [Mon, 27 Sep 2021 20:48:29 +0000 (21:48 +0100)]
[libomptarget][amdgpu] Report zero devices if plugin construction fails, instead of segv
Anna Thomas [Mon, 27 Sep 2021 21:08:28 +0000 (17:08 -0400)]
Revert "[LoopPredication] Add testcase showing BPI computation. NFC"
This reverts commit
16a62d4f3dca189b0e0565c7ebcd83ddfcc67629.
Needs some update to check lines to fix bb failure.
Louis Dionne [Thu, 23 Sep 2021 16:47:24 +0000 (12:47 -0400)]
[libc++] Do not enable P1951 before C++23, since it's a breaking change
In reaction to the issues raised by Richard in https://llvm.org/
D109066,
this commit does not apply P1951 as a DR in previous standard modes,
since it breaks valid code.
I do believe it should be applied as a DR, however ideally we'd get some
sort of statement from the Committee to this effect (and all implementations
would behave consistently). In the meantime, only implement P1951 starting
with C++23 -- we can always come back and apply it as a DR if that's what
the Committee says.
Differential Revision: https://reviews.llvm.org/
D110347
Anna Thomas [Mon, 27 Sep 2021 20:52:09 +0000 (16:52 -0400)]
[LoopPredication] Add testcase showing BPI computation. NFC
Precommit testcase for
D110438. Since we do not preserve BPI in loop
pass manager, we are forced to compute BPI everytime Loop predication is
invoked.
The patch referenced changes that behaviour by preserving lossy BPI for
loop passes.
Simon Pilgrim [Mon, 27 Sep 2021 20:42:08 +0000 (21:42 +0100)]
[X86] Add slow/fast pmulld test coverage to vector-mul.ll
Kostya Kortchinsky [Mon, 27 Sep 2021 19:31:59 +0000 (12:31 -0700)]
[gwp-asan] Initialize AllocatorVersionMagic at runtime
GWP-ASan's `AllocatorState` was recently extended with a
`AllocatorVersionMagic` structure required so that GWP-ASan bug reports
can be understood by tools at different versions.
On Fuchsia, this in included in the `scudo::Allocator` structure, and
by having non-zero initializers, this effectively moved the static
allocator structure from the `.bss` segment to the `.data` segment, thus
increasing (significantly) the size of the libc.
This CL proposes to initialize the structure with its magic numbers at
runtime, allowing for the allocator to go back into the `.bss` segment.
I will work on adding a test on the Scudo side to ensure that this type
of changes get detected early on. Additional work is also needed to
reduce the footprint of the (large) memory-tagging related structures
that are currently part of the allocator.
Differential Revision: https://reviews.llvm.org/
D110575
Roman Lebedev [Mon, 27 Sep 2021 20:47:23 +0000 (23:47 +0300)]
[NFC][X86] Add 'gather' optsize/minsize test coverage
Florian Mayer [Tue, 14 Sep 2021 15:54:18 +0000 (16:54 +0100)]
[NFC] [PSI] explain encoding of PercentileCutoff.
Reviewed By: mtrofin, davidxl
Differential Revision: https://reviews.llvm.org/
D109764
Fangrui Song [Mon, 27 Sep 2021 20:28:40 +0000 (13:28 -0700)]
[Driver] Remove confusing *-linux-android detection with non-android --target=
These values allow, for example, `--target=aarch64` and
`--target=aarch64-linux-gnu` to detect `aarch64-linux-android`. This is
confusing. Users should specify `--target=aarch64-linux-android` to get Android GCC
installation.
Reverts D53463.
Reviewed By: nickdesaulniers, danalbert
Differential Revision: https://reviews.llvm.org/
D110379
Roman Lebedev [Mon, 27 Sep 2021 19:40:25 +0000 (22:40 +0300)]
[NFC][X86] Add test showing that legal `GATHER`'s are expoanded on Znver3
modimo [Mon, 27 Sep 2021 19:24:28 +0000 (12:24 -0700)]
[ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities.
This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build:
1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities.
2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time.
Implementation-wise this adds the following summary function attributes:
1. noUnwind: function is noUnwind
2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions)
3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well)
Testing:
Clang self-build passes and 2nd stage build passes check-all
ninja check-all with newly added tests passing
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D36850
Tobias Gysi [Mon, 27 Sep 2021 19:20:56 +0000 (19:20 +0000)]
[mlir][linalg] Finer-grained padding control.
Adapt the signature of the PaddingValueComputationFunction callback to either return the padding value or failure to signal padding is not desired.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/
D110572
Roman Lebedev [Mon, 27 Sep 2021 19:18:41 +0000 (22:18 +0300)]
[X86][Costmodel] Load/store i16 Stride=4 VF=32 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For this tuple, measuring becomes problematic since there's a lot of spilling going on,
but apparently all these memory ops do not affect worst-case estimate at all here.
For load we have:
https://godbolt.org/z/zP4hd8MT6 - for intels `Block RThroughput: =150.0`; for ryzens, `Block RThroughput: <=59`
So pick cost of `150`.
For store we have:
https://godbolt.org/z/vKb8zTK8E - for intels `Block RThroughput: =32.0`; for ryzens, `Block RThroughput: <=24.0`
So pick cost of `64`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/
D110548
Roman Lebedev [Mon, 27 Sep 2021 19:18:40 +0000 (22:18 +0300)]
[X86][Costmodel] Load/store i16 Stride=4 VF=16 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/Wd9cKab83 - for intels `Block RThroughput: =75.0`; for ryzens, `Block RThroughput: <=29.5`
So pick cost of `75`. (note that `# 32-byte Reload` does not affect throughput there.)
For store we have:
https://godbolt.org/z/Wd9cKab83 - for intels `Block RThroughput: =32.0`; for ryzens, `Block RThroughput: <=12.0`
So pick cost of `32`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/
D110543
Roman Lebedev [Mon, 27 Sep 2021 19:18:36 +0000 (22:18 +0300)]
[X86][Costmodel] Load/store i16 Stride=4 VF=8 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/dd8T5P471 - for intels `Block RThroughput: =33.0`; for ryzens, `Block RThroughput: <=14.5`
So pick cost of `33`.
For store we have:
https://godbolt.org/z/zPxcKWhn4 - for intels `Block RThroughput: =10.0`; for ryzens, `Block RThroughput: <=6.0`
So pick cost of `10`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/
D110541
Roman Lebedev [Mon, 27 Sep 2021 19:18:32 +0000 (22:18 +0300)]
[X86][Costmodel] Load/store i16 Stride=4 VF=4 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/rnsf639Wh - for intels `Block RThroughput: =17.0`; for ryzens, `Block RThroughput: <=7.5`
So pick cost of `17`.
For store we have:
https://godbolt.org/z/565KKrcY6 - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: =2.0`
So pick cost of `6`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/
D110537
Roman Lebedev [Mon, 27 Sep 2021 19:18:27 +0000 (22:18 +0300)]
[X86][Costmodel] Load/store i16 Stride=4 VF=2 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/5EYc6r9nh - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=3.0`
So pick cost of `6`.
For store we have:
https://godbolt.org/z/z61e5d6GE - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0`
So pick cost of `2`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/
D110536
Chris Bieneman [Mon, 27 Sep 2021 19:16:28 +0000 (14:16 -0500)]
Fixing docs build
I always forget that new line...
Chris Bieneman [Mon, 27 Sep 2021 16:59:55 +0000 (11:59 -0500)]
Implement #pragma clang final extension
This patch adds a new preprocessor extension ``#pragma clang final``
which enables warning on undefinition and re-definition of macros.
The intent of this warning is to extend beyond ``-Wmacro-redefined`` to
warn against any and all alterations to macros that are marked `final`.
This warning is part of the ``-Wpedantic-macros`` diagnostics group.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/
D108567
Sanjay Patel [Mon, 27 Sep 2021 18:54:40 +0000 (14:54 -0400)]
[InstCombine] reduce code for shl-of-sub transform; NFC
Sanjay Patel [Mon, 27 Sep 2021 18:32:42 +0000 (14:32 -0400)]
[InstCombine] add tests for shl-of-sub; NFC
Aart Bik [Fri, 24 Sep 2021 20:36:52 +0000 (13:36 -0700)]
[mlir][sparse] sampled matrix multiplication fusion test
This integration tests runs a fused and non-fused version of
sampled matrix multiplication. Both should eventually have the
same performance!
NOTE: relies on pending tensor.init fix!
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/
D110444
Nico Weber [Mon, 27 Sep 2021 18:38:18 +0000 (14:38 -0400)]
Revert "[clangd] Refactor IncludeStructure: use File (unsigned) for most computations"
This reverts commit
0b1eff1bc5d004b1964bb9b1667e3efc034f3f62.
Breaks check-clangd on Windows, see comments on
https://reviews.llvm.org/
D110386
Jon Chesterfield [Mon, 27 Sep 2021 18:27:00 +0000 (19:27 +0100)]
Revert "[openmp] Add addrspacecast to getOrCreateIdent"
This reverts commit
1a761e5b7b50dc08e0ff7f7aea65e1da29c5cd80.
Failed CI, albeit with a different failure mode to BZ51982
Jon Chesterfield [Mon, 27 Sep 2021 18:23:11 +0000 (19:23 +0100)]
[openmp] Add addrspacecast to getOrCreateIdent
Fixes 51982. Minor refactor to remove `return x = y` construct.
Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\
blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting
parts while checking the assertion failure still occurred.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/
D110556
Aart Bik [Fri, 24 Sep 2021 20:15:17 +0000 (13:15 -0700)]
[mlir][sparse] preserve zero-initialization for materializing buffers
This revision makes sure that when the output buffer materializes locally
(in contrast with the passing in of output tensors either in-place or not
in-place), the zero initialization assumption is preserved. This also adds
a bit more documentation on our sparse kernel assumption (viz. TACO
assumptions).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/
D110442
Aaron Ballman [Mon, 27 Sep 2021 18:18:58 +0000 (14:18 -0400)]
Add a missing include to appease the build bots
Sanjay Patel [Mon, 27 Sep 2021 17:49:13 +0000 (13:49 -0400)]
[InstCombine] add use check to shl transform
This bug was introduced with the refactoring in:
9075edc89bc9
...but there were no tests to detect it.
Sanjay Patel [Mon, 27 Sep 2021 17:42:55 +0000 (13:42 -0400)]
[InstCombine] add tests for opposing shifts separated by trunc; NFC
Jameson Nash [Thu, 22 Jul 2021 23:42:23 +0000 (19:42 -0400)]
Bad SLPVectorization shufflevector replacement, resulting in write to wrong memory location
We see that it might otherwise do:
%10 = getelementptr {}**, <2 x {}***> %9, <2 x i32> <i32 10, i32 4>
%11 = bitcast <2 x {}***> %10 to <2 x i64*>
...
%27 = extractelement <2 x i64*> %11, i32 0
%28 = bitcast i64* %27 to <2 x i64>*
store <2 x i64> %22, <2 x i64>* %28, align 4, !tbaa !2
Which is an out-of-bounds store (the extractelement got offset 10
instead of offset 4 as intended). With the fix, we correctly generate
extractelement for i32 1 and generate correct code.
Differential Revision: https://reviews.llvm.org/
D106613
Carlos Galvez [Mon, 27 Sep 2021 18:02:53 +0000 (14:02 -0400)]
Fix bug in readability-uppercase-literal-suffix
Fixes https://bugs.llvm.org/show_bug.cgi?id=51790. The check triggers
incorrectly with non-type template parameters.
A bisect determined that the bug was introduced here:
https://github.com/llvm/llvm-project/commit/
ea2225a10be986d226e041d20d36dff17e78daed
Unfortunately that patch can no longer be reverted on top of the main
branch, so add a fix instead. Add a unit test to avoid regression in
the future.
peter klausler [Thu, 16 Sep 2021 17:03:45 +0000 (10:03 -0700)]
[flang] Catch branching into FORALL/WHERE constructs
Enforce constraints C1034 & C1038, which disallow the use
of otherwise valid statements as branch targets when they
appear in FORALL &/or WHERE constructs. (And make the
diagnostic message somewhat more user-friendly.)
Differential Revision: https://reviews.llvm.org/
D109936
Praveen Velliengiri [Mon, 27 Sep 2021 17:49:49 +0000 (11:49 -0600)]
[AMDGPU] Change ASAN init/fini kernels linkage to external.
HSA runtime fails to find the symbols for Init and Fini kernels as
they mark with internal linkage, changing the linkage to external
to fix those errors.
Differential Revision: https://reviews.llvm.org/
D110054
Sumesh Udayakumaran [Sat, 25 Sep 2021 22:46:03 +0000 (01:46 +0300)]
[mlir] Mode for explicitly controlling the fusion kind
New mode option that allows for either running the default fusion kind that happens today or doing either of producer-consumer or sibling fusion. This will also be helpful to minimize the compile-time of the fusion tests.
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/
D110102
Quinn Pham [Thu, 16 Sep 2021 19:00:01 +0000 (14:00 -0500)]
[PowerPC] Fix td pattern for P10 VSLDBI and VSRDBI
This patch fixes the pattern for the P10 instructions Vector Shift Left
Double by Bit Immediate VN-form and Vector Shift Right Double by Bit
Immediate VN-form. The third argument should be a target constant (`timm`)
instead of an `i32` because an immediate is expected.
Reviewed By: lei
Differential Revision: https://reviews.llvm.org/
D109920
Yaxun (Sam) Liu [Thu, 23 Sep 2021 03:45:27 +0000 (23:45 -0400)]
[HIP] Fix linking of asanrt.bc
HIP currently uses -mlink-builtin-bitcode to link all bitcode libraries, which
changes the linkage of functions to be internal once they are linked in. This
works for common bitcode libraries since these functions are not intended
to be exposed for external callers.
However, the functions in the sanitizer bitcode library is intended to be
called by instructions generated by the sanitizer pass. If their linkage is
changed to internal, their parameters may be altered by optimizations before
the sanitizer pass, which renders them unusable by the sanitizer pass.
To fix this issue, HIP toolchain links the sanitizer bitcode library with
-mlink-bitcode-file, which does not change the linkage.
A struct BitCodeLibraryInfo is introduced in ToolChain as a generic
approach to pass the bitcode library information between ToolChain and Tool.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/
D110304
William S. Moses [Mon, 27 Sep 2021 16:55:24 +0000 (12:55 -0400)]
[MLIR][LLVM] Add error if using incorrect attribute type for specifying LLVM linkage
Address post-commit review in https://reviews.llvm.org/
D108524 to add appropriate diagnostics.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/
D110566
peter klausler [Wed, 15 Sep 2021 15:28:48 +0000 (08:28 -0700)]
[flang] Enforce constraint: defined ass't in WHERE must be elemental
A defined assignment subroutine invoked in the context of a WHERE
statement or construct must necessarily be elemental (C1032).
Differential Revision: https://reviews.llvm.org/
D109932
Craig Topper [Mon, 27 Sep 2021 16:45:30 +0000 (09:45 -0700)]
[RISCV] Fold store of vmv.x.s to a vse with VL=1.
This can avoid a loss of decoupling with the scalar unit on cores
with decoupled scalar and vector units.
We should support FP too, but those use extract_element and not a
custom ISD node so it is a little different. I also left a FIXME
in the test for i64 extract and store on RV32.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/
D109482
Fangrui Song [Mon, 27 Sep 2021 16:50:41 +0000 (09:50 -0700)]
[ELF] Support symbol names with space in linker script expressions
Fix PR51961
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/
D110490
Kazu Hirata [Mon, 27 Sep 2021 16:49:32 +0000 (09:49 -0700)]
[InstCombine] Fix an "unused variable" warning
Bixia Zheng [Sat, 25 Sep 2021 06:19:07 +0000 (23:19 -0700)]
Implement the conversion from sparse constant to sparse tensors.
The sparse constant provides a constant tensor in coordinate format. We first split the sparse constant into a constant tensor for indices and a constant tensor for values. We then generate a loop to fill a sparse tensor in coordinate format using the tensors for the indices and the values. Finally, we convert the sparse tensor in coordinate format to the destination sparse tensor format.
Add tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/
D110373
@vladaindjic [Mon, 27 Sep 2021 16:44:46 +0000 (19:44 +0300)]
[OpenMP] libomp: Usage of TASK_TIED constant inside kmp_gsupport.cpp
The minor code refactorization introduces the TASK_TIED constant inside
kmp_gsupprot.cpp as a replacement for the literal value 1.
The mentioned constant is now used in both kmp_tasking.cpp and
kmp_gsupport.cpp files.
Differential Revision: https://reviews.llvm.org/
D110441
Craig Topper [Mon, 27 Sep 2021 16:37:04 +0000 (09:37 -0700)]
[RISCV] Improve support for forming widening multiplies when one input is a scalar splat.
If one input of a fixed vector multiply is a sign/zero extend and
the other operand is a splat of a scalar, we can use a widening
multiply if the scalar value has sufficient sign/zero bits.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/
D110028
Daniil Fukalov [Mon, 27 Sep 2021 16:23:47 +0000 (19:23 +0300)]
[NFC][AMDGPU] Update cost model tests:
1. Convert to generated tests.
2. Added code-size case in few places.
Sanjay Patel [Mon, 27 Sep 2021 16:06:40 +0000 (12:06 -0400)]
[InstCombine] move shl-only folds out from under commonShiftTransforms(); NFCI
This is no-functional-change-intended, but it hopefully makes things
slightly clearer and more efficient to have transforms that require
'shl' be called only from visitShl(). Further cleanup is possible.