Florian Hahn [Tue, 19 Jul 2022 10:23:24 +0000 (11:23 +0100)]
[LV] Remove unnecessary cast in widenCallInstruction. (NFC)
Simon Pilgrim [Tue, 19 Jul 2022 10:13:31 +0000 (11:13 +0100)]
Fix signed/unsigned comparison mismatch warning
Simon Pilgrim [Tue, 19 Jul 2022 09:58:27 +0000 (10:58 +0100)]
[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits
The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.
This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115
Alive2: https://alive2.llvm.org/ce/z/fl7T7K
Differential Revision: https://reviews.llvm.org/D129933
Abinav Puthan Purayil [Wed, 13 Jul 2022 06:40:02 +0000 (12:10 +0530)]
[AMDGPU] Set amdgpu-memory-bound if a basic block has dense global memory access
AMDGPUPerfHintAnalysis doesn't set the memory bound attribute if
FuncInfo::InstCost outweighs MemInstCost even if we have a basic block
with relatively high global memory access. GCNSchedStrategy could revert
optimal scheduling in favour of occupancy which seems to degrade
performance for some kernels. This change introduces the
HasDenseGlobalMemAcc metric in the heuristic that makes the analysis
more conservative in these cases.
This fixes SWDEV-334259/SWDEV-343932
Differential Revision: https://reviews.llvm.org/D129759
Abinav Puthan Purayil [Wed, 13 Jul 2022 18:04:57 +0000 (23:34 +0530)]
[AMDGPU] Pre-commit tests for D129759
Differential Revision: https://reviews.llvm.org/D129760
David Spickett [Thu, 14 Jul 2022 09:36:03 +0000 (09:36 +0000)]
[llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping
Fixes https://github.com/llvm/llvm-project/issues/56484
H registers are 16 bit views of AArch64's Neon registers and
B are the 8 bit views.
msvc does not support 16 bit float (some mention in DirectX but I
couldn't find a way to get to it) so for lack of a better reference
I'm using:
https://github.com/MicrosoftEdge/JsDbg/blob/
85c9b41b33bb8f3496dbe400d912c32bb7cc496b/server/references/dia/include/cvconst.h
(the other microsoft-pdb repo is no longer up to date)
Luckily clang does support fp16 so a test is added for that.
There is no 8 bit float type so I had to get creative with the
test case. We're not testing for correct debug info here just
that we can select the B register and not crash in the process.
For FPCR it's never going to be passed as an argument so I've
not added a test for it. It is included to keep our list looking
the same as the reference.
Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D129774
Alexey Lapshin [Tue, 19 Jul 2022 09:16:24 +0000 (12:16 +0300)]
Revert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF."
This reverts commit
e2147c26bd1522ad67a98836fbe94933eab869bb.
Markus Böck [Tue, 19 Jul 2022 08:58:25 +0000 (10:58 +0200)]
[mlir] Ignore effects on allocated results when checking whether the op is trivially dead.
In the current state, this is only special cased for Allocation effects, but any effects on results allocated by the operation may be ignored when checking whether the op may be removed, as none of them are possible to be observed if the result is unused.
A use case for this is for IRs for languages which always initialize on allocation. To correctly model such operations, a Write as well as an Allocation effect should be placed on the result. This would prevent the Op from being deleted if unused however. This patch fixes that issue.
Differential Revision: https://reviews.llvm.org/D129854
Max Kazantsev [Tue, 19 Jul 2022 08:22:44 +0000 (15:22 +0700)]
[LoopSimplifyCFG] Prevent use-def dominance breach by handling dead exits. PR56243
One of the transforms in LoopSimplifyCFG demands that the LCSSA form is
truly maintained for all values, tokens included, otherwise it may end up creating
a use that is not dominated by def (and Phi creation for tokens is impossible).
Detect this situation and prevent transform for it early.
Differential Revision: https://reviews.llvm.org/D129984
Reviewed By: efriedma
Jason Molenda [Tue, 19 Jul 2022 08:40:12 +0000 (01:40 -0700)]
Update docs to note lzfse open source implementation
Alexey Lapshin [Sun, 10 Jul 2022 17:11:55 +0000 (20:11 +0300)]
[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):
```
./llvm-dwarfutil [options] <input file> <output file>
--garbage-collection Do garbage collection for debug info(default)
-j <value> Alias for --num-threads
--no-garbage-collection Don`t do garbage collection for debug info
--no-odr-deduplication Don`t do ODR deduplication for debug types
--no-odr Alias for --no-odr-deduplication
--no-separate-debug-file
Create single output file, containing debug tables(default)
--num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
--odr-deduplication Do ODR deduplication for debug types(default)
--odr Alias for --odr-deduplication
--separate-debug-file Create two output files: file w/o debug tables and file with debug tables
--tombstone [bfd,maxpc,exec,universal]
Tombstone value used as a marker of invalid address(default: universal)
=bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
=maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
=exec - Match with address ranges of executable sections
=universal - Both: bfd and maxpc
```
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D86539
Cullen Rhodes [Tue, 19 Jul 2022 07:51:28 +0000 (07:51 +0000)]
[AArch64] Add patterns to fold zext(cmpeq(x, splat(0)))
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D129626
Xiang1 Zhang [Fri, 15 Jul 2022 03:18:33 +0000 (11:18 +0800)]
[X86] Add 64 bit implement for __SSC_MARK
Reviewed By: craig.topper, pengfei.wang, jinsong
Differential Revision: https://reviews.llvm.org/D129826
Nikita Popov [Fri, 15 Jul 2022 10:35:08 +0000 (12:35 +0200)]
[LoopInfo] Allow cloning of callbr
After D129288, callbr is safe to clone without special handling.
This permits optimizations like loop unroll and loop unswitch on
loops containing callbrs.
Fixes https://github.com/llvm/llvm-project/issues/41834.
Differential Revision: https://reviews.llvm.org/D129993
Haojian Wu [Fri, 15 Jul 2022 14:15:31 +0000 (16:15 +0200)]
[pseudo] Implement a guard to determine function declarator.
This eliminates some simple-declaration/function-definition false
parses.
- implement a function to determine whether a declarator ForestNode is a
function declarator;
- extend the standard declarator to two guarded function-declarator and
non-function-declarator nonterminals;
Differential Revision: https://reviews.llvm.org/D129222
Rosie Sumpter [Tue, 12 Jul 2022 07:40:59 +0000 (08:40 +0100)]
[AArch64][SVE] Fold fadda(ptrue, x, select(mask, y, -0.0)) into fadda(mask, x, y)
This patch adds an SVE pattern to recognize the use of a select with an
fadda in the form fadda(ptrue, x, select(mask, y, -0.0)). In this case
the select can be folded away, with the select mask used as the
predicate for fadda. This improves the codegen when vectorizing loops
with ordered fp reductions.
Differential Revision: https://reviews.llvm.org/D129623
Matthias Springer [Tue, 19 Jul 2022 07:20:32 +0000 (09:20 +0200)]
[mlir][sparse][NFC] Update remaining test cases
No more to_memref, memref.alloc or memref.dealloc when possible.
Differential Revision: https://reviews.llvm.org/D130023
Matthias Springer [Tue, 19 Jul 2022 07:13:53 +0000 (09:13 +0200)]
[mlir][bufferization][NFC] Move sparse_tensor.release to bufferization dialect
This op used to belong to the sparse dialect, but there are use cases for dense bufferization as well. (E.g., when a tensor alloc is returned from a function and should be deallocated at the call site.) This change moves the op to the bufferization dialect, which now has an `alloc_tensor` and a `dealloc_tensor` op.
Differential Revision: https://reviews.llvm.org/D129985
Nicolai Hähnle [Tue, 19 Jul 2022 07:10:27 +0000 (09:10 +0200)]
Revert change to clang/test/CodeGen/arm_acle.c
For some reason, update_cc_test_checks.py produced a failing test.
Partial revert of
301011fa6078b4f16bd3fc6158d9c6fddad7e118
serge-sans-paille [Tue, 12 Jul 2022 20:05:57 +0000 (22:05 +0200)]
[sanitizer] Don't call dlerror() after swift_demangle lookup through dlsym
Because the call to `dlerror()` may actually want to print something, which turns into a deadlock
as showcased in #49223.
Instead rely on further call to dlsym to clear `dlerror` internal state if they
need to check the return status.
Differential Revision: https://reviews.llvm.org/D128992
serge-sans-paille [Tue, 12 Jul 2022 20:57:16 +0000 (22:57 +0200)]
[llvm] Fix forward declaration in Support/JSON.h
Some methods of json::Array require json::Value to be completely defined, so
they can't be defined in-class. Fix that by defining them out of class.
Fix #55780
Bing1 Yu [Tue, 19 Jul 2022 06:51:58 +0000 (14:51 +0800)]
[X86][NFC] avx512-f16c-v16f16-fadd.ll avx512-skx-v32f16-fadd.ll - add nounwind to prevent cfi noise on tests
Nicolai Hähnle [Tue, 19 Jul 2022 06:45:31 +0000 (08:45 +0200)]
Rerun ./utils/update_cc_test.py on a bunch of tests
Due to update script changes; this reduces the size of a later
"real" diff.
Max Kazantsev [Tue, 19 Jul 2022 05:50:43 +0000 (12:50 +0700)]
[NFC] Introduce API to detect tokens penetrating LCSSA form
Following discussion in PR56243, we need to somehow detect the situation
when token values penetrate LCSSA form for transforms that require that
it is maintained by all values (for example, to sustain use-def dominance
invarians). This patch introduces a parameter to LCSSA checkers to control
their ignorance about tokens.
Differential Revision: https://reviews.llvm.org/D129983
Reviewed By: efriedma
LLVM GN Syncbot [Tue, 19 Jul 2022 06:42:58 +0000 (06:42 +0000)]
[gn build] Port
8ed702b83f20
Max Kazantsev [Tue, 19 Jul 2022 06:26:31 +0000 (13:26 +0700)]
Revert "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat."
This reverts commit
58dfaaaace4ea75ab3588a6e738f2cf58ebf77c2.
Massive AARCH test failures in buildbot.
Bing1 Yu [Tue, 19 Jul 2022 06:16:30 +0000 (14:16 +0800)]
[X86] Promote v32f16's fadd into v32f32's fadd when it is avx512 without avx512fp16
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D130059
Shraiysh Vaishay [Mon, 18 Jul 2022 09:55:54 +0000 (15:25 +0530)]
[OpenMP][IRBuilder] Add support for taskgroup
This patch adds support for generating taskgroup construct.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D128203
Jacques Pienaar [Tue, 19 Jul 2022 05:18:52 +0000 (22:18 -0700)]
[mlir] Add refineReturnTypes to InferTypeOpInterface
refineReturnType method shares the same parameters as inferReturnTypes
but gets passed in the return types of the op if known that can be used
during refinement passes or for more op specific error reporting.
Currently the error reporting on failure is generic and doesn't allow
for specializing the returned result based on failure, with this change
what would previously have been a separate trait with specialized
verification can just be handled as part of inferrence rather than
duplicated.
refineReturnTypes behaves like inferReturnTypes if no result types are fed in,
while the current verification is recast as the default implementation for
refineReturnTypes with it calling inferReturnTypes (and so the default type
verification now goes through refine and allows for more op specific inference
mismatch errors).
Differential Revision: https://reviews.llvm.org/D129955
Carlos Alberto Enciso [Tue, 19 Jul 2022 04:55:14 +0000 (05:55 +0100)]
Update the Windows packaging script.
As discussed on:
https://discourse.llvm.org/t/build-llvm-release-bat-script-options/63146/6
- Refactor the build/test steps into functions.
- Exit the script if the build directory already exists.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D129559
Nathan James [Tue, 19 Jul 2022 04:21:09 +0000 (05:21 +0100)]
[clang-tidy] Remove unnecessary code from ReadabilityModuleTest
D56303 added testing code that was then made redundant by the changes in D125026. However this code wasn't completely removed in the latter patch.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D130026
Konstantin Varlamov [Tue, 19 Jul 2022 04:05:51 +0000 (21:05 -0700)]
[libc++][ranges] Implement `ranges::{,stable_}partition`.
Differential Revision: https://reviews.llvm.org/D129624
Lang Hames [Tue, 19 Jul 2022 03:36:17 +0000 (20:36 -0700)]
[ORC] Fix serialization / deserialization of default-constructed ArrayRef<char>.
Avoids a zero-length memcpy from a null src, which caused errors on some of the
sanitizer bots. Also uses null when deserializing an empty ArrayRef (rather
than pointing to a zero length range in the middle of the input buffer).
jacquesguan [Thu, 7 Jul 2022 08:48:55 +0000 (16:48 +0800)]
[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat.
This revision supports to scalarize a binary operation of two scalable splat vectors.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122791
jacquesguan [Thu, 7 Jul 2022 08:57:59 +0000 (16:57 +0800)]
[RISCV][test] Precommit test for D122791.
Differential Revision: https://reviews.llvm.org/D123362
Kazushi (Jam) Marukawa [Sat, 2 Jul 2022 04:51:20 +0000 (13:51 +0900)]
[VE] Support load/store/spill of vector mask registers
Support load/store/spill of vector mask registers and add regression
tests.
Reviewed By: efocht
Differential Revision: https://reviews.llvm.org/D129415
zhongyunde [Tue, 19 Jul 2022 00:59:26 +0000 (08:59 +0800)]
[AArch64][NFC] Set true for default of subfeature is more readable
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D129960
Jim Ingham [Tue, 19 Jul 2022 00:38:43 +0000 (17:38 -0700)]
Revert "Make hit point counts reliable for architectures that stop before evaluation."
This reverts commit
5778ada8e54edb2bc2869505b88a959d1915c02f.
The watchpoint tests all stall on aarch64-ubuntu bots. Reverting till I can
get my hands on an system to test this out.
Jim Ingham [Tue, 19 Jul 2022 00:37:13 +0000 (17:37 -0700)]
Revert "This is a followup to https://reviews.llvm.org/D129814"
This reverts commit
555ae5b8f5aa93ab090af853a8b7a83f815b3f20.
Apparently, there's something different about how Linux ARM handles watchpoints,
as all the watchpoint tests seem to stall on the Ubuntu aarch64 bots.
Reverting till I can get my hands on a linux system and see what is
wrong.
ksyx [Sun, 26 Jun 2022 01:36:02 +0000 (21:36 -0400)]
[RISCV][Clang] Add support for Zmmul extension
This patch implements recently ratified extension Zmmul, a subextension
of M (Integer Multiplication and Division) consisting only
multiplication part of it.
Differential Revision: https://reviews.llvm.org/D103313
Reviewed By: craig.topper, jrtc27, asb
Argyrios Kyrtzidis [Mon, 18 Jul 2022 23:53:16 +0000 (16:53 -0700)]
[unittests/Tooling/DependencyScannerTest] Add a target triple for `ScanDepsWithFS` test
This should fix the `clang-ppc64-aix` builder.
Rahman Lavaee [Sat, 16 Jul 2022 07:48:50 +0000 (00:48 -0700)]
[llvm-objdump] Support --symbolize-operands when there is a single SHT_LLVM_BB_ADDR_MAP section for all text sections
When linking, using `-Wl,-z,keep-text-section-prefix` results in multiple text sections while all `SHT_LLVM_BB_ADDR_MAP` sections are linked into a single one.
In such case, we should not read the corresponding section for each text section, and instead read all `SHT_LLVM_BB_ADDR_MAP` sections before disassembly.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D129924
Jim Ingham [Mon, 18 Jul 2022 23:21:51 +0000 (16:21 -0700)]
This is a followup to https://reviews.llvm.org/D129814
That was causing hit counts to be double-counted on x86_64 Linux.
It looks like StopInfoWatchpoint::ShouldStopSynchronous gets called
twice for a give stop on Linux (not on Darwin). I had taken out the
"have I been called already" check when I reworked this part of the
code because it didn't seem necessary. Putting that back in because
it looks like it is on some systems.
Ellis Hoag [Mon, 18 Jul 2022 22:09:11 +0000 (15:09 -0700)]
[InstrProf] Allow CSIRPGO function entry coverage
The flag `-fcs-profile-generate` for enabling CSIRPGO moves the pass
`pgo-instrumentation` after inlining. Function entry coverage works fine
with this change, so remove the assert. I had originally left this
assert in because I had not tested this at the time.
Reviewed By: davidxl, MaskRay
Differential Revision: https://reviews.llvm.org/D129407
Jim Ingham [Mon, 18 Jul 2022 21:47:35 +0000 (14:47 -0700)]
When the module path for `command script import` is invalid, echo the path.
We were just emitting "invalid module" w/o saying which module. That's
not particularly helpful.
Differential Revision: https://reviews.llvm.org/D129338
Jim Ingham [Wed, 13 Jul 2022 01:34:24 +0000 (18:34 -0700)]
Make hit point counts reliable for architectures that stop before evaluation.
Since we want to present the "new & old" values for watchpoint hits, on architectures,
including the ARM family, that stop before the triggering instruction is run, we need
to single step over the instruction before stopping for realz. This was incorrectly
done directly in the StopInfoWatchpoint::ShouldStop. That causes problems if more than
one thread stops "for a reason" at the same time as the watchpoint, since the other actions
didn't expect the process to make progress in this part of the execution control machinery.
The correct way to do this is to schedule the step over using ThreadPlans, and then to restore
the stop info after that plan stops, so that the rest of the stop info actions can happen when
all the other threads have handled their immediate actions as well.
Differential Revision: https://reviews.llvm.org/D129814
Matt Arsenault [Fri, 24 Jun 2022 16:09:34 +0000 (12:09 -0400)]
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
Michael Jones [Mon, 18 Jul 2022 18:32:04 +0000 (11:32 -0700)]
[libc] fix strtofloatingpoint on rare edge case
Currently, there are two string parsers that can be used in a call to
strtofloatingpoint. There is the main parser used by Clinger's fast path
and Eisel-Lemire, and the backup parser used by Simple Decimal
Conversion. There was a bug in the backup parser where if the number had
more than 800 digits (the size of the SDC buffer) before the decimal
point, it would just ignore the digits after the 800th and not count
them into the exponent. This patch fixes that issue and adds regression
tests.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D130032
zr33 [Mon, 18 Jul 2022 21:20:22 +0000 (14:20 -0700)]
[BOLT][DWARF] Fix incorrect DW_AT_type offset for unittest
Some unit tests has incorrect DW_AT_type offset since they are manual crafted, fix them to the correct offset.
Reviewed By: Amir, ayermolo
Differential Revision: https://reviews.llvm.org/D129828
zr33 [Mon, 18 Jul 2022 21:03:40 +0000 (14:03 -0700)]
[BOLT][DWARF] Add Unit test for DW_AT_high_pc [DW_FORM_addr]
Reviewed By: ayermolo
Differential Revision: https://reviews.llvm.org/D127613
Sam McCall [Mon, 18 Jul 2022 20:38:24 +0000 (22:38 +0200)]
[pseudo] Add guards for module contextual keywords
Martin Storsjö [Thu, 14 Jul 2022 19:35:50 +0000 (22:35 +0300)]
[clang-tidy] Reduce the dependencies for the "make-confusable-table" tool
When cross compiling llvm, a separate recursive native cmake build
is generated, for building the tools that generate code (unless they're
provided externally by the caller).
This reduces the number of build steps for that native build from
1000+ steps to 162.
This matches how the clang-pseudo-gen tool is set up in
clang-tools-extra/pseudo/gen/CMakeLists.txt.
Differential Revision: https://reviews.llvm.org/D129797
Björn Schäpers [Sat, 16 Jul 2022 21:46:18 +0000 (23:46 +0200)]
[clang-format] Mark constexpr lambdas as lambda
Otherwise the brace was detected as a function brace, not wrong per se,
but when directly calling the lambda the calling parens were put on the
next line.
Differential Revision: https://reviews.llvm.org/D129946
Björn Schäpers [Wed, 13 Jul 2022 10:38:38 +0000 (12:38 +0200)]
[clang-format] Indent TT_CtorInitializerColon after requires clauses
Fixes https://github.com/llvm/llvm-project/issues/56215
Differential Revision: https://reviews.llvm.org/D129942
Björn Schäpers [Mon, 4 Jul 2022 08:53:34 +0000 (10:53 +0200)]
[clang-format] Fix misannotation of colon in presence of requires clause
For clauses without parentheses it was annotated as TT_InheritanceColon.
Relates to https://github.com/llvm/llvm-project/issues/56215
Differential Revision: https://reviews.llvm.org/D129940
Stanislav Mekhanoshin [Fri, 15 Jul 2022 22:16:04 +0000 (15:16 -0700)]
[AMDGPU] Support for gfx940 fp8 smfmac
Differential Revision: https://reviews.llvm.org/D129908
Stanislav Mekhanoshin [Fri, 15 Jul 2022 21:45:19 +0000 (14:45 -0700)]
[AMDGPU] Support for gfx940 fp8 mfma
Differential Revision: https://reviews.llvm.org/D129906
Stanislav Mekhanoshin [Fri, 15 Jul 2022 20:20:08 +0000 (13:20 -0700)]
[AMDGPU] Support for gfx940 fp8 conversions
Differential Revision: https://reviews.llvm.org/D129902
Florian Hahn [Mon, 18 Jul 2022 18:41:48 +0000 (19:41 +0100)]
[LV] Sink module variable and use State to set it in widenCall. (NFC)
Limits the lifetime of the variable and makes it independent of
CallInst.
Jay Foad [Mon, 18 Jul 2022 14:20:06 +0000 (15:20 +0100)]
[LiveIntervals] Find better anchoring end points when repairing ranges
r175673 changed repairIntervalsInRange to find anchoring end points for
ranges automatically, but the calculation of Begin included the first
instruction found that already had an index. This patch changes it to
exclude that instruction:
1. For symmetry, so that the half open range [Begin,End) only includes
instructions that do not already have indexes.
2. As a possible performance improvement, since repairOldRegInRange
will scan fewer instructions.
3. Because repairOldRegInRange hits assertion failures in some cases
when it sees a def that already has a live interval.
(3) fixes about ten tests in the CodeGen lit test suite when
-early-live-intervals is forced on.
Differential Revision: https://reviews.llvm.org/D110182
Mubariz Afzal [Mon, 18 Jul 2022 18:24:20 +0000 (14:24 -0400)]
Reland "[SystemZ][z/OS] Fix f32 variadic argument assertion"
This patch relands the f32 vararg assertion on z/OS fix that was reverted previously due to the testcase failing on non-z/OS platforms. It is now passing.
The tablegen lines that specify the XPLINK64 calling convention for promoting an f32 vararg to an f64 are effectively overwritten by the following tablegen line which bitcast an f64 vararg to an i64 (so that it can be used in the GPRs). Thus it becomes a bitcast from f32 to i64. We don't handle bitcasts for f32s and so this causes an assertion to be thrown.
We fix this by simplifying the tablegen lines to explicity show this behaviour, and allow the f32 in the bitcast case by first promoting it to an f64.
Mehdi Amini [Mon, 18 Jul 2022 18:07:36 +0000 (18:07 +0000)]
Revert "[MLIR] Generic 'malloc', 'aligned_alloc' and 'free' functions"
This reverts commit
3e21fb616d9a1b29bf9d1a1ba484add633d6d5b3.
A lot of integration tests are failing on the bot.
Craig Topper [Mon, 18 Jul 2022 16:58:54 +0000 (09:58 -0700)]
[RISCV] Optimize (seteq (i64 (and X, 0xffffffff)), C1)
(and X, 0xffffffff) requires 2 shifts in the base ISA. Since we
know the result is being used by a compare, we can use a sext_inreg
instead of an AND if we also modify C1 to have 33 sign bits instead
of 32 leading zeros. This can also improve the generated code for
materializing C1.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D129980
Craig Topper [Mon, 18 Jul 2022 16:58:44 +0000 (09:58 -0700)]
[RISCV] Pre-commit tests for D129980. NFC
Differential Revision: https://reviews.llvm.org/D129981
LLVM GN Syncbot [Mon, 18 Jul 2022 17:45:05 +0000 (17:45 +0000)]
[gn build] Port
e24b390dbc4e
LLVM GN Syncbot [Mon, 18 Jul 2022 17:45:04 +0000 (17:45 +0000)]
[gn build] Port
0f9d9edd2477
Arnold Schwaighofer [Fri, 15 Jul 2022 14:32:22 +0000 (07:32 -0700)]
[coro async] Fix code to run coro.async.end cleanup like the legacy pass did
The code executed for the Switch ABI does not change.
rdar://
97074714
Differential Revision: https://reviews.llvm.org/D129865
Matt Arsenault [Mon, 27 Jun 2022 18:28:33 +0000 (14:28 -0400)]
llvm-reduce: Add reduction for instruction defs
Try to insert an implicit_def to replace the instruction's value,
replacing the original instruction's def with a dead register. If all
defs are delete the instruction entirely.
This is pretty similar to the instruction reduction, but leaves the
new defs in the same place as the original instruction. This could
possibly replace it. I'm not sure if we should directly delete the
instructions here, or leave dead ones behind.
This could also further work to replace physical register defs.
Matt Arsenault [Wed, 22 Jun 2022 17:13:17 +0000 (13:13 -0400)]
llvm-reduce: Add reduction for custom register masks
I have a register allocator failure that only reproduces with IPRA
enabled, and requires the specific regmask if I want to only run the
one relevant pass. The printed custom regmask is enormous and I would
like to reduce it.
This reduces each individual bit in the mask, but it would probably be
better to start at register units and clear all aliasing fields at a
time. This would require stricter verification that all aliasing bits
are set in regmasks (although I would prefer to switch regmasks to use
register units in the first place).
Alex Bradbury [Mon, 18 Jul 2022 17:37:09 +0000 (18:37 +0100)]
[docs] Remove unmaintained target feature matrix
Back in 2017, a table was added to the codegen documentation listing
which features various backends support. It received a few updates since
then, but not since the end of 2019. Having such a table is a nice idea,
but it hasn't been kept up to date, it isn't easy to ensure that it is
up to date, and the table probably isn't very discoverable for most
users who would be interested in this information anyway (it would be
better suited to some kind of "what can LLVM do for me?" page).
For all of the above reasons, I believe it makes sense to remove it.
Differential Revision: https://reviews.llvm.org/D129996
Daniel Bertalan [Mon, 18 Jul 2022 11:37:40 +0000 (13:37 +0200)]
[lld-macho] Devirtualize TargetInfo::getRelocAttrs
This method is called on each relocation when parsing input files, so
the overhead of using virtual functions ends up being quite large. We
now have a single non-virtual method, which reads from the appropriate
array of relocation attributes set in the TargetInfo constructor.
This change results in a modest 2.3% reduction in link time for
chromium_framework measured on an x86-64 VPS, and 0.7% on an arm64 Mac.
N Min Max Median Avg Stddev
x 10 11.869417 12.032609 11.935041 11.938268 0.
045802324
+ 10 11.581526 11.785265 11.649885 11.659507 0.
054634834
Difference at 95.0% confidence
-0.278761 +/- 0.0473673
-2.33502% +/- 0.396768%
(Student's t, pooled s = 0.0504124)
Differential Revision: https://reviews.llvm.org/D130000
Arjun P [Mon, 18 Jul 2022 17:01:32 +0000 (18:01 +0100)]
[MLIR][Presburger] fix warning under g++ (NFC)
Igor Kudrin [Mon, 18 Jul 2022 16:56:07 +0000 (20:56 +0400)]
Reapply "[NVPTX] Use the mask() operator to initialize packed structs with pointers"
The original patch revealed an issue of reading incorrect values on BE hosts.
That is now changed to use `endian::read32le()` and `endian::read64le()`.
Original commit message:
The current implementation assumes that all pointers used in the
initialization of an aggregate are aligned according to the pointer size
of the target; that might not be so if the object is packed. In that
case, an array of .u8 should be used and pointers should be decorated
with the mask() operator.
The operator was introduced in PTX ISA 7.1, so an error is issued if the
case is detected for an earlier version.
Differential Revision: https://reviews.llvm.org/D127504
Craig Topper [Mon, 18 Jul 2022 16:10:53 +0000 (09:10 -0700)]
[RISCV] Add isel patterns for ineg+setge/le/uge/ule.
setge/le/uge/ule selected by themselves require an xori with 1.
If we're negating the setcc, we can fold the xori with the neg
to create an addi with -1.
This works because xori X, 1 is equivalent to 1 - X if X is either
0 or 1. So we're doing -(1 - X) which is X-1 or X+-1.
This improves the code for selecting between 0 and -1 based on a
condition for some conditions.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D129957
Nicolai Hähnle [Mon, 18 Jul 2022 16:46:58 +0000 (18:46 +0200)]
Rerun ./utils/update_cc_test.py on a bunch of tests
Due to update script changes; this reduces the size of a later "real"
diff.
Joseph Huber [Mon, 18 Jul 2022 16:43:50 +0000 (12:43 -0400)]
[LinkerWrapper] Rework passing args to the LLVM backend
Argyrios Kyrtzidis [Sat, 16 Jul 2022 00:31:49 +0000 (17:31 -0700)]
[Tooling/DependencyScanning] Enable passing a `vfs::FileSystem` object to `DependencyScanningTool`
Also include a unit test to validate that the `vfs::FileSystem` object is properly used.
Differential Revision: https://reviews.llvm.org/D129912
Fangrui Song [Mon, 18 Jul 2022 16:35:11 +0000 (09:35 -0700)]
[IR] Allow absence for Min module flags and make AArch64 BTI/PAC-RET flags backward compatible
D123493 introduced llvm::Module::Min to encode module flags metadata for AArch64
BTI/PAC-RET. llvm::Module::Min does not take effect when the flag is absent in
one module. This behavior is misleading and does not address backward
compatibility problems (when a bitcode with "branch-target-enforcement"==1 and
another without the flag are merged, the merge result is 1 instead of 0).
To address the problems, require Min flags to be non-negative and treat absence
as having a value of zero. For an old bitcode without
"branch-target-enforcement"/"sign-return-address", its value is as if 0.
Differential Revision: https://reviews.llvm.org/D129911
Arjun P [Mon, 18 Jul 2022 16:34:40 +0000 (17:34 +0100)]
[MLIR][Presburger] Provide functions to convert between arrays of MPInt and int64_t
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D129509
Arjun P [Mon, 18 Jul 2022 16:33:24 +0000 (17:33 +0100)]
[MLIR][Presburger] SlowMPInt: fix bug in ceilDiv, floorDiv where widths weren't harmonized
This also adds tests for abs, ceilDiv, floorDiv, mod, gcd and lcm.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D129816
Aart Bik [Fri, 15 Jul 2022 23:41:02 +0000 (16:41 -0700)]
[mlir][sparse] migrate sparse rewriting to sparse transformations pass
The rules in the linalg file were very specific to sparse tensors so will
find a better home under sparse tensor dialect than linalg dialect. Also
moved some rewriting from sparsification into this new "pre-rewriting" file.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D129910
Itay Bookstein [Wed, 13 Jul 2022 17:29:48 +0000 (20:29 +0300)]
[SDAG] Remove single-result restriction on commutative CSE
The DAG Combiner unnecessarily restricts commutative CSE
to nodes with a single result value. This commit removes
that restriction.
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D129666
Craig Topper [Mon, 18 Jul 2022 16:05:41 +0000 (09:05 -0700)]
[RISCV] Fold stack reload into sext.w by using lw instead of ld.
We can use lw to load 4 bytes from the stack and sign extend them
instead of loading all 8 bytes.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D129948
Igor Kudrin [Mon, 18 Jul 2022 16:08:15 +0000 (20:08 +0400)]
Revert "[NVPTX] Use the mask() operator to initialize packed structs with pointers"
The new test fails on BE hosts.
This reverts commit
04e978ccba1e6c8b600b2fbad1a82b4b64ffc34b.
Alexander Batashev [Mon, 18 Jul 2022 15:56:34 +0000 (11:56 -0400)]
[mlir][spirv] Allow unnamed entry point functions
SPIR-V specification does not require a function to have a name
if it is an entry point. Adjust deserializer to allow those kinds
of SPIR-V binaries.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D120181
Michele Scuttari [Mon, 18 Jul 2022 15:47:18 +0000 (17:47 +0200)]
[MLIR] Generic 'malloc', 'aligned_alloc' and 'free' functions
When converted to the LLVM dialect, the memref.alloc and memref.free operations were generating calls to hardcoded 'malloc' and 'free' functions. This didn't leave any freedom to users to provide their custom implementation. Those operations now convert into calls to '_mlir_alloc' and '_mlir_free' functions, which have also been implemented into the runtime support library as wrappers to 'malloc' and 'free'. The same has been done for the 'aligned_alloc' function.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D128791
Nicolai Hähnle [Mon, 18 Jul 2022 15:43:35 +0000 (17:43 +0200)]
Revert "Inliner: don't mark call sites as 'nounwind' if that would be redundant"
This reverts commit
9905c379819fafdc2246bcd24dd7165bd72d7659.
Looks like there are Clang changes that are affected in trivial ways. Will look into it.
Nicolai Hähnle [Fri, 15 Jul 2022 13:27:26 +0000 (15:27 +0200)]
Inliner: don't mark call sites as 'nounwind' if that would be redundant
When F calls G calls H, G is nounwind, and G is inlined into F, then the
inlined call-site to H should be effectively nounwind so as not to lose
information during inlining.
If H itself is nounwind (which often happens when H is an intrinsic), we
no longer mark the callsite explicitly as nounwind. Previously, there
were cases where the inlined call-site of H differs from a pre-existing
call-site of H in F *only* in the explicitly added nounwind attribute,
thus preventing common subexpression elimination.
v2:
- just check CI->doesNotThrow
Differential Revision: https://reviews.llvm.org/D129860
Sanjay Patel [Mon, 18 Jul 2022 14:37:18 +0000 (10:37 -0400)]
[InstCombine] reduce code for signbit folds; NFC
Lorenzo Albano [Mon, 18 Jul 2022 06:49:19 +0000 (08:49 +0200)]
[VP] IR expansion pass for VP gather and scatter
Add vp_gather and vp_scatter expansion to unpredicated intrinsics.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D120664
Hans Wennborg [Mon, 18 Jul 2022 14:34:34 +0000 (16:34 +0200)]
Revert "[libc++] Always build c++experimental.a"
This caused build failures when building Clang and libc++ together on Mac:
fatal error: 'experimental/memory_resource' file not found
See the code review for details. Reverting until the problem and how to
solve it is better understood.
(Updates to some test files were not reverted, since they seemed
unrelated and were later updated by
340b48b267b96.)
> This is the first part of a plan to ship experimental features
> by default while guarding them behind a compiler flag to avoid
> users accidentally depending on them. Subsequent patches will
> also encompass incomplete features (such as <format> and <ranges>)
> in that categorization. Basically, the idea is that we always
> build and ship the c++experimental library, however users can't
> use what's in it unless they pass the `-funstable` flag to Clang.
>
> Note that this patch intentionally does not start guarding
> existing <experimental/FOO> content behind the flag, because
> that would merely break users that might be relying on such
> content being in the headers unconditionally. Instead, we
> should start guarding new TSes behind the flag, and get rid
> of the existing TSes we have by shipping their Standard
> counterpart.
>
> Also, this patch must jump through a few hoops like defining
> _LIBCPP_ENABLE_EXPERIMENTAL because we still support compilers
> that do not implement -funstable yet.
>
> Differential Revision: https://reviews.llvm.org/D128927
This reverts commit
bb939931a1adb9a47a2de13c359d6a72aeb277c8.
Ulrich Weigand [Mon, 18 Jul 2022 14:54:48 +0000 (16:54 +0200)]
[libunwind][SystemZ] Use process_vm_readv to avoid potential segfaults
Fix potential crashes during unwind when checking for signal frames
and the current PC is invalid.
The same bug was fixed for aarch64 in https://reviews.llvm.org/D126343.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D129856
zhijian [Mon, 18 Jul 2022 14:43:30 +0000 (10:43 -0400)]
[AIX] support read global symbol of big archive
Reviewers: James Henderson, Fangrui Song
Differential Revision: https://reviews.llvm.org/D124865
Jeff Bailey [Sun, 17 Jul 2022 02:41:55 +0000 (02:41 +0000)]
[libc] Fix API for remove_{prefix, suffix}
The API in StringView.h for remove_prefix was incorrect and was returning a
new StringView rather than just altering the view.
As part of this, also removed some of the safety features. The comment
correctly noted that the behaviour is undefined in some cases,
but the code and test cases checked for that.
One caller was relying on the old behaviour, so fixed it and added some
comments.
Tested:
check-libc
llvmlibc
libc-loader-tests
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D129950
Benjamin Kramer [Mon, 18 Jul 2022 14:34:36 +0000 (16:34 +0200)]
Don't include private gtest/gmock headers
Only gmock.h and gtest.h are supposed to be user-visible.
gbreynoo [Mon, 18 Jul 2022 13:53:20 +0000 (14:53 +0100)]
[llvm-size] Fix hang waiting for input on invalid short commandline option
When an invalid shot command line option was used (e.g. -v) llvm-size
would hang waiting on input from stdin. This change fixes this issue by
bringing llvm-size in line with other llvm tools and exiting early when
this error is output.
Differential Revision: https://reviews.llvm.org/D129866
gbreynoo [Thu, 14 Jul 2022 09:48:52 +0000 (10:48 +0100)]
[llvm-ar][test] Add testing for bitcode file handling
Recommit after revert.
This change adds testing for handling of bitcode files in archives,
particularly the creation of symbol tables and through MRI scripts.
Although there is some testing of bitcode handling in the archive
library testing, this was not covered.
Differential Revision: https://reviews.llvm.org/D129088
Alex Zinenko [Mon, 18 Jul 2022 13:35:23 +0000 (15:35 +0200)]
[mlir] Fix Bazel for
5e83a5b4752da6631d79c446f21e5d128b5c5495
Export the __init__.py from _mlir_libs.
Nikita Popov [Mon, 18 Jul 2022 13:30:13 +0000 (15:30 +0200)]
[LoopSimplifyCFG] Revert accidental change
This change was included in an unrelated change
b57d61384c9938e3dfa54b55bf8b2a0a05e67e28
and was of course not intended for commit...
Nikita Popov [Mon, 18 Jul 2022 13:18:47 +0000 (15:18 +0200)]
[ConstantRangeTest] Migrate known bits test to generic infrastructure (NFC)
This can't make use of TestBinaryOpExhaustive, but it can make use
of the general TestRange approach that collects the precise elements
in a bit vector.
This allows us to remove the obsolete "op range gatherer" infrastructure.