Max Kazantsev [Mon, 25 Oct 2021 07:09:41 +0000 (14:09 +0700)]
[SCEV][NFC] Win some compile time from mass forgetMemoizedResults
Mass forgetMemoizedResults can be done more efficiently than bunch
of individual invocations of helper because we can traverse maps being
updated just once, rather than doing this for each invidivual SCEV.
Should be NFC and supposedly improves compile time.
Differential Revision: https://reviews.llvm.org/D112294
Reviewed By: reames
Max Kazantsev [Mon, 25 Oct 2021 06:50:49 +0000 (13:50 +0700)]
[SCEV][NFC] Apply mass forgetMemoizedResults queries where possible
When forgetting multiple SCEVs, rather than doing this one by one, we can
instead use mass updates. We plan to make them more efficient than they
are now, potentially improving compile time.
Differential Revision: https://reviews.llvm.org/D111602
Reviewed By: reames
Max Kazantsev [Mon, 25 Oct 2021 06:32:53 +0000 (13:32 +0700)]
[SCEV][NFC] Introduce API for mass forgetMemoizedResults query
This patch changes signature of forgetMemoizedResults to be able to work with
multiple SCEVs. Usage will come in follow-ups. We also plan to optimize it in the
future to work faster than individual invalidation updates. Should not change
behavior in any sense.
Split-off from D111602.
Differential Revision: https://reviews.llvm.org/D112293
Reviewed By: reames
Shivam Gupta [Mon, 25 Oct 2021 06:28:58 +0000 (11:58 +0530)]
[NFC] Update test/CodeGen/RISCV/select-constant-xor.ll to use RV --check-prefix
This is only for consistency with test cases.
Differential Revision: https://reviews.llvm.org/D112364
Nicolas Vasilache [Sun, 24 Oct 2021 22:34:38 +0000 (22:34 +0000)]
[mlir][Linalg] NFC - Reorganize options nesting.
This removes duplication and makes nesting more clear.
It also reduces the amount of changes necessary for exposing future options.
Differential revision: https://reviews.llvm.org/D112344
Max Kazantsev [Mon, 25 Oct 2021 05:30:46 +0000 (12:30 +0700)]
[NFC][SCEV] Do not track users of SCEVConstants
Follow-up from D112295, suggested by Nikita: we can avoid tracking
users of SCEVConstants because dropping their cached info is unlikely
to give any new prospects for fact inference, and it should not introduce
any correctness problems.
Max Kazantsev [Mon, 25 Oct 2021 04:36:25 +0000 (11:36 +0700)]
[SCEV][NFC] API for tracking of SCEV users
This patch introduces API that keeps track of SCEVs users of
another SCEVs, required to handle invalidations of users along
with operands that comes in follow-up patches.
Differential Revision: https://reviews.llvm.org/D112295
Reviewed By: reames
Mehdi Amini [Mon, 25 Oct 2021 02:49:46 +0000 (02:49 +0000)]
Add a clear() method on the PassManager (NFC)
This allows to clear an OpPassManager and populated it again with a new
pipeline, while preserving all the other options (including instrumentations).
Differential Revision: https://reviews.llvm.org/D112393
Chen Zheng [Mon, 25 Oct 2021 03:27:16 +0000 (03:27 +0000)]
[PowerPC] common chains to reuse offsets to reduce register pressure.
Add a new preparation pattern in PPCLoopInstFormPrep pass to reduce register
pressure.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D108750
Jacques Pienaar [Mon, 25 Oct 2021 02:50:15 +0000 (19:50 -0700)]
[mlir] Give GenericAtomicRMW region a name
Some tools assume all regions have names, provide one to avoid breakage.
Jinsong Ji [Mon, 25 Oct 2021 01:52:03 +0000 (01:52 +0000)]
[AIX] Add i128 arg split tests
Address comments in D111078.
Reviewed By: hubert.reinterpretcast, lkail
Differential Revision: https://reviews.llvm.org/D112272
Vitaly Buka [Mon, 25 Oct 2021 02:13:43 +0000 (19:13 -0700)]
Revert "[NFC][sanitizer] constexpr a few functions"
This reverts a part of commit
8cd51a69e5b4cf9513eb4f416f113058ebd8f257
and
5bf24f0581ee7ab9971b4050497375464b894c59 to fix Windows.
Craig Topper [Mon, 25 Oct 2021 01:41:32 +0000 (18:41 -0700)]
[RISCV] Rename vmulh-sdnode-rv32.ll and add rv64 command line. NFC
Vitaly Buka [Mon, 25 Oct 2021 00:34:13 +0000 (17:34 -0700)]
[NFC][sanitizer] Use power of two in TwoLevelMap
Using divisions by non power of two makes
a difference on x86_64 and aarch64 benchmarks.
Vitaly Buka [Mon, 25 Oct 2021 00:05:50 +0000 (17:05 -0700)]
[NFC][sanitizer] DCHECKs in hot code
Vitaly Buka [Sun, 24 Oct 2021 23:21:28 +0000 (16:21 -0700)]
[NFC][sanitizer] constexpr a few functions
Jacques Pienaar [Mon, 25 Oct 2021 01:36:33 +0000 (18:36 -0700)]
[mlir] Switch arith, llvm, std & shape dialects to accessors prefixed both form.
Following
https://llvm.discourse.group/t/psa-ods-generated-accessors-will-change-to-have-a-get-prefix-update-you-apis/4476,
this follows flipping these dialects to _Both prefixed form. This
changes the accessors to have a prefix. This was possibly mostly without
breaking breaking changes if the existing convenience methods were used.
(https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp
was used to migrate the callers post flipping, using the output from
Operator.cpp)
Differential Revision: https://reviews.llvm.org/D112383
Fangrui Song [Mon, 25 Oct 2021 01:29:45 +0000 (18:29 -0700)]
[ELF] Remove ignored options that likely nobody uses
GNU ld doesn't support `--no-pic-executable`.
`-p` has been removed from likely the only use case (Linux kernel) for over 2.5 years: https://git.kernel.org/linus/
091bb549f7722723b284f63ac665e2aedcf9dec9
`--no-add-needed` was the pre-binutils-2.23 spelling for `--no-copy-dt-needed-entries`.
The legacy alias is irrelevant in 2021.
Jacques Pienaar [Mon, 25 Oct 2021 01:17:09 +0000 (18:17 -0700)]
[mlir] Rename to avoid overlap in accessor prefixing
Split out renaming from D112383 into standalone change.
Kazu Hirata [Mon, 25 Oct 2021 00:35:35 +0000 (17:35 -0700)]
[Target, Transforms] Use predecessors instead of pred_begin and pred_end (NFC)
Kazu Hirata [Mon, 25 Oct 2021 00:35:33 +0000 (17:35 -0700)]
Use llvm::any_of and llvm::none_of (NFC)
Matthias Braun [Sun, 24 Oct 2021 23:39:14 +0000 (16:39 -0700)]
pre-comitting tests for D110865
Matthias Braun [Fri, 24 Sep 2021 20:41:54 +0000 (13:41 -0700)]
X86InstrInfo: Look across basic blocks in optimizeCompareInstr
This extends `optimizeCompareInstr` to continue the backwards search
when it reached the beginning of a basic block. If there is a single
predecessor block then we can just continue the search in that block and
mark the EFLAGS register as live-in.
Differential Revision: https://reviews.llvm.org/D110862
Matthias Braun [Fri, 24 Sep 2021 18:43:43 +0000 (11:43 -0700)]
X86InstrInfo: Refactor and cleanup optimizeCompareInstr
This changes the first part of `optimizeCompareInstr` being split into a
loop with a forward scan for cases that re-use zero flags from a
producer in case of compare with zero and a backward scan for finding an
instruction equivalent to a compare.
The code now uses a single backward scan searching for the next
instructions that reads or writes EFLAGS.
Also:
- Add comments giving examples for the 3 cases handled.
- Check `MI` which contains the result of the zero-compare cases,
instead of re-checking `IsCmpZero`.
- Tweak coding style in some loops.
- Add new MIR based tests that test the optimization in isolation.
This also removes a check for flag readers in situations like this:
```
= SUB32rr %0, %1, implicit-def $eflags
... we no longer stop when there are $eflag users here
CMP32rr %0, %1 ; will be removed
...
```
Differential Revision: https://reviews.llvm.org/D110857
Philip Reames [Sun, 24 Oct 2021 19:39:26 +0000 (12:39 -0700)]
Treat branch on poison as immediate UB (under an off by default flag)
The LangRef clearly states that branching on a undef or poison value is immediate undefined behavior, but historically, we have not been consistent about implementing that interpretation in the optimizer. Historically, we used (in some cases) a more relaxed model which essentially looked for provable UB along both paths which was control dependent on the condition. However, we've never been 100% consistent here. For instance SCEV uses the strong model for increments which form AddRecs (and only addrecs).
At the moment, the last big blocker for finally making this switch is enabling the fix landed in D106041. Loop unswitching (in it's classic form) is incorrect as it creates many "branch on poisons" when unswitching conditions originally unreachable within the loop.
This change adds a flag to value tracking which allows to easily test the optimization potential of treating branch on poison as immediate UB. It's intended to help ease work on getting us finally through this transition and avoid multiple independent rediscovers of the same issues.
Differential Revision: https://reviews.llvm.org/D112026
Philip Reames [Sun, 24 Oct 2021 01:07:21 +0000 (18:07 -0700)]
[instcombine] Fix oss-fuzz 39934 (mul matcher can match non-instruction)
Fixes a crash observed by oss-fuzz in 39934. Issue at hand is that code expects a pattern match on m_Mul to imply the operand is a mul instruction, however mul constexprs are also valid here.
Vitaly Buka [Tue, 12 Oct 2021 06:58:04 +0000 (23:58 -0700)]
[sanitizer] Remove tag from StackDepotNode
And share storage with size.
Depends on D111615.
Differential Revision: https://reviews.llvm.org/D111616
Vitaly Buka [Sat, 16 Oct 2021 20:31:59 +0000 (13:31 -0700)]
[sanitizer] Remove use_count from StackDepotNode
This is msan/dfsan data which does not need waste cache
of other sanitizers.
Depends on D111614.
Differential Revision: https://reviews.llvm.org/D111615
Fangrui Song [Sun, 24 Oct 2021 17:31:44 +0000 (10:31 -0700)]
[ARC] Fix -Wunused-variable. NFC
Kazu Hirata [Sun, 24 Oct 2021 16:32:59 +0000 (09:32 -0700)]
[llvm] Call *(Set|Map)::erase directly (NFC)
We can erase an item in a set or map without checking its membership
first.
Kazu Hirata [Sun, 24 Oct 2021 16:32:57 +0000 (09:32 -0700)]
Use llvm::is_contained (NFC)
Groverkss [Sun, 24 Oct 2021 14:36:03 +0000 (20:06 +0530)]
[MLIR] FlatAffineValueConstraints: Fix bug in mergeSymbolIds
This patch fixes a bug in implementation `mergeSymbolIds` where symbol
identifiers were not unique after merging them. Asserts for checking uniqueness
before and after the merge are also added. The asserts checking uniqueness
after the merge fail without the fix on existing test cases.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D111958
Nikita Popov [Sun, 24 Oct 2021 14:10:16 +0000 (16:10 +0200)]
[BasicAA] Add range test with multiple indices (NFC)
Sanjay Patel [Sun, 24 Oct 2021 12:08:30 +0000 (08:08 -0400)]
[x86] add tests for variants of usubsat; NFC
Sanjay Patel [Fri, 22 Oct 2021 20:16:53 +0000 (16:16 -0400)]
[AMDGPU] add tests for alternate form of usubsat; NFC
Kazu Hirata [Sun, 24 Oct 2021 03:41:48 +0000 (20:41 -0700)]
[TableGen] Use llvm::erase_value (NFC)
Kazu Hirata [Sun, 24 Oct 2021 03:41:46 +0000 (20:41 -0700)]
Use StringRef::contains (NFC)
Sylvestre Ledru [Sat, 23 Oct 2021 21:55:50 +0000 (23:55 +0200)]
Add support of the next Ubuntu (Ubuntu 22.04 - Jammy Jellyfish)
It is going to be a LTS release
Simon Pilgrim [Sat, 23 Oct 2021 20:06:03 +0000 (21:06 +0100)]
[X86] findEltLoadSrc - fix shift amount variable name. NFCI.
Fix the copy + paste, renaming shift amt from Idx to Amt
Nikita Popov [Sat, 23 Oct 2021 20:07:31 +0000 (22:07 +0200)]
[InstSimplify] Simplify fetching of index size (NFC)
Directly fetch the size instead of going through the index type
first.
Balazs Benics [Sat, 23 Oct 2021 19:01:59 +0000 (21:01 +0200)]
Revert "[analyzer][solver] Introduce reasoning for not equal to operator"
This reverts commit
cac8808f154cef6446e507d55aba5721c3bd5352.
#5 0x00007f28ec629859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x25859)
#6 0x00007f28ec629729 (/lib/x86_64-linux-gnu/libc.so.6+0x25729)
#7 0x00007f28ec63af36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
#8 0x00007f28ecc2cc46 llvm::APInt::compareSigned(llvm::APInt const&) const (libLLVMSupport.so.14git+0xeac46)
#9 0x00007f28e7bbf957 (anonymous namespace)::SymbolicRangeInferrer::VisitBinaryOperator(clang::ento::RangeSet, clang::BinaryOperatorKind, clang::ento::RangeSet, clang::QualType) (libclangStaticAnalyzerCore.so.14git+0x1df957)
#10 0x00007f28e7bbf2db (anonymous namespace)::SymbolicRangeInferrer::infer(clang::ento::SymExpr const*) (libclangStaticAnalyzerCore.so.14git+0x1df2db)
#11 0x00007f28e7bb2b5e (anonymous namespace)::RangeConstraintManager::assumeSymNE(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, llvm::APSInt const&, llvm::APSInt const&) (libclangStaticAnalyzerCore.so.14git+0x1d2b5e)
#12 0x00007f28e7bc67af clang::ento::RangedConstraintManager::assumeSymUnsupported(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, bool) (libclangStaticAnalyzerCore.so.14git+0x1e67af)
#13 0x00007f28e7be3578 clang::ento::SimpleConstraintManager::assumeAux(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x203578)
#14 0x00007f28e7be33d8 clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x2033d8)
#15 0x00007f28e7be32fb clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal, bool) (libclangStaticAnalyzerCore.so.14git+0x2032fb)
#16 0x00007f28e7b15dbc clang::ento::ConstraintManager::assumeDual(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal) (libclangStaticAnalyzerCore.so.14git+0x135dbc)
#17 0x00007f28e7b4780f clang::ento::ExprEngine::evalEagerlyAssumeBinOpBifurcation(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNodeSet&, clang::Expr const*) (libclangStaticAnalyzerCore.so.14git+0x16780f)
This is known to be triggered on curl, tinyxml2, tmux, twin and on xerces.
But @bjope also reported similar crashes.
So, I'm reverting it to make our internal bots happy again.
Differential Revision: https://reviews.llvm.org/D106102
Nikita Popov [Sat, 23 Oct 2021 15:58:07 +0000 (17:58 +0200)]
[ConstantFolding] Accept offset in ConstantFoldLoadFromConstPtr (NFCI)
As this API is now internally offset-based, we can accept a starting
offset and remove the need to create a temporary bitcast+gep
sequence to perform an offset load. The API now mirrors the
ConstantFoldLoadFromConst() API.
Kazu Hirata [Sat, 23 Oct 2021 15:45:29 +0000 (08:45 -0700)]
Ensure newlines at the end of files (NFC)
Kazu Hirata [Sat, 23 Oct 2021 15:45:27 +0000 (08:45 -0700)]
[llvm] Use StringRef::contains (NFC)
Stefan Gränitz [Sat, 23 Oct 2021 13:54:38 +0000 (15:54 +0200)]
[Orc][examples] Re-enable test for LLJITWithRemoteDebugging
The test was removed temporarily in
a539a847c9428e36722dcb43a1c953c9d66b7f0b to aid switching the RPC API in use in the LLJITWithRemoteDebugging example.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D110649
Nikita Popov [Sat, 23 Oct 2021 14:58:09 +0000 (16:58 +0200)]
[ConstantFolding] Remove ConstantFoldLoadThroughGEPIndices() API (NFC)
The last user of this API went away in
4f5e9a2bb28e1cf4a12c9330f52e664542400ec7.
Nicolas Vasilache [Sat, 23 Oct 2021 13:46:22 +0000 (13:46 +0000)]
Revert "[mlir][Linalg] NFC - Reorganize options nesting."
This reverts commit
4703a07e6cc170666abb62d91307978ab4992d9c.
Didnt' mean to push this yet, sorry about the noise.
Nikita Popov [Fri, 22 Oct 2021 19:35:08 +0000 (21:35 +0200)]
[SCEV] Remove computeLoadConstantCompareExitLimit() (NFCI)
The functionality of this method is already covered by
computeExitCountExhaustively() in a more general fashion. It was
added at a time when exhaustive exit count calculation did not
support constant folding loads yet. I double checked that dropping
this code causes no binary changes in test-suite.
Differential Revision: https://reviews.llvm.org/D112343
Nicolas Vasilache [Sat, 23 Oct 2021 13:01:40 +0000 (13:01 +0000)]
[mlir][Linalg] NFC - Reorganize options nesting.
This removes duplication and makes nesting more clear.
It also reduces the amount of changes necessary for exposing future options.
Differential revision: https://reviews.llvm.org/D112344
Emilio Cota [Sat, 23 Oct 2021 11:48:24 +0000 (04:48 -0700)]
[mlir] Add polynomial approximation for vectorized math::Rsqrt
This patch adds a polynomial approximation that matches the
approximation in Eigen.
Note that the approximation only applies to vectorized inputs;
the scalar rsqrt is left unmodified.
The approximation is protected with a flag since it emits an AVX2
intrinsic (generated via the X86Vector). This is the only reasonably
clean way that I could find to generate the exact approximation that
I wanted (i.e. an identical one to Eigen's).
I considered two alternatives:
1. Introduce a Rsqrt intrinsic in LLVM, which doesn't exist yet.
I believe this is because there is no definition of Rsqrt that
all backends could agree on, since hardware instructions that
implement it have widely varying degrees of precision.
This is something that the standard could mandate, but Rsqrt is
not part of IEEE754, so I don't think this option is feasible.
2. Emit fdiv(1.0, sqrt) with fast math flags to allow reciprocal
transformations. Although portable, this doesn't allow us
to generate exactly the code we want; it is the LLVM backend,
and not MLIR, who controls what code is generated based on the
target CPU.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D112192
Frederik Seiffert [Sat, 23 Oct 2021 10:37:36 +0000 (16:07 +0530)]
[www] Fix Ninja build instructions on Windows
The `clang` target used in the line below is only generated with `LLVM_ENABLE_PROJECTS=clang`.
Without this change, running `ninja clang` will fail with:
```
ninja: error: unknown target 'clang', did you mean 'clean'?
```
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D112257
Martin Storsjö [Sat, 23 Oct 2021 09:52:55 +0000 (12:52 +0300)]
[lldb] [Host/SerialPort] Fix build with GCC 7
Michał Górny [Sat, 23 Oct 2021 09:38:11 +0000 (11:38 +0200)]
[lldb] [Host/FreeBSD] Remove unused variable (NFC)
Salman Javed [Sat, 23 Oct 2021 07:07:36 +0000 (00:07 -0700)]
[clang-tidy] Tidy up spelling, grammar, and inconsistencies in documentation (NFC)
Differential Revision: https://reviews.llvm.org/D112356
Shivam Gupta [Sat, 23 Oct 2021 06:51:31 +0000 (12:21 +0530)]
[NFC] Correct arc draft option
Jessica Clarke [Sat, 23 Oct 2021 00:56:05 +0000 (01:56 +0100)]
[X86] Don't add implicit REP prefix to VIA PadLock xstore
Commit
8fa3e8fa1492 added an implicit REP prefix to all VIA PadLock
instructions, but GNU as doesn't add one to xstore, only all the others.
This resulted in a kernel panic regression in FreeBSD upon updating to
LLVM 11 (https://bugs.freebsd.org/259218) which includes the commit in
question. This partially reverts that commit.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D112355
Jessica Clarke [Sat, 23 Oct 2021 00:55:56 +0000 (01:55 +0100)]
[NFC][X86] Add MC tests for all untested VIA PadLock instructions
We currently only test the encoding of xstore but none of the other
instructions, which should all have their implicit REP prefix be
verified as working.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D112354
peter klausler [Fri, 22 Oct 2021 23:39:16 +0000 (16:39 -0700)]
[flang] Fix buildbot (new warnings on old code)
The clang-aarch64-full-2stage buildbot is complaining about a
warning with three instances in f18 code (none modified recently).
The warning is for using the | bitwise OR operator on bool operands.
In one instance, the bitwise operator was being used instead of the
logical || operator in order to avoid short-circuting. The fix
requires using some temporary variables. In the other two instances,
the bitwise operator seemed more idiomatic in context, but can be
replaced without harm with the logical operator.
Pushing without review as confidence is high and nobody wants
a buildbot to stay sad for long.
Kazu Hirata [Sat, 23 Oct 2021 00:22:13 +0000 (17:22 -0700)]
[tools, utils] Use StringRef::contains (NFC)
LLVM GN Syncbot [Fri, 22 Oct 2021 23:26:01 +0000 (23:26 +0000)]
[gn build] Port
e18ea6f2946a
Duncan P. N. Exon Smith [Thu, 23 Sep 2021 03:31:43 +0000 (23:31 -0400)]
Support: Skip buffering buffer_unique_ostream's owned stream
Change buffer_unique_ostream's constructor to call
raw_ostream::SetUnbuffered() on its owned stream. Otherwise,
buffer_unique_ostream's destructor could cause the owned stream to
temporarily allocate a buffer only to be immediately flushed.
Also add some tests for buffer_ostream and buffer_unique_ostream. Use
the same naming scheme as other raw_ostream-related tests (e.g.,
`raw_ostreamTest` for the fixture, `raw_ostream_test.cpp` for the
filename).
(I considered changing buffer_ostream in the same way (calling
SetUnbuffered on the referenced stream), but that seemed like overreach
since the client may have more things to write.)
(I considered merging buffer_ostream and buffer_unique_ostream into a
single class (with a `raw_ostream&` and a `std::unique_ptr` that is only
sometimes used), but that makes the class bigger and the small amount of
code deduplication seems uncompelling.)
Differential Revision: https://reviews.llvm.org/D110369
peter klausler [Thu, 21 Oct 2021 20:33:07 +0000 (13:33 -0700)]
[flang] Support legacy usage of 'A' edit descriptors for integer & real
The 'A' edit descriptor once served as a form of raw I/O of bytes
to/from variables that weren't of type CHARACTER (which itself
didn't exist until F'77). This usage was especially common for
output of numeric variables that had been initialized with Hollerith.
Differential Revision: https://reviews.llvm.org/D112346
peter klausler [Thu, 21 Oct 2021 21:13:21 +0000 (14:13 -0700)]
[flang] Fix NAMELIST input bug with multiple subscript triplets
NAMELIST input can contain array subscripts with triplet notation.
The calculation of the default effective stride for the constructed
array descriptor was simply incorrect after the first dimension.
Differential Revision: https://reviews.llvm.org/D112347
peter klausler [Thu, 21 Oct 2021 19:47:46 +0000 (12:47 -0700)]
[flang] Fix DOT_PRODUCT for logical
A build-time check in a template class instantiation was applying
a test that's meaningful only for numeric types.
Differential Revision: https://reviews.llvm.org/D112345
peter klausler [Thu, 21 Oct 2021 21:47:55 +0000 (14:47 -0700)]
[flang] Extension: allow tabs in output format strings
A CHARACTER variable used as an output format may contain
unquoted tab characters, which are treated as if they had
been quoted. This is an extension supported by all other
Fortran compilers to which I have access.
Differential Revision: https://reviews.llvm.org/D112350
peter klausler [Fri, 22 Oct 2021 19:22:42 +0000 (12:22 -0700)]
[flang] Fix crash on empty formatted external READs
ExternalFileUnit::BeginReadingRecord() must be called at least once
during an external formatted READ statement before FinishReadingRecord().
In the case of a formatted external READ with no data items, the call
to finish processing of the format (which might have lingering control
items that need doing) was taking place before the call to BeginReadingRecord
from ExternalIoStatementState::EndIoStatement. Add a call to
BeginReadingRecord on this path.
Differential Revision: https://reviews.llvm.org/D112351
Jon Chesterfield [Fri, 22 Oct 2021 22:28:42 +0000 (23:28 +0100)]
[libomptarget] Run GPU offloading tests on both new and old runtime
Implemented by patching python config instead of modifying all
the tests so that -generic and XFAIL work as usual. Expectation is for
this to be reverted once the old runtime is deleted.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D112225
Nikita Popov [Wed, 29 Sep 2021 20:18:54 +0000 (22:18 +0200)]
[BasicAA] Model implicit trunc of GEP indices
GEP indices larger than the GEP index size are implicitly truncated
to the index size. BasicAA currently doesn't model this, resulting
in incorrect alias analysis results.
Fix this by explicitly modelling truncation in CastedValue in the
same way we do zext and sext. Additionally we need to disable a
number of optimizations for truncated values, in particular
"non-zero" and "non-equal" may no longer hold after truncation.
I believe the constant offset heuristic is also not necessarily
correct for truncated values, but wasn't able to come up with a
test for that one.
A possible followup here would be to use the new mechanism to
model explicit trunc as well (which should be much more common,
as it is the canonical form). This is straightforward, but omitted
here to separate the correctness fix from the analysis improvement.
(Side note: While I say "index size" above, BasicAA currently uses
the pointer size instead. Something for another day...)
Differential Revision: https://reviews.llvm.org/D110977
Peter Klausler [Tue, 19 Oct 2021 18:30:45 +0000 (11:30 -0700)]
[flang] Speed common runtime cases of DOT_PRODUCT & MATMUL
Look for contiguous numeric argument arrays at runtime and
use specialized code for them.
Differential Revision: https://reviews.llvm.org/D112239
peter klausler [Tue, 19 Oct 2021 18:34:57 +0000 (11:34 -0700)]
[flang] Fix generic resolution case
Don't try to convert INTEGER argument expressions to the kind of
the dummy argument when performing generic resolution; specific
procedures may be distinguished only by their kinds.
Differential Revision: https://reviews.llvm.org/D112240
peter klausler [Wed, 20 Oct 2021 20:56:47 +0000 (13:56 -0700)]
[flang] Support NAMELIST input of short arrays
NAMELIST array input does not need to fully define an array.
If another input item begins after at least one element,
it ends input into the array and the remaining items are
not modified.
The tricky part of supporting this feature is that it's not
always easy to determine whether the next non-blank thing in
the input is a value or the next item's name, esp. in the case
of logical data where T and F can be names. E.g.,
&group logicalArray = t f f t
= 1 /
should read three elements into "logicalArray" and then read
an integer or real variable named "t".
So the I/O runtime has to do some look-ahead to determine whether
the next thing in the input is a name followed by '=', '(', or '%'.
Since the '=' may be on a later record, possibly with intervening
NAMELIST comments, the runtime has to support a general form of
saving and restoring its current position. The infrastructure
in the I/O runtime already has to support repositioning for
list-directed repetition, even on non-positionable input sources
like terminals and sockets; this patch adds an internal RAII API
to make it easier to save a position and then do arbitrary
look-ahead.
Differential Revision: https://reviews.llvm.org/D112245
Vy Nguyen [Fri, 22 Oct 2021 02:38:12 +0000 (22:38 -0400)]
[lld-macho] Implement -oso_prefix
https://bugs.llvm.org/show_bug.cgi?id=50229
Differential Revision: https://reviews.llvm.org/D112291
Nicolas Vasilache [Fri, 22 Oct 2021 12:04:32 +0000 (12:04 +0000)]
[mlir][Linalg] Retire CodegenStrategy::transform
Instead each pass should constructed a nested OpPassManager and runPipeline on that.
Differential Revision: https://reviews.llvm.org/D112308
Jason Molenda [Fri, 22 Oct 2021 20:23:06 +0000 (13:23 -0700)]
Fix locals naming in DNBArchMachARM64::GetGPRState for 32-bit builds
The local variables names used for logging when built on armv7k
weren't unique, resulting in build error.
rdar://
84274006
Matt Arsenault [Thu, 9 Sep 2021 23:57:12 +0000 (19:57 -0400)]
AMDGPU: Use attributor to propagate amdgpu-flat-work-group-size
This can merge the acceptable ranges based on the call graph, rather
than the simple application of the attribute. Remove the handling from
the old pass.
Matt Arsenault [Sat, 11 Sep 2021 02:18:54 +0000 (22:18 -0400)]
AMDGPU: Don't consider whether amdgpu-flat-work-group-size was set
It should be semantically identical if it was set to the same value as
the default. Also improve the documentation.
Craig Topper [Fri, 22 Oct 2021 19:48:17 +0000 (12:48 -0700)]
[X86] Fix bad formatting. NFC
Louis Dionne [Fri, 22 Oct 2021 20:15:45 +0000 (16:15 -0400)]
[libc++][NFC] Remove duplicate Python imports
Duncan P. N. Exon Smith [Thu, 21 Oct 2021 22:57:15 +0000 (15:57 -0700)]
Support: Use Expected<T>::moveInto() in a few places
These are some usage examples for `Expected<T>::moveInto()`.
Differential Revision: https://reviews.llvm.org/D112280
peter klausler [Mon, 18 Oct 2021 17:44:39 +0000 (10:44 -0700)]
[flang] Extension to distinguish specific procedures
Allocatable dummy arguments can be used to distinguish
two specific procedures in a generic interface when
it is the case that exactly one of them is polymorphic
or exactly one of them is unlimited polymorphic. The
standard requires that an actual argument corresponding
to an (unlimited) polymorphic allocatable dummy argument
must also be an (unlimited) polymorphic allocatable, so an
actual argument that's acceptable to one procedure must
necessarily be a bad match for the other.
Differential Revision: https://reviews.llvm.org/D112237
Matt Arsenault [Fri, 22 Oct 2021 17:47:33 +0000 (13:47 -0400)]
AMDGPU: Regenerate MIR test checks
Recently this started using -NEXT checks, so regenerate these to avoid
extra test churn in a future change.
Matt Arsenault [Mon, 16 Aug 2021 14:19:52 +0000 (10:19 -0400)]
AMDGPU: Fix hardcoded registers in tests
Nicolas Vasilache [Fri, 22 Oct 2021 19:11:32 +0000 (19:11 +0000)]
[mlir][Linalg] NFC - Drop Optional in favor of FailureOr
Differential revision: https://reviews.llvm.org/D112332
Jay Foad [Fri, 22 Oct 2021 10:19:29 +0000 (11:19 +0100)]
[AMDGPU] Run SIShrinkInstructions before post-RA scheduling
Run post-RA SIShrinkInstructions just before post-RA scheduling, instead
of afterwards. After the fixes in D112305 and D112317 this seems to make
no difference, but it paves the way for scheduler tweaks that are
sensitive to the e32 vs e64 encoding of VALU instructions.
Differential Revision: https://reviews.llvm.org/D112341
Med Ismail Bennani [Fri, 22 Oct 2021 19:18:11 +0000 (19:18 +0000)]
[lldb/Formatters] Remove space from vector type string summaries (NFCI)
This patch changes the string summaries for vector types by removing the
space between the type and the bracket, conforming to
277623f4d5a6.
This should also fix TestCompactVectors failure.
Differential Revision: https://reviews.llvm.org/D112340
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Jay Foad [Fri, 22 Oct 2021 13:39:39 +0000 (14:39 +0100)]
[AMDGPU] Fix latency for implicit vcc_lo operands on GFX10 wave32
As described in the comment, the way we change vcc to vcc_lo in these
operands confuses addPhysRegDataDeps into treating them as implicit
pseudo operands. Fix this by setting the correct latency from the
SchedModel after addPhysRegDataDeps wrongly set it to 0.
Differential Revision: https://reviews.llvm.org/D112317
Jay Foad [Fri, 22 Oct 2021 11:21:57 +0000 (12:21 +0100)]
[ScheduleDAGInstrs] Call adjustSchedDependency in more cases
This removes a condition and the corresponding FIXME comment, because
the Hexagon assertion it refers to has apparently been fixed, probably
by D76134.
NFCI. This just gives targets the opportunity to adjust latencies that
were set to 0 by the generic code because they involve "implicit pseudo"
operands.
Differential Revision: https://reviews.llvm.org/D112306
Stanislav Mekhanoshin [Fri, 22 Oct 2021 18:59:15 +0000 (11:59 -0700)]
[InstCombine] Precommit new and-xor-or.ll tests. NFC.
Duncan P. N. Exon Smith [Wed, 20 Oct 2021 19:03:31 +0000 (12:03 -0700)]
Support: Add Expected<T>::moveInto() to avoid extra names
Expected<T>::moveInto() takes as an out parameter any `OtherT&` that's
assignable from `T&&`. It moves any stored value before returning
takeError().
Since moveInto() consumes both the Error and the value, it's only
anticipated that we'd use call it on temporaries/rvalues, with naming
the Expected first likely to be an anti-pattern of sorts (either you
want to deal with both at the same time, or you don't). As such,
starting it out as `&&`-qualified... but it'd probably be fine to drop
that if there's a good use case for lvalues that appears.
There are two common patterns that moveInto() cleans up:
```
// If the variable is new:
Expected<std::unique_ptr<int>> ExpectedP = makePointer();
if (!ExpectedP)
return ExpectedP.takeError();
std::unique_ptr<int> P = std::move(*ExpectedP);
// If the target variable already exists:
if (Expected<T> ExpectedP = makePointer())
P = std::move(*ExpectedP);
else
return ExpectedP.takeError();
```
moveInto() takes less typing and avoids needing to name (or leak into
the scope) an extra variable.
```
// If the variable is new:
std::unique_ptr<int> P;
if (Error E = makePointer().moveInto(P))
return E;
// If the target variable already exists:
if (Error E = makePointer().moveInto(P))
return E;
```
It also seems useful for unit tests, to log errors (but continue) when
there's an unexpected failure. E.g.:
```
// Crash on error, or undefined in non-asserts builds.
std::unique_ptr<MemoryBuffer> MB = cantFail(makeMemoryBuffer());
// Avoid crashing on error without moveInto() :(.
Expected<std::unique_ptr<MemoryBuffer>>
ExpectedMB = makeMemoryBuffer();
ASSERT_THAT_ERROR(ExpectedMB.takeError(), Succeeded());
std::unique_ptr<MemoryBuffer> MB = std::move(ExpectedMB);
// Avoid crashing on error with moveInto() :).
std::unique_ptr<MemoryBuffer> MB;
ASSERT_THAT_ERROR(makeMemoryBuffer().moveInto(MB), Succeeded());
```
Differential Revision: https://reviews.llvm.org/D112278
Nikita Popov [Fri, 22 Oct 2021 17:15:22 +0000 (19:15 +0200)]
[ConstantFolding] Drop misleading comment (NFC)
As pointed out by Philip, this part of the comment is misleading,
as it describes undef rather than poison behavior. Just mentioning
poison should be sufficient.
Stephen Tozer [Mon, 18 Oct 2021 11:36:25 +0000 (12:36 +0100)]
[Dexter] Add DexFinishTest command to conditionally early-exit a test program
This patch adds a command, DexFinishTest, that allows a Dexter test to
be conditionally finished at a given breakpoint. This command has the
same set of arguments as DexLimitSteps, except that it does not allow a
line range (from_line, to_line), only a single line (on_line).
Reviewed By: Orlando
Differential Revision: https://reviews.llvm.org/D111988
Louis Dionne [Fri, 22 Oct 2021 16:03:00 +0000 (12:03 -0400)]
[libunwind] Fix path to libunwind for per-target-runtime-dir builds
We recently introduced a from-scratch config to run the libunwind tests.
However, that config was always looking for libunwind in <install>/lib,
and never in <install>/<target>/lib, which is necessary for tests to
work when the per-target-runtime-dir configuration is enabled.
This commit fixes that. I believe this is what caused the CI failures we
saw after
5a8ad80b6fa5 and caused it to be reverted.
Differential Revision: https://reviews.llvm.org/D112322
peter klausler [Tue, 19 Oct 2021 20:49:21 +0000 (13:49 -0700)]
[flang] Enforce rest of semantic constraint C919
A reference to an allocatable or pointer component must be applied
to a scalar base object. (This is the second part of constraint C919;
the first part is already checked.)
Differential Revision: https://reviews.llvm.org/D112241
Jeremy Morse [Fri, 22 Oct 2021 18:10:05 +0000 (19:10 +0100)]
[DebugInfo][Instr] Track subregisters across stack spills/restores
Sometimes we generate code that writes to a subregister, then spills /
restores a super-register to the stack, for example:
$eax = MOV32ri 0
MOV64mr $rsp, 1, $noreg, 16, $noreg, $rax
$rcx = MOV64rm $rsp, 1, $noreg, 8, $noreg
This patch takes a different approach: it adds another index to
MLocTracker that identifies a size/offset within a stack slot. A location
on the stack is then a pari of {FrameIndex, SlotNum}. Spilling and
restoring now involves pairing up the src/dest register numbers, and the
dest/src stack position to be transferred to/from. Location coverage
improves as a result, compile-time performance decreases, alas.
One limitation is that if a PHI occurs inside a stack slot:
DBG_PHI %stack.0, 1
We don't know how large the resulting value is, and so might have
difficulty picking which value to use. DBG_PHI might need to be augmented
in the future with such a size.
Unit tests added ensure that spills and restores correctly transfer to
positions in the Location => Value map, and that different register classes
written to the stack will correctly clobber all other positions in the
stack slot.
Differential Revision: https://reviews.llvm.org/D112133
peter klausler [Tue, 19 Oct 2021 21:46:23 +0000 (14:46 -0700)]
[flang] Emit unformatted headers & footers even with RECL=
The runtime library was emitting unformatted record headers and
footers when an external unit had no fixed RECL=. This is wrong
for sequential files, which should have headers & footers even
with RECL. Change to omit headers & footers from unformatted
I/O only for direct access files.
Differential Revision: https://reviews.llvm.org/D112243
Craig Topper [Fri, 22 Oct 2021 17:38:53 +0000 (10:38 -0700)]
[LegalizeTypes] Only expand CTLZ/CTTZ/CTPOP during type promotion if the new type is legal.
We might be promoting a large non-power of 2 type and the new type
may need to be split. Once we split it we may have a ctlz/cttz/ctpop
instruction for the split type.
I'm also concerned that we may create large shifts with shift amounts
that are too small.
peter klausler [Wed, 20 Oct 2021 17:37:09 +0000 (10:37 -0700)]
[flang] Fix bogus folding error for ISHFT(x, negative)
Negative shift counts are of course valid for ISHFT when
shifting to the right. This patch decouples the folding of
ISHFT from that of SHIFTA/L/R and adds tests.
Differential Revision: https://reviews.llvm.org/D112244
David Green [Fri, 22 Oct 2021 17:36:08 +0000 (18:36 +0100)]
[InstCombine] Various tests for truncating saturates and related patterns.
Simon Pilgrim [Fri, 22 Oct 2021 17:19:02 +0000 (18:19 +0100)]
[DAG] narrowExtractedVectorLoad - EXTRACT_SUBVECTOR indices are always constant
EXTRACT_SUBVECTOR indices are always constant, we don't need to check for ConstantSDNode, we should just use getConstantOperandVal which will assert for the constant.
Philip Reames [Fri, 22 Oct 2021 17:24:27 +0000 (10:24 -0700)]
[indvars] Use fact loop must exit to canonicalize to unsigned conditions
The logic in this patch is that if we find a comparison which would be unsigned except for when the loop is infinite, and we can prove that an infinite loop must be ill defined, we can still make the predicate unsigned.
The eventual goal (combined with a follow on patch) is to use the fact the loop exits to remove the zext (see tests) entirely.
A couple of points worth noting:
* We loose the ability to prove the loop unreachable by committing to the must exit interpretation. If instead, we later proved that rhs was definitely outside the range required for finiteness, we could have killed the loop entirely. (We don't currently implement this transform, but could in theory, do so.)
* simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts). D112170 explores fixing that, but - at least so far - appears to be too expensive compile time wise.
Differential Revision: https://reviews.llvm.org/D111836