Jason Molenda [Fri, 15 Oct 2021 06:55:37 +0000 (23:55 -0700)]
Use Module's FileSpec for limiting binaries to set dyld breakpoint in
When DynamicLoaderMacOS::SetNotificationBreakpoint sets the breakpoint
for new binaries being loaded/unloaded, it limits the scope of that
breakpoint to just dyld, so we don't re-evaluate the breakpoint for
every new binary loaded. I wrote this to get the module's ObjectFile
FileSpec in an earlier change, but this is not correct. If lldb
is debugging a remote system, and it had to read dyld out of memory
from the remote system, it will have no FileSpec on the lldb debugger
host. We need to grab the Module's FileSpec, which in this case is
actually falling back to the PlatformFileSpec, the binary path on the
target system.
rdar://
84199646
Shao-Ce SUN [Fri, 15 Oct 2021 06:51:49 +0000 (14:51 +0800)]
[NFC] fix a typo
Ben Shi [Fri, 15 Oct 2021 06:44:28 +0000 (06:44 +0000)]
[RISCV] Optimize immediate materialisation with SH*ADD
Use SH1ADD/SH2ADD/SH3ADD along with LUI+ADDI to compose int32*3,
int32*5 and int32*9.
Reviewed By: craig.topper, luismarques
Differential Revision: https://reviews.llvm.org/D111484
Kazu Hirata [Fri, 15 Oct 2021 05:44:08 +0000 (22:44 -0700)]
[llvm] Use llvm::is_contained (NFC)
Max Kazantsev [Fri, 15 Oct 2021 04:01:28 +0000 (11:01 +0700)]
[SCEV] Prove implication of predicates to their sign-flipped counterparts
This patch teaches SCEV two implication rules:
x <u y && y >=s 0 --> x <s y,
x <s y && y <s 0 --> x <u y.
And all equivalents with signs/parts swapped.
Differential Revision: https://reviews.llvm.org/D110517
Reviewed By: nikic
Qiu Chaofan [Fri, 15 Oct 2021 04:22:44 +0000 (12:22 +0800)]
[PowerPC] Support ppc-asm-full-reg-names for AIX
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D94282
Artem Dergachev [Fri, 15 Oct 2021 03:32:54 +0000 (20:32 -0700)]
[analyzer] Fix property access kind detection inside parentheses.
'(self.prop)' produces a surprising AST where ParenExpr
resides inside `PseudoObjectExpr.
This breaks ObjCMethodCall::getMessageKind() which in turn causes us
to perform unnecessary dynamic dispatch bifurcation when evaluating
body-farmed property accessors, which in turn causes us
to explore infeasible paths.
Richard Smith [Fri, 15 Oct 2021 02:20:01 +0000 (19:20 -0700)]
PR52183: Don't emit code for a void-typed constant expression.
This is unnecessary in general, and wrong when the expression invokes a
consteval function.
Max Kazantsev [Fri, 15 Oct 2021 03:19:15 +0000 (10:19 +0700)]
[SCEV][NFC] Reduce memory footprint & compile time via DFS refactoring
Current implementations of DFS in SCEV check unique-visited of traversed
values on pop, and not on push. As result, the same value may be pushed
multiple times just to be thrown away when popped. These operations are
meaningless and only waste time and increase memory footprint of the
worklist.
This patch reworks the DFS strategy to check uniqueness before push.
Should be NFC.
Differential Revision: https://reviews.llvm.org/D111774
Reviewed By: nikic, reames
Mogball [Fri, 15 Oct 2021 00:00:54 +0000 (00:00 +0000)]
[MLIR][ODS] default-valued strings should be in quotes
`DefaultValuedAttr<StrAttr, "">` and `ConstantAttr<StrAttr, "">`
result in bugs in which TableGen will not recognize that the attribute
has a default value, because `""` is an empty TableGen string.
Strings no longer have special treatment. Instead, string values must be
wrapped in quotes: "\"foo\"". Two helpers, `DefaultValuedStrAttr` and
`ConstantStrAttr` have been added to keep code clean.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D111855
Matthias Springer [Fri, 15 Oct 2021 02:17:23 +0000 (11:17 +0900)]
[mlir][linalg][bufferize] Handle scf::ForOp correctly in bufferizesToMemoryRead
From the perspective of analysis, scf::ForOp is treated as a black box. Basic block arguments do not alias with their respective OpOperands on the ForOp, so they do not participate in conflict analysis with ops defined outside of the loop.
However, bufferizesToMemoryRead and bufferizesToMemoryWrite on the scf::ForOp itself are used to determine how the scf::ForOp interacts with its surrounding ops.
Differential Revision: https://reviews.llvm.org/D111775
Matthias Springer [Fri, 15 Oct 2021 01:19:14 +0000 (10:19 +0900)]
[mlir][linalg][bufferize] Rewrite conflict detection
For each memory read, follow SSA use-def chains to find the op that produces the data being read (i.e., the most recent write). A memory write to an alias is a conflict if it takes places after the "most recent write" but before the read.
This CL introduces two main changes:
* There is a concise definition of a conflict. Given a piece of IR with InPlaceSpec annotations and a computes alias set, it is easy to compute whether this program has a conflict. No need to consider multiple cases such as "read of operand after in-place write" etc.
* No need to check for clobbering.
Differential Revision: https://reviews.llvm.org/D111287
Artur Pilipenko [Wed, 13 Oct 2021 00:53:59 +0000 (17:53 -0700)]
Fix getInlineCost with ComputeFullInlineCost enabled
Fix a bug when getInlineCost incorrectly returns a
cost/threshold pair instead of an explicit never inline.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D111687
Hongtao Yu [Thu, 14 Oct 2021 18:37:44 +0000 (11:37 -0700)]
[CSSPGO] Turn off PseudoProbeUpdatePass for non-FDO builds.
PseudoProbeUpdatePass is used to distribute sample counts among dulplicated probes. It doesn't make sense for it to run without a sample profile. The pass takes 1% of the build time.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D111847
Vitaly Buka [Fri, 15 Oct 2021 00:03:00 +0000 (17:03 -0700)]
[NFC][asan] Speedup uar_signals.cpp test
It was the slowest test:
--------------------------------------------------------------------------
41.77s: AddressSanitizer-x86_64-linux :: TestCases/Linux/uar_signals.cpp
26.64s: AddressSanitizer-i386-linux :: TestCases/Linux/uar_signals.cpp
14.82s: AddressSanitizer-x86_64-linux :: TestCases/Posix/current_allocated_bytes.cpp
14.79s: AddressSanitizer-i386-linux :: TestCases/Posix/current_allocated_bytes.cpp
11.55s: AddressSanitizer-x86_64-linux :: TestCases/scariness_score_test.cpp
10.15s: AddressSanitizer-x86_64-linux :: TestCases/Posix/stack-use-after-return.cpp
Vitaly Buka [Thu, 14 Oct 2021 22:37:04 +0000 (15:37 -0700)]
[NFC][sanitizer] Remove %stdcxx11
-std=c++14 is a default for a while.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D111848
Vitaly Buka [Thu, 14 Oct 2021 22:27:03 +0000 (15:27 -0700)]
[NFC][asan] Use more common socket type in test
Michael Jones [Tue, 12 Oct 2021 23:05:56 +0000 (23:05 +0000)]
[libc] add memccpy and mempcpy
Add an implementation for memccpy and mempcpy. These functions are
posix extensions for the moment.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D111762
peter klausler [Thu, 14 Oct 2021 17:32:59 +0000 (10:32 -0700)]
[flang] Admit NULL() in generic procedure resolution cases
Semantics is rejecting valid programs with NULL() actual arguments
to generic interfaces, including user-defined operators. Subclause
16.9.144(para 6) makes clear that NULL() can be a valid actual
argument to a generic interface so long as it does not produce
ambiguity. This patch handles those cases, revises existing
tests, and adjust an error message about NULL() operands to
appear less like a blanket prohibition.
Differential Revision: https://reviews.llvm.org/D111850
Jacques Pienaar [Thu, 14 Oct 2021 22:58:44 +0000 (15:58 -0700)]
[mlir][ods] Enable emitting getter/setter prefix
Allow emitting get & set prefix for accessors generated for ops. If
enabled, then the argument/return/region name gets converted from
snake_case to UpperCamel and prefix added. The attribute also allows
generating both the current "raw" method along with the prefix'd one to
make it easier to stage changes.
The option is added on the dialect and currently defaults to existing
raw behavior. The expectation is that the staging where both are
generated would be short lived and so optimized to keeping the changes
local/less invasive (it just generates two functions for each accessor
with the same body - most of these internally again call a helper
function). But generation can be optimized if needed.
I'm unsure about OpAdaptor classes as there it is all get methods (it is
a named view into raw data structures), so prefix doesn't add much.
This starts with emitting raw-only form (as current behavior) as
default, then one can opt-in to raw & prefixed, then just prefixed. The
default in OpBase will switch to prefixed-only to be consistent with
MLIR style guide. And the option potentially removed later (considered
enabling specifying prefix but current discussion more pro keeping it
limited and stuck with that).
Also add more explicit checking for pruned functions to avoid emitting
where no function was added (and so avoiding dereferencing nullptr)
during op def/decl generation.
See https://bugs.llvm.org/show_bug.cgi?id=51916 for further discussion.
Differential Revision: https://reviews.llvm.org/D111033
peter klausler [Wed, 13 Oct 2021 21:42:21 +0000 (14:42 -0700)]
[flang] Fold LGE/LGT/LLE/LLT intrinsic functions
Fold the legacy intrinsic functions LGE, LGT, LLE, & LLT
by rewriting them into character relational expressions and
then folding those. Also fix folding of comparisons of
character values of distinct lengths: the shorter value must
be padded with blanks. (This fix exposed some bad test cases,
which are also fixed.)
Differential Revision: https://reviews.llvm.org/D111843
Arthur Eubanks [Thu, 14 Oct 2021 22:10:32 +0000 (15:10 -0700)]
[NFC][Interpreter] Remove unused CompilerInvocation
Evgenii Stepanov [Thu, 14 Oct 2021 21:56:29 +0000 (14:56 -0700)]
[scudo] Fix running tests under hwasan.
When built with hwasan, assume that the target architecture does not
support TBI. HWASan uses that byte for its own purpose, and changing it
breaks things.
Reviewed By: hctim
Differential Revision: https://reviews.llvm.org/D111842
Evgenii Stepanov [Thu, 14 Oct 2021 21:56:38 +0000 (14:56 -0700)]
[hwasan] Fix TestCases/thread-uaf.c.
On newer glibc, this test detects an extra match somewhere under
pthread_getattr_np. This results in Thread: lines getting spread out in
the report and failing to match the CHECKs.
Fix the CHECKs to allow this possibility.
Reviewed By: fmayer
Differential Revision: https://reviews.llvm.org/D111841
Evgenii Stepanov [Thu, 14 Oct 2021 21:49:07 +0000 (14:49 -0700)]
[hwasan] Add default "/" prefix.
Add a default "/" prefix to the symbol search path in the
symbolization script. Without this, the binary itself is not considered
a valid source of symbol info.
Differential Revision: https://reviews.llvm.org/D111840
Peyton, Jonathan L [Thu, 14 Oct 2021 21:41:38 +0000 (16:41 -0500)]
[OpenMP][host runtime] Add initial hybrid CPU support
Detect, through CPUID.1A, and show user different core types through
KMP_AFFINITY=verbose mechanism. Offer future runtime optimizations
__kmp_is_hybrid_cpu() to know whether running on a hybrid system or not.
Differential Revision: https://reviews.llvm.org/D110435
Peyton, Jonathan L [Thu, 14 Oct 2021 21:37:52 +0000 (16:37 -0500)]
[OpenMP][host runtime] small fixup of RTM CPUID bit check
David Blaikie [Thu, 14 Oct 2021 21:48:17 +0000 (14:48 -0700)]
Revert "Compress formatting of array type names (int [4] -> int[4])"
Looks like lldb has some issues with this - somehow it causes lldb to
treat a "char[N]" type as an array of chars (prints them out
individually) but a "char [N]" is printed as a string. (even though the
DWARF doesn't have this string in it - it's something to do with the
string lldb generates for itself using clang)
This reverts commit
277623f4d5a672d707390e2c3eaf30a9eb4b075c.
peter klausler [Tue, 12 Oct 2021 23:20:35 +0000 (16:20 -0700)]
[flang] Expunge bogus semantic check for ELEMENTAL without dummies
Semantics refuses valid ELEMENTAL subprograms without dummy arguments,
but there's no such constraint in the standard; indeed, subclause
15.8.2 discusses the meaning of calls to ELEMENTAL functions with
arguments. Remove the check and its test.
Differential Revision: https://reviews.llvm.org/D111832
Stella Laurenzo [Thu, 14 Oct 2021 21:31:09 +0000 (14:31 -0700)]
Disable add_mlir_aggregate() debug file generation.
* Leaves it as a commented out area with a note on how to debug.
Peyton, Jonathan L [Wed, 15 Sep 2021 17:52:14 +0000 (12:52 -0500)]
[OpenMP][host runtime] Add support for teams affinity
This patch implements teams affinity on the host.
The default is spread. A user can specify either spread, close, or
primary using KMP_TEAMS_PROC_BIND environment variable. Unlike
OMP_PROC_BIND, KMP_TEAMS_PROC_BIND is only a single value and is not a
list of values. The values follow the same semantics under the OpenMP
specification for parallel regions except T is the number of teams in
a league instead of the number of threads in a parallel region.
Differential Revision: https://reviews.llvm.org/D109921
Alexey Bataev [Thu, 14 Oct 2021 13:40:06 +0000 (06:40 -0700)]
[SLP]Fix PR52090: clang crashes: Assertion `Index < Length && "Invalid index!"' failed.
Need to check that either Idx is UndefMaskElem and value is UndefValue
or Idx is valid and value is the same as the scalar value in the node.
Differential Revision: https://reviews.llvm.org/D111802
David Blaikie [Thu, 14 Oct 2021 21:07:51 +0000 (14:07 -0700)]
Compress formatting of array type names (int [4] -> int[4])
Based on post-commit review discussion on
2bd84938470bf2e337801faafb8a67710f46429d with Richard Smith.
Other uses of forcing HasEmptyPlaceHolder to false seem OK to me -
they're all around pointer/reference types where the pointer/reference
token will appear at the rightmost side of the left side of the type
name, so they make nested types (eg: the "int" in "int *") behave as
though there is a non-empty placeholder (because the "*" is essentially
the placeholder as far as the "int" is concerned).
Reid Kleckner [Thu, 14 Oct 2021 20:34:15 +0000 (13:34 -0700)]
[bazel] Move MC header usage from Support to tblgen
After the TargetRegistry.h move, nothing in Support includes headers
from MC. However, files in tablegen use MC headers, so we must add an
entry for them in tblgen srcs.
Differential Revision: https://reviews.llvm.org/D111835
Collin Baker [Thu, 14 Oct 2021 20:47:25 +0000 (16:47 -0400)]
[test] Fix asan dynamic unit tests with per-target runtime dirs
When LLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on
Asan-i386-calls-Dynamic-Test and Asan-i386-inline-Dynamic-Test fail to
run on a x86_64 host. This is because asan's unit test lit files are
configured once, rather than per target arch as with the non-unit
tests. LD_LIBRARY_PATH ends up incorrect, and the tests try linking
against the x86_64 runtime which fails.
This changes the unit test CMake machinery to configure the default
and dynamic unit tests once per target arch, similar to the other asan
tests. Then the fix from https://reviews.llvm.org/D108859 is adapted
to the unit test Lit files with some modifications.
Fixes PR52158.
Differential Revision: https://reviews.llvm.org/D111756
Arthur Eubanks [Wed, 13 Oct 2021 18:51:45 +0000 (11:51 -0700)]
[clang] Support -clear-ast-before-backend without -disable-free
Previously without -disable-free, -clear-ast-before-backend would crash in ~ASTContext() due to various reasons.
This works around that by doing a lot of the cleanup ahead of the destructor so that the destructor doesn't actually do any manual cleanup if we've already cleaned up beforehand.
This actually does save a measurable amount of memory with -clear-ast-before-backend, although at an almost unnoticeable runtime cost:
https://llvm-compile-time-tracker.com/compare.php?from=
5d755b32f2775b9219f6d6e2feda5e1417dc993b&to=
58ef1c7ad7e2ad45f9c97597905a8cf05a26258c&stat=max-rss
Previously we weren't doing any cleanup with -disable-free, so I tried measuring the impact of always doing the cleanup and didn't measure anything noticeable on llvm-compile-time-tracker.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111767
Rong Xu [Thu, 14 Oct 2021 20:33:37 +0000 (13:33 -0700)]
[TableGen][PGO] Disable profile instrumentation for printInstruction function
We are seeing extremely long time in building AMDGPUInstPrinter.cpp
when profile instrumentation is enabled: It takes more than 5 minutes
(compared to ~8 seconds in non-instrument build).
This caused by the huge statements in printInstruction functions. In
profile instrumentation build, we need have extra control flow to
differentiate each case statement. This in turn adds significant
compile time in block placement and branch folding.
Function printInstruction is not likely to benefit from PGO build
as it's rarely executed in a typical compilation. So here I disable
the profile instrumentation for this function.
Differential Revision: https://reviews.llvm.org/D111682
Mogball [Thu, 14 Oct 2021 16:55:33 +0000 (16:55 +0000)]
[MLIR][arith] fix references to std.constant in comments
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D111820
thomasraoux [Thu, 14 Oct 2021 17:39:15 +0000 (10:39 -0700)]
[mlir][vector] Refactor linalg vectorization for reductions
Emit reduction during op vectorization instead of doing it when creating the
transfer write. This allow us to not broadcast output arguments for reduction
initial value.
Differential Revision: https://reviews.llvm.org/D111825
Philip Reames [Thu, 14 Oct 2021 20:26:59 +0000 (13:26 -0700)]
[tests] Add indvars tests showing missing transforms with small IVs
This shows the transform side of D109457, but also lets us try other approaches to the same problem. The common trend to all is that we need to explicit reason about UB to disallow possibility of infinite loops.
David Green [Thu, 14 Oct 2021 20:26:24 +0000 (21:26 +0100)]
[AArch64] Add extra tests for fptosisat vector variants
Roman Lebedev [Thu, 14 Oct 2021 20:07:59 +0000 (23:07 +0300)]
[X86][Costmodel] Improve cost modelling for not-fully-interleaved load
While i've modelled most of the relevant tuples for AVX2,
that only covered fully-interleaved groups.
By definition, interleaving load of stride N means:
load N*VF elements, and shuffle them into N VF-sized vectors,
with 0'th vector containing elements `[0, VF)*stride + 0`,
and 1'th vector containing elements `[0, VF)*stride + 1`.
Example: https://godbolt.org/z/df561Me5E (i64 stride 4 vf 2 => cost 6)
Now, not fully interleaved load, is when not all of these vectors is demanded.
So at worst, we could just pretend that everything is demanded,
and discard the non-demanded vectors. What this means is that the cost
for not-fully-interleaved group should be not greater than the cost
for the same fully-interleaved group, but perhaps somewhat less.
Examples:
https://godbolt.org/z/a78dK5Geq (i64 stride 4 (indices 012u) vf 2 => cost 4)
https://godbolt.org/z/G91ceo8dM (i64 stride 4 (indices 01uu) vf 2 => cost 2)
https://godbolt.org/z/5joYob9rx (i64 stride 4 (indices 0uuu) vf 2 => cost 1)
As we have established over the course of last ~70 patches, (wow)
`BaseT::getInterleavedMemoryOpCos()` is absolutely bogus,
it is usually almost an order of magnitude overestimation,
so i would claim that we should at least use the hardcoded costs
of fully interleaved load groups.
We could go further and adjust them e.g. by the number of demanded indices,
but then i'm somewhat fearful of underestimating the cost.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D111174
Philip Reames [Thu, 14 Oct 2021 00:11:45 +0000 (17:11 -0700)]
autogen tests for ease of update
Craig Topper [Thu, 14 Oct 2021 19:56:42 +0000 (12:56 -0700)]
[RISCV] Remove unused member variable. NFC
Nikita Popov [Tue, 12 Oct 2021 21:23:22 +0000 (23:23 +0200)]
[IVUsers] Move preheader check into SCEVExpander
Rather than checking for loop nest preheaders upfront in IVUsers,
move this requirement into isSafeToExpand() from SCEVExpander.
Historically, LSR did not check whether SCEVs are safe to expand
and fully relied on IVUsers to validate this. Later, support for
non-expandable SCEVs was added via rigid formulas.
Checking this in isSafeToExpand() makes it more obvious what
exactly this check is guarding against, and avoids the awkward
loop nest scan.
This is a followup to https://reviews.llvm.org/D111493#3055286.
Differential Revision: https://reviews.llvm.org/D111681
Aaron Ballman [Thu, 14 Oct 2021 19:46:22 +0000 (15:46 -0400)]
Fix a crash on valid consteval code.
Not all constants are emitted within the context of a function, so use
the module's ASTContext instead because 1) that's the same as the
current function ASTContext, and 2) the module can never be null.
Fixes PR50787.
Raphael Isemann [Thu, 14 Oct 2021 19:36:10 +0000 (21:36 +0200)]
[lldb] Move ~Platform to source file
The called destructors of the members require the includes that are only
in the source file.
Frederic Cambus [Thu, 14 Oct 2021 19:30:39 +0000 (21:30 +0200)]
[Driver][Darwin] Use T reference instead of getToolChain().getTriple().
Differential Revision: https://reviews.llvm.org/D111793
Craig Topper [Thu, 14 Oct 2021 18:23:42 +0000 (11:23 -0700)]
[X86] Use CMOVNS for abs instead of CMOVGE.
CMOVGE reads SF and OF. CMOVNS only reads SF. This matches with
other recent changes to use a single flag where possible. It also
matches gcc codegen.
I believe this technically changes whether the conditioanl move happens
on INT_MIN, but for INT_MIN both registers are the same so it doesn't
matter.
Differential Revision: https://reviews.llvm.org/D111826
Michael Kruse [Thu, 14 Oct 2021 19:06:20 +0000 (14:06 -0500)]
[Polly] Remove support for code generated by gfortran+DragonEgg.
DragonEgg is not maintained anymore, hence there is no need for this
functionality.
Fixes llvm.org/PR52173
Michael Kruse [Thu, 14 Oct 2021 18:47:12 +0000 (13:47 -0500)]
[Polly][docs] Fix itemize list for release notes.
Make the changes top-level items, instead of subitems of the
"Changes..." placeholder.
Aaron Ballman [Thu, 14 Oct 2021 18:44:37 +0000 (14:44 -0400)]
Fix a rejects-valid with consteval on overloaded operators
It seems that Clang 11 regressed functionality that was working in
Clang 10 regarding calling a few overloaded operators in an immediate
context. Specifically, we were not checking for immediate invocations
of array subscripting and the arrow operators, but we properly handle
the other overloaded operators.
This fixes the two problematic operators and adds some test coverage to
show they're equivalent to calling the operator directly.
This addresses PR50779.
Raphael Isemann [Thu, 14 Oct 2021 18:41:58 +0000 (20:41 +0200)]
[lldb] Remove logging from Platform::~Platform
Platform instances are stored in a function-local static list. However, the
logging code involves locking a function-local static mutex. This only works on
some implementations where the Log mutex is by accident destroyed *after* the
Platform list is destroyed.
This fixes randomly failing tests due to `recursive_mutex lock failed: Invalid
argument`.
Reviewed By: kastiglione
Differential Revision: https://reviews.llvm.org/D111816
Rob Suderman [Thu, 14 Oct 2021 02:21:09 +0000 (19:21 -0700)]
[mlir][tosa] Fix tosa.cast UiToFp32 for tosa-to-linalg
Part of the arith update broke UiToFp32. Fixed the lowering and included a new
test to detect a regression.
Differential Revision: https://reviews.llvm.org/D111772
Raphael Isemann [Wed, 13 Oct 2021 16:55:32 +0000 (18:55 +0200)]
[lldb] Rewrite TestDiamond and document some bugs.
David Tenty [Wed, 13 Oct 2021 15:41:47 +0000 (11:41 -0400)]
[libc++][AIX] Add scripts and config for building with the libcxx CI infrastructure
This initial change adds the AIX configuration to run-buildbot, an AIX
CMake cache file, and appropriate compiler and linker flags for testing
AIX to the lit "from scratch" configuration files. Either of the 32-bit or 64-bit configurations
can be built by setting `OBJECT_MODE` in the build environment (as is
typical for AIX).
Reviewed By: ldionne, #libc, #libc_abi
Differential Revision: https://reviews.llvm.org/D111244
Nikita Popov [Sun, 26 Sep 2021 20:20:53 +0000 (22:20 +0200)]
[BasicAA] Improve scalable vector handling
Currently, DecomposeGEP() bails out on the whole decomposition if
it encounters a scalable GEP type anywhere. However, it is fine to
still analyze other GEPs that we look through before hitting the
scalable GEP. This does mean that the decomposed GEP base is no
longer required to be the same as the underlying object. However,
I don't believe this property is necessary for correctness anymore.
This allows us to compute slightly more precise aliasing results
for GEP chains containing scalable vectors, though my primary
interest here is simplifying the code.
Differential Revision: https://reviews.llvm.org/D110511
Daniel Sanders [Tue, 12 Oct 2021 22:44:35 +0000 (15:44 -0700)]
[llvm-mca][timeline] Indicate output was stopped due to cycle limit.
It can be a bit confusing to stop with no explanation so we should indicate
when further output was prevented by the cycle limit.
Differential Revision: https://reviews.llvm.org/D111753
Kai Nacke [Thu, 14 Oct 2021 17:50:05 +0000 (13:50 -0400)]
[AIX] Ignore case when comparing output from od
POSIX does not define the exact output from od tool.
While most implementations use lower case characters in hex output,
the z/OS USS implementation uses upper case characters.
To avoid LIT failures, the FileCheck option to ignore the case must
be used when checking hex bytes.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D111427
Simon Pilgrim [Thu, 14 Oct 2021 17:40:23 +0000 (18:40 +0100)]
[TTI][X86] Merge getInterleavedMemoryOpCostAVX2 into getInterleavedMemoryOpCost. NFC
This a NFC refactor patch to merge the AVX2 interleaved cost handling back into the getInterleavedMemoryOpCost base method - while getInterleavedMemoryOpCostAVX512 uses instruction and patterns very specific to AVX512+, much of the costs analysis for AVX2 can be reused for all SSE targets.
This is the first step towards improving SSE and AVX1 costs that will reuse the relevant AVX2 costs by splitting some of the tables - for instance AVX1 has very similar costs for most vXi64/vXf64 interleave patterns and many sub-128bit vector costs are the same all the way down to SSE2 (or at least SSSE3).
Differential Revision: https://reviews.llvm.org/D111822
Frederic Cambus [Thu, 14 Oct 2021 17:43:59 +0000 (19:43 +0200)]
[Driver][WebAssembly] Use ToolChain reference instead of getToolChain().
Differential Revision: https://reviews.llvm.org/D111786
Michael Kruse [Thu, 14 Oct 2021 17:24:09 +0000 (12:24 -0500)]
[Polly] Clean up Polly's getting started docs.
This patch removes the broken bash scipt (polly.sh) and fixes the broken setup
instructions in get_started.html. It also adds instructions for using Ninja and
links to the LLVM getting started page.
Reviewed By: Meinersbur, InnovativeInventor
Differential Revision: https://reviews.llvm.org/D111685
Simon Pilgrim [Thu, 14 Oct 2021 17:09:49 +0000 (18:09 +0100)]
[TTI][X86] Swap getInterleavedMemoryOpCostAVX2/getInterleavedMemoryOpCostAVX512 implementations. NFC.
I have some upcoming refactoring for SSE/AVX1 interleaving cost support, and the diff is a lot nicer if the (unaltered) AVX512 implementation isn't stuck between getInterleavedMemoryOpCost and getInterleavedMemoryOpCostAVX2
Simon Pilgrim [Thu, 14 Oct 2021 12:35:25 +0000 (13:35 +0100)]
[Transforms] eliminateDeadStores - remove unused variable. NFC.
The initial MemoryAccess *Current assignment is never used, and all other uses are initialized/used within the worklist loop (and not across multiple iterations) - so move the variable internal to the loop.
Fixes scan-build unused assignment warning.
Yitzhak Mandelbaum [Wed, 13 Oct 2021 12:30:10 +0000 (12:30 +0000)]
[libTooling] Add "switch"-like Stencil combinator
Adds `selectBound`, a `Stencil` combinator that allows the user to supply multiple alternative cases, discriminated by bound node IDs.
Differential Revision: https://reviews.llvm.org/D111708
Kevin P. Neal [Thu, 14 Oct 2021 16:31:31 +0000 (12:31 -0400)]
[FPEnv][InstSimplify] Fold fadd X, 0 ==> X, when we know X is not -0
Currently the fadd optimizations in InstSimplify don't know how to do this
NoSignedZeros "X + 0.0 ==> X" fold when using the constrained intrinsics.
This adds the support.
This review is derived from D106362 with some improvements from D107285
and is a follow-on to D111085.
Differential Revision: https://reviews.llvm.org/D111450
Craig Topper [Thu, 14 Oct 2021 16:08:31 +0000 (09:08 -0700)]
[RISCV] Update Zba, Zbb, Zbc, and Zbs version from 0.93 to 1.0.
I've removed the Zbs W instructions that are not part of the frozen spec.
References to B as an extension name have been removed. Tests are updated or split accordingly.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D110669
Vitaly Buka [Tue, 12 Oct 2021 05:59:58 +0000 (22:59 -0700)]
[sanitizer] Move out stack trace pointer from header StackDepot
Trace pointers accessed very rarely and don't need to
be in hot data.
Depends on D111613.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D111614
Nikita Popov [Thu, 14 Oct 2021 16:17:28 +0000 (18:17 +0200)]
[ValueTracking] Simplify getKnowledgeValidInContext() call (NFC)
This accepts an ArrayRef, there's no need to create a SmallVector.
Wenlei He [Thu, 14 Oct 2021 05:27:08 +0000 (22:27 -0700)]
[llvm-profgen] Allow generating AutoFDO profile from CSSPGO binary
Add `-use-dwarf-correlation` switch to allow llvm-profgen to generate AutoFDO profile for binaries built with CSSPGO (pseudo-probe).
Differential Revision: https://reviews.llvm.org/D111776
Joe Loser [Thu, 14 Oct 2021 15:53:43 +0000 (11:53 -0400)]
[libc++] LWG3480: make (recursive_)directory_iterator C++20 ranges
Implement LWG3480 which enables `directory_iterator` and
`recursive_directory_iterator` to be both a `borrowed_range` and a
`view`.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D111644
Julien Pages [Thu, 14 Oct 2021 15:44:09 +0000 (11:44 -0400)]
[AMDGPU] Add more tests for build_vector
Differential Revision: https://reviews.llvm.org/D111652
Gabor Marton [Thu, 30 Sep 2021 13:34:06 +0000 (15:34 +0200)]
[analyzer][solver] Handle simplification to ConcreteInt
The solver's symbol simplification mechanism was not able to handle cases
when a symbol is simplified to a concrete integer. This patch adds the
capability.
E.g., in the attached lit test case, the original symbol is `c + 1` and
it has a `[0, 0]` range associated with it. Then, a new condition `c == 0`
is assumed, so a new range constraint `[0, 0]` comes in for `c` and
simplification kicks in. `c + 1` becomes `0 + 1`, but the associated
range is `[0, 0]`, so now we are able to realize the contradiction.
Differential Revision: https://reviews.llvm.org/D110913
Mark de Wever [Thu, 14 Oct 2021 15:40:45 +0000 (17:40 +0200)]
[libc++][NFC] Fixes placement of the return type.
Dave Lee [Tue, 5 Oct 2021 18:10:45 +0000 (11:10 -0700)]
[lldb] Fix 'frame diagnose' docstring typo
Nicolas Vasilache [Thu, 14 Oct 2021 15:20:43 +0000 (15:20 +0000)]
[mlir][Linalg] Tighten canonicalization of InsertSliceOp that triggers infinite loop
I am unclear this is reproducible with correct IR but atm the verifier for InsertSliceOp
is not powerful enough and this triggers an infinite loop that is worth fixing independently.
Differential Revision: https://reviews.llvm.org/D111812
Nicolas Vasilache [Thu, 14 Oct 2021 15:20:31 +0000 (15:20 +0000)]
[mlir][Linalg] Fix insertion point in comprehensive bufferization
luxufan [Thu, 14 Oct 2021 15:06:43 +0000 (23:06 +0800)]
[JITLink][NFC] Add TableManager to replace PerGraph...Builder pass
This patch add a TableManager which reponsible for fixing edges that need entries to reference the target symbol and constructing such entries.
In the past, the PerGraphGOTAndPLTStubsBuilder pass was used to build GOT and PLT entry, and the PerGraphTLSInfoEntryBuilder pass was used to build TLSInfo entry. By generalizing the behavior of building entry, I added a TableManager which could be reused when built GOT, PLT and TLSInfo entries.
If this patch makes sense and can be accepted, I will apply the TableManager to other targets(MachO_x86_64, MachO_arm64, ELF_riscv), and delete the file PerGraphGOTAndPLTStubsBuilder.h
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D110383
Jinsong Ji [Thu, 14 Oct 2021 14:52:17 +0000 (14:52 +0000)]
[NFC][compiler-rt][profile] Remove non-Posix -h option from test
We are running `ls -lh` in gcov-execlp.c test in Posix folder.
However `-h` is not a POSIX option,ls on some POSIX system (eg: AIX)
may not support it.
This patch remove this option to avoid break.
Reviewed By: anhtuyen
Differential Revision: https://reviews.llvm.org/D111807
Ben Shi [Mon, 4 Oct 2021 15:38:52 +0000 (15:38 +0000)]
[RISCV][test] Add tests of (add (shl r, c0), c1)
Reviewed By: craig.topper, luismarques
Differential Revision: https://reviews.llvm.org/D111116
Tobias Gysi [Thu, 14 Oct 2021 14:08:37 +0000 (14:08 +0000)]
[mlir][linalg] Fix FusionOnTensors header and make local method static (NFC).
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111798
Jeremy Morse [Thu, 14 Oct 2021 13:21:49 +0000 (14:21 +0100)]
[DebugInfo][InstrRef] Place variable-values PHI using LLVM utilities
This patch is very similar to D110173 /
a3936a6c19c, but for variable
values rather than machine values. This is for the second instr-ref
problem, calculating the correct variable value on entry to each block.
The previous lattice based implementation was broken; we now use LLVMs
existing PHI placement utilities to work out where values need to merge,
then eliminate un-necessary ones through value propagation.
Most of the deletions here happen in vlocJoin: it was trying to pick a
location for PHIs to happen in, badly, leading to an infinite loop in the
MIR test added, where it would repeatedly switch between register
locations. The new approach is simpler: either PHIs can be eliminated, or
they can't, and the location of the value is a different problem.
Various bits and pieces move to the header so that they can be tested in
the unit tests. The DbgValue class grows a "VPHI" kind to represent
variable value PHIS that haven't been eliminated yet.
Differential Revision: https://reviews.llvm.org/D110630
Brian Cain [Mon, 22 Feb 2021 15:17:23 +0000 (09:17 -0600)]
[hexagon] Add system register, transfer support
This commit adds the system reg/regpair definitions and the corresponding
register transfer instructions.
Andrew Savonichev [Thu, 14 Oct 2021 13:32:16 +0000 (16:32 +0300)]
Fixup [NVPTX] Add VRFrame and VRFrameLocal to integer register classes
Andrew Savonichev [Thu, 14 Oct 2021 13:18:34 +0000 (16:18 +0300)]
[NVPTX] Add VRFrame and VRFrameLocal to integer register classes
These registers are used as operands for instructions that expect an
integer register, so they should be added to Int32Regs or Int64Regs
register classes. Otherwise the machine verifier emits an error for
the following LIT tests when LLVM_ENABLE_MACHINE_VERIFIER=1
environment variable is set:
*** Bad machine code: Illegal physical register for instruction ***
- function: kernel_func
- basic block: %bb.0 entry (0x55c8903d5438)
- instruction: %3:int64regs = LEA_ADDRi64 $vrframelocal, 0
- operand 1: $vrframelocal
$vrframelocal is not a Int64Regs register.
CodeGen/NVPTX/call-with-alloca-buffer.ll
CodeGen/NVPTX/disable-opt.ll
CodeGen/NVPTX/lower-alloca.ll
CodeGen/NVPTX/lower-args.ll
CodeGen/NVPTX/param-align.ll
CodeGen/NVPTX/reg-types.ll
DebugInfo/NVPTX/dbg-declare-alloca.ll
DebugInfo/NVPTX/dbg-value-const-byref.ll
Differential Revision: https://reviews.llvm.org/D110164
Florian Hahn [Thu, 14 Oct 2021 10:54:04 +0000 (11:54 +0100)]
[VectorCombine] Add test showing issue when running VectorCombine early.
Running -vector-combine early can introduce new vector operations,
blocking loop/SLP vectorization. The added test case could be better
optimized by the SLPVectorizer if no new vector operations are added
early.
Jonas Paulsson [Thu, 14 Oct 2021 12:42:58 +0000 (14:42 +0200)]
[SystemZ] Remove some now unused ISD XXX_LOOP opcodes.
Andrew Savonichev [Wed, 8 Sep 2021 15:19:57 +0000 (18:19 +0300)]
[ARM] Simplify address calculation for NEON load/store
The patch attempts to optimize a sequence of SIMD loads from the same
base pointer:
%0 = gep float*, float* base, i32 4
%1 = bitcast float* %0 to <4 x float>*
%2 = load <4 x float>, <4 x float>* %1
...
%n1 = gep float*, float* base, i32 N
%n2 = bitcast float* %n1 to <4 x float>*
%n3 = load <4 x float>, <4 x float>* %n2
For AArch64 the compiler generates a sequence of LDR Qt, [Xn, #16].
However, 32-bit NEON VLD1/VST1 lack the [Wn, #imm] addressing mode, so
the address is computed before every ld/st instruction:
add r2, r0, #32
add r0, r0, #16
vld1.32 {d18, d19}, [r2]
vld1.32 {d22, d23}, [r0]
This can be improved by computing address for the first load, and then
using a post-indexed form of VLD1/VST1 to load the rest:
add r0, r0, #16
vld1.32 {d18, d19}, [r0]!
vld1.32 {d22, d23}, [r0]
In order to do that, the patch adds more patterns to DAGCombine:
- (load (add ptr inc1)) and (add ptr inc2) are now folded if inc1
and inc2 are constants.
- (or ptr inc) is now recognized as a pointer increment if ptr is
sufficiently aligned.
In addition to that, we now search for all possible base updates and
then pick the best one.
Differential Revision: https://reviews.llvm.org/D108988
Simon Pilgrim [Thu, 14 Oct 2021 12:08:40 +0000 (13:08 +0100)]
[Codegen] TargetLowering::getCanonicalIndexType - early out scaled MVT::i8 indices. NFCI.
Avoids unused assignment scan-build warning.
Simon Pilgrim [Thu, 14 Oct 2021 11:51:23 +0000 (12:51 +0100)]
[clang][sema] instantiateOMPDeclareVariantAttr - merge repeated VariantFuncRef.get() calls. NFCI.
Fixes scan-build warning about dead initialization
Kirill Bobyrev [Thu, 14 Oct 2021 11:36:25 +0000 (13:36 +0200)]
[clangd] IncludeCleaner: Handle macros coming from ScratchBuffer
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D111698
Nicolas Vasilache [Thu, 14 Oct 2021 11:26:11 +0000 (11:26 +0000)]
[mlir] NFC - Avoid unused symbol in opt mode.
Simon Pilgrim [Thu, 14 Oct 2021 11:17:28 +0000 (12:17 +0100)]
[CostModel][X86] Pre-SSE41 targets can use PMADDWD for sext sub-i16 -> i32
Without SSE41 sext/zext instructions the extensions will be split, meaning that the MUL->PMADDWD fold will split the sext_i32(x) into zext_i32(sext_i16(x))
Simon Pilgrim [Thu, 14 Oct 2021 11:05:35 +0000 (12:05 +0100)]
[Orc] ELFNixPlatform::setupJITDylib - remove dead return. NFCI.
2 returns, one after the other - reported by coverity
Alex Zinenko [Thu, 14 Oct 2021 09:15:44 +0000 (11:15 +0200)]
[mlir][python] Better support for variadic regions in Python bindings
Improve support for variadic regions in ODS-generated operation view classes.
In particular, make generated constructors take an extra argument that
specifies the number of variadic regions if the operation has them. Previously,
there was no mechanism to specify a non-zero number of variadic regions. Also
generate named accessors to regions.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D111783
Alex Zinenko [Thu, 14 Oct 2021 09:33:28 +0000 (11:33 +0200)]
[mlir][python] Fix MemRefType IsAFunction in Python bindings
MemRefType was using a wrong `isa` function in the bindings code, which
could lead to invalid IR being constructed. Also run the verifier in
memref dialect tests.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111784
Jeremy Morse [Thu, 14 Oct 2021 09:58:33 +0000 (10:58 +0100)]
Follow up to
a3936a6c19c, correctly select LiveDebugValues implementation
Some functions get opted out of instruction referencing if they're being
compiled with no optimisations, however the LiveDebugValues pass picks one
implementation and then sticks with it through the rest of compilation.
This leads to a segfault if we encounter a function that doesn't use
instr-ref (because it's optnone, for example), but we've already decided
to use InstrRefBasedLDV which expects to be passed a DomTree.
Solution: keep both implementations around in the pass, and pick whichever
one is appropriate to the current function.
Uday Bondhugula [Tue, 12 Oct 2021 09:34:17 +0000 (15:04 +0530)]
[MLIR] Fix assert crash when an unregistered dialect op is encountered
Fix assert crash when an unregistered dialect op is encountered during
parsing and `-allow-unregistered-dialect' isn't on. Instead, emit an
error.
While on this, clean up "registered" vs "loaded" on `getDialect()` and
local clang-tidy warnings.
https://llvm.discourse.group/t/assert-behavior-on-unregistered-dialect-ops/4402
Differential Revision: https://reviews.llvm.org/D111628
Tobias Gysi [Thu, 14 Oct 2021 09:38:21 +0000 (09:38 +0000)]
[mlir][linalg] Expose flag to control nofold attribute when padding.
Setting the nofold attribute enables packing an operand. At the moment, the attribute is set by default. The pack introduces a callback to control the flag.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111718
Josh Mottley [Wed, 13 Oct 2021 18:19:33 +0000 (19:19 +0100)]
[Flang] flang-omp-report replace std::vector's with llvm::SmallVector
This patch replaces all uses of std::vector with llvm::SmallVector in the flang-omp-report plugin.
This is a one of several patches focusing on switching containers from STL to LLVM's ADT library.
Reviewed By: Leporacanthicus
Differential Revision: https://reviews.llvm.org/D111709