Matt Arsenault [Thu, 15 Jul 2021 18:44:03 +0000 (14:44 -0400)]
GlobalISel: Remove dead function
Matt Arsenault [Thu, 15 Jul 2021 18:24:00 +0000 (14:24 -0400)]
AMDGPU/GlobalISel: Preserve more memory types
Matt Arsenault [Thu, 15 Jul 2021 18:23:06 +0000 (14:23 -0400)]
AMDGPU/GlobalISel: Redo kernel argument load handling
This avoids relying on G_EXTRACT on unusual types, and also properly
decomposes structs into multiple registers. This also preserves the
LLTs in the memory operands.
Jeremy Morse [Fri, 16 Jul 2021 12:36:27 +0000 (13:36 +0100)]
[InstrRef][FastISel] Support emitting DBG_INSTR_REF from fast-isel
If you attach __attribute__((optnone)) to a function when using
optimisations, that function will use fast-isel instead of the usual
SelectionDAG method. This is a problem for instruction referencing,
because it means DBG_VALUEs of virtual registers will be created,
triggering some safety assertions in LiveDebugVariables. Those assertions
exist to detect exactly this scenario, where an unexpected piece of code is
generating virtual register references in instruction referencing mode.
Fix this by transforming the DBG_VALUEs created by fast-isel into
half-formed DBG_INSTR_REFs, after which they get patched up in
finalizeDebugInstrRefs. The test modified adds a fast-isel mode to the
instruction referencing isel test.
Differential Revision: https://reviews.llvm.org/D105694
Sanjay Patel [Fri, 16 Jul 2021 12:31:28 +0000 (08:31 -0400)]
[SLP] add tests for poison-safe bool logic reductions; NFC
More coverage for D105730
serge-sans-paille [Thu, 15 Jul 2021 19:55:22 +0000 (21:55 +0200)]
SubstTemplateTypeParmType can contain an 'auto' type in their replacement type
This fixes bug 36064
Differential Revision: https://reviews.llvm.org/D106093
Dmitry Preobrazhensky [Fri, 16 Jul 2021 11:42:30 +0000 (14:42 +0300)]
[AMDGPU][MC] Added missing isCall/isBranch flags
Added isCall for S_CALL_B64; added isBranch for S_SUBVECTOR_LOOP_*.
Differential Revision: https://reviews.llvm.org/D106072
Zarko Todorovski [Fri, 16 Jul 2021 11:49:36 +0000 (07:49 -0400)]
[PowerPC][AIX] Add warning when alignment is incompatible with XL
https://reviews.llvm.org/D105659 implements ByVal handling in llc but
some cases are not compatible with existing XL compiler on AIX. Adding
a clang warning for such cases.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D105660
Alexander Belyaev [Fri, 16 Jul 2021 11:31:02 +0000 (13:31 +0200)]
[mlir] Move linalg::Expand/CollapseShapeOp to memref dialect.
RFC: https://llvm.discourse.group/t/rfc-reshape-ops-restructuring/3310
Differential Revision: https://reviews.llvm.org/D106141
Serge Pavlov [Fri, 16 Jul 2021 11:19:31 +0000 (18:19 +0700)]
Use update_test_checks.py to auto-generate check lines
Alex Zinenko [Thu, 15 Jul 2021 16:16:07 +0000 (18:16 +0200)]
[mlir] add an interface to support custom types in LLVM dialect pointers
This may be necessary in partial multi-stage conversion when a container type
from dialect A containing types from dialect B goes through the conversion
where only dialect A is converted to the LLVM dialect. We will need to keep a
pointer-to-non-LLVM type in the IR until a further conversion can convert
dialect B types to LLVM types.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D106076
Nicholas Guy [Mon, 12 Jul 2021 09:36:35 +0000 (10:36 +0100)]
[AArch64] Update Cortex-A55 SchedModel to improve LDP scheduling
Specifying the latencies of specific LDP variants appears to improve
performance almost universally.
Differential Revision: https://reviews.llvm.org/D105882
Kerry McLaughlin [Fri, 16 Jul 2021 10:04:20 +0000 (11:04 +0100)]
[LV] Avoid scalable vectorization for loops containing alloca
This patch returns an Invalid cost from getInstructionCost() for alloca
instructions if the VF is scalable, as otherwise loops which contain
these instructions will crash when attempting to scalarize the alloca.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D105824
Cullen Rhodes [Fri, 16 Jul 2021 09:14:08 +0000 (09:14 +0000)]
[AArch64][SME] Add load and store instructions
This patch adds support for following contiguous load and store
instructions:
* LD1B, LD1H, LD1W, LD1D, LD1Q
* ST1B, ST1H, ST1W, ST1D, ST1Q
A new register class and operand is added for the 32-bit vector select
register W12-W15. The differences in the following tests which have been
re-generated are caused by the introduction of this register class:
* llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
* llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir
* llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir
* llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir
D88663 attempts to resolve the issue with the store pair test
differences in the AArch64 load/store optimizer.
The GlobalISel differences are caused by changes in the enum values of
register classes, tests have been updated with the new values.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06
Reviewed By: CarolineConcatto
Differential Revision: https://reviews.llvm.org/D105572
David Spickett [Thu, 8 Jul 2021 12:17:43 +0000 (13:17 +0100)]
[lldb][AArch64] Refactor memory tag range handling
Previously GetMemoryTagManager checked many things in one:
* architecture supports memory tagging
* process supports memory tagging
* memory range isn't inverted
* memory range is all tagged
Since writing follow up patches for tag writing (in review
at the moment) it has become clear that this gets unwieldy
once we add the features needed for that.
It also implies that the memory tag manager is tied to the
range you used to request it with but it is not. It's a per
process object.
Instead:
* GetMemoryTagManager just checks architecture and process.
* Then the MemoryTagManager can later be asked to check a
memory range.
This is better because:
* We don't imply that range and manager are tied together.
* A slightly diferent range calculation for tag writing
doesn't add more code to Process.
* Range checking code can now be unit tested.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D105630
Sander de Smalen [Thu, 15 Jul 2021 14:32:31 +0000 (15:32 +0100)]
Reland "[LV] Print remark when loop cannot be vectorized due to invalid costs."
The original patch was:
https://reviews.llvm.org/D105806
There were some issues with undeterministic behaviour of the sorting
function, which led to scalable-call.ll passing and/or failing. This
patch fixes the issue by numbering all instructions in the array first,
and using that number as the order, which should provide a consistent
ordering.
This reverts commit
a607f64118240f70bf1b14ec121b65f49d63800d.
Fraser Cormack [Thu, 24 Jun 2021 15:32:46 +0000 (16:32 +0100)]
[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID
This patch teaches the compiler to identify a wider variety of
`BUILD_VECTOR`s which form integer arithmetic sequences, and to lower
them to `vid.v` with modifications for non-unit steps and non-zero
addends.
The sequences handled by this optimization must either be monotonically
increasing or decreasing. Consecutive elements holding the same value
indicate a fractional step which, while simple mathematically,
becomes more complex to handle both in the realm of lossy integer
division and in the presence of `undef`s.
For example, a common "interleaving" shuffle index will be lowered by
LLVM to both `<0,u,1,u,2,...>` and `<u,0,u,1,u,...>` `BUILD_VECTOR`
nodes. Either of these would ideally be lowered to `vid.v` shifted right
by 1. Detection of this sequence in presence of general `undef` values
is more complicated, however: `<0,u,u,1,>` could match either
`<0,0,0,1,>` or `<0,0,1,1,>` depending on later values in the sequence.
Both are possible, so backtracking or multiple passes is inevitable.
Sticking to monotonic sequences keeps the logic simpler as it can be
done in one pass. Fractional steps will likely be a separate
optimization in a future patch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104921
Uday Bondhugula [Fri, 16 Jul 2021 09:29:16 +0000 (14:59 +0530)]
[MLIR][NFC] Improve doc comment and delete stale comment
Remove duplicate and stale doc comment on affineParallelize. NFC.
Timm Bäder [Fri, 16 Jul 2021 08:17:41 +0000 (10:17 +0200)]
[llvm][tools] Hide unrelated llvm-cfi-verify options
Differential Revision: https://reviews.llvm.org/D106055
Vince Bridgers [Wed, 14 Jul 2021 12:00:14 +0000 (07:00 -0500)]
[analyzer] Do not assume that all pointers have the same bitwidth as void*
This change addresses this assertion that occurs in a downstream
compiler with a custom target.
```APInt.h:1151: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"'```
No covering test case is susbmitted with this change since this crash
cannot be reproduced using any upstream supported target. The test case
that exposes this issue is as simple as:
```lang=c++
void test(int * p) {
int * q = p-1;
if (q) {}
if (q) {} // crash
(void)q;
}
```
The custom target that exposes this problem supports two address spaces,
16-bit `char`s, and a `_Bool` type that maps to 16-bits. There are no upstream
supported targets with similar attributes.
The assertion appears to be happening as a result of evaluating the
`SymIntExpr` `(reg_$0<int * p>) != 0U` in `VisitSymIntExpr` located in
`SimpleSValBuilder.cpp`. The `LHS` is evaluated to `32b` and the `RHS` is
evaluated to `16b`. This eventually leads to the assertion in `APInt.h`.
While this change addresses the crash and passes LITs, two follow-ups
are required:
1) The remainder of `getZeroWithPtrWidth()` and `getIntWithPtrWidth()`
should be cleaned up following this model to prevent future
confusion.
2) We're not sure why references are found along with the modified
code path, that should not be the case. A more principled
fix may be found after some further comprehension of why this
is the case.
Acks: Thanks to @steakhal and @martong for the discussions leading to this
fix.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105974
Simon Giesecke [Wed, 14 Jul 2021 08:21:16 +0000 (08:21 +0000)]
Reformat files.
Differential Revision: https://reviews.llvm.org/D105982
Mehdi Amini [Thu, 15 Jul 2021 23:52:44 +0000 (23:52 +0000)]
Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer
We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Reviewed By: lattner, jpienaar
Differential Revision: https://reviews.llvm.org/D105959
Mehdi Amini [Fri, 16 Jul 2021 07:34:41 +0000 (07:34 +0000)]
Revert "Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer"
This reverts commit
af9321739b20becf170e6bb5060b8d780e1dc8dd.
Still some specific config broken in some way that requires more
investigation.
Timm Bäder [Fri, 16 Jul 2021 07:30:57 +0000 (09:30 +0200)]
Revert "[llvm][tools] Hide unrelated llvm-cfi-verify options"
This reverts commit
7c63726072005cc331bb21694c9022e6d18a3b93.
Timm Bäder [Thu, 15 Jul 2021 11:01:00 +0000 (13:01 +0200)]
[llvm][tools] Hide unrelated llvm-cfi-verify options
Differential Revision: https://reviews.llvm.org/D106055
Marcos Horro [Fri, 16 Jul 2021 07:10:50 +0000 (09:10 +0200)]
[llvm-mca][JSON] Store extra information about driver flags used for the simulation
Added information stored in PipelineOptions and the MCSubtargetInfo.
Bug: https://bugs.llvm.org/show_bug.cgi?id=51041
Reviewed By: andreadb
Differential Revision: https://reviews.llvm.org/D106077
Deep Majumder [Fri, 16 Jul 2021 07:04:30 +0000 (12:34 +0530)]
[analyzer] Handle << operator for std::unique_ptr
This patch handles the `<<` operator defined for `std::unique_ptr` in
the std namespace (ignores custom overloads of the operator).
Differential Revision: https://reviews.llvm.org/D105421
Mehdi Amini [Thu, 15 Jul 2021 23:52:44 +0000 (23:52 +0000)]
Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer
We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Reviewed By: lattner, jpienaar
Differential Revision: https://reviews.llvm.org/D105959
Mehdi Amini [Fri, 16 Jul 2021 06:49:57 +0000 (06:49 +0000)]
Fix mismatch between the provisioning of asyncExecutors and the actual thread count currently in the context (NFC)
This fixes an assert in some deployment where the threadpool is
customized.
Jonas Devlieghere [Fri, 16 Jul 2021 06:16:43 +0000 (23:16 -0700)]
[debugserver] Un-conditionalize code guarded by macOS 10.10 checks
We've been requiring macOS 10.11 since 2018 so there's no point in
keeping code for 10.10 around.
Petr Hosek [Sun, 21 Feb 2021 06:11:33 +0000 (22:11 -0800)]
[profile] Decommit memory after counter relocation
After we relocate counters, we no longer need to keep the original copy
around so we can return the memory back to the operating system.
Differential Revision: https://reviews.llvm.org/D104839
Serge Pavlov [Fri, 16 Jul 2021 04:57:10 +0000 (11:57 +0700)]
Fix typo in test
Max Kazantsev [Fri, 16 Jul 2021 04:31:15 +0000 (11:31 +0700)]
[LSR] Handle case 1*reg => reg. PR50918
This patch addresses assertion failure in case when the only found formula for LSR
is `1*reg => reg` which was supposed to be an impossible situation, however there
is a test that shows it is possible.
In this case, we can use scale register with scale of 1 as the missing base register.
Reviewed By: huihuiz, reames
Differential Revision: https://reviews.llvm.org/D105009
Deep Majumder [Fri, 16 Jul 2021 04:24:05 +0000 (09:54 +0530)]
[analyzer] Model comparision methods of std::unique_ptr
This patch handles all the comparision methods (defined via overloaded
operators) on std::unique_ptr. These operators compare the underlying
pointers, which is modelled by comparing the corresponding inner-pointer
SVal. There is also a special case for comparing the same pointer.
Differential Revision: https://reviews.llvm.org/D104616
Carl Ritson [Fri, 16 Jul 2021 03:13:29 +0000 (12:13 +0900)]
[TableGen] Allow isAllocatable inheritence from any superclass
When setting Allocatable on a generated register class check all
superclasses and set Allocatable true if any superclass is
allocatable.
Without this change generated register classes based on an
allocatable class may end up unallocatable due to the topological
inheritance order.
This change primarily effects AMDGPU backend; however, there are
a few changes in MIPs GlobalISel register constraints as a result.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D105967
Vincent Lee [Fri, 16 Jul 2021 01:29:05 +0000 (18:29 -0700)]
[lld-macho] Optimize bind opcodes with multiple passes
In D105866, we used an intermediate container to store a list of opcodes. Here,
we use that data structure to help us perform optimization passes that would allow
a more efficient encoding of bind opcodes. Currently, the functionality mirrors the
optimization pass {1,2} done in ld64 for bind opcodes under optimization gate
to prevent slight regressions.
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D105867
Shilei Tian [Fri, 16 Jul 2021 03:51:38 +0000 (23:51 -0400)]
[Attributor] Add support for compound assignment for ChangeStatus
A common use of `ChangeStatus` is as follows:
```
ChangeStatus Changed = ChangeStatus::UNCHANGED;
Changed |= foo();
```
where `foo` returns `ChangeStatus` as well. Currently `ChangeStatus` doesn't
support compound assignment, we have to write as
```
Changed = Changed | foo();
```
which is not that convenient.
This patch add the support for compound assignment for `ChangeStatus`. Compound
assignment is usually implemented as a member function, and binary arithmetic
operator is therefore implemented using compound assignment. However, unlike
regular C++ class, enum class doesn't support member functions. As a result, they
can only be implemented in the way shown in the patch.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106109
Mehdi Amini [Fri, 16 Jul 2021 03:46:22 +0000 (03:46 +0000)]
Revert "Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer"
This reverts commit
42f588f39c5ce6f521e3709b8871d1fdd076292f.
Broke some buildbots
Mehdi Amini [Thu, 15 Jul 2021 23:52:44 +0000 (23:52 +0000)]
Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer
We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Reviewed By: lattner, jpienaar
Differential Revision: https://reviews.llvm.org/D105959
John Demme [Fri, 16 Jul 2021 02:03:48 +0000 (19:03 -0700)]
[MLIR] [Python ODS] Use @builtins.property for cases where 'property' is already defined
In cases where an operation has an argument or result named 'property', the
ODS-generated python fails on import because the `@property` resolves to the
`property` operation argument instead of the builtin `@property` decorator. We
should always use the fully qualified decorator name.
Reviewed By: mikeurbach
Differential Revision: https://reviews.llvm.org/D106106
LLVM GN Syncbot [Fri, 16 Jul 2021 02:23:45 +0000 (02:23 +0000)]
[gn build] Port
766a08df12c1
Nico Weber [Fri, 16 Jul 2021 02:23:14 +0000 (22:23 -0400)]
[gn build] port
766a08df12c1
Shilei Tian [Fri, 16 Jul 2021 02:20:54 +0000 (22:20 -0400)]
[NFC][OpenMP][Offloading] Replaced explicit parallel level computation with function `__kmpc_parallel_level`
There are two places in current deviceRTLs where it computes parallel level explicitly,
which is basically the functionality of `__kmpc_parallel_level`. Starting from
D105787, we plan to introduce a series of function call folding based on information
that can be deducted during compilation time. Computation of parallel level is
the next target. This patch makes steps for the optimization.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105955
Louis Dionne [Thu, 15 Jul 2021 13:46:36 +0000 (09:46 -0400)]
[libc++] Add a job running GCC with C++11
This configuration is interesting because GCC has a different level of
strictness for some C++ rules. In particular, it implements the older
standards more stringently than Clang, which can help find places where
we are non-conforming (especially in the test suite).
Differential Revision: https://reviews.llvm.org/D105936
Ben Barham [Fri, 16 Jul 2021 01:24:09 +0000 (18:24 -0700)]
[Frontend] Only compile modules if not already finalized
It was possible to re-add a module to a shared in-memory module cache
when search paths are changed. This can eventually cause a crash if the
original module is referenced after this occurs.
1. Module A depends on B
2. B exists in two paths C and D
3. First run only has C on the search path, finds A and B and loads
them
4. Second run adds D to the front of the search path. A is loaded and
contains a reference to the already compiled module from C. But
searching finds the module from D instead, causing a mismatch
5. B and the modules that depend on it are considered out of date and
thus rebuilt
6. The recompiled module A is added to the in-memory cache, freeing
the previously inserted one
This can never occur from a regular clang process, but is very easy to
do through the API - whether through the use of a shared case or just
running multiple compilations from a single `CompilerInstance`. Update
the compilation to return early if a module is already finalized so that
the pre-condition in the in-memory module cache holds.
Resolves rdar://
78180255
Differential Revision: https://reviews.llvm.org/D105328
Daniel Rodríguez Troitiño [Thu, 15 Jul 2021 21:37:33 +0000 (14:37 -0700)]
[test] Use double pound to denote comments.
Use double pound at the start of the line to differentiate comments from
statements for Lit or FileCheck.
I will also use this small commit to check my commit access.
Differential Revision: https://reviews.llvm.org/D106103
Weiwei Li [Thu, 15 Jul 2021 23:25:56 +0000 (07:25 +0800)]
[mlir][spirv] Add support for GLSL FMix
Add spv.GLSL.FMix opertaion.
co-authered-by: Alan Liu <alanliu.yf@gmail.com>
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D104153
Vincent Lee [Sat, 10 Jul 2021 00:25:29 +0000 (17:25 -0700)]
[lld-macho] Use intermediate arrays to store opcodes
We want to incorporate some of the optimization passes in bind opcodes from ld64.
This revision makes no functional changes but to start storing opcodes in intermediate
containers in preparation for implementing the optimization passes in a follow-up revision.
Differential Revision: https://reviews.llvm.org/D105866
Kirill Stoimenov [Thu, 15 Jul 2021 20:53:56 +0000 (13:53 -0700)]
[asan] Slightly modified the documentation.
The goal of this change is to test if I can commit changes.
Reviewed By: kcc
Differential Revision: https://reviews.llvm.org/D106101
Nico Weber [Thu, 15 Jul 2021 23:29:04 +0000 (19:29 -0400)]
Revert "tsan: make obtaining current PC faster"
This reverts commit
e33446ea58b8357dd8b79eb39140a1de2baff1ae.
Doesn't build on mac, and causes other problems. See reports
on https://reviews.llvm.org/D106046 and https://reviews.llvm.org/D106081
Also revert follow-up "tsan: strip top inlined internal frames"
This reverts commit
7b302fc9b04c7991cdb869b65316e0d72e41042e.
Matt Arsenault [Thu, 15 Jul 2021 16:29:50 +0000 (12:29 -0400)]
GlobalISel: Surface offsets parameter from ComputeValueVTs
Matt Arsenault [Thu, 15 Jul 2021 16:16:15 +0000 (12:16 -0400)]
AMDGPU/GlobalISel: Fix incorrect memory types in test
Matt Arsenault [Wed, 14 Jul 2021 18:03:18 +0000 (14:03 -0400)]
GlobalISel: Track argument pointeriness with arg flags
Since we're still building on top of the MVT based infrastructure, we
need to track the pointer type/address space on the side so we can end
up with the correct pointer LLTs when interpreting CCValAssigns.
Peter S. Housel [Thu, 15 Jul 2021 22:42:28 +0000 (00:42 +0200)]
[lldb] Add AllocateMemory/DeallocateMemory to the SBProcess API
This change adds AllocateMemory and DeallocateMemory methods to the SBProcess
API, so that clients can allocate and deallocate memory blocks within the
process being debugged (for storing JIT-compiled code or other uses).
(I am developing a debugger + REPL using the API; it will need to store
JIT-compiled code within the target.)
Reviewed By: clayborg, jingham
Differential Revision: https://reviews.llvm.org/D105389
Vitaly Buka [Thu, 15 Jul 2021 22:16:29 +0000 (15:16 -0700)]
[NFC][hwasan] Remove default arguments in internal class
Victor Huang [Thu, 15 Jul 2021 22:21:54 +0000 (17:21 -0500)]
[PowerPC] Add PowerPC population count, reversed load and store related builtins and instrinsics for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for population
count, reversed load and store related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106021
Shilei Tian [Thu, 15 Jul 2021 22:23:12 +0000 (18:23 -0400)]
[AbstractAttributor] Fold function calls to `__kmpc_is_spmd_exec_mode` if possible
In the device runtime there are many function calls to `__kmpc_is_spmd_exec_mode`
to query the execution mode of current kernels. In many cases, user programs
only contain target region executing in one mode. As a consequence, those runtime
function calls will only return one value. If we can get rid of these function
calls during compliation, it can potentially improve performance.
In this patch, we use `AAKernelInfo` to analyze kernel execution. Basically, for
each kernel (device) function `F`, we collect all kernel entries `K` that can
reach `F`. A new AA, `AAFoldRuntimeCall`, is created for each call site. In each
iteration, it will check all reaching kernel entries, and update the folded value
accordingly.
In the future we will support more function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105787
Amara Emerson [Fri, 9 Jul 2021 22:48:47 +0000 (15:48 -0700)]
GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI.
This adds some level of type safety, allows helper functions to be added for
specific opcodes for free, and also allows us to succinctly check for class
membership with the usual dyn_cast/isa/cast functions.
To start off with, add variants for the different load/store operations with some
places using it.
Differential Revision: https://reviews.llvm.org/D105751
Roland McGrath [Thu, 15 Jul 2021 22:07:07 +0000 (15:07 -0700)]
[libc] Fix typos in x86_64/FEnv.h
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D106105
Omar Emara [Thu, 15 Jul 2021 21:33:30 +0000 (14:33 -0700)]
[LLDB][GUI] Add Process Attach form
This patch adds a form window to attach a process, either by PID or by
name. This patch also adds support for dynamic field visibility such
that the form delegate can hide or show certain fields based on some
conditions.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D105655
Eli Friedman [Thu, 15 Jul 2021 20:49:13 +0000 (13:49 -0700)]
[DependenceAnalysis] Guard analysis using getPointerBase().
D104806 broke some uses of getMinusSCEV() in DependenceAnalysis:
subtraction with different pointer bases returns a SCEVCouldNotCompute.
Make sure we avoid cases involving such subtractions.
Differential Revision: https://reviews.llvm.org/D106099
Harald van Dijk [Thu, 15 Jul 2021 21:56:08 +0000 (22:56 +0100)]
[X86] Fix handling of maskmovdqu in X32
The maskmovdqu instruction is an odd one: it has a 32-bit and a 64-bit
variant, the former using EDI, the latter RDI, but the use of the
register is implicit. In 64-bit mode, a 0x67 prefix can be used to get
the version using EDI, but there is no way to express this in
assembly in a single instruction, the only way is with an explicit
addr32.
This change adds support for the instruction. When generating assembly
text, that explicit addr32 will be added. When not generating assembly
text, it will be kept as a single instruction and will be emitted with
that 0x67 prefix. When parsing assembly text, it will be re-parsed as
ADDR32 followed by MASKMOVDQU64, which still results in the correct
bytes when converted to machine code.
The same applies to vmaskmovdqu as well.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103427
Sanjay Patel [Thu, 15 Jul 2021 20:47:51 +0000 (16:47 -0400)]
[SLP] avoid leaking poison in reduction of safe boolean logic ops
This bug was introduced with D105730 /
25ee55c0baff .
If we are not converting all of the operations of a reduction
into a vector op, we need to preserve the existing select form
of the remaining ops. Otherwise, we are potentially leaking
poison where it did not in the original code.
Alive2 agrees that the version that freezes some inputs
and then falls back to scalar is correct:
https://alive2.llvm.org/ce/z/erF4K2
Nikita Popov [Thu, 15 Jul 2021 18:34:56 +0000 (20:34 +0200)]
[Verifier] Extend address taken check for unknown intrinsics
Intrinsics can only be called directly, taking their address is not
legal. This is currently only enforced for intrinsics that have an
ID, rather than all intrinsics. Adjust the check to cover all
intrinsics.
This came up in D106013.
Differential Revision: https://reviews.llvm.org/D106095
Victor Huang [Thu, 15 Jul 2021 21:06:59 +0000 (16:06 -0500)]
[PowerPC][NFC] Add the missing 'REQUIRES: powerpc-registered-target.' in the builtins' front end test cases for XL compatibility
Sumesh Udayakumaran [Thu, 15 Jul 2021 01:42:39 +0000 (04:42 +0300)]
[mlir] Enable cleanup of single iteration reduction loops being sibling-fused maximally
Changes include the following:
1. Single iteration reduction loops being sibling fused at innermost insertion level
are skipped from being considered as sequential loops.
Otherwise, the slice bounds of these loops is reset.
2. Promote loops that are skipped in previous step into outer loops.
3. Two utility function - buildSliceTripCountMap, getSliceIterationCount - are moved from
mlir/lib/Transforms/Utils/LoopFusionUtils.cpp to mlir/lib/Analysis/Utils.cpp
Reviewed By: bondhugula, vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D104249
Jessica Paquette [Wed, 14 Jul 2021 00:42:00 +0000 (17:42 -0700)]
[AArch64][GlobalISel] Clamp <n x p0> vecs when legalizing G_EXTRACT_VECTOR_ELT
This case was missing from G_EXTRACT_VECTOR_ELT. It's the same as for s64.
https://godbolt.org/z/Tnq4acY8z
Differential Revision: https://reviews.llvm.org/D105952
zhijian [Thu, 15 Jul 2021 20:54:22 +0000 (16:54 -0400)]
[AIX][XCOFF][Bug-Fixed] parse the parameter type of the traceback table
Summary:
in the function PPCFunctionInfo::getParmsType(), there is if (Bits > 31 || (Bits > 30 && (Elt != FixedType || hasVectorParms())))
when the Bit is 31 and the Elt is not FixedType(for example the Elt is FloatingType) , the 31th bit will be not encoded, it leave the bit as zero, when the function Expected<SmallString<32>> XCOFF::parseParmsType() the original implement
**// unsigned ParmsNum = FixedParmsNum + FloatingParmsNum;
while (Bits < 32 && ParsedNum < ParmsNum) {
...
}//**
it will look the 31 bits (zero) as FixedType. which should be FloatingType, and get a error.
Reviewers: Jason Liu,ZarkoCA
Differential Revision: https://reviews.llvm.org/D105023
Louis Dionne [Thu, 15 Jul 2021 17:02:43 +0000 (13:02 -0400)]
[runtimes] Don't try passing --target flags to GCC
When a target triple is specified in CMake via XXX_TARGET_TRIPLE, we tried
passing the --target=<...> flag to the compiler. However, not all compilers
support that flag (e.g. GCC, which is not a cross-compiler). As a result,
setting e.g. LIBCXX_TARGET_TRIPLE=<host-triple> would end up trying to
pass --target=<host-triple> to GCC, which breaks everything because the
flag isn't even supported.
This commit only adds `--target=<...>` & friends to the flags if it is
supported by the compiler.
One could argue that it's confusing to pass LIBCXX_TARGET_TRIPLE=<...>
and have it be ignored. That's correct, and one possibility would be
to assert that the requested triple is the same as the host triple when
we know the compiler is unable to cross-compile. However, note that this
is a pre-existing issue (setting the TARGET_TRIPLE variable never had an
influence on the flags passed to the compiler), and also fixing that is
starting to look like reimplementing a lot of CMake logic that is already
handled with CMAKE_CXX_COMPILER_TARGET.
Differential Revision: https://reviews.llvm.org/D106082
Martin Storsjö [Tue, 13 Jul 2021 12:49:19 +0000 (12:49 +0000)]
[libcxx] [test] Fix mismatches between aligned operator new and std::free
The XFAIL comments about VCRuntime not providing aligned operator new
are outdated; these days VCRuntime does provide them.
However, the tests used to fail on Windows, as the pointers allocated
with an aligned operator new (which is implemented with _aligned_malloc
on Windows) can't be freed using std::free() on Windows (but they need
to be freed with the corresponding function _aligned_free instead).
Instead override the aligned operator new to return a dummy suitably
aligned pointer instead, like other tests that override aligned operator
new.
Also override `operator delete[]` instead of plain `operator delete`
in the array testcase; the fallback from `operator delete[]` to
user defined `operator delete` doesn't work in all DLL build
configurations on Windows.
Also expand the TEST_NOEXCEPT macros, as these tests only are built
in C++17 mode.
By providing the aligned operator new within the tests, this also makes
these test cases pass when testing back deployment on macOS 10.9.
Differential Revision: https://reviews.llvm.org/D105962
Hedin Garca [Thu, 15 Jul 2021 18:13:10 +0000 (18:13 +0000)]
[libc] Relocate the closing directive of #ifdef
Changed where an #endif was placed because previously it
prevented three macro definitions from being enable in Windows.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D106087
Nikita Popov [Thu, 15 Jul 2021 20:06:39 +0000 (22:06 +0200)]
[ObjCARC] Use objc_msgSend instead of llvm.objc.msgSend in tests
D55348 replaced @objc_msgSend with @llvm.objc.msgSend in tests
together with many other objc intrinsics. However, this is not a
recognized objc intrinsic (https://llvm.org/docs/LangRef.html#objective-c-arc-runtime-intrinsics)
and does not receive special treatment by LLVM. It's likely that
uses of this function were renamed by accident.
This came up in D106013, because the address of @llvm.objs.msgSend
is taken, something which is normally not allowed for intrinsics.
Differential Revision: https://reviews.llvm.org/D106094
George Burgess IV [Thu, 15 Jul 2021 20:03:27 +0000 (13:03 -0700)]
utils: fix broken assertion in revert_checker
`intermediate_commits` is a list of full SHAs, and `across_ref` may/may
not be a full SHA (or a SHA at all). We already have `across_sha`, which
is the resolved form of `across_ref`, so use that instead.
Thanks to probinson for catching this in post-commit review of
https://reviews.llvm.org/D105578!
Harald van Dijk [Thu, 15 Jul 2021 19:52:25 +0000 (20:52 +0100)]
[Driver] Fix compiler-rt lookup for x32
x86_64-linux-gnu and x86_64-linux-gnux32 use different ABIs and objects
built for one cannot be used for the other. In order to build and use
compiler-rt for x32, we need to treat x32 as a new arch there. This
updates the driver to search using the new arch name.
Reviewed By: glaubitz
Differential Revision: https://reviews.llvm.org/D100148
Aart Bik [Thu, 15 Jul 2021 18:06:40 +0000 (11:06 -0700)]
[mlir][sparse] add int64 storage type to sparse tensor runtime support library
This format was missing from the support library. Although there are some
subtleties reading in an external format for int64 as double, there is no
good reason to omit support for this data type form the support library.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106016
Artem Belevich [Sat, 3 Jul 2021 00:02:07 +0000 (17:02 -0700)]
[NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction.
That should allow clang to compile mma.h from CUDA-11.3.
Differential Revision: https://reviews.llvm.org/D105384
Sushma Unnibhavi [Thu, 15 Jul 2021 19:00:14 +0000 (13:00 -0600)]
[M68k][GloballSel] LegalizerInfo implementation
Added rules for G_ADD, G_SUB, G_MUL, G_UDIV to be legal.
Differential Revision: https://reviews.llvm.org/D105536
Dmitry Vyukov [Thu, 15 Jul 2021 09:18:53 +0000 (11:18 +0200)]
tsan: lock ScopedErrorReportLock around fork
Currently we don't lock ScopedErrorReportLock around fork
and it mostly works becuase tsan has own report_mtx that
is locked around fork and tsan reports.
However, sanitizer_common code prints some own reports
which are not protected by tsan's report_mtx. So it's better
to lock ScopedErrorReportLock explicitly.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D106048
Philip Reames [Thu, 15 Jul 2021 17:59:02 +0000 (10:59 -0700)]
[unittest] Exercise SCEV's udiv and udiv ceiling routines
The ceiling variant was recently added (due to the work towards D105216), and we're spending a lot of time trying to find optimizations for the expression. This patch brute forces the space of i8 unsigned divides and checks that we get a correct (well consistent with APInt) result for both udiv and udiv ceiling.
(This is basically what I've been doing locally in a hand rolled C++ program, and I realized there no good reason not to check it in as a unit test which directly exercises the logic on constants.)
Differential Revision: https://reviews.llvm.org/D106083
Louis Dionne [Thu, 15 Jul 2021 13:43:47 +0000 (09:43 -0400)]
[libc++/abi] Fix broken Lit feature no-noexcept-function-type
The feature was always defined, which means that the two test cases
guarded by it were never run.
Differential Revision: https://reviews.llvm.org/D106062
Fangrui Song [Thu, 15 Jul 2021 18:31:11 +0000 (11:31 -0700)]
[ELF] Don't define __rela_iplt_start for -pie/-shared
`clang -fuse-ld=lld -static-pie -fpie` produced executable
currently crashes and this patch makes it work.
See https://sourceware.org/bugzilla/show_bug.cgi?id=27164
and https://sourceware.org/pipermail/libc-alpha/2021-July/128810.html
While it seems unreasonable to keep csu/libc-start.c ARCH_APPLY_IREL unclear in
static-pie mode and have an unneeded diff -u =(ld.bfd --verbose) =(ld.bfd -pie
--verbose) difference, glibc folks don't want to fix their code.
I feel sad about that but this patch can remove an iffy condition for lld/ELF
as well: `needsInterpSection()`.
Fangrui Song [Thu, 15 Jul 2021 18:16:35 +0000 (11:16 -0700)]
[ELF][test] Rework non-preemptible ifunc tests
Nikita Popov [Thu, 15 Jul 2021 18:27:52 +0000 (20:27 +0200)]
[Verifier] Use isIntrinsic() (NFC)
Call Function::isIntrinsic() instead of manually checking the
function name for an "llvm." prefix.
Simon Pilgrim [Thu, 15 Jul 2021 18:20:27 +0000 (19:20 +0100)]
[InstCombine] Add select(cond,gep(gep(x,y),z),gep(x,y)) tests from PR51069
Sam Tebbs [Mon, 5 Jul 2021 15:08:58 +0000 (16:08 +0100)]
[ARM][LowOverheadLoops] Make some stack spills valid for tail predication
This patch makes vector spills valid for tail predication when all loads
from the same stack slot are within the loop
Differential Revision: https://reviews.llvm.org/D105443
Quinn Pham [Thu, 15 Jul 2021 13:35:07 +0000 (08:35 -0500)]
[PowerPC] Fix popcntb XL Compat Builtin for 32bit
This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins.
Reviewed By: #powerpc, nemanjai, amyk
Differential Revision: https://reviews.llvm.org/D105360
Simon Pilgrim [Thu, 15 Jul 2021 17:50:06 +0000 (18:50 +0100)]
Fix "unknown pragma 'GCC'" MSVC warning. NFCI.
Simon Pilgrim [Thu, 15 Jul 2021 17:48:56 +0000 (18:48 +0100)]
[InstCombine] Add 3-operand gep test with different ptr and same indices
Dmitry Vyukov [Thu, 15 Jul 2021 17:08:35 +0000 (19:08 +0200)]
tsan: strip top inlined internal frames
The new GET_CURRENT_PC() can lead to spurious top inlined internal frames.
Here are 2 examples from bots, in both cases the malloc is supposed to be
the top frame (#0):
WARNING: ThreadSanitizer: signal-unsafe call inside of a signal
#0 __sanitizer::StackTrace::GetNextInstructionPc(unsigned long)
#1 malloc
Location is heap block of size 99 at 0xbe3800003800 allocated by thread T1:
#0 __sanitizer::StackTrace::GetNextInstructionPc(unsigned long)
#1 malloc
Let's strip these internal top frames from reports.
With other code changes I also observed some top frames
from __tsan::ScopedInterceptor, proactively remove these as well.
Differential Revision: https://reviews.llvm.org/D106081
Philip Reames [Thu, 15 Jul 2021 17:25:06 +0000 (10:25 -0700)]
[SCEV] Fix unsound reasoning in howManyLessThans
This is split from D105216, it handles only a subset of the cases in that patch.
Specifically, the issue being fixed is that the code incorrectly assumed that (Start-Stide) < End implied that the backedge was taken at least once. This is not true when e.g. Start = 4, Stride = 2, and End = 3. Note that we often do produce the right backedge taken count despite the flawed reasoning.
The fix chosen here is to use an alternate form of uceil (ceiling of unsigned divide) lowering which is safe when max(RHS,Start) > Start - Stride. (Note that signedness of both max expression and comparison depend on the signedness of the comparison being analyzed, and that overflow in the Start - Stride expression is allowed.) Note that this is weaker than proving the backedge is taken because it allows start - stride < end < start. Some cases which can't be proven safe are sent down the generic path, and we do end up generating less optimal expressions in a few cases.
Credit for coming up with the approach goes entirely to Eli. I just split it off, tweaked the comments a bit, and did some additional testing.
Differential Revision: https://reviews.llvm.org/D105942
Louis Dionne [Thu, 15 Jul 2021 17:29:47 +0000 (13:29 -0400)]
[libc++] NFC: Reindent the run-buildbot script
Fangrui Song [Thu, 15 Jul 2021 17:26:21 +0000 (10:26 -0700)]
[test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers)
Vy Nguyen [Tue, 13 Jul 2021 17:27:09 +0000 (13:27 -0400)]
[llvm-exegesis] Fix missing-headers build errors.
Details:
Switch all #includes to use <> because that is consistent with what happens in the cmake checks.
Otherwise, we could be in the situation where cmake checks see that headers exist at <perfmon/...>
but in llvm-exegesis code, we use "perfmon/...", which may not exist.
Related PR/revisions: D84076, PR51017+D105615
Differential Revision: https://reviews.llvm.org/D105861
Arthur Eubanks [Thu, 15 Jul 2021 17:15:51 +0000 (10:15 -0700)]
Revert "[SLP]Workaround for InsertSubVector cost."
This reverts commit
2eb50baf059648214cb1c624b5269978a62e86a1.
Causes hangs, see comments on D105827.
Jessica Paquette [Thu, 15 Jul 2021 16:56:14 +0000 (09:56 -0700)]
[GlobalISel] Fix infinite loop in reassociationCanBreakAddressingModePattern
It didn't update the opcode while walking through G_INTTOPTR/G_PTRTOINT.
Differential Revision: https://reviews.llvm.org/D106080
Wouter van Oortmerssen [Tue, 13 Jul 2021 00:18:39 +0000 (17:18 -0700)]
[WebAssembly] Fixed LLD generation of 64-bit __wasm_apply_data_relocs
Differential Revision: https://reviews.llvm.org/D105863
Leonard Grey [Thu, 15 Jul 2021 16:56:13 +0000 (12:56 -0400)]
[lld-macho] Add LTO cache support
This adds support for the lld-only `--thinlto-cache-policy` option, as well as
implementations for ld64's `-cache_path_lto`, `-prune_interval_lto`,
`-prune_after_lto`, and `-max_relative_cache_size_lto`.
Test is adapted from lld/test/ELF/lto/cache.ll
Differential Revision: https://reviews.llvm.org/D105922
Stanislav Mekhanoshin [Wed, 7 Jul 2021 17:57:56 +0000 (10:57 -0700)]
[AMDGPU] Refine -O0 and -O1 passes.
Differential Revision: https://reviews.llvm.org/D105579
Fangrui Song [Thu, 15 Jul 2021 16:50:37 +0000 (09:50 -0700)]
[llvm-nm] Remove one-dash long options except -arch
The documentation and help messages have recommended the double-dash forms for
quite a while. Remove one-dash long options which are not recognized by GNU
style `getopt_long`.
`-arch` is kept as it is in the manpage of classic nm
https://keith.github.io/xcode-man-pages/nm.1.html
Note: the dyldinfo related options don't have a test.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D105948
Fangrui Song [Thu, 15 Jul 2021 16:45:46 +0000 (09:45 -0700)]
[test] Avoid llvm-nm one-dash long options