Craig Topper [Thu, 22 Apr 2021 17:29:36 +0000 (10:29 -0700)]
[RISCV] Add IR intrinsics for vmsge(u).vv/vx/vi.
These instructions don't really exist, but we have ways we can
emulate them.
.vv will swap operands and use vmsle().vv. .vi will adjust the
immediate and use .vmsgt(u).vi when possible. For .vx we need to
use some of the multiple instruction sequences from the V extension
spec.
For unmasked vmsge(u).vx we use:
vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd
For cases where mask and maskedoff are the same value then we have
vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that
requires a temporary so we use:
vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt
For other masked cases we use this sequence:
vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0
We trust that register allocation will prevent vd in vmslt{u}.vx
from being v0 since v0 is still needed by the vmxor.
Differential Revision: https://reviews.llvm.org/D100925
Craig Topper [Thu, 22 Apr 2021 17:18:33 +0000 (10:18 -0700)]
[RISCV] Add missing tests for vector type for second operand of vmsgt and vmsgtu IR intrinsics.
Refactor to use new multiclass instead of individual patterns.
We already supported this due to SEW=64 on RV32, but we didn't have
test cases for all the types we supported.
Part of D100925
Craig Topper [Thu, 22 Apr 2021 17:07:23 +0000 (10:07 -0700)]
[RISCV] Support vector type for second operand of vmfge and vmfgt IR intrinsics.
We don't have instructions for these, but can swap the operands
to use vmle/vmflt. This makes the IR interface more consistent and
simplifies the frontend implementation.
Part of D100925
Vitaly Buka [Thu, 22 Apr 2021 17:39:04 +0000 (10:39 -0700)]
[NFC] Remove reference to file deleted by D100981.
Vitaly Buka [Thu, 22 Apr 2021 08:45:36 +0000 (01:45 -0700)]
[scudo] Check if MADV_DONTNEED zeroes memory
QEMU just ignores MADV_DONTNEED
https://github.com/qemu/qemu/blob/
b1cffefa1b163bce9aebc3416f562c1d3886eeaa/linux-user/syscall.c#L11941
Depends on D100998.
Differential Revision: https://reviews.llvm.org/D101031
Vitaly Buka [Tue, 20 Apr 2021 20:14:03 +0000 (13:14 -0700)]
[sanitizer] Use COMPILER_RT_EMULATOR with gtests
Differential Revision: https://reviews.llvm.org/D100998
Andrzej Warzynski [Thu, 22 Apr 2021 16:17:26 +0000 (16:17 +0000)]
[flang] Update recently added OpenMP tests to use the new driver
Switching from `%f18` to `%flang_fc1` in LIT tests added in
https://reviews.llvm.org/D91159. This way these tests are run with the
new driver, `flang-new`, when enabled (i.e. when
`FLANG_BUILD_NEW_DRIVER` is set).
Differential Revision: https://reviews.llvm.org/D101078
Fangrui Song [Thu, 22 Apr 2021 17:18:44 +0000 (10:18 -0700)]
Temporarily revert the code part of D100981 "Delete le32/le64 targets"
This partially reverts commit
77ac823fd285973cfb3517932c09d82e6a32f46d.
Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934).
Temporarily brings back the code part to give them some time for migration.
Vitaly Buka [Thu, 22 Apr 2021 16:57:58 +0000 (09:57 -0700)]
[lsan] Temporarily disable new check broken on arm7
Craig Topper [Thu, 22 Apr 2021 16:50:52 +0000 (09:50 -0700)]
[RISCV] Turn splat shuffles of vector loads into strided load with stride of x0.
Implementations are allowed to optimize an x0 stride to perform
less memory accesses. This is the case in SiFive cores.
No idea if this is the case in other implementations. We might
need a tuning flag for this.
Reviewed By: frasercrmck, arcbbb
Differential Revision: https://reviews.llvm.org/D100815
Raphael Isemann [Thu, 22 Apr 2021 16:44:58 +0000 (18:44 +0200)]
[lldb] Fix that the expression commands --top-level flag overwrites --allow-jit false
The `--allow-jit` flag allows the user to force the IR interpreter to run the
provided expression.
The `--top-level` flag parses and injects the code as if its in the top level
scope of a source file.
Both flags just change the ExecutionPolicy of the expression:
* `--allow-jit true` -> doesn't change anything (its the default)
* `--allow-jit false` -> ExecutionPolicyNever
* `--top-level` -> ExecutionPolicyTopLevel
Passing `--allow-jit false` and `--top-level` currently causes the `--top-level`
to silently overwrite the ExecutionPolicy value that was set by `--allow-jit
false`. There isn't any ExecutionPolicy value that says "top-level but only
interpret", so I would say we reject this combination of flags until someone
finds time to refactor top-level feature out of the ExecutionPolicy enum.
The SBExpressionOptions suffer from a similar symptom as `SetTopLevel` and
`SetAllowJIT` just silently disable each other. But those functions don't have
any error handling, so not a lot we can do about this in the meantime.
Reviewed By: labath, kastiglione
Differential Revision: https://reviews.llvm.org/D91780
Craig Topper [Thu, 22 Apr 2021 16:33:24 +0000 (09:33 -0700)]
[RISCV] Use stack temporary to splat two GPRs into SEW=64 vector on RV32.
Rather than doing splatting each separately and doing bit manipulation
to merge them in the vector domain, copy the data to the stack
and splat it using a strided load with x0 stride. At least on
some implementations this vector load is optimized to not do
a load for each element.
This is equivalent to how we move i64 to f64 on RV32.
I've only implemented this for the intrinsic fallbacks in this
patch. I think we do similar splatting/shifting/oring in other
places. If this is approved, I'll refactor the others to share
the code.
Differential Revision: https://reviews.llvm.org/D101002
Krzysztof Parzyszek [Thu, 22 Apr 2021 14:05:05 +0000 (09:05 -0500)]
[Hexagon] Add HVX intrinsics for conditional vector loads/stores
Intrinsics for the following instructions are added. The intrinsic
name is "int_hexagon_<inst>[_128B]", e.g.
int_hexagon_V6_vL32b_pred_ai for 64-byte version
int_hexagon_V6_vL32b_pred_ai_128B for 128-byte version
V6_vL32b_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4)
V6_vL32b_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3)
V6_vL32b_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2)
V6_vL32b_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4)
V6_vL32b_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3)
V6_vL32b_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2)
V6_vL32b_nt_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4):nt
V6_vL32b_nt_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3):nt
V6_vL32b_nt_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2):nt
V6_vL32b_nt_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4):nt
V6_vL32b_nt_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3):nt
V6_vL32b_nt_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt
V6_vS32b_pred_ai if (Pv4) vmem(Rt32+#s4) = Vs32
V6_vS32b_pred_pi if (Pv4) vmem(Rx32++#s3) = Vs32
V6_vS32b_pred_ppu if (Pv4) vmem(Rx32++Mu2) = Vs32
V6_vS32b_npred_ai if (!Pv4) vmem(Rt32+#s4) = Vs32
V6_vS32b_npred_pi if (!Pv4) vmem(Rx32++#s3) = Vs32
V6_vS32b_npred_ppu if (!Pv4) vmem(Rx32++Mu2) = Vs32
V6_vS32Ub_pred_ai if (Pv4) vmemu(Rt32+#s4) = Vs32
V6_vS32Ub_pred_pi if (Pv4) vmemu(Rx32++#s3) = Vs32
V6_vS32Ub_pred_ppu if (Pv4) vmemu(Rx32++Mu2) = Vs32
V6_vS32Ub_npred_ai if (!Pv4) vmemu(Rt32+#s4) = Vs32
V6_vS32Ub_npred_pi if (!Pv4) vmemu(Rx32++#s3) = Vs32
V6_vS32Ub_npred_ppu if (!Pv4) vmemu(Rx32++Mu2) = Vs32
V6_vS32b_nt_pred_ai if (Pv4) vmem(Rt32+#s4):nt = Vs32
V6_vS32b_nt_pred_pi if (Pv4) vmem(Rx32++#s3):nt = Vs32
V6_vS32b_nt_pred_ppu if (Pv4) vmem(Rx32++Mu2):nt = Vs32
V6_vS32b_nt_npred_ai if (!Pv4) vmem(Rt32+#s4):nt = Vs32
V6_vS32b_nt_npred_pi if (!Pv4) vmem(Rx32++#s3):nt = Vs32
V6_vS32b_nt_npred_ppu if (!Pv4) vmem(Rx32++Mu2):nt = Vs32
Joseph Huber [Wed, 21 Apr 2021 21:31:09 +0000 (17:31 -0400)]
[OpenMP] Add function for setting LIBOMPTARGET_INFO at runtime
Summary:
This patch adds a new runtime function __tgt_set_info_flag that allows the
user to set the information level at runtime without using the environment
variable. Using this will require an extern function, but will eventually be
added into an auxilliary library for OpenMP support functions.
This patch required moving the current InfoLevel to a global variable which must
be instantiated by each plugin.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D100774
Raphael Isemann [Thu, 22 Apr 2021 16:33:46 +0000 (18:33 +0200)]
Fix memory leak in MicrosoftDemangleNodes's Node::toString
The buffer we turn into a std::string here is malloc'd and should be
free'd before we return from this function.
Follow up to LLDB leak fixes such as D100806.
Reviewed By: mstorsjo, rupprecht, MaskRay
Differential Revision: https://reviews.llvm.org/D100843
peter klausler [Wed, 21 Apr 2021 22:12:07 +0000 (15:12 -0700)]
[flang] Fix spurious errors from runtime derived type table construction
Andrezj W. @ Arm discovered that the runtime derived type table
building code in semantics was detecting fatal errors in the tests
that the f18 driver wasn't printing. This patch fixes f18 so that
these messages are printed; however, the messages were not valid user
errors, and the rest of this patch fixes them up.
There were two sources of the bogus errors. One was that the runtime
derived type information table builder was calculating the shapes of
allocatable and pointer array components in derived types, and then
complaining that they weren't constant or LEN parameter values, which
of course they couldn't be since they have to have deferred shapes
and those bounds were expressions like LBOUND(component,dim=1).
The second was that f18 was forwarding the actual LEN type parameter
expressions of a type instantiation too far into the uses of those
parameters in various expressions in the declarations of components;
when an actual LEN type parameter is not a constant value, it needs
to remain a "bare" type parameter inquiry so that it will be lowered
to a descriptor inquiry and acquire a captured expression value.
Fixing this up properly involved: moving some code into new utility
function templates in Evaluate/tools.h, tweaking the rewriting of
conversions in expression folding to elide needless integer kind
conversions of type parameter inquiries, making type parameter
inquiry folding *not* replace bare LEN type parameters with
non-constant actual parameter values, and cleaning up some
altered test results.
Differential Revision: https://reviews.llvm.org/D101001
Jianzhou Zhao [Wed, 21 Apr 2021 04:54:29 +0000 (04:54 +0000)]
[dfsan] Track origin at loads
The first version of origin tracking tracks only memory stores. Although
this is sufficient for understanding correct flows, it is hard to figure
out where an undefined value is read from. To find reading undefined values,
we still have to do a reverse binary search from the last store in the chain
with printing and logging at possible code paths. This is
quite inefficient.
Tracking memory load instructions can help this case. The main issues of
tracking loads are performance and code size overheads.
With tracking only stores, the code size overhead is 38%,
memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
larger than #store, so both code size and cpu overhead increases. The
first blocker is code size overhead: link fails if we inline tracking
loads. The workaround is using external function calls to propagate
metadata. This is also the workaround ASan uses. The cpu overhead
is ~10x. This is a trade off between debuggability and performance,
and will be used only when debugging cases that tracking only stores
is not enough.
Reviewed By: gbalats
Differential Revision: https://reviews.llvm.org/D100967
Arthur O'Dwyer [Thu, 22 Apr 2021 16:15:09 +0000 (12:15 -0400)]
[libc++] [test] Fix nodiscard_extensions.pass.cpp in _LIBCPP_DEBUG mode.
`std::clamp(2, 1, 3, std::greater<int>())` has UB because (1 > 3) is false.
Swap the operands to fix the _LIBCPP_ASSERT failure in this test.
Irina Dobrescu [Thu, 22 Apr 2021 15:45:19 +0000 (15:45 +0000)]
[flang][openmp] Add General Semantic Checks for Allocate Directive
This patch adds semantic checks for the General Restrictions of the
Allocate Directive.
Since the requires directive is not yet implemented in Flang, the
restriction:
```
allocate directives that appear in a target region must
specify an allocator clause unless a requires directive with the
dynamic_allocators clause is present in the same compilation unit
```
will need to be updated at a later time.
A different patch will be made with the Fortran specific restrictions of
this directive.
I have used the code from https://reviews.llvm.org/D89395 for the
CheckObjectListStructure function.
Co-authored-by: Isaac Perry <isaac.perry@arm.com>
Reviewed By: clementval, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D91159
Sanjay Patel [Thu, 22 Apr 2021 16:11:47 +0000 (12:11 -0400)]
[x86] remove stale comment from test file; NFC
Arthur O'Dwyer [Mon, 19 Apr 2021 02:17:44 +0000 (22:17 -0400)]
[libc++] Eliminate macro _LIBCPP_UNUSED_VAR. NFCI.
Reviewed as part of https://reviews.llvm.org/D100737
Arthur O'Dwyer [Mon, 19 Apr 2021 02:15:38 +0000 (22:15 -0400)]
[libc++] Fix some typos and remove unused macros. NFCI.
Reviewed as part of https://reviews.llvm.org/D100737
Alexey Bataev [Thu, 22 Apr 2021 13:15:27 +0000 (06:15 -0700)]
[SLP]Skip undefs trying to find perfect/shuffled tree entries matching.
We can skip check for undefs trying to find perfect/shuffled tree
entries matching, they can be ignored completely improving the final
cost/vectorization results.
Differential Revision: https://reviews.llvm.org/D101061
Hongtao Yu [Thu, 22 Apr 2021 01:02:11 +0000 (18:02 -0700)]
[llvm-profgen] A couple tweaks to the testing harness.
1. Remove unnecessary filtering code.
2. Add llvm-profgen to tool substitutions.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D101006
Peter Steinfeld [Wed, 21 Apr 2021 19:12:26 +0000 (12:12 -0700)]
[flang] Fix checking of argument passing for parameterized derived types
We were erroneously not taking into account the constant values of LEN type
parameters of parameterized derived types when checking for argument
compatibility. The required checks are identical to those for assignment
compatibility. Since argument compatibility is checked in .../lib/Evaluate and
assignment compatibility is checked in .../lib/Semantics, I moved the common
code into .../lib/Evaluate/tools.cpp and changed the assignment compatibility
checking code to call it.
After implementing these new checks, tests in resolve53.f90 were failing
because the tests were erroneous. I fixed these tests and added new tests
to call03.f90 to test argument passing of parameterized derived types more
completely.
Differential Revision: https://reviews.llvm.org/D100989
Arnamoy Bhattacharyya [Thu, 22 Apr 2021 15:45:30 +0000 (11:45 -0400)]
[flang][driver][Revert] Reverts f18 to allow options passed to -W
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D100883
Nemanja Ivanovic [Thu, 22 Apr 2021 15:28:50 +0000 (10:28 -0500)]
[PowerPC] Add missing casts for vec_xlds and vec_load_splats
The previous commits just missed some pointer casts and ended up
producing warnings.
Nemanja Ivanovic [Thu, 22 Apr 2021 13:47:31 +0000 (08:47 -0500)]
[PowerPC] Add vec_vclz as an alias for vec_cntlz in altivec.h
Another addition for compatibility with XLC. The functions have the
same overloads so just add it as a preprocessor define.
Nemanja Ivanovic [Thu, 22 Apr 2021 12:27:51 +0000 (07:27 -0500)]
[PowerPC] Add vec_load_splats to altivec.h
Add these overloads for compatibility with XLC. This is a word
load-and-splat.
Nemanja Ivanovic [Thu, 22 Apr 2021 11:43:42 +0000 (06:43 -0500)]
[PowerPC] Add vec_xlds to altivec.h
Add these overloads for compatibility with XLC. This is a doubleword
load-and-splat.
Nemanja Ivanovic [Thu, 22 Apr 2021 10:53:37 +0000 (05:53 -0500)]
[PowerPC] Add vec_roundz as alias for vec_trunc in altivec.h
Add the overloads for compatibility with XLC.
Nemanja Ivanovic [Thu, 22 Apr 2021 10:47:00 +0000 (05:47 -0500)]
[PowerPC] Add vec_roundp as alias for vec_ceil
Add the overloads for compatibility with XLC.
Nemanja Ivanovic [Thu, 22 Apr 2021 10:41:37 +0000 (05:41 -0500)]
[PowerPC] Add missing VSX guard for vec_roundm with vector double
The guard was missed in the previous commit.
Nemanja Ivanovic [Thu, 22 Apr 2021 10:38:11 +0000 (05:38 -0500)]
[PowerPC] Add vec_roundm as alias for vec_floor in altivec.h
Add the overloads for compatibility with XLC.
Louis Dionne [Thu, 22 Apr 2021 15:05:30 +0000 (11:05 -0400)]
[libc++] Re-apply `std::indirectly_readable` and `std::indirectly_writable`
That was originally committed in
04733181b513 and then reverted in
a9f11cc0d965 because it broke several people.
The problem was a missing include of __iterator/concepts.h, which has now
been fixed.
Differential Revision: https://reviews.llvm.org/D100073
Coplin, Jared [Thu, 22 Apr 2021 15:15:46 +0000 (10:15 -0500)]
[Hexagon] Unmasked and masked load pair to dame bae -? one load and selects
Nico Weber [Thu, 22 Apr 2021 15:14:58 +0000 (11:14 -0400)]
[lld/mac] tweak comment in a test
Joe Ellis [Thu, 22 Apr 2021 15:07:26 +0000 (15:07 +0000)]
[AArch64] Block tryCombineToBSL combines for vectors wider than NEON
There are no patterns for the AArch64ISD::BSP ISD node for anything
other than NEON vectors at the moment. As a result, if we hit these
combines for vectors wider than a NEON vector (such as what we might get
with fixed length SVE) we will fail to lower.
This patch simply prevents us from attempting the combines if the input
vector type is too wide.
Reviewed By: peterwaller-arm
Differential Revision: https://reviews.llvm.org/D100961
Alexey Bataev [Thu, 22 Apr 2021 14:58:56 +0000 (07:58 -0700)]
[OPENMP]Mark test as unsupported to avoid possible unexpected passes,
NFC.
Joe Ellis [Wed, 31 Mar 2021 10:41:15 +0000 (10:41 +0000)]
[LoopVectorize] Fix bug where predicated loads/stores were dropped
This commit fixes a bug where the loop vectoriser fails to predicate
loads/stores when interleaving for targets that support masked
loads and stores.
Code such as:
1 void foo(int *restrict data1, int *restrict data2)
2 {
3 int counter = 1024;
4 while (counter--)
5 if (data1[counter] > data2[counter])
6 data1[counter] = data2[counter];
7 }
... could previously be transformed in such a way that the predicated
store implied by:
if (data1[counter] > data2[counter])
data1[counter] = data2[counter];
... was lost, resulting in miscompiles.
This bug was causing some tests in llvm-test-suite to fail when built
for SVE.
Differential Revision: https://reviews.llvm.org/D99569
Alexey Bataev [Thu, 22 Apr 2021 14:53:20 +0000 (07:53 -0700)]
[SLP]Replace more `TTI` with `TTIRef`, NFC.
To pacify MSVC buildbots.
Alexey Bataev [Thu, 22 Apr 2021 14:49:08 +0000 (07:49 -0700)]
[SLP]Added explicit ref to TargetTransformInfo to try to pacify MSVC
buildbots, NFC.
Nico Weber [Thu, 22 Apr 2021 14:44:56 +0000 (10:44 -0400)]
[lld/mac] make a few "named parameter comments" more consistent
Most of LLVM and almost all of lld/MachO uses `/*foo=*/bar` style.
No behavior change.
Alexey Bataev [Tue, 6 Apr 2021 17:26:25 +0000 (10:26 -0700)]
[SLP]Improve cost model for the vectorized extractelements.
1. No need to call `areAllUsersVectorized` as later the cost is
calculated only if the instruction has one use and gets vectorized.
2. Need to calculate the cost of the dead extractelement more precisely,
taking the vector type of the vector operand, not the resulting
vector type.
Part of D57059.
Differential Revision: https://reviews.llvm.org/D99980
Dávid Bolvanský [Thu, 22 Apr 2021 14:39:07 +0000 (16:39 +0200)]
[LoopIdiom] Added testcase for double memset (fixed in LLVM 12); NFC
Anastasia Stulova [Thu, 22 Apr 2021 12:46:46 +0000 (13:46 +0100)]
[C++4OpenCL] Add extra diagnostics for kernel argument types
Add restrictions on type layout (PR48099):
- Types passed by pointer or reference must be standard layout types.
- Types passed by value must be POD types.
Patch by olestrohm (Ole Strohm)!
Differential Revision: https://reviews.llvm.org/D100471
Alex Zinenko [Thu, 22 Apr 2021 13:52:01 +0000 (15:52 +0200)]
[mlir] Move PyConcreteAttribute to header. NFC.
This allows out-of-tree users to derive PyConcreteAttribute to bind custom
attributes.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D101063
Sven van Haastregt [Thu, 22 Apr 2021 14:08:36 +0000 (15:08 +0100)]
[OpenCL] Add missing C++ legacy atomics with generic
https://reviews.llvm.org/D62335 added some C++ for OpenCL specific
builtins to opencl-c.h, but these were not mirrored to the TableGen
builtin functions yet.
The TableGen builtins machinery does not have dedicated version
handling for C++ for OpenCL at the moment: all builtin versioning is
tied to `LangOpts.OpenCLVersion` (i.e., the OpenCL C version). As a
workaround, to add builtins that are only available in C++ for OpenCL,
we define a function extension guarded by the __cplusplus macro.
Differential Revision: https://reviews.llvm.org/D100935
Fixes PR50041.
Krzysztof Parzyszek [Thu, 22 Apr 2021 14:06:18 +0000 (09:06 -0500)]
Revert "[Hexagon] Masked and unmasked load to same base -> load and two selects"
This reverts commit
96dc8d7e7dee68592e56d69184b92fcb021cdb9c.
It breaks a few builds.
Tim Northover [Tue, 20 Apr 2021 14:05:11 +0000 (15:05 +0100)]
AArch64: support mixed-size fp <-> int conversions in GlobalISel.
Tim Northover [Tue, 20 Apr 2021 09:19:02 +0000 (10:19 +0100)]
AArch64: expand G_DIVREM operations in GlobalISel
We don't have a specific instruction for these, so they should be expanded to
whatever separate division & multiplication is needed.
David Zarzycki [Thu, 22 Apr 2021 13:36:49 +0000 (09:36 -0400)]
Revert "[libcxx][iterator] adds `std::indirectly_readable` and `std::indirectly_writable`"
This reverts commit
04733181b5136e2b3df2b37c6bdd9e25f8afecd0 which was
failing for multiple people.
Paula Toth [Thu, 22 Apr 2021 13:43:06 +0000 (06:43 -0700)]
Update shebang for clang-format-diff script to python3.
Different distributions have different strategies migrating the `python` symlink. Debian and its derivatives provide `python-is-python2` and `python-is-python3`. If neither is installed, the user gets no `/usr/bin/python`. The clang-format-diff script and consequently `arc diff` can thus fail with a python not found error. Since we require python greater than 3.6 as part of llvm prerequisites (https://llvm.org/docs/GettingStarted.html#software), let's go ahead and update this shebang.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100968
Coplin, Jared [Thu, 22 Apr 2021 13:44:01 +0000 (08:44 -0500)]
[Hexagon] Masked and unmasked load to same base -> load and two selects
Nico Weber [Thu, 22 Apr 2021 13:09:39 +0000 (09:09 -0400)]
[lld/mac] add a comment pointing to a test that took me a while to find
Simon Pilgrim [Thu, 22 Apr 2021 13:06:57 +0000 (14:06 +0100)]
[X86] Regenerate atomic-eflags-reuse.ll
Simon Pilgrim [Thu, 22 Apr 2021 10:40:54 +0000 (11:40 +0100)]
[LTO] Caching.h - remove unused <string> include. NFCI.
Dawid Jurczak [Thu, 22 Apr 2021 13:06:20 +0000 (15:06 +0200)]
[InstCombine][NFC] Use --check-globals flag in tests.
This patch adds strings content checking to printf-2.ll via --check-globals flag.
Split off from D100724.
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D101034
Dawid Jurczak [Thu, 22 Apr 2021 13:01:44 +0000 (15:01 +0200)]
[SimplifyLibCalls][NFC] Use StringRef::back instead explicit indexing.
Split off from D100724.
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D101032
Jun Ma [Mon, 19 Apr 2021 09:16:08 +0000 (17:16 +0800)]
[DAGCombiner] Allow operand of step_vector to be negative.
It is proper to relax non-negative limitation of step_vector.
Also this patch adds more combines for step_vector:
(sub X, step_vector(C)) -> (add X, step_vector(-C))
Differential Revision: https://reviews.llvm.org/D100812
Alexander Belyaev [Thu, 22 Apr 2021 12:50:13 +0000 (14:50 +0200)]
[mlir] Add `tensor.reshape`.
This operation a counterpart of `memref.reshape`.
RFC [Reshape Ops Restructuring](https://llvm.discourse.group/t/rfc-reshape-ops-restructuring/3310)
Differential Revision: https://reviews.llvm.org/D100971
Wang, Pengfei [Thu, 22 Apr 2021 09:26:36 +0000 (17:26 +0800)]
[X86][AMX][NFC] Remove assert for comparison between different BBs.
SmallSet may use operator `<` when we insert MIRef elements, so we
cannot limit the comparison between different BBs.
We allow MIRef() to be less that any initialized MIRef object, otherwise,
we always reture false when compare between different BBs.
Differential Revision: https://reviews.llvm.org/D101039
Nico Weber [Thu, 22 Apr 2021 12:40:48 +0000 (08:40 -0400)]
[gn build] (manually) port
aee6c86c4d better
"EmptyNodeIntrospection.inc.in" needs to be a source of the action,
so that ninja knows to rerun this action if that input changes.
Stephen Kelly [Thu, 22 Apr 2021 12:40:01 +0000 (13:40 +0100)]
[AST] Make comment a bit more specific
Nico Weber [Thu, 22 Apr 2021 12:36:19 +0000 (08:36 -0400)]
[gn build] (manually) port
aee6c86c4d
Pavel Labath [Thu, 22 Apr 2021 12:13:27 +0000 (14:13 +0200)]
[lldb/elf] Avoid side effects in function calls ParseUnwindSymbols
This addresses post-commit feedback to cd64273.
Nathan Sidwell [Wed, 14 Apr 2021 11:18:23 +0000 (04:18 -0700)]
clang: libstdc++ LWM is 4.8.3
Document oldest libstdc++ as 4.8.3, remove a hack for a 4.6 issue.
Differential Revision: https://reviews.llvm.org/D100465
Sander de Smalen [Wed, 7 Apr 2021 16:06:22 +0000 (17:06 +0100)]
[AArch64][SVE] Regression test all ACLE tests with C++
We found issues with a number of intrinsics when building them with
C++, so it makes sense to guard these tests with some extra RUN lines
to build the tests in C++ mode.
Valeriy Savchenko [Wed, 21 Apr 2021 13:49:06 +0000 (16:49 +0300)]
[-Wcalled-once] Do not run analysis on Obj-C++
Objective-C++ is not yet suppoerted.
rdar://
76729552
Differential Revision: https://reviews.llvm.org/D100955
Frederik Gossen [Thu, 22 Apr 2021 12:09:14 +0000 (14:09 +0200)]
[MLIR][Shape] Add canonicalizations for `shape.broadcast`
Eliminate empty shapes from the operands, partially fold all constant shape
operands, and fix normal folding.
Differential Revision: https://reviews.llvm.org/D100634
Raphael Isemann [Thu, 22 Apr 2021 10:29:08 +0000 (12:29 +0200)]
[lldb] Don't leak LineSequence in PDB parsers
`InsertSequence` doesn't take ownership of the pointer so releasing this pointer
is just leaking memory.
Follow up to D100806 that was fixing other leak sanitizer test failures
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D100846
Jan Svoboda [Thu, 22 Apr 2021 11:55:23 +0000 (13:55 +0200)]
[clang][deps] Check extra args in tests
These flags are being generated by `clang-scan-deps` and it makes sense to ensure it keeps doing so.
Stephen Kelly [Thu, 22 Apr 2021 11:51:25 +0000 (12:51 +0100)]
[AST] Add clarification comment
Raphael Isemann [Thu, 22 Apr 2021 11:45:15 +0000 (13:45 +0200)]
[lldb] XFAIL TestStoppedInStaticMemberFunction on Windows
It seems we can't find the symbols of static members on Windows? The bug is not
relevant to what this test is actually testing so let's just XFAIL it.
Jay Foad [Thu, 22 Apr 2021 11:29:49 +0000 (12:29 +0100)]
Fix typo "beneficiates" in comments
Stephen Kelly [Wed, 17 Mar 2021 01:48:50 +0000 (01:48 +0000)]
[AST] De-duplicate empty node introspection
This way we can add support for other nodes without duplication.
Differential Revision: https://reviews.llvm.org/D98774
Benjamin Kramer [Thu, 22 Apr 2021 11:01:16 +0000 (13:01 +0200)]
[lldb-vscode] Use a DenseMap to pacify overly aggressive linters
Some linters get rather upset upon seeing
`std::unordered_map<const char*`, because it looks like a map of
strings but isn't. lldb uses interned strings so this is not a problem.
DenseMap is a better data structure for this anyways, so use that
instead.
Stephen Tozer [Wed, 21 Apr 2021 15:56:38 +0000 (16:56 +0100)]
[Bitcode] Ensure DIArgList in bitcode has no null or forward metadata refs
This patch fixes an issue in which ConstantAsMetadata arguments to a
DIArglist, as well as the Constant values referenced by that metadata,
would not be always be emitted correctly into bitcode. This patch fixes
this issue firstly by searching for ConstantAsMetadata in DIArgLists
(previously we would only search for them when directly wrapped in
MetadataAsValue), and secondly by enumerating all of a DIArgList's
arguments directly prior to enumerating the DIArgList itself.
This patch also adds a number of asserts, and no longer treats the
arguments to a DIArgList as optional fields when reading/writing to
bitcode.
Differential Revision: https://reviews.llvm.org/D100572
Hamza Mahfooz [Thu, 22 Apr 2021 08:15:48 +0000 (09:15 +0100)]
[Matrix] Support #pragma clang fp
From https://bugs.llvm.org/show_bug.cgi?id=49739:
Currently, `#pragma clang fp` are ignored for matrix types.
For the code below, the `contract` fast-math flag should be added to the generated call to `llvm.matrix.multiply` and `fadd`
```
typedef float fx2x2_t __attribute__((matrix_type(2, 2)));
void foo(fx2x2_t &A, fx2x2_t &C, fx2x2_t &B) {
#pragma clang fp contract(fast)
C = A*B + C;
}
```
Reviewed By: fhahn, mibintc
Differential Revision: https://reviews.llvm.org/D100834
Simon Pilgrim [Thu, 22 Apr 2021 10:11:09 +0000 (11:11 +0100)]
MipsSEFrameLowering.h - remove unused headers. NFCI.
Simon Pilgrim [Thu, 22 Apr 2021 09:57:19 +0000 (10:57 +0100)]
[X86][AVX] Add PR49971 test case
This is a llvm12 only bug, and is already avoided in trunk, but we should keep track of it.
Nemanja Ivanovic [Thu, 22 Apr 2021 10:30:50 +0000 (05:30 -0500)]
[PowerPC] Add vec_roundc as alias for vec_rint in altivec.h
For compatibility with XLC, add these overloads.
Stephen Kelly [Thu, 15 Apr 2021 20:02:10 +0000 (21:02 +0100)]
[AST] Add NestedNameSpecifierLoc accessors to node introspection
Differential Revision: https://reviews.llvm.org/D100712
Raphael Isemann [Thu, 22 Apr 2021 10:19:15 +0000 (12:19 +0200)]
[lldb][NFC] Fix unsigned/signed cmp warning in MainLoopTest
The gtest checks compare all against unsigned int constants so this also needs
to be unsigned.
Raphael Isemann [Mon, 19 Apr 2021 13:50:10 +0000 (15:50 +0200)]
[lldb] Add support for evaluating expressions in static member functions
At the moment the expression parser doesn't support evaluating expressions in
static member functions and just pretends the expression is evaluated within a
non-member function. This causes that all static members are inaccessible when
doing unqualified name lookup.
This patch adds support for evaluating in static member functions. It
essentially just does the same setup as what LLDB is already doing for
non-static member functions (i.e., wrapping the expression in a fake member
function) with the difference that we now mark the wrapping function as static
(to prevent access to non-static members).
Reviewed By: shafik, jarin
Differential Revision: https://reviews.llvm.org/D81550
Nemanja Ivanovic [Thu, 22 Apr 2021 03:16:35 +0000 (22:16 -0500)]
[PowerPC] Improve codegen for vector fp to int widening conversions
We currently do not utilize instructions that convert single
precision vectors to doubleword integer vectors. These conversions
come up in code occasionally and this improvement allows us to
open code some functions that need to be added to altivec.h.
Thomas Schmeyer [Fri, 16 Apr 2021 15:33:06 +0000 (17:33 +0200)]
[mlir] Move memref-tests from standard to memref folder.
Split memref-test from standard test and move them to the folder MemRef.
Differential Revision: https://reviews.llvm.org/D100950
Martin Storsjö [Tue, 20 Apr 2021 21:05:01 +0000 (00:05 +0300)]
[AArch64] Fix calling windows varargs with floats in fixed args from non-windows functions
When inspecting the calling convention, for calling windows functions
from a non-windows function, inspect the calling convention of
the called function, not the caller.
Also remove an unnecessary parameter to AArch64CallLowering
OutgoingArgHandler.
Differential Revision: https://reviews.llvm.org/D100890
Jan Svoboda [Wed, 21 Apr 2021 11:25:31 +0000 (13:25 +0200)]
[clang][deps] Include "-cc1" in the arguments
To simplify tools consuming dependency scanning results, prepend the "-cc1" argument by default.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D100942
Tobias Gysi [Thu, 22 Apr 2021 08:22:37 +0000 (08:22 +0000)]
[mlir][linalg] remove interchange option on linalg to loop lowering.
The interchange option attached to the linalg to loop lowering affects only the loops and does not update the memory accesses generated in to body of the operation. Instead of performing the interchange during the loop lowering use the interchange pattern.
Differential Revision: https://reviews.llvm.org/D100758
Martin Probst [Thu, 22 Apr 2021 05:54:11 +0000 (07:54 +0200)]
clang-format: [JS] do not merge side-effect imports.
The if condition was testing the current element, but
forgot to check the previous element (doh), so it
would fail depending on sort order of the imports.
Differential Revision: https://reviews.llvm.org/D101020
Jay Foad [Wed, 21 Apr 2021 16:38:32 +0000 (17:38 +0100)]
[AMDGPU] SIWholeQuadMode: don't add duplicate implicit $exec operands
STRICT_WWM and STRICT_WQM are already defined with Uses = [EXEC], so
there is no need to add another implicit use of $exec when lowering them
to V_MOV_B32 instructions.
Differential Revision: https://reviews.llvm.org/D100969
Serge Pavlov [Tue, 10 Nov 2020 16:51:34 +0000 (23:51 +0700)]
[RISCV] Custom lowering of SET_ROUNDING
Differential Revision: https://reviews.llvm.org/D91242
David Sherwood [Mon, 19 Apr 2021 13:56:35 +0000 (14:56 +0100)]
[LoopVectorize] Don't create unnecessary vscale intrinsic calls
In quite a few cases in LoopVectorize.cpp we call createStepForVF
with a step value of 0, which leads to unnecessary generation of
llvm.vscale intrinsic calls. I've optimised IRBuilder::CreateVScale
and createStepForVF to return 0 when attempting to multiply
vscale by 0.
Differential Revision: https://reviews.llvm.org/D100763
Wenlei He [Thu, 15 Apr 2021 05:53:40 +0000 (22:53 -0700)]
[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles
The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful.
Differential Revision: https://reviews.llvm.org/D100528
Martin Storsjö [Tue, 23 Mar 2021 08:55:26 +0000 (10:55 +0200)]
[libcxx] [test] Fix testing on windows with c++experimental enabled
The straightforward `AddLinkFlag('-lc++experimental')` approach doesn't
work on e.g. MSVC. For linking to libc++ itself, a more convoluted logic
is used (see configure_link_flags_cxx_library).
Differential Revision: https://reviews.llvm.org/D99177
Georgy Komarov [Tue, 6 Apr 2021 04:57:47 +0000 (07:57 +0300)]
[clang-tidy] Avoid bugprone-macro-parentheses warnings after goto argument
clang-tidy should not generate warnings for the goto argument without
parentheses, because it would be a syntax error.
The only valid case where an argument can be enclosed in parentheses is
"Labels as Values" gcc extension: https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html.
This commit adds support for the label-as-values extension as implemented in clang.
Fixes bugzilla issue 49634.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D99924
Arthur Eubanks [Thu, 22 Apr 2021 06:51:51 +0000 (23:51 -0700)]
[NewPM] Mark some more wrapper passes as ignored
We shouldn't print IR when seeing these passes.
Craig Topper [Thu, 22 Apr 2021 06:05:26 +0000 (23:05 -0700)]
[RISCV] Use TargetConstant for condition code of RISCVISD::SELECT_CC.
The value is always an immediate and can never be in a register.
This the kind of thing TargetConstant is for.
Saves a step GenDAGISel to convert a Constant to a TargetConstant.
Max Kazantsev [Thu, 22 Apr 2021 05:50:38 +0000 (12:50 +0700)]
[GVN] Introduce loop load PRE
This patch allows PRE of the following type of loads:
```
preheader:
br label %loop
loop:
br i1 ..., label %merge, label %clobber
clobber:
call foo() // Clobbers %p
br label %merge
merge:
...
br i1 ..., label %loop, label %exit
```
Into
```
preheader:
%x0 = load %p
br label %loop
loop:
%x.pre = phi(x0, x2)
br i1 ..., label %merge, label %clobber
clobber:
call foo() // Clobbers %p
%x1 = load %p
br label %merge
merge:
x2 = phi(x.pre, x1)
...
br i1 ..., label %loop, label %exit
```
So instead of loading from %p on every iteration, we load only when the actual clobber happens.
The typical pattern which it is trying to address is: hot loop, with all code inlined and
provably having no side effects, and some side-effecting calls on cold path.
The worst overhead from it is, if we always take clobber block, we make 1 more load
overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken
at least once, the transform is neutral or profitable.
There are several improvements prospect open up:
- We can sometimes be smarter in loop-exiting blocks via split of critical edges;
- If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that
we don't know if their sum is colder than the header.
Differential Revision: https://reviews.llvm.org/D99926
Reviewed By: reames