Nashe Mncube [Tue, 23 Mar 2021 13:26:37 +0000 (13:26 +0000)]
[InstCombine]Generalise regression tests for sve
The tests, test/Transforms/InstCombine/AArch64/sve-*,
have been shown to not be AArch64 specific. These tests
have been renamed and moved to reflect this.
Differential Revision: https://reviews.llvm.org/D99253
Josh Berdine [Thu, 25 Mar 2021 23:34:04 +0000 (23:34 +0000)]
[OCaml] Fix a possible crash in llvm_struct_name
The implementation of `llvm_struct_name` before this diff calls
`caml_copy_string`, which allocates, while the `result` local variable
points to a block allocated by `caml_alloc_small` that has not yet
been initialized. If the allocation in `caml_copy_string` triggers a
garbage collection, then the GC root `result` contains a pointer to
uninitialized data, which may crash the GC or lead to a memory
corruption.
This diff fixes this by allocating and initializing the string first
and then allocating and initializing the option, thereby leaving no
dangling pointers when allocations are made.
The conversion from a C string to an OCaml string option is refactored
into a function, `cstr_to_string_option`. This function is also used
to simplify the definitions of `llvm_get_mdstring` and
`llvm_string_of_const`.
Differential Revision: https://reviews.llvm.org/D99393
Josh Berdine [Thu, 25 Mar 2021 23:24:16 +0000 (23:24 +0000)]
[NFC][OCaml] Resolve const and unsigned compilation warnings
There are a number of compilation warnings regarding disregarding
const qualifiers, and casting between pointers to integer types with
different sign.
The incompatible sign warnings are due to treating the result of
`LLVMGetModuleIdentifier` as `const unsigned char *`, but it is
declared as `const char *`.
The dropped const qualifiers are due to the code pattern
`memcpy(String_val(_),_,_)` which ought to be (following the
implementation of the OCaml runtime)
`memcpy((char *)String_val(_),_,_)`. The issue is that `String_val` is
usually used to get the value of an immutable string. But in the
context of the `memcpy` calls, the string is in the process of being
initialized, so is not yet constant.
Differential Revision: https://reviews.llvm.org/D99392
Josh Berdine [Thu, 25 Mar 2021 23:07:46 +0000 (23:07 +0000)]
[NFC][OCaml] Simplify llvm_global_initializer using ptr_to_option
This diff uses ptr_to_option to convert a nullable C pointer to an
OCaml option instead of the redundant implementation in
llvm_global_initializer.
Differential Revision: https://reviews.llvm.org/D99391
David Sherwood [Fri, 26 Mar 2021 11:36:53 +0000 (11:36 +0000)]
Revert "[LoopVectorize] Simplify scalar cost calculation in getInstructionCost"
This reverts commit
240aa96cf25d880dde7a0db5d96918cfaa4b8891.
David Sherwood [Wed, 10 Mar 2021 08:34:19 +0000 (08:34 +0000)]
[LoopVectorize] Simplify scalar cost calculation in getInstructionCost
This patch simplifies the calculation of certain costs in
getInstructionCost when isScalarAfterVectorization() returns a true value.
There are a few places where we multiply a cost by a number N, i.e.
unsigned N = isScalarAfterVectorization(I, VF) ? VF.getKnownMinValue() : 1;
return N * TTI.getArithmeticInstrCost(...
After some investigation it seems that there are only these cases that occur
in practice:
1. VF is a scalar, in which case N = 1.
2. VF is a vector. We can only get here if: a) the instruction is a
GEP/bitcast with scalar uses, or b) this is an update to an induction variable
that remains scalar.
I have changed the code so that N is assumed to always be 1. For GEPs
the cost is always 0, since this is calculated later on as part of the
load/store cost. For all other cases I have added an assert that none of the
users needs scalarising, which didn't fire in any unit tests.
Only one test required fixing and I believe the original cost for the scalar
add instruction to have been wrong, since only one copy remains after
vectorisation.
Differential Revision: https://reviews.llvm.org/D98512
Vladislav Vinogradov [Thu, 25 Mar 2021 12:02:41 +0000 (15:02 +0300)]
[mlir][ODS] Fix `VariadicRegion` code generation for `NoTerminator` Ops
The issue was introduced in D98468.
The `{0}Regions` is an array of `std::unique_ptr<Region>` objects,
so it should be processed accordingly.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D99332
Abhina Sreeskantharajan [Fri, 26 Mar 2021 11:12:28 +0000 (07:12 -0400)]
[Windows] Turn off text mode in TableGen and Rewriter to stop CRLF translation
This patch should fix the errors shown on the Windows bots by turning off text mode. I plan to investigate a better fix but this should unblock the buildbots for now.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D99363
Max Kazantsev [Fri, 26 Mar 2021 11:02:31 +0000 (18:02 +0700)]
[Test] Add failing test for pr49730
Muhammad Omair Javaid [Fri, 26 Mar 2021 10:54:39 +0000 (15:54 +0500)]
[LLDB] Skip TestVSCode_disconnect.test_launch arm/linux
TestVSCode_disconnect.test_launch hangs in tear down and times out
Arm linux. I am marking it skipped for the buildbot while looking
into failure.
Jay Foad [Fri, 26 Mar 2021 09:31:42 +0000 (09:31 +0000)]
[AMDGPU] Inline FSHRPattern into its only use. NFC.
Fangrui Song [Fri, 26 Mar 2021 07:45:58 +0000 (00:45 -0700)]
[memprof][test] Make test_terse.cpp robust (sched_getcpu may happens to change)
```
/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/memprof/TestCases/test_terse.cpp:11:11: error: CHECK: expected string not found in input
// CHECK: MIB:[[STACKID:[0-9]+]]/1/40.00/40/40/20.00/20/20/[[AVELIFETIME:[0-9]+]].00/[[AVELIFETIME]]/[[AVELIFETIME]]/0/0/0/0
^
<stdin>:1:1: note: scanning from here
MIB:StackID/AllocCount/AveSize/MinSize/MaxSize/AveAccessCount/MinAccessCount/MaxAccessCount/AveLifetime/MinLifetime/MaxLifetime/NumMigratedCpu/NumLifetimeOverlaps/NumSameAllocCpu/NumSameDeallocCpu
^
<stdin>:4:1: note: possible intended match here
MIB:
134217729/1/40.00/40/40/20.00/20/20/7.00/7/7/1/0/0/0
```
Craig Topper [Fri, 26 Mar 2021 06:29:34 +0000 (23:29 -0700)]
[RISCV] Optimize (and (shl GPR:, uimm5:), 0xffffffff) to use 2 shifts instead of 3.
The and would normally become SLLI+SRLI, giving us 2 SLLI+SRLI. We
can detect this and combine the 2 SLLIs into 1.
Craig Topper [Fri, 26 Mar 2021 05:04:24 +0000 (22:04 -0700)]
[RISCV] Don't call CheckAndMask from selectZExti32.
Now that targetShrinkDemandedConstant preserves 0xffffffff masks we
shouldn't need to call computeKnownBits here.
Fangrui Song [Fri, 26 Mar 2021 04:55:27 +0000 (21:55 -0700)]
[sanitizer] Simplify GetTls with dl_iterate_phdr
GetTls is the range of
* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus
On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.
This patch uses `dl_iterate_phdr` to collect TLS ranges. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.
This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize` entirely. Use the simplified method with non-Android Linux for
now, but in theory this can be used with *BSD and potentially other ELF OSes.
In the future, we can move `ThreadDescriptorSize` code to lsan (and consider
intercepting `pthread_setspecific`) to avoid hacks in generic code.
See https://reviews.llvm.org/D93972#2480556 for analysis on GetTls usage
across various sanitizers.
Differential Revision: https://reviews.llvm.org/D98926
Kazu Hirata [Fri, 26 Mar 2021 04:51:38 +0000 (21:51 -0700)]
Reapply [InlineCost] Enable the cost benefit analysis on FDO
This patch enables the cost-benefit-analysis-based inliner by default
if we have instrumentation profile.
- SPEC CPU 2017 shows a 0.4% improvement.
- An internal large benchmark shows a 0.9% reduction in the cycle
count along with 14.6% reduction in the number of call instructions
executed.
Differential Revision: https://reviews.llvm.org/D98213
Kazu Hirata [Fri, 26 Mar 2021 04:51:36 +0000 (21:51 -0700)]
[InlineCost] Reject a zero entry count
This patch teaches the cost-benefit-analysis-based inliner to reject a
zero entry count so that we don't trigger a divide-by-zero.
Suraj Sudhir [Fri, 26 Mar 2021 04:22:33 +0000 (21:22 -0700)]
[mlir][tosa] TOSA MLIR dialect update to v0.22, part 1
Incremental set of updates to align to TOSA v0.22 spec
- modify gather, resize
- add scatter
- remove aint8 type
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D99390
Wenlei He [Thu, 25 Mar 2021 18:15:35 +0000 (11:15 -0700)]
[CSSPGO] Minor tweak for inline candidate priority tie breaker
When prioritize call site to consider for inlining in sample loader, use number of samples as a first tier breaker before using name/guid comparison. This would favor smaller functions when hotness is the same (from the same block). We could try to retrieve accurate function size if this turns out to be more important.
Differential Revision: https://reviews.llvm.org/D99370
Tony [Tue, 23 Mar 2021 22:38:10 +0000 (22:38 +0000)]
[NFC][AMDGPU] Corrections to AMD GPU initial kernel launch documentation
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D99223
Lang Hames [Fri, 26 Mar 2021 00:52:27 +0000 (17:52 -0700)]
[JITLink][MachO] Use full <segment>,<section> names for MachO jitlink::Sections.
JITLink now requires section names to be unique. In MachO section names are only
guaranteed to be unique within their containing segment (e.g. a '__const' section
in the '__DATA' segment does not clash with a '__const' section in the '__TEXT'
segment), so we need to use the fully qualified <segment>,<section> section
names (e.g. '__DATA,__const' or '__TEXT,__const') when constructing
jitlink::Sections for MachO objects.
Stella Laurenzo [Thu, 25 Mar 2021 22:52:18 +0000 (15:52 -0700)]
[mlir][python] Add docs for op class extension mechanism.
Differential Revision: https://reviews.llvm.org/D99387
Richard Smith [Fri, 26 Mar 2021 01:22:18 +0000 (18:22 -0700)]
Stop this test from dropping a .s file in the current directory.
Richard Smith [Fri, 26 Mar 2021 01:09:40 +0000 (18:09 -0700)]
Explicitly enable the new pass manager in this test.
Otherwise it fails under -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF.
Craig Topper [Fri, 26 Mar 2021 00:15:10 +0000 (17:15 -0700)]
[RISCV] Add Zbb+Zbt command lines to the signed saturing add/sub tests.
This will enable cmov to be used for select. I improve the codegen
of select_cc in D99021, but that patch doesn't work for cmov.
Amara Emerson [Thu, 25 Mar 2021 06:59:40 +0000 (23:59 -0700)]
[GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates.
Differential Revision: https://reviews.llvm.org/D99383
Jessica Paquette [Thu, 25 Mar 2021 06:45:36 +0000 (23:45 -0700)]
[AArch64][GlobalISel] Emit bzero on Darwin
Darwin platforms for both AArch64 and X86 can provide optimized `bzero()`
routines. In this case, it may be preferable to use `bzero` in place of a
memset of 0.
This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can
be generated by platforms which may want to use bzero.
To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The
conditions for this are largely a port of the bzero case in
`AArch64SelectionDAGInfo::EmitTargetCodeForMemset`.
The only difference in comparison to the SelectionDAG code is that, when
compiling for minsize, this will fire for all memsets of 0. The original code
notes that it's not beneficial to do this for small memsets; however, using
bzero here will save a mov from wzr. For minsize, I think that it's preferable
to prioritise omitting the mov.
This also fixes a bug in the libcall legalization code which would delete
instructions which could not be legalized. It also adds a check to make sure
that we actually get a libcall name.
Code size improvements (Darwin):
- CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign)
- CTMark -Oz: -0.2% geomean (-0.5% on bullet)
Differential Revision: https://reviews.llvm.org/D99358
Richard Smith [Fri, 26 Mar 2021 00:05:36 +0000 (17:05 -0700)]
Add a target triple to fix test failure on targets that don't support
__int128.
Richard Smith [Thu, 25 Mar 2021 23:51:56 +0000 (16:51 -0700)]
Fix a miscompile introduced by 99203f2.
getPointersDiff would previously round down the difference between two
pointers to a multiple of the element size of the pointee, which could
result in a pointer value being decreased a little.
Alexey Bataev has graciously agreed to add a testcase for this;
submitting the bugfix now to unblock.
Rahman Lavaee [Thu, 25 Mar 2021 23:36:54 +0000 (16:36 -0700)]
Add missing 'CHECK' prefix to basic block labels test.
The `CHECK` prefix was dropped in
e0bf2349303f. This lead to all CHECK
lines having no effect.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D99316
Muhammad Omair Javaid [Thu, 25 Mar 2021 23:37:49 +0000 (04:37 +0500)]
[LLDB] Skip TestVSCode_launch.test_progress_events arm/linux
TestVSCode_launch.test_progress_events is mysteriously failing on arm
linux. I am marking it skipped for the buildbot while looking into
failure.
Fangrui Song [Thu, 25 Mar 2021 23:25:47 +0000 (16:25 -0700)]
[Triple][Driver] Add muslx32 environment and use /lib/ld-musl-x32.so.1 for -dynamic-linker
Differential Revision: https://reviews.llvm.org/D99308
Yonghong Song [Thu, 25 Mar 2021 21:09:19 +0000 (14:09 -0700)]
BPF: add extern func to data sections if specified
This permits extern function (BTF_KIND_FUNC) be added
to BTF_KIND_DATASEC if a section name is specified.
For example,
-bash-4.4$ cat t.c
void foo(int) __attribute__((section(".kernel.funcs")));
int test(void) {
foo(5);
return 0;
}
The extern function foo (BTF_KIND_FUNC) will be put into
BTF_KIND_DATASEC with name ".kernel.funcs".
This will help to differentiate two kinds of external functions,
functions in kernel and functions defined in other bpf programs.
Differential Revision: https://reviews.llvm.org/D93563
Jingu Kang [Thu, 11 Mar 2021 13:07:36 +0000 (13:07 +0000)]
[ValueTracking] Handle two PHIs in isKnownNonEqual()
loop:
%cmp.0 = phi i32 [ 3, %entry ], [ %inc, %loop ]
%pos.0 = phi i32 [ 1, %entry ], [ %cmp.0, %loop ]
...
%inc = add i32 %cmp.0, 1
br label %loop
On above example, %pos.0 uses previous iteration's %cmp.0 with backedge
according to PHI's instruction's defintion. If the %inc is not same among
iterations, we can say the two PHIs are not same.
Differential Revision: https://reviews.llvm.org/D98422
Jonas Devlieghere [Wed, 24 Mar 2021 16:57:00 +0000 (09:57 -0700)]
[lldb] Add IsFullyInitialized to DynamicLoader
On Darwin based systems, lldb will get notified by dyld before it itself
finished initializing, at which point it's not safe to call certain APIs
or SPIs. Add a method to the DynamicLoader to query that.
Differential revision: https://reviews.llvm.org/D99314
Leonard Chan [Thu, 25 Mar 2021 21:26:00 +0000 (14:26 -0700)]
[llvm][hwasan] Add Fuchsia shadow mapping configuration
Ensure that Fuchsia shadow memory starts at zero.
Differential Revision: https://reviews.llvm.org/D99380
Stella Laurenzo [Sat, 20 Mar 2021 01:16:45 +0000 (18:16 -0700)]
[mlir][linalg] Add an InitTensorOp python builder.
* This has the API I want but I am not thrilled with the implementation. There are various things that could be improved both about the way that Python builders are mapped and the way the Linalg ops are factored to increase code sharing between C++/Python.
* Landing this as-is since it at least makes the InitTensorOp usable with the right API. Will refactor underneath in follow-ons.
Differential Revision: https://reviews.llvm.org/D99000
Guozhi Wei [Thu, 25 Mar 2021 21:50:18 +0000 (14:50 -0700)]
[DAE] Adjust param/arg attributes when changing parameter to undef
In DeadArgumentElimination pass, if a function's argument is never used, corresponding caller's parameter can be changed to undef. If the param/arg has attribute noundef or other related attributes, LLVM LangRef(https://llvm.org/docs/LangRef.html#parameter-attributes) says its behavior is undefined. SimplifyCFG(D97244) takes advantage of this behavior and does bad transformation on valid code.
To avoid this undefined behavior when change caller's parameter to undef, this patch removes noundef attribute and other attributes imply noundef on param/arg.
Differential Revision: https://reviews.llvm.org/D98899
Philip Reames [Thu, 25 Mar 2021 21:50:07 +0000 (14:50 -0700)]
Mark gc.relocate and gc.result as readnone (try 2)
As noted in the LangRef, these are semantically readnone projections from the result value of the associated statepoint. However, it turned out we had a few latent bugs being covered up by the fact we were only marking them readonly (see PR49607 for context).
As of this change, all known issues are resolved. This is a deliberately minimal patch to make it easy to test downstream and revert with minimal change if that turns out to be necessary.
Differential Revision: https://reviews.llvm.org/D98729
Philip Reames [Thu, 25 Mar 2021 21:47:31 +0000 (14:47 -0700)]
[deref] Handle byval/byref/sret/inalloc/preallocated arguments for deref-at-point semantics
All of these are scoped allocations which remain dereferenceable during the lifetime of the callee.
Differential Revision: https://reviews.llvm.org/D99310
Philip Reames [Thu, 25 Mar 2021 21:41:08 +0000 (14:41 -0700)]
Autogen test to account for tool output format change
Philip Reames [Thu, 25 Mar 2021 21:08:39 +0000 (14:08 -0700)]
[test] Add test for hoisting to custom allocation function using allocsize
The first is currently demonstrating a miscompile.
David Stone [Thu, 25 Mar 2021 21:27:13 +0000 (17:27 -0400)]
Handle 128-bits IntegerLiterals in StmtPrinter
This fixes PR35677: "int128_t or uint128_t as non-type template
parameter causes crash when considering invalid constructor".
Vedant Kumar [Thu, 25 Mar 2021 21:24:59 +0000 (14:24 -0700)]
[lldb/Commands] Fix spelling of target.move-to-nearest-code in helptext
Matt Morehouse [Thu, 25 Mar 2021 21:20:48 +0000 (14:20 -0700)]
[HWASan] Mention x86_64 aliasing mode in design doc.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D98892
Craig Topper [Thu, 25 Mar 2021 21:20:38 +0000 (14:20 -0700)]
[RISCV] Reorder checks in RISCVTTIImpl::getGatherScatterOpCost to avoid calling getMinRVVVectorSizeInBits() when V extension is not enabled.
getMinRVVVectorSizeInBits() asserts if the V extension isn't
enabled. So check that gather/scatter is legal first since it
already contains a check for V extension being enabled. It
also already checks getMinRVVVectorSizeInBits for fixed length
vectors so we don't need a check in getGatherScatterOpCost.
Andrew Savonichev [Wed, 24 Mar 2021 20:33:21 +0000 (23:33 +0300)]
[MCA] Support carry-over instructions for in-order processors
Instructions that have more uops than the processor's IssueWidth are
issued in multiple cycles.
The patch fixes PR49712.
Differential Revision: https://reviews.llvm.org/D99339
Xun Li [Thu, 25 Mar 2021 20:52:36 +0000 (13:52 -0700)]
[OpenMP][InstrProfiling] Fix a missing instr profiling counter
When emitting a function body there needs to be a instr profiling counter emitted. Otherwise instr profiling won't work for this function.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D98135
Richard Smith [Thu, 25 Mar 2021 20:46:23 +0000 (13:46 -0700)]
PR49724: Fix deduction of null member pointers.
Previously we created an implicit cast of the wrong kind, which we'd
later fail to constant-evaluate, resulting in deduction failure.
Vy Nguyen [Wed, 24 Mar 2021 01:27:58 +0000 (21:27 -0400)]
Reland [lld-macho][nfc] minor clean up, follow up to D98559
This reverts commit
77b4230ed9bea541fd3fb04707e35308c2f34347.
New change: Fixed tests on windows
Differential Revision: https://reviews.llvm.org/D99210
Xun Li [Thu, 25 Mar 2021 20:46:20 +0000 (13:46 -0700)]
[Coroutine][Clang] Force emit lifetime intrinsics for Coroutines
tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available.
Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame).
The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame.
In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap.
Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime.
To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point.
The following is a common code pattern called "Symmetric Transfer" in coroutine:
```
auto tmp = await_suspend();
__builtin_coro_resume(tmp.address());
return;
```
In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine.
During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards.
However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape.
To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived.
I made some attempt in D98638, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines.
Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893
Differential Revision: https://reviews.llvm.org/D99227
Nico Weber [Thu, 25 Mar 2021 20:41:32 +0000 (16:41 -0400)]
Revert "[InlineCost] Enable the cost benefit analysis on FDO"
This reverts commit
ef69aa961d12dee2141a79b05c9637d8cc9c0c74.
Makes clang assert in PGO builds, see repro tgz in
https://bugs.chromium.org/p/chromium/issues/detail?id=1192783#c6
Leonard Chan [Thu, 25 Mar 2021 18:30:44 +0000 (11:30 -0700)]
[clang][driver] Support HWASan in the Fuchsia toolchain
These contain clang driver changes for supporting HWASan on Fuchsia.
This includes hwasan multilibs and the dylib path change.
Differential Revision: https://reviews.llvm.org/D99361
Roman Lebedev [Thu, 25 Mar 2021 19:58:10 +0000 (22:58 +0300)]
[NFCI][SimplifyCFG] Don't pay for a Small{Map,Set}Vector when plain SmallSet will suffice
This *only* changes the cases where we *really* don't care
about the iteration order of the underlying contained,
namely when we will use the values from it to form DTU updates.
Nikita Popov [Wed, 24 Mar 2021 16:56:23 +0000 (17:56 +0100)]
[IR] Lift attribute handling for assume bundles into CallBase
Rather than special-casing assume in BasicAA getModRefBehavior(),
do this one level higher, in the attribute handling of CallBase.
For assumes with operand bundles, the inaccessiblememonly attribute
applies regardless of operand bundles.
Sanjay Patel [Thu, 25 Mar 2021 18:58:51 +0000 (14:58 -0400)]
[PowerPC] auto-generate complete testchecks; NFC
The full checks demonstrate a problem that comes up in:
https://llvm.org/PR49610
peter klausler [Thu, 25 Mar 2021 18:03:32 +0000 (11:03 -0700)]
[flang] fix spurious runtime crash on TRIM('')
The standard interoperability routine CFI_establish() does not
accept a zero-length CHARACTER type. Since these can be valid
results of intrinsic function references, work around the design
of CFI_establish() in the wrapper routine that calls it.
Differential Revision: https://reviews.llvm.org/D99296
Markus Böck [Thu, 25 Mar 2021 19:26:20 +0000 (20:26 +0100)]
[Support][Windows] Make sure only executables are found by sys::findProgramByName
The function utilizes Windows' SearchPathW function, which as I found out today, may also return directories. After looking at the Unix implementation of the file I found that it contains a check whether the found path is also executable. While fixing the Windows implementation, I also learned that sys::fs::access returns successfully when querying whether directories are executable, which the Unix version does not.
This patch makes both of these functions equivalent to their Unix implementation and insures that any path returned by sys::findProgramByName on Windows may only be executable, just like the Unix implementation.
The equivalent additions I have made to the Windows implementation, in the Unix implementation are here:
sys::findProgramByName: https://github.com/llvm/llvm-project/blob/
39ecfe614350fa5db7b8f13f81212f8e3831a390/llvm/lib/Support/Unix/Program.inc#L90
sys::fs::access: https://github.com/llvm/llvm-project/blob/
c2a84771bb63947695ea50b89160c02b36fb634d/llvm/lib/Support/Unix/Path.inc#L608
I encountered this issue when running the LLVM testsuite. Commands of the form not test ... would fail to correctly execute test.exe, which is part of GnuWin32, as it actually tried to execute a folder called test, which happened to be in a directory on my PATH.
Differential Revision: https://reviews.llvm.org/D99357
Mircea Trofin [Thu, 25 Mar 2021 19:28:47 +0000 (12:28 -0700)]
[NFC] Module::getInstructionCount() is const
Yaxun (Sam) Liu [Wed, 24 Mar 2021 21:28:56 +0000 (17:28 -0400)]
[CUDA][HIP] add __builtin_get_device_side_mangled_name
Add builtin function __builtin_get_device_side_mangled_name
to get device side manged name for functions and global
variables, which can be used to get symbol address of kernels
or variables by mangled name in dynamically loaded
bundled code objects at run time.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D99301
Stanislav Mekhanoshin [Thu, 25 Mar 2021 19:04:57 +0000 (12:04 -0700)]
[AMDGPU] Refactoring mfma intrinsic definitions. NFC.
Differential Revision: https://reviews.llvm.org/D99366
Vy Nguyen [Thu, 25 Mar 2021 18:59:54 +0000 (14:59 -0400)]
[lld-macho][nfc] Removed unnecessary static_cast
Differential Revision: https://reviews.llvm.org/D99365
Andrzej Warzynski [Thu, 25 Mar 2021 18:59:48 +0000 (18:59 +0000)]
[flang][driver] Fix typos and inconsistent comments (nfc)
Krzysztof Parzyszek [Thu, 25 Mar 2021 18:43:55 +0000 (13:43 -0500)]
[Hexagon] Limit virtual register reuse range in FI elimination
Jez Ng [Thu, 25 Mar 2021 18:39:45 +0000 (14:39 -0400)]
[lld-macho] Add support for --threads
Code and test are largely identical to the LLD-ELF equivalents.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99312
Jez Ng [Thu, 25 Mar 2021 18:39:44 +0000 (14:39 -0400)]
[lld-macho] Add more TimeTraceScopes
I added just enough to allow us to see a top-level breakdown of time taken. This
is the result of loading the time-trace output into `chrome:://tracing`:
https://gist.githubusercontent.com/int3/
236c723cbb4b6fa3b2d340bb6395c797/raw/
ef5e8234f3fdf609bf93b50f54f4e0d9bd439403/tracing.png
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D99311
Jez Ng [Wed, 24 Mar 2021 18:43:09 +0000 (14:43 -0400)]
[lld-macho] Fix typo in diagnostic message
Lang Hames [Thu, 25 Mar 2021 18:45:30 +0000 (11:45 -0700)]
[JITLink][MachO/x86-64] Remove stale commented-out code.
This commented-out code was accidentally left in during the transition from
MachO-specific to generic x86-64 edge kinds (
ecf6466f01c).
Mehdi Amini [Thu, 25 Mar 2021 18:36:33 +0000 (18:36 +0000)]
Remove unused function, fix warning (NFC)
The `mayNotHaveTerminator` was initially on Block but moved to the
verifier before landing and wasn't removed from its original place
where it is unused.
Shoaib Meenai [Thu, 25 Mar 2021 07:20:01 +0000 (00:20 -0700)]
[clang] Pass option directly to command. NFC
This code was written back when LLVM's minimum required CMake version
was 2.8.8, and I assume ExternalProject_Add_Step didn't take this option
at that point. It does now though, so we should just use the option.
Setting the _EP_* property is entirely equivalent (and is in fact how
these commands behave internally), but that also feels like an internal
implementation detail we shouldn't be relying on.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D99322
Shoaib Meenai [Thu, 25 Mar 2021 07:16:47 +0000 (00:16 -0700)]
[clang] Always execute multi-stage install steps
We want installs to be executed even if binaries haven't changed, e.g.
so that we can install to multiple places. This is consistent with how
non-multi-stage install targets (e.g. the regular install-distribution
target) behave.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D99321
Tim Keith [Thu, 25 Mar 2021 18:18:39 +0000 (11:18 -0700)]
[flang] Fix error compiling std::min on macos
On macos, `size_t` is `unsigned long` while `size_t - int64_t` is
`unsigned long long` so std::min requires an explicit type to compile.
Differential Revision: https://reviews.llvm.org/D99340
Utkarsh Saxena [Mon, 22 Mar 2021 14:40:37 +0000 (15:40 +0100)]
[clang][Syntax] Optimize expandedTokens for token ranges.
`expandedTokens(SourceRange)` used to do a binary search to get the
expanded tokens belonging to a source range. Each binary search uses
`isBeforeInTranslationUnit` to order two source locations. This is
inherently very slow.
By profiling clangd we found out that users like clangd::SelectionTree
spend 95% of time in `isBeforeInTranslationUnit`. Also it is worth
noting that users of `expandedTokens(SourceRange)` majorly use ranges
provided by AST to query this funciton. The ranges provided by AST are
token ranges (starting at the beginning of a token and ending at the
beginning of another token).
Therefore we can avoid the binary search in majority of the cases by
maintaining an index of ExpandedToken by their SourceLocations. We still
do binary search for ranges which are not token ranges but such
instances are quite low.
Performance:
`~/build/bin/clangd --check=clang/lib/Serialization/ASTReader.cpp`
Before: Took 2:10s to complete.
Now: Took 1:13s to complete.
Differential Revision: https://reviews.llvm.org/D99086
Jean Perier [Thu, 25 Mar 2021 17:36:06 +0000 (18:36 +0100)]
[flang] fold LOGICAL intrinsic calls
Folding of LOGICAL intrinsic procedure was missing in the front-end causing
crash when using it in parameter expressions.
Simply fold LOGICAL calls to evaluate::Convert<T>.
Differential Revision: https://reviews.llvm.org/D99346
Kadir Cetinkaya [Thu, 25 Mar 2021 10:04:35 +0000 (11:04 +0100)]
[clangd] Fix a use-after-free
Clangd was storing reference to a possibly-dead string in compiled
config. This patch fixes the issue by copying suppression strings from
fragments into compiled Config.
Fixes https://github.com/clangd/clangd/issues/724.
Differential Revision: https://reviews.llvm.org/D99326
Gabor Marton [Thu, 25 Mar 2021 14:29:41 +0000 (15:29 +0100)]
[Analyzer] Infer 0 value when the divisible is 0 (bug fix)
Currently, we infer 0 if the divisible of the modulo op is 0:
int a = x < 0; // a can be 0
int b = a % y; // b is either 1 % sym or 0
However, we don't when the op is / :
int a = x < 0; // a can be 0
int b = a / y; // b is either 1 / sym or 0 / sym
This commit fixes the discrepancy.
Differential Revision: https://reviews.llvm.org/D99343
Marek Kurdej [Thu, 25 Mar 2021 17:09:11 +0000 (18:09 +0100)]
[libc++] [C++2b] [P2162] Allow inheritance from std::variant.
This patch changes the variant even in pre-C++2b.
It should not break anything, only allow use cases that didn't work previously.
Notes:
`__as_variant` is used in `__visitation::__variant::__visit_alt`, but I haven't used it in `__visitation::__variant::__visit_alt_at`.
That's because it is used only in `__visit_value_at`, which in turn is always used on variant specializations (that's in comparison operators).
* https://wg21.link/P2162
Reviewed By: ldionne, #libc, Quuxplusone
Differential Revision: https://reviews.llvm.org/D97394
Alexander Belyaev [Thu, 25 Mar 2021 17:08:30 +0000 (18:08 +0100)]
[mlir][linalg] Add output tensor args folding for linalg.tiled_loop.
Folds away TiledLoopOp output tensors when the following conditions are met:
* result of `linalg.tiled_loop` has no uses
* output tensor is the argument of `linalg.yield`
Example:
```
%0 = linalg.tiled_loop ... outs (%out, %out_buf:tensor<...>, memref<...>) {
...
linalg.yield %out : tensor ...
}
```
Becomes
```
linalg.tiled_loop ... outs (%out_buf:memref<...>) {
...
linalg.yield
}
```
Differential Revision: https://reviews.llvm.org/D99333
Arnamoy Bhattacharyya [Thu, 25 Mar 2021 17:02:05 +0000 (13:02 -0400)]
[flang][driver] Add options for -std=f2018
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D97119
Uday Bondhugula [Thu, 25 Mar 2021 11:23:45 +0000 (16:53 +0530)]
Revert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants."
This reverts commit
361b7d125b438cda13fa45f13790767a62252be9 by Chris
Lattner <clattner@nondot.org> dated Fri Mar 19 21:22:15 2021 -0700.
The change to the greedy rewriter driver picking a different order was
made without adequate analysis of the trade-offs and experimentation. A
change like this has far reaching consequences on transformation
pipelines, and a major impact upstream and downstream. For eg., one
can’t be sure that it doesn’t slow down a large number of cases by small
amounts or create other issues. More discussion here:
https://llvm.discourse.group/t/speeding-up-canonicalize/3015/25
Reverting this so that improvements to the traversal order can be made
on a clean slate, in bigger steps, and higher bar.
Differential Revision: https://reviews.llvm.org/D99329
David Green [Thu, 25 Mar 2021 16:44:15 +0000 (16:44 +0000)]
[ARM] Revert WhileLoopStartLR to DoLoopStart
If a WhileLoopStartLR is reverted due to calls in the preheader, we may
still be able to instead create a DoLoopStart, preserving the low
overhead loop. This adds code for that, only reverting the
WhileLoopStartR to a Br/Cmp, leaving the rest of the low overhead loop
in place.
Differential Revision: https://reviews.llvm.org/D98413
Craig Topper [Thu, 25 Mar 2021 06:23:16 +0000 (23:23 -0700)]
[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffffffff).
We look for this pattern frequently in isel patterns so its a
good idea to try to preserve it.
This also let's us remove our special isel handling for srliw
and use a direct pattern match of (srl (and X, 0xffffffff), C)
since no bits will be removed from the and mask.
Differential Revision: https://reviews.llvm.org/D99042
Abhina Sreeskantharajan [Thu, 25 Mar 2021 15:55:30 +0000 (11:55 -0400)]
Fix: Reordering parameters in getFile and getFileOrSTDIN
There was a new getFileOrSTDIN call added recently which was not included in my patch. https://reviews.llvm.org/D99110
I reordered the args to match the new order.
Reviewed By: tunz
Differential Revision: https://reviews.llvm.org/D99349
Yevgeny Rouban [Thu, 25 Mar 2021 14:32:55 +0000 (21:32 +0700)]
[SLP] Fix crash in reduction for integer min/max
The SCEV commit
b46c085d2b6d1 [NFCI] SCEVExpander:
emit intrinsics for integral {u,s}{min,max} SCEV expressions
seems to reveal a new crash in SLPVectorizer.
SLP crashes expecting a SelectInst as an externally used value
but umin() call is found.
The patch relaxes the assumption to make the IR flag propagation safe.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D99328
Nathan James [Thu, 25 Mar 2021 14:38:35 +0000 (14:38 +0000)]
[clang-tidy] Fix mpi checks when running multiple TUs per clang-tidy process
Both the mpi-type-mismatch and mpi-buffer-deref check make use of a static MPIFunctionClassifier object.
This causes issue as the classifier is initialized with the first ASTContext that produces a match.
If the check is enabled on multiple translation units in a single clang-tidy process, this classifier won't be reinitialized for each TU. I'm not an expert in the MPIFunctionClassifier but I'd imagine this is a source of UB.
It is suspected that this bug may result in the crash caused here: https://bugs.llvm.org/show_bug.cgi?id=48985. However even if not the case, this should still be addressed.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98275
Sven van Haastregt [Thu, 25 Mar 2021 14:38:02 +0000 (14:38 +0000)]
Reuse `os` variable in AllocateTarget; NFC
Jamie Schmeiser [Thu, 25 Mar 2021 14:32:13 +0000 (10:32 -0400)]
add print-change diff modes that do not use colour
Summary:
The colour characters currently added to the output of -print-changed=diff
and -print-changed=diff-quiet cause difficulties when capturing the output
and examining it in an editor. Change the function to not have the colour
characters and add 2 new choices (-print-changed=cdiff and
-print-changed=cdiff-quiet) to retain the existing functionality of adding
the colour characters.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban)
Differential Revision: https://reviews.llvm.org/D97398
Arthur O'Dwyer [Wed, 24 Mar 2021 23:14:51 +0000 (19:14 -0400)]
[libc++] Eliminate <compare>'s dependency on <array>.
This refactor is not only a good idea, but is in fact required by the standard,
in the sense that <array> is mandated to include <compare>.
So <compare> shouldn't have a circular dependency on <array>!
Differential Revision: https://reviews.llvm.org/D99307
Arthur O'Dwyer [Wed, 10 Feb 2021 00:12:16 +0000 (19:12 -0500)]
[libc++] [P1032] Misc constexpr bits in <iterator>, <string_view>, <tuple>, <utility>.
This completes the implementation of P1032's changes to <iterator>,
<string_view>, <tuple>, and <utility> in C++20.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1032r1.html
Drive-by fix a couple of unintended rvalues in "*iterators*/*.fail.cpp".
Differential Revision: https://reviews.llvm.org/D96385
Kerry McLaughlin [Thu, 25 Mar 2021 13:37:02 +0000 (13:37 +0000)]
[SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads
D95598 added a cost model for broadcast shuffle, which should enable loops
such as the following to vectorize, where the load of b[42] is invariant
and can be done using a scalar load + splat:
for (int i=0; i<n; ++i)
a[i] = b[i] + b[42];
This patch adds tests to verify that we can vectorize such loops.
Reviewed By: joechrisellis
Differential Revision: https://reviews.llvm.org/D98506
Matt Morehouse [Thu, 25 Mar 2021 13:34:25 +0000 (06:34 -0700)]
[HWASan] Use page aliasing on x86_64.
Userspace page aliasing allows us to use middle pointer bits for tags
without untagging them before syscalls or accesses. This should enable
easier experimentation with HWASan on x86_64 platforms.
Currently stack, global, and secondary heap tagging are unsupported.
Only primary heap allocations get tagged.
Note that aliasing mode will not work properly in the presence of
fork(), since heap memory will be shared between the parent and child
processes. This mode is non-ideal; we expect Intel LAM to enable full
HWASan support on x86_64 in the future.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98875
Abhina Sreeskantharajan [Thu, 25 Mar 2021 13:47:25 +0000 (09:47 -0400)]
[NFC] Reordering parameters in getFile and getFileOrSTDIN
In future patches I will be setting the IsText parameter frequently so I will refactor the args to be in the following order. I have removed the FileSize parameter because it is never used.
```
static ErrorOr<std::unique_ptr<MemoryBuffer>>
getFile(const Twine &Filename, bool IsText = false,
bool RequiresNullTerminator = true, bool IsVolatile = false);
static ErrorOr<std::unique_ptr<MemoryBuffer>>
getFileOrSTDIN(const Twine &Filename, bool IsText = false,
bool RequiresNullTerminator = true);
static ErrorOr<std::unique_ptr<MB>>
getFileAux(const Twine &Filename, uint64_t MapSize, uint64_t Offset,
bool IsText, bool RequiresNullTerminator, bool IsVolatile);
static ErrorOr<std::unique_ptr<WritableMemoryBuffer>>
getFile(const Twine &Filename, bool IsVolatile = false);
```
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D99182
Alexander Lanin [Thu, 25 Mar 2021 13:44:41 +0000 (09:44 -0400)]
fix readability-braces-around-statements Stmt type dependency
Replaces Token based approach to identify EndLoc of Stmt with AST traversal.
This also improves handling of macros.
Fixes Bugs 22785, 25970 and 35754.
Abhina Sreeskantharajan [Thu, 25 Mar 2021 13:18:49 +0000 (09:18 -0400)]
[SystemZ][z/OS] csv files should be text files
This patch sets the OF_Text flag correctly for the csv file.
Reviewed By: anirudhp
Differential Revision: https://reviews.llvm.org/D99285
Alexey Bataev [Wed, 24 Mar 2021 14:13:58 +0000 (07:13 -0700)]
[SLP]Improve and simplify extendSchedulingRegion.
We do not need to scan further if the upper end or lower end of the
basic block is reached already and the instruction is not found. It
means that the instruction is definitely in the lower part of basic
block or in the upper block relatively.
This should improve compile time for the very big basic blocks.
Differential Revision: https://reviews.llvm.org/D99266
Djordje Todorovic [Thu, 11 Mar 2021 14:55:13 +0000 (06:55 -0800)]
[Debugify] Expose original debug info preservation check as CC1 option
In order to test the preservation of the original Debug Info metadata
in your projects, a front end option could be very useful, since users
usually report that a concrete entity (e.g. variable x, or function fn2())
is missing debug info. The [0] is an example of running the utility
on GDB Project.
This depends on: D82546 and D82545.
Differential Revision: https://reviews.llvm.org/D82547
Simon Pilgrim [Thu, 25 Mar 2021 12:12:04 +0000 (12:12 +0000)]
[X86][SSE] Add pmulh tests where the source ops are not generated from sign/zero-extends
Simon Pilgrim [Thu, 25 Mar 2021 11:52:28 +0000 (11:52 +0000)]
[X86][SSE] Rename pmulh tests to show they're from sign/zero-extends
I'm intending to add additional coverage based off computeKnownBits/ComputeNumSignBits as suggested by PR45897
Fraser Cormack [Wed, 24 Mar 2021 14:54:20 +0000 (14:54 +0000)]
[RISCV] Optimize select-like vector shuffles
This patch adds a small optimization for vector shuffle lowering,
detecting shuffles which can be re-expressed as vector selects.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99270
Nemanja Ivanovic [Thu, 25 Mar 2021 11:32:12 +0000 (06:32 -0500)]
[PowerPC][NFC] Provide legacy names for VSX loads and stores
Before we unified the names of the builtins across all the
compilers, there were a number of synonyms between them. There
is code out there that uses XL naming for some of these loads and
stores. This just adds those names.