Lang Hames [Thu, 13 Apr 2023 03:14:52 +0000 (20:14 -0700)]
[JITLink] Remove a stale comment.
This comment was probably mangled when the generic ELFLinkGraphBuilder was
written from the original x86-64 specific version. Regardless of its origins,
it doesn't make any sense now.
Craig Topper [Thu, 13 Apr 2023 03:13:02 +0000 (20:13 -0700)]
[TableGen] Store CodeGenInstruction reference in EmitNodeMatcherCommon. NFC
Instead of storing a string containing the instruction name, store a
reference to the instruction. We can use that reference to print the
instruction name when we emit the table.
The only slightly annoying part is that we have to find the
CodeGenInstruction for IMPLICIT_DEF. GlobalISel is doing
a similar thing.
Arthur Eubanks [Mon, 13 Mar 2023 19:32:24 +0000 (12:32 -0700)]
[Pipeline] Remove Annotation2Metadata pass in post-link pipelines
The pre-link pipeline already ran the pass and it only needs to be run once.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D145978
Peiming Liu [Tue, 24 Jan 2023 16:55:29 +0000 (16:55 +0000)]
[mlir][sparse] extend loop emitter to emit slice driven loops
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D142930
Shivam Gupta [Thu, 13 Apr 2023 02:58:57 +0000 (08:28 +0530)]
[Docs] Add GettingStarted.md to index.md
Kai Sasaki [Thu, 13 Apr 2023 01:41:56 +0000 (10:41 +0900)]
[mlir][func] Guard for unranked memref with the bare ptr memref call
Lowering the call op with use-bare-ptr-memref-call crashes due to the unsupported unranked memref type. We can prevent the crash by checking the type of operand in the pass instead of the assertion in the type converter.
Issue: https://github.com/llvm/llvm-project/issues/61872
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D148078
Jason Molenda [Thu, 13 Apr 2023 01:31:11 +0000 (18:31 -0700)]
AArch64 debugserver parse ESR register for watchpoints
Have debugserver parse the watchpoint flags out of the exception
syndrome register when we get a watchpoint mach exception. Relay
those fields up to lldb in the stop reply packet, if the watchpoint
number was reported by the hardware, use the address from that as
the watchpoint address.
Change how watchpoints are reported to lldb from using the mach
exception data, to using the `reason:watchpoint` and `description:asciihex`
method that lldb-server uses, which can relay the actual trap address
as well as the address of a watched memory region responsible for
the trap, so lldb can step past it.
Have debugserver look for the nearest watchpoint that it has set
when it gets a watchpoint trap, so accesses that are reported as
starting before the watched region are associated with the correct
watchpoint to lldb. Add a test case for this specific issue.
Differential Revision: https://reviews.llvm.org/D147820
rdar://
83996471
Fangrui Song [Thu, 13 Apr 2023 01:08:58 +0000 (18:08 -0700)]
[docs] Fix --filter typo in SymbolizerMarkupFormat.rst
Jason Molenda [Thu, 13 Apr 2023 00:53:51 +0000 (17:53 -0700)]
Remove AArch64 out of MIPS watchpoint-skip, doc wp description
Watchpoints from lldb-server are sent in the stop info packet
as a `reason:watchpoint` and `description:asciihex` keys; the
latter's asciihex has one to three integer values. This patch
documents the purpose of those three different numbers, and
clarifies the behavior on MIPS with the third number which is
outside the range of any watched memory range means to silently
skip the watchpoint.
lldb was previously using this silently skip watchpoint behavior
for AArch64 as well, but in the case of AArch64 we see a watchpoint
address outside of a watched memory range when the write BEGINS
before the watched memory range, but extends in to it. We don't
want to silently skip these.
Differential Revision: https://reviews.llvm.org/D147816
rdar://
83996471
Philip Reames [Thu, 13 Apr 2023 00:42:15 +0000 (17:42 -0700)]
[RISCV][TTI] Call improveShuffleKindFromMask like all the other backends
No test diff; noticed via inspection.
Han Zhu [Tue, 14 Feb 2023 01:16:25 +0000 (17:16 -0800)]
[X86 isel] Remove lane requirement from lowerShuffleAsUNPCKAndPermute
`lowerShuffleAsUNPCKAndPermute` requires the shuffle mask element to be
in the same lane in both the input and output vectors. This prevents it from
matching certain patterns for example in [GHI
61964](https://github.com/llvm/llvm-project/issues/61964). Removing the lane
requirement fixes the issue.
The change I'm targeting is in the test llvm/test/CodeGen/X86/pr61964.ll. The
codegen has improved notably with this patch. Otherwise, looks like some
broadcast instructions are replaced with unpck and perm. To check if there's
any other performance change, I ran llvm-test-suite benchmarks from the
SingleSource, MultiSource, and MicroBenchmarks directories:
```
Tests: 2665
Short Running: 2009 (filtered out)
Same hash: 140 (filtered out)
In Blacklist: 513 (filtered out)
Remaining: 3
Metric: exec_time
Program exec_time
lhs rhs diff
test-suite :: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.test 1.64 1.64 0.1%
test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test 1.06 1.06 0.0%
test-suite :: MultiSource/Applications/JM/lencod/lencod.test 5.25 5.25 0.0%
Geomean difference nan nan 0.0%
exec_time
l/r lhs rhs diff
count 3.000000 3.000000 3.000000
mean 2.648300 2.649100 0.000462
std 2.269035 2.268849 0.000415
min 1.055500 1.055900 0.000095
25% 1.349300 1.350250 0.000237
50% 1.643100 1.644600 0.000379
75% 3.444700 3.445700 0.000646
max 5.246300 5.246800 0.000913
```
The patch only hits three cases and the result is neutral. (The 513 blacklisted
benchmarks are the ones under MicroBenchmarks, which `--filter-hash` does
not work and I manually verified their code did not change).
Differential Revision: https://reviews.llvm.org/D147668
Noah Goldstein [Wed, 12 Apr 2023 23:37:01 +0000 (18:37 -0500)]
[LIBC] Fix comments / name of __sched_cpu_count tests
Test was incorrectly named/commented after the sched_{set|get}affinity
functions.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D148044
Noah Goldstein [Wed, 12 Apr 2023 23:36:48 +0000 (18:36 -0500)]
[LIBC] Fix `getrandom` success return value
`getrandom` should return the number of bytes successfully set on
success, not `0`.
Reviewed By: sivachandra, michaelrj
Differential Revision: https://reviews.llvm.org/D147981
Noah Goldstein [Wed, 12 Apr 2023 23:41:33 +0000 (18:41 -0500)]
[LIBC] Implement `sched_yield()`
Implements: https://linux.die.net/man/2/sched_yield
Possibly we don't need the return value check / errno as according to
both the manpage (and current linux source) `sched_yield` cannot fail.
Reviewed By: sivachandra, michaelrj
Differential Revision: https://reviews.llvm.org/D147985
Reagan Bohan [Wed, 12 Apr 2023 23:55:15 +0000 (16:55 -0700)]
[crt] Enable sparc and mips targets
This patch enables sparc and mips in compiler-rt CRT, meaning that now every platform supported by compiler-rt builtins (that runs on Linux, i.e. not WebAssembly) will be suported by compiler-rt CRT
Reviewed By: phosek, MaskRay
Differential Revision: https://reviews.llvm.org/D147819
Akira Hatanaka [Wed, 12 Apr 2023 23:45:55 +0000 (16:45 -0700)]
Fix an assertion failure in unwrapSugar
An assertion in Qualifiers::addObjCLifetime fails when the ObjC lifetime
bits are already set.
Instead of calling operator+=, call addConsistentQualifiers, which
allows the lifetime bits to be set again as long the new value doesn't
conflict with the old value.
This fixes https://github.com/llvm/llvm-project/issues/61419.
Differential Revision: https://reviews.llvm.org/D147263
Amara Emerson [Fri, 24 Feb 2023 00:35:39 +0000 (16:35 -0800)]
[GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types.
These reduce the typing and clutter from:
Register Dst = MI.getOperand(0).getReg();
Register Src1 = MI.getOperand(1).getReg();
Register Src2 = MI.getOperand(2).getReg();
Register Src3 = MI.getOperand(3).getReg();
LLT DstTy = MRI.getType(Dst);
... etc etc
To just:
auto [Dst, Src1, Src2, Src3] = MI.getFirst4Regs();
auto [DstTy, Src1Ty, Src2Ty, Src3Ty] = MI.getFirst4LLTs();
Or even more concise:
auto [Dst, DstTy, Src1, Src1Ty, Src2, Src2Ty, Src3, Src3Ty] =
MI.getFirst4RegLLTs();
Differential Revision: https://reviews.llvm.org/D144687
Amara Emerson [Wed, 12 Apr 2023 16:40:59 +0000 (09:40 -0700)]
[GlobalISel] Move the truncstore_merge combine to the LoadStoreOpt pass and add support for an extra case.
If we have set of mergeable stores of shifts, but the original source value being shifted
is wider than the merged size, we should still be able to merge if we truncate first. To do this
however we need to search for stores speculatively up the block, without knowing exactly how
many stores we should see before we stop. The old algorithm has to match an exact number of
stores to fit the wide type, or it dies. The new one will try to set the wide type to however
many stores we found in the upwards block traversal and use later checks to verify if they're
a valid mergeable set.
The reason I need to move this to LoadStoreOpt is because the combiner works going top down
inside a block, which means that we end up doing partial merges because we haven't seen all
the possible stores before we mutate the MIR. In LoadStoreOpt we can go bottom up.
As a side effect of this change, we also end up doing better on an existing test case (missing_store)
since we manage to do a partial merge there.
Craig Topper [Wed, 12 Apr 2023 23:11:03 +0000 (16:11 -0700)]
[LV] Optimize trip count SCEV.
To calculate the trip count we need to add 1 to the backedge
taken count. If we need to widen the backedge count, it's better
to do the add before the widening if we can guarantee it won't
overflow.
The code here is based on similar code I found in
LoopIdiomRecognize.
This is the vectorizer version of this InstCombine patch D142783.
Looking at the IR diffs, this does look like it gets more cases
than the InstCombine patch.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D147355
Valentin Clement [Wed, 12 Apr 2023 23:13:29 +0000 (16:13 -0700)]
[mlir][openacc][NFC] Use oilist in assembly format
Use the oilist syntax in assembly format where appropriate.
This makes the dialect format more flexible as an order
is not imposed for the clauses.
Reviewed By: PeteSteinfeld, razvanlupusoru
Differential Revision: https://reviews.llvm.org/D148154
Med Ismail Bennani [Wed, 12 Apr 2023 23:00:07 +0000 (16:00 -0700)]
[lldb] Fix assertion when ScriptedProcess have no pid after launch
This patch should fix an assertion that causes some test failures:
https://ci.swift.org/view/LLDB/job/llvm-org-lldb-release-debuginfo/3587/console
This was caused by the changes introduces in `
88f409194d5a` where we
replaced `DidLaunch` by `DidResume` in the `ScriptedProcess` class.
However, by the time we resume the process, the pid should be already
set. To address this, this patch brings back `DidLaunch` which will
initialize the ScriptedProcess pid with a placeholder value. That value
will be updated in `DidResume` to the final pid.
Note, this 2 stage PID initialization is necessary sometimes, when the
scripted process gets stopped at entry (launch) and gets assigned an
object that contains the PID value. In this case, we need to update the
PID when we resume the process after we've stopped at entry.
This also replaces the default scripted process id to an arbitrary
number (42) since the current value (0) is considered invalid.
Differential Revision: https://reviews.llvm.org/D148153
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Vitaly Buka [Wed, 12 Apr 2023 22:38:44 +0000 (15:38 -0700)]
[test][asan] Simplify test
FileCheck is not very useful here.
Michael Jones [Wed, 12 Apr 2023 18:26:18 +0000 (11:26 -0700)]
[libc] Fix strtod exponent overflow bug
String to float has a condition to prevent overflowing the exponent with
the E notation. To do this it checks if adding that exponent to the
exponent found by parsing the number is greater than the maximum
exponent for the given size of float. The if statements had a gap on
exactly the maximum exponent value that caused it to be treated as the
minimum exponent value. This patch fixes those conditions.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D148152
Fangrui Song [Wed, 12 Apr 2023 22:01:04 +0000 (15:01 -0700)]
Revert D146987 "[Assignment Tracking] Enable by default"
This reverts commit
3820e9a2b29a2e268319ed6635da0d59e18d736d.
See https://reviews.llvm.org/D146987 for issues.
Ziqing Luo [Wed, 12 Apr 2023 21:40:14 +0000 (14:40 -0700)]
[-Wunsafe-buffer-usage] Add a Fixable for pointer pre-increment
For a pointer type expression `e` of the form `++DRE`, if `e` is under
an Unspecified Pointer Context (UPC) and `DRE` is suppose to be
transformed to have std:span type, we generate fix-its that transform `e` to
`(DRE = DRE.subspan(1)).data()`.
For reference, `e` is in an UPC if `e` is
- an argument of a function call (except the callee has [[unsafe_buffer_usage]] attribute), or
- the operand of a cast-to-(Integer or Boolean) operation; or
- the operand of a pointer subtraction operation; or
- the operand of a pointer comparison operation;
We may extend the definition of UPC by adding more cases later.
Reviewed by: NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D144304
varconst [Wed, 12 Apr 2023 21:13:55 +0000 (14:13 -0700)]
[libc++][ranges][NFC] Templatize some of the types in `almost_satisfies_types.h`
Paul Kirth [Mon, 20 Mar 2023 21:16:15 +0000 (21:16 +0000)]
[CodeGen][RISCV] Change Shadow Call Stack Register to X3
ShadowCallStack implementation uses s2 register on RISC-V, but that
choice is problematic for reasons described in:
https://lists.riscv.org/g/sig-toolchains/message/544,
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/370, and
https://github.com/google/android-riscv64/issues/72
The concern over the register choice was also brought up in
https://reviews.llvm.org/D84414.
https://reviews.llvm.org/D84414#2228666 said:
```
"If the register choice is the only concern about this work, then I think
we can probably land it as-is and fixup the register choice if we see
major drawbacks later. Yes, it's an ABI issue, but on the other hand the
shadow call stack is not a standard ABI anyway.""
```
Since we have now found a sufficient reason to fixup the register
choice, we should go ahead and update the implementation. We propose
using x3(gp) which is now the platform register in the RISC-V ABI.
Reviewed By: asb, hiraditya, mcgrathr, craig.topper
Differential Revision: https://reviews.llvm.org/D146463
Craig Topper [Wed, 12 Apr 2023 20:42:35 +0000 (13:42 -0700)]
[LoopIdiomRecognize] Replace getNegativeSCEV(getOne()) with getMinusOne. NFC
Craig Topper [Wed, 12 Apr 2023 20:15:59 +0000 (13:15 -0700)]
[RISCV] Support llvm.lround intrinsics with i32 return type on RV64.
It seems that flang uses this for "nint" and expects this i32
to work. On the C side we think lround should only work for "long"
which is i64 on rv64.
It's easy for us to support i32 when we have native FP instructions.
I fell back to i64 and truncated the result otherwise. The
documentation for lround says it returns an unspecified value if
doesn't fit in the integer type. I have no idea what flang is
expecting. I really only did the libcall to avoid forking a test.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D147195
Fangrui Song [Wed, 12 Apr 2023 20:13:38 +0000 (13:13 -0700)]
[ELF] Cap parallel::strategy to 16 threads when --threads= is unspecified
When --threads= is unspecified, we set it to
`parallel::strategy.compute_thread_count()`, which uses
sched_getaffinity (Linux)/cpuset_getaffinity (FreeBSD)/std::thread::hardware_concurrency (others).
With extensive testing on many machines (many configurations from
{aarch64,x86-64} x {Linux,FreeBSD,Windows} x allocators(native,mimalloc,rpmalloc) combinations)
with varying workloads, we discovered that when the concurrency is larger than
16, the linking process is slower than using --threads=16 due to parallelism
overhead outweighs optimizations. This is particularly harmful for machines with
many cores or when the link job competes with other jobs.
Cap parallel::strategy when --threads= is unspecified.
For some workloads changing the concurrency from 8 to 16 has nearly no improvement.
--thinlto-jobs= is unchanged since ThinLTO backend compiles are embarrassingly
parallel.
Link: https://discourse.llvm.org/t/avoidable-overhead-from-threading-by-default/69160
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D147493
Shubham Sandeep Rastogi [Wed, 12 Apr 2023 19:44:13 +0000 (12:44 -0700)]
Revert "Move DBG_VALUE's that depend on loads to after a"
This reverts commit
0aaf634152f25a805563d552e72d89e8202d84f2.
Reverted this because of build failure https://lab.llvm.org/buildbot#builders/245/builds/7035
/home/tcwg-buildbot/worker/clang-armv8-quick/llvm/llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1.ll:28:12: error: DWARF23: expected string not found in input
; DWARF23: DW_OP_lit13{{$}}
^
<stdin>:1:1: note: scanning from here
-: file format elf32-littlearm
^
<stdin>:19:20: note: possible intended match here
DW_AT_frame_base (DW_OP_reg13 SP)
^
Nicolas Vasilache [Wed, 12 Apr 2023 19:12:46 +0000 (12:12 -0700)]
[mlir][SCF] Make the scf.take_assumed_branch transform only read its target handle
Handles are tracked properly and usage composes better if we don't artificially lose handles.
Differential Revision: https://reviews.llvm.org/D148155
Vitaly Buka [Wed, 12 Apr 2023 19:23:30 +0000 (12:23 -0700)]
[test][tsan] Attempt to fix darwing after D147337
Shubham Sandeep Rastogi [Tue, 7 Feb 2023 19:33:45 +0000 (11:33 -0800)]
Move DBG_VALUE's that depend on loads to after a
load if the load is moved due to the pre register allocation ld/st
optimization pass
The issue here is that there can be a scenario where debug information
is lost because of the pre register allocation load store optimization
pass, where a load who's result describes the debug infomation for a
local variable gets moved below the load and that causes the debug
information for that load to get lost.
Example:
Before the Pre Register Allocation Load Store Pass
inst_a
%2 = ld ...
inst_b
DBG_VALUE %2, "x", ...
%3 = ld ...
After the Pass:
inst_a
inst_b
DBG_VALUE %2, "x", ...
%2 = ld ...
%3 = ld ...
The load has now been moved to after the DBG_VAL that uses its result
and the debug info for "x" has been lost. What we want is:
inst_a
inst_b
%2 = ld ...
DBG_VALUE %2, "x", ...
%3 = ld ...
Which is what this patch addresses
Differential Revision: https://reviews.llvm.org/D145168
Kadir Cetinkaya [Wed, 12 Apr 2023 16:40:43 +0000 (18:40 +0200)]
[clangd] Treat preamble patch as main file for include-cleaner analysis
Since we redefine all macros in preamble-patch, and it's parsed after
consuming the preamble macros, we can get false missing-include diagnostics
while a fresh preamble is being rebuilt.
This patch makes sure preamble-patch is treated same as main file for
include-cleaner purposes.
Differential Revision: https://reviews.llvm.org/D148143
David Green [Wed, 12 Apr 2023 18:44:02 +0000 (19:44 +0100)]
[AArch64][SVE] Extend predicated fadd/fsub patterns to negative zero
This adds -0.0 patterns for fadd and fsub, to go with D147723. The fsub pattern
is only added for completeness but with -0.0 being the neutral element the fadd
case comes up from vectorized reductions.
Differential Revision: https://reviews.llvm.org/D147724
Alexey Bataev [Thu, 6 Apr 2023 21:29:13 +0000 (14:29 -0700)]
[SLP]Improve reduction cost model for scalars.
Instead of abstract cost of the scalar reduction ops, try to use the
cost of actual reduction operation instructions, where possible. Also,
remove the estimation of the vectorized GEPs pointers for reduced loads,
since it is already handled in the tree.
Differential Revision: https://reviews.llvm.org/D148036
Kadir Cetinkaya [Wed, 12 Apr 2023 18:30:13 +0000 (20:30 +0200)]
[icnlude-cleaner] Fix dandling pointers in tests
Alex Langford [Tue, 11 Apr 2023 21:06:06 +0000 (14:06 -0700)]
[lldb] Change formatter helper function parameter list to remove ConstString
All of these functions take a ConstString for the type_name,
but this isn't really needed for two reasons:
1.) This parameter is always constructed from a static c-string
constant.
2.) They are passed along to to `AddTypeSummary` as a StringRef anyway.
Differential Revision: https://reviews.llvm.org/D148050
Shivam Gupta [Wed, 12 Apr 2023 06:54:15 +0000 (12:24 +0530)]
[Flang][Docs] Add a GettingStarted.md for build instructions
This patch fix first point of https://github.com/llvm/llvm-project/issues/60730.
https://flang.llvm.org/docs/ has no build instructions. This doc page is just
a copy of README page so that it can be accessible from website.
Differential Revision: https://reviews.llvm.org/D148070
Paul Kirth [Tue, 11 Apr 2023 16:52:37 +0000 (16:52 +0000)]
[clang][driver][NFC] Add hasShadowCallStack to SanitizerArgs
Currently, we can't check if ShadowCallStack is present in Args the same
way we handle other sanitizers. This is preparatory work for planned
driver changes to how we handle ShadowCallStack.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D148031
Paul Kirth [Thu, 23 Mar 2023 16:54:15 +0000 (16:54 +0000)]
[support] Provide overload to PrintNumber that use C++ types
This attempts to address an issue with overload resolution for `PrintNumber`
with `size_t` parameters on Darwin, brought up in
https://reviews.llvm.org/D146492.
On Aarch64 Darwin, `uint64_t` has a different typedef than `size_t`
(e.g., `unsigned long long` vs. `unsigned long`), whereas on Linux and
Windows they are the same.
This commit also reverts the static_cast's added in
064e2497e2ebe9ac30ac96923a26a52484300fdf, since they are no longer
needed.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D146771
lizhiguang [Wed, 12 Apr 2023 17:47:28 +0000 (10:47 -0700)]
[sanitizer] correct prctl scope
prctl scope is wrong, i think this is typo, we should use:
REAL(prctl)(args...)
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D147760
Vlad Serebrennikov [Wed, 12 Apr 2023 17:45:32 +0000 (20:45 +0300)]
[clang] Make make_cxx_dr_status script runnable from anywhere
This script has hardcoded relative paths to `clang/test/CXX/drs`, `cwg_index.html`, and `cxx_dr_status.html`, which requires running it with `clang/www` CWD. This patch makes those paths relative to path of the script itself, so that it could be run from anywhere.
Reviewed By: #clang-language-wg, cor3ntin
Differential Revision: https://reviews.llvm.org/D148146
Mitch Phillips [Wed, 12 Apr 2023 17:08:40 +0000 (10:08 -0700)]
[MTE] [llvm-readobj] Add globals section parsing to --memtag
Global variables are described in a metadata table called
SHT_AARCH64_MEMTAG_GLOBALS_DYNAMIC. It's basically a ULEB-encoded skip
list with some other fancy encoding tricks to make it smaller. You can
see the ABI at
https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#83encoding-of-sht_aarch64_memtag_globals_dynamic
This extends readelf/readobj to understand these sections.
Reviewed By: pcc, MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D145761
Timm Bäder [Tue, 11 Apr 2023 05:26:22 +0000 (07:26 +0200)]
[clang][NFC] Use range-for loop in TextDiagnostic.cpp
Simon Pilgrim [Wed, 12 Apr 2023 16:41:26 +0000 (17:41 +0100)]
[SLP][X86] Add SSE4 test coverage to minmax reduction tests
Improve coverage for D148036
Archibald Elliott [Tue, 4 Apr 2023 09:36:12 +0000 (10:36 +0100)]
[DAGCombiner] Fix (shl (ctlz x) n) for non-power-of-two Data
This DAGCombine is not valid for some combinations of the known bits
of x and non-power-of-two widths of x. As shown in the bug:
- The bitwidth of x is 35 (n=5)
- The unknown bits of x is only the least significant bit
- This gives the result of the ctlz two possible values: 34 or 35, both
of which will give 1 when left-shifted 5 bits.
- So the `eor x, 1` that this optimisation would give is not correct.
A similar instcombine optimisation is only applied when the width of x is
a power-of-two. GlobalISel does not have this bug, as shown by the testcase.
Fixes #61549
Differential Revision: https://reviews.llvm.org/D147518
Archibald Elliott [Tue, 4 Apr 2023 09:34:24 +0000 (10:34 +0100)]
[NFC][AArch64] Add Test for PR61549
Differential Revision: https://reviews.llvm.org/D147517
Archibald Elliott [Wed, 5 Apr 2023 13:24:43 +0000 (14:24 +0100)]
[AArch64][GISel] Legalize non-power-of-two G_CTLZ
This fixes a crash found in PR61549, and adds test coverage for other
sizes that cannot be selected.
Differential Revision: https://reviews.llvm.org/D147516
Archibald Elliott [Tue, 4 Apr 2023 09:32:17 +0000 (10:32 +0100)]
[NFC][AArch64] Regenerate G_CTLZ Legalizer Test
Differential Revision: https://reviews.llvm.org/D147515
Alexis Engelke [Tue, 11 Apr 2023 15:02:03 +0000 (17:02 +0200)]
[LegacyPM] Call getPassName only when needed
Even when time tracing is disabled, getPassName is currently still
called. This adds an avoidable virtual function call for each pass.
Fetching the pass name only when required slightly improves
compile-time (particularly when LLVM is built without LTO).
Reviewed By: nikic, MaskRay
Differential Revision: https://reviews.llvm.org/D148022
Thurston Dang [Tue, 11 Apr 2023 00:44:25 +0000 (00:44 +0000)]
ASan: move allocator base to avoid conflict with high-entropy ASLR for x86-64 Linux
Users have discovered [*] that when CONFIG_ARCH_MMAP_RND_BITS == 32,
it will frequently conflict with ASan's allocator on x86-64 Linux, because the
PIE program segment base address of 0x555555555554 plus an ASLR shift of up to
((2**32) * 4K == 0x100000000000) will sometimes exceed ASan's hardcoded
base address of 0x600000000000. We fix this by simply moving the allocator base
to 0x500000000000, which is below the PIE program segment base address. This is
cleaner than trying to move it to another location that is sandwiched between
the PIE program and library segments, because if either of those grow too large,
it will collide with the allocator region.
Note that we will never need to change this base address again (unless we want to increase
the size of the allocator), because ASLR cannot be set above 32-bits for x86-64 Linux (the
PIE program segment and library segments would collide with each other; see also
ARCH_MMAP_RND_BITS_MAX in https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig).
[*] see https://b.corp.google.com/issues/
276925478
and https://groups.google.com/a/google.com/g/chrome-os-gardeners/c/BbfzCP3dEeo/m/h3C_vVUxCQAJ
Differential Revision: https://reviews.llvm.org/D147984
OCHyams [Wed, 12 Apr 2023 16:07:44 +0000 (17:07 +0100)]
[Assignment Tracking] Ignore VLA-backed variables
VLA backed variables currently trip an assertion in SROA with D146987 (enabling
assignment tracking). Disable assignment tracking for VLA variables until that
can be investigated.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D148140
Shao-Ce SUN [Wed, 12 Apr 2023 07:57:20 +0000 (15:57 +0800)]
[nfc][flang] Eliminate the dependency on cctype by using characters.h
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D148076
Nicolas Vasilache [Wed, 12 Apr 2023 13:19:45 +0000 (06:19 -0700)]
[mlir][SCF] Add an scf.take_assumed_branch transform op.
Given an scf.if conditional, using this transformation is akin to injecting
user-specified information that it is always safe to execute only the specified
`if` or `else` branch.
This is achieved by just replacing the scf.if by the content of one of its
branches.
This is particularly useful for user-controlled rewriting of conditionals
that exist solely to guard against out-of-bounds behavior.
At the moment, no assume or assert operation is emitted as it is not always
desirable. In the future, this may be controlled by a dedicated attribute.
Differential Revision: https://reviews.llvm.org/D148125
Kadir Cetinkaya [Wed, 12 Apr 2023 10:05:14 +0000 (12:05 +0200)]
[include-cleaner] Improve handling for templates
Principal here is:
- Making sure each template instantiation implies use of the most specialized
template. As explicit instantiations/specializations are not redeclarations of
the primary template.
- Introducing a use from explicit instantions/specializaitons to the primary
template, as they're required but not traversed as part of the RAV.
Differential Revision: https://reviews.llvm.org/D148112
Mark de Wever [Tue, 11 Apr 2023 16:51:18 +0000 (18:51 +0200)]
[NFC][libc++] Use _LIBCPP_HIDE_FROM_ABI.
This updates the new __system_error directory.
Reviewed By: #libc, philnik
Differential Revision: https://reviews.llvm.org/D148028
Valentin Clement [Wed, 12 Apr 2023 15:19:11 +0000 (08:19 -0700)]
[flang][openacc] Keep region when applying data operand conversion
Similar to D148039 but for the FIR to LLVM IR
conversion pass.
The inner part of the acc.loop has been removed since the rest of the
pipeline is not ready and would raise an error here. This was passing
until now because the acc.loop was discarded completely.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D148057
Anna Thomas [Wed, 12 Apr 2023 14:58:06 +0000 (10:58 -0400)]
Revert "[GuardUtils] Add asserts about loop varying widenable conditions"
This reverts commit
5675757f5fc6e27ce01b3b12bdfd04044df53aa3.
Assert maybe too strict. revert and investigate why assert fires.
David Green [Wed, 12 Apr 2023 14:53:22 +0000 (15:53 +0100)]
[AArch64][SVE] Extend predicated fma patterns to negative zero
This extends the patterns added in D130564 for fma to also handle negative 0.0.
-0.0 is the identity element for fadd so comes up in vectorized loops.
The same basic idea applies to D130564, but nsz should no longer be needed for
the fadd case, and is for fsub (which is really only added for completeness).
Differential Revision: https://reviews.llvm.org/D147723
Nikita Popov [Wed, 12 Apr 2023 14:39:24 +0000 (16:39 +0200)]
[VNCoercion] Drop some redundant functions (NFC)
These load and store APIs now do the same thing, so combine them
into one.
Ivan Butygin [Thu, 16 Mar 2023 18:18:07 +0000 (19:18 +0100)]
[mlir][scf] More WhileOp canonicalizations
Remove duplicated ConditonOp args, remove unused init/yield args.
Differential Revision: https://reviews.llvm.org/D146252
Simon Pilgrim [Wed, 12 Apr 2023 14:28:17 +0000 (15:28 +0100)]
[TTI][X86] getMinMaxCost - use existing integer min/max intrinsic cost values instead of maintaining a duplicate cost table
getMinMaxCost has an alternative set of min/max costs to getIntrinsicInstrCost that are only used by getMinMaxReductionCost, but are a lot less thorough and fallback to an expansion in most cases resulting in cost overestimations - we're better off just using getIntrinsicInstrCost.
getIntrinsicInstrCost is still missing complete FMINNUM/FMAXNUM costs, so until then getMinMaxCost will still be used for these, after that we can remove getMinMaxCost and have getMinMaxReductionCost call getIntrinsicInstrCost directly.
Fixes regression noticed in D148036
Simon Pilgrim [Wed, 12 Apr 2023 13:53:16 +0000 (14:53 +0100)]
[X86] combinePTESTCC - attempt to use TESTPS/TESTPD instead of MOVMSKPS/MOVMSKPD for all-of cases with all-sign values.
We can probably be more aggressive with TESTPS/TESTPD (instead of relying on the SimplifyMultipleUseDemandedBits call) - I've updated an existing TODO to suggest this for now.
Part of Issue #60007
Simon Pilgrim [Wed, 12 Apr 2023 13:47:38 +0000 (14:47 +0100)]
[X86] Cleanup reduction cost table names. NFC.
We merged the costs for split/pairwise reductions sometime ago.
Nikita Popov [Wed, 12 Apr 2023 14:29:58 +0000 (16:29 +0200)]
[GVN][VNCoercion] Remove load widening leftovers (NFCI)
GVN load widening was disabled in D24096. This removes various
support code that is no longer relevant.
The way this works nowadays is that we return PartialAlias with
an offset from BasicAA and this gets passed on as a clobber by
MDA. However, PartialAlias will only be returned if the load is
properly nested inside the other load.
This just removes the bulk of the code, but some additional
cleanup can be done here now that we don't need to distinguish
between load and store cases.
Younan Zhang [Mon, 10 Apr 2023 07:08:45 +0000 (15:08 +0800)]
[clang] Implement CWG 2397
This patch implements CWG 2397, allowing array of placeholder type
to be valid.
See also https://wg21.cmeerw.net/cwg/issue2397.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D147909
Nikita Popov [Wed, 12 Apr 2023 14:16:15 +0000 (16:16 +0200)]
[GVN] Regenerate test checks (NFC)
Sebastian Neubauer [Wed, 12 Apr 2023 14:15:09 +0000 (16:15 +0200)]
[AMDGPU] Fix amdgpu_gfx tail-call test
The inreg argument prevented the tail call optimization to kick in.
Remove the inreg, so this test actually uses a tail call.
Note that it now uses s[4:5] for the return address, which is invalid,
because these registers are supposed to be callee-save.
D147096 tried to fix that problem for the C calling convention.
Differential Revision: https://reviews.llvm.org/D148119
Florian Hahn [Wed, 12 Apr 2023 14:07:48 +0000 (15:07 +0100)]
[Matrix] Fix crash during dot product lowering.
Perform dot-product lowering before instruction fusion to avoid crash in
newly added test. Also update lowerDotProduct to properly mark optimized
matmul as fused.
Sergio Afonso [Mon, 10 Apr 2023 13:19:11 +0000 (14:19 +0100)]
[Flang][Driver][OpenMP] Enable options for selecting offloading phases in flang
This patch unlocks the "--offload-device-only", "--offload-host-only" and
"--offload-host-device" options available in Clang for use by the Flang driver.
These can be used to modify the behavior of the driver to select which
compilation invocations are triggered during OpenMP offloading.
Differential Revision: https://reviews.llvm.org/D147941
Nikita Popov [Wed, 12 Apr 2023 13:46:10 +0000 (15:46 +0200)]
[GVN] Add additional metadata adjustment tests (NFC)
Max Kazantsev [Wed, 12 Apr 2023 13:34:56 +0000 (20:34 +0700)]
[SimpleLoopUnswitch] Do not try to inject pointer conditions. PR62058
As shown in https://github.com/llvm/llvm-project/issues/62058, canonicalication
may fail with pointer types (and basically this transform is not expected to
work with pointers).
Max Kazantsev [Wed, 12 Apr 2023 13:33:39 +0000 (20:33 +0700)]
[Test] Add XFAIL test for PR62058
Details at https://github.com/llvm/llvm-project/issues/62058
Hans Wennborg [Wed, 12 Apr 2023 13:28:13 +0000 (15:28 +0200)]
[profile] Make __llvm_profile_global_timestamp static to unbreak Darwin
See comments on https://reviews.llvm.org/D147287
LLVM GN Syncbot [Wed, 12 Apr 2023 13:18:39 +0000 (13:18 +0000)]
[gn build] Port
e2b15ec235fe
LLVM GN Syncbot [Wed, 12 Apr 2023 13:18:38 +0000 (13:18 +0000)]
[gn build] Port
a6d9730f403a
Nico Weber [Wed, 12 Apr 2023 13:18:07 +0000 (09:18 -0400)]
Akash Banerjee [Wed, 12 Apr 2023 12:08:37 +0000 (13:08 +0100)]
[MLIR][OpenMP] Update OpenMPIRBuilderTest to use opaque pointers
This patch updates all tests to use to use the opaque pointers.
Differential Revision: https://reviews.llvm.org/D147599
Nicolas Vasilache [Wed, 12 Apr 2023 09:09:17 +0000 (02:09 -0700)]
[mlir][Linalg] Allow linalg.copy to be vectorized with masking
Differential Revision: https://reviews.llvm.org/D148095
Nikolas Klauser [Mon, 10 Apr 2023 17:31:22 +0000 (19:31 +0200)]
[libc++][NFC] rename __is_trivially_equality_comparable to __libcpp_is_trivially_equality_comparable
This is required for D147175.
Reviewed By: ldionne, Mordante, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D147953
Nikolas Klauser [Tue, 4 Apr 2023 11:05:59 +0000 (13:05 +0200)]
[libc++] Rename __tuple_dir back to __tuple
This essentially reverts D139270
Reviewed By: #libc, EricWF
Spies: tahonermann, libcxx-commits, arichardson
Differential Revision: https://reviews.llvm.org/D147519
Muhammad Omair Javaid [Wed, 12 Apr 2023 12:11:58 +0000 (17:11 +0500)]
[BOLT] Fix section-end-sym.s test to only run x86/Linux
section-end-sym.s contains x86_64 assembly instruction execution on target.
I have changed REQURIES: field system-linux --> x86_64-linux
This came up while testing LLVM 16.0.1 release on AArch64 Linux.
Dmitry Makogon [Wed, 12 Apr 2023 11:44:40 +0000 (18:44 +0700)]
[LoopUtils] Add isKnownPositiveInLoop and isKnownNonPositiveInLoop functions
Nicolas Vasilache [Wed, 12 Apr 2023 09:12:31 +0000 (02:12 -0700)]
[mlir][Linalg] Add support for tiling tensor.pad to scf.forall
Also, properly propagate the nofold attribute.
Differential Revision: https://reviews.llvm.org/D148114
Hans Wennborg [Wed, 12 Apr 2023 11:28:25 +0000 (13:28 +0200)]
Revert "Move "auto-init" instructions to the dominator of their users"
This could also move initialization of sret args, causing actually
initialized parts of such return values to be uninitialized. See
discussion on the code review.
> As a result of -ftrivial-auto-var-init, clang generates instructions to
> set alloca'd memory to a given pattern, right after the allocation site.
> In some cases, this (somehow costly) operation could be delayed, leading
> to conditional execution in some cases.
>
> This is not an uncommon situation: it happens ~500 times on the cPython
> code base, and much more on the LLVM codebase. The benefit greatly
> varies on the execution path, but it should not regress on performance.
>
> This is a recommit of
cca01008cc31a891d0ec70aff2201b25d05d8f1b with
> MemorySSA update fixes.
>
> Differential Revision: https://reviews.llvm.org/D137707
This reverts commit
50b2a113db197a97f60ad2aace8b7382dc9b8c31
and follow-up commit
ad9ad3735c4821ff4651fab7537a75b8f0bb60f8.
OCHyams [Wed, 12 Apr 2023 11:35:17 +0000 (12:35 +0100)]
Reapply (4) "[Assignment Tracking] Enable by default"
Re-land D146987.
This reverts commit
8af575657b1dc1113640286b3649842c2473c2cf
which reverts D146987.
OCHyams [Wed, 12 Apr 2023 11:28:59 +0000 (12:28 +0100)]
[Assignment Tracking] Fix assertion in AssignmentTrackingPass::runOnFunction
The assertion exists to ensure all variables passed into `trackAssignments` end
up with dbg.assigns associated with their backing allocas. The assertion
compared the passed-in and tracked variables using `DebugVariable`, which
includes the fragment as part of the variable identity.
It is possible for the backing alloca to be smaller than a variable (see test
case). In this case the input variable `(Var X, no fragment, no InlinedAt)`
isn't equal to the dbg.assign variable `(Var X, some fragment, no
InlinedAt)`. To cover this case the assertion now ignores fragments through the
use of `DebugVariableAggregate`.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D148100
Alexander Kornienko [Wed, 12 Apr 2023 10:30:23 +0000 (12:30 +0200)]
[lldb] Reduce chances of spurious failures in some build setups
The test may fail when running from a directory that contains the string used in
CHECK-NOT. We observe flakiness rate of around 3/100000. Increasing the length
helps reducing the rate of failures.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D148099
Nicolas Vasilache [Wed, 12 Apr 2023 10:51:08 +0000 (03:51 -0700)]
[mlir][Linalg] Add a structured transform to materialize a tensor.insert_slice via a linalg.copy
This is useful to materialize copies explicitly before bufferization and
transform them, avoiding the need to rediscover them after bufferization.
Differential Revision: https://reviews.llvm.org/D148108
Alex Zinenko [Mon, 3 Apr 2023 12:59:49 +0000 (12:59 +0000)]
[mlir] Add transform.foreach_match
Add a new transform op combinator that implements an "if-then-else"
style of mechanism for applying transformations. Its main purpose is to
serve as a higher-level driver when applying multiple transform scripts
to potentially overlapping pieces of the payload IR. This is similar to
how the various rewrite drivers operate in C++, but at a higher level
and with more declarative expressions. This is not intended to replace
existing pattern-based rewrites, but to to drive more complex
transformations that are exposed in the transform dialect and are too
complex to be expressed as simple declarative rewrites.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D148013
Dmitry Makogon [Wed, 12 Apr 2023 08:55:08 +0000 (15:55 +0700)]
[Test] Add ranges for some expressions in some SCEV tests (NFC)
Simon Pilgrim [Wed, 12 Apr 2023 10:54:18 +0000 (11:54 +0100)]
[X86] SimplifyDemandedBitsForTargetNode - add TESTPS/TESTPD support
We only need the sign bits from these nodes
Another step towards Issue #60007
Matt Arsenault [Tue, 11 Apr 2023 13:50:11 +0000 (09:50 -0400)]
InstCombine: Add some additional is.fpclass tests
Add some tests to generalize the clearing of known bits using
computeKnownFPClass instead of isKnownNeverNaN/isKnownNeverInfinity.
Matt Arsenault [Wed, 12 Apr 2023 02:24:33 +0000 (22:24 -0400)]
unittests: Use opaque pointers in a test
Matt Arsenault [Sat, 8 Apr 2023 13:21:31 +0000 (09:21 -0400)]
ValueTracking: Handle no-nan check for computeKnownFPClass for fadd/fsub
Copy the logic from isKnownNeverNaN for fadd/fsub.
Matt Arsenault [Sat, 8 Apr 2023 23:06:36 +0000 (19:06 -0400)]
ValueTracking: Remove outdated todo
Matt Arsenault [Thu, 26 Jan 2023 19:55:42 +0000 (15:55 -0400)]
AMDGPU: Push fneg into bitcast of integer select
Avoids some regressions in the math libraries in a future
patch.
Zahira Ammarguellat [Thu, 6 Apr 2023 19:21:14 +0000 (15:21 -0400)]
Set 'rounding_mode' to 'tonearest' with '#pragma STDC FENV_ACCESS OFF'.
In strict mode the 'roundin_mode' is set to 'dynamic'. Using this pragma to
get out of strict mode doesn't have any effect on the 'rounding_mode'.
See https://godbolt.org/z/zoGTf4j1G
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D147733