Amaury Séchet [Sat, 30 Apr 2022 23:45:28 +0000 (23:45 +0000)]
[DAG] Peek through trunc when combining select into shifts.
This fixes a regression in D127115
Depends on D127115
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D151916
Sheng [Fri, 23 Jun 2023 00:30:53 +0000 (08:30 +0800)]
[m68k] Fix incorrect handling of TLS when matching addressing mode.
`TargetGlobalTLSAddress` is not considered and handled correctly when matching addressing mode, which leads to an incorrect result of instruction selection.
fixes #63162.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D153103
Vitaly Buka [Fri, 23 Jun 2023 00:13:11 +0000 (17:13 -0700)]
[nfc][msan] Clang-format includes
Vitaly Buka [Fri, 23 Jun 2023 00:05:02 +0000 (17:05 -0700)]
[msan] Release origin pages with shadow
Vitaly Buka [Fri, 23 Jun 2023 00:00:55 +0000 (17:00 -0700)]
[test][hwasan] Reformat comments
Vitaly Buka [Thu, 22 Jun 2023 23:58:56 +0000 (16:58 -0700)]
[test][hwasan] Describe why HWASAN does not work
Alex Langford [Thu, 22 Jun 2023 23:04:06 +0000 (16:04 -0700)]
[lldb] Adjust for changes in objc runtime
The Objective-C runtime and the shared cache has changed slightly.
Given a class_ro_t, the baseMethods ivar is now a pointer union and may
either be a method_list_t pointer or a pointer to a relative list of
lists. The entries of this relative list of lists are indexes that refer
to a specific image in the shared cache in addition to a pointer offset
to find the accompanying method_list_t. We have to go over each of these
entries, parse it, and then if the relevant image is loaded in the
process, we add those methods to the relevant clang Decl.
In order to determine if an image is loaded, the Objective-C runtime
exposes a symbol that lets us determine if a particular image is loaded.
We maintain a data structure SharedCacheImageHeaders to keep track of
that information.
There is a known issue where if an image is loaded after we create a
Decl for a class, the Decl will not have the relevant methods from that
image (i.e. for Categories).
rdar://
107957209
Differential Revision: https://reviews.llvm.org/D153597
Matt Arsenault [Thu, 22 Jun 2023 22:15:19 +0000 (18:15 -0400)]
Revert "AMDGPU: Use generic helper for skipping over allocas"
This reverts commit
aa7e09ebd38c5f23f6d7d6d8394a2aea04715ba9.
Matt Arsenault [Thu, 22 Jun 2023 21:16:43 +0000 (17:16 -0400)]
AMDGPU: Use generic helper for skipping over allocas
Daniel Hoekwater [Thu, 22 Jun 2023 03:56:14 +0000 (03:56 +0000)]
[MC] Detect out of range jumps further than 2^32 bytes
On AArch64, object files may be greater than 2^32 bytes. If an
offset is greater than the max value of a 32-bit unsigned integer,
LLVM silently truncates the offset. Instead, make it return an
error.
Differential Revision: https://reviews.llvm.org/D153494
Fangrui Song [Thu, 22 Jun 2023 21:51:08 +0000 (14:51 -0700)]
[MC] Suppress -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D153096
Philip Reames [Thu, 22 Jun 2023 21:47:03 +0000 (14:47 -0700)]
[RISCV] Document overview of vector pseudos [nfc]
I tried to give a rough overview of our current pseudo structure. I'm mostly focused on the policy handling bits - since that's what I'm in the process of changing - but touched on the other dimensions in the process of framing it.
Differential Revision: https://reviews.llvm.org/D152937
Shatian Wang [Thu, 22 Jun 2023 21:26:21 +0000 (14:26 -0700)]
[BOLT] Fixing relative ordering of cold sections under multi-way function splitting
Order code sections with names in the form of ".text.cold.i" based on the value of i
[Context] SplitFunctions.cpp implements splitting strategies that can potentially split each function into maximum N>2 fragments.
When such N-way splitting happens, new code sections with names ".text.cold.1", ..., ".text.cold.i", ... "text.cold.N-2" will be created
A section with name ".text.cold.i" contains the the (i+2)th fragment of each function.
As an example, if each function is splitted into N=3 fragments: hot, warm, cold, then code sections will now include
- a section with name ".text" containing hot fragments
- a section with name ".text.cold" containing warm fragments
- a section with name ".text.cold.1" containing cold fragments
The order of these new sections in the output binary currently depends on the order in which they are encountered by the emitter.
For example, under N=3-way splitting, if the first function is 2-way splitted into hot and cold and the second function is 3-way splitted into hot, warm, and cold
then the cold fragment is encountered first, resulting in the final section to be in the following order
.text (hot), .text.cold.1 (cold), .text.cold (warm)
The above is suboptimal because the distance of jumps/calls between the hot and the warm sections will be much bigger than when ordering the sections as follows
.text (hot), .text.cold (warm), .text.cold.1 (cold)
This diff orders the sections with names in the form of ".text.cold" or ".text.cold.i" based on the value of i (assuming the i-value of ".text.cold" is 0).
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D152941
Maksim Panchenko [Tue, 20 Jun 2023 22:27:31 +0000 (15:27 -0700)]
[BOLT] Remove unnecessary diagnostics
When optimizations passes do not change anything, skip their diagnostics
output. NFC otherwise.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D153386
Jonas Devlieghere [Thu, 22 Jun 2023 15:57:30 +0000 (08:57 -0700)]
[lldb] Fix variable name mismatch between signature and docs (NFC)
The variable is named `bundle_dir` but the documentation referenced
`directory` which generated a warning.
Tomasz Kuchta [Wed, 21 Jun 2023 21:24:33 +0000 (21:24 +0000)]
[DFSAN] Add support for _tolower
I noticed that in some cases _tolower shows as uninstrumented - I've added it as "functional" in the done_abilist.txt file
Reviewed by: browneee
Differential Revision: https://reviews.llvm.org/D153410
Fangrui Song [Thu, 22 Jun 2023 20:44:15 +0000 (13:44 -0700)]
[MC,x86-32] Remove a gold<2.34 workaround
This workaround appears to apply with gold<2.34 -O2/-O3 (linker -O2, not
compiler driver -O2). This used to be more visible as we used -Wl,-O3 in
CMake, but the option is generally not recommended and has been removed
by
d63016a86548e8231002a760bbe9eb817cd1eb00 (Dec 2021).
This finishes a workaround removal work started by D64327 (2019).
Link: https://github.com/llvm/llvm-project/issues/45269
Manna, Soumi [Thu, 22 Jun 2023 20:24:53 +0000 (13:24 -0700)]
[CLANG] Fix Static Code Analyzer Concerns with bad bit right shift operation in getNVPTXLaneID()
In getNVPTXLaneID(CodeGenFunction &), the value of LaneIDBits is
4294967295 since function call llvm::Log2_32(CGF->getTarget()->getGridValue().GV_Warp_Size) might return
4294967295.
unsigned LaneIDBits =
llvm::Log2_32(CGF.getTarget().getGridValue().GV_Warp_Size);
unsigned LaneIDMask = ~0u >> (32u - LaneIDBits);
The shift amount (32U - LaneIDBits) might be 33, So it has undefined behavior for right shifting by more than 31 bits.
This patch adds an assert to guard the LaneIDBits overflow issue with LaneIDMask value.
Reviewed By: tahonermann
Differential Revision: https://reviews.llvm.org/D151606
Peter Klausler [Thu, 22 Jun 2023 16:25:03 +0000 (09:25 -0700)]
[flang] Avoid crash in statement function error case
The predicate IsPureProcedure() crashes with infinite
recursion when presented with mutually recursive statement
functions -- an error case that should be recoverable.
Fix by adding a visited set.
Fixes bug https://github.com/llvm/llvm-project/issues/63231
Differential Revision: https://reviews.llvm.org/D153569
Jon Chesterfield [Thu, 22 Jun 2023 20:18:43 +0000 (21:18 +0100)]
[libc] Can build amdgpu libc even if rocm is missing
Clang defaults to failing to build if it can't find rocm device libs
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D153581
Valentin Clement [Thu, 22 Jun 2023 20:10:50 +0000 (13:10 -0700)]
[flang][openacc] Add lowering support for multi-dimensional arrays reduction
Lower multi-dimensional arrays reduction for add and mul operator.
Depends on D153448
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D153455
Valentin Clement [Thu, 22 Jun 2023 20:09:15 +0000 (13:09 -0700)]
[flang][openacc] Add lowering support for 1d array reduction for add/mul operator
Lower 1d array reduction for add and mul operator. Multi-dimensional arrays and
other operator will follow.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D153448
Manna, Soumi [Thu, 22 Jun 2023 19:52:25 +0000 (12:52 -0700)]
[CLANG] Fix potential null pointer dereference bugs
This patch uses castAs instead of getAs which will assert if the type doesn't match and adds nullptr check if needed.
Also this patch improves the codes and passes I.getData() instead of doing a lookup in dumpVarDefinitionName()
since we're iterating over the same map in LocalVariableMap::dumpContex().
Reviewed By: aaron.ballman, aaronpuchert
Differential Revision: https://reviews.llvm.org/D153033
Vitaly Buka [Thu, 22 Jun 2023 18:17:12 +0000 (11:17 -0700)]
[NFC][asan] Add FIXME for a posible optimization
Manna, Soumi [Thu, 22 Jun 2023 19:30:01 +0000 (12:30 -0700)]
[Clang] Fix Static Code Analysis Concerns with copy without assign
This patch adds missing assignment operator to the class which has user-defined copy constructor.
Reviewed By: tahonermann, aaronpuchert
Differential Revision: https://reviews.llvm.org/D150931
Vitaly Buka [Thu, 22 Jun 2023 06:18:25 +0000 (23:18 -0700)]
[asan] Don't double poison secondary allocations
Sanitizers allocate shadow and memory as MAP_NORESERVE.
User memory can stay this way and do not increase RSS as long as we
don't store there.
The shadow unpoisoning also can avoid RSS increase for zeroed pages.
However as soon we poison the shadow, we need the page in RSS.
To avoid unnececary RSS increase we should not poison memory just before
unpoisoning them.
Depends on D153497.
Reviewed By: thurston
Differential Revision: https://reviews.llvm.org/D153500
Fangrui Song [Thu, 22 Jun 2023 19:24:19 +0000 (12:24 -0700)]
[MC] Fold A-B when A's fragment precedes B's fragment
When the MCAssembler is non-null and the MCAsmLayout is null, we can fold A-B
when
* A and B are in the same fragment, or
* A's fragment suceeds B's fragment, and they are not separated by non-data fragments (D69411)
This patch allows folding when A's fragment precedes B's fragment so
that `9997b - . == 0` below can be evaluated as true:
```
nop
.arch_extension sec
9997:nop
// old behavior: error: expected absolute expression
.if 9997b - . == 0
.endif
```
Add a case to llvm/test/MC/ARM/directive-if-subtraction.s.
Note: for MCAsmStreamer, we cannot evaluate `.if . - 9997b == 0` at parse
time due to MCAsmStreamer::getAssemblerPtr returning nullptr (D45164).
Some Darwin tests check that this folding does not work. Add `.p2align 2` to
block some label difference folding or adjust the tests.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D153096
Florian Hahn [Thu, 22 Jun 2023 19:15:21 +0000 (20:15 +0100)]
[LV] Add test with reduction start values that are/may be poison/undef.
Test cases for #62565.
Joseph Huber [Thu, 22 Jun 2023 19:03:18 +0000 (14:03 -0500)]
[Clang] Disable `libc` headers for offloading languages
These headers are currently broken when included from the offloading
languages like OpenMP, OpenCL, CUDA, and HIP. Turn this logic off so we
can compile these languages when the GPU libc is installed. I am
currently trying to remedy this and have made an RFC for it in libc,
see https://discourse.llvm.org/t/rfc-implementing-gpu-headers-in-the-llvm-c-library/71523.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D153578
Fangrui Song [Thu, 22 Jun 2023 19:11:52 +0000 (12:11 -0700)]
[MC][test] Clean up MC/ARM/directive-if-subtraction.s
Manna, Soumi [Thu, 22 Jun 2023 19:05:30 +0000 (12:05 -0700)]
[CLANG] Fix uninitialized scalar field issues
Reviewed By: erichkeane, steakhal, tahonermann, shafik
Differential Revision: https://reviews.llvm.org/D150744
Sam McCall [Thu, 22 Jun 2023 02:27:20 +0000 (04:27 +0200)]
[dataflow] Avoid copying environment
This appears to be just an accidental copy rather than move from a scratch
variable.
As well as doing redundant work, these copies introduce extra SAT variables
which make debugging harder (each Enviroment has a unique FC token).
Example flow condition before:
```
(B0:1 = V15)
(B1:1 = V8)
(B2:1 = V10)
(B3:1 = (V4 & (!V7 => V6)))
(V10 = (B3:1 & !V7))
(V12 = B1:1)
(V13 = B2:1)
(V15 = (V12 | V13))
(V3 = V2)
(V4 = V3)
(V8 = (B3:1 & !!V7))
B0:1
V2
```
after:
```
(B0:1 = (V9 | V10))
(B1:1 = (B3:1 & !!V6))
(B2:1 = (B3:1 & !V6))
(B3:1 = (V3 & (!V6 => V5)))
(V10 = B2:1)
(V3 = V2)
(V9 = B1:1)
B0:1
V2
```
(with labelling from D153488)
There are also some more copies that can be avoided here (when multiple blocks
without terminating statements are joined), but they're less trivial, so I'll
put those in another patch.
Differential Revision: https://reviews.llvm.org/D153491
Amy Huang [Thu, 22 Jun 2023 18:36:30 +0000 (11:36 -0700)]
Revert "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."
Causes a clang crash (see crbug.com/1457256).
This reverts commit
015049338d7e8e0e81f2ad2f94e5a43e2e3f5220.
Joe Nash [Thu, 22 Jun 2023 15:31:50 +0000 (11:31 -0400)]
[AMDGPU] Add _e64_dpp asm suffix to docs
The _e64_dpp suffix can be added to an instruction to force the
AsmParser to encode it as VOP3 with DPP if possible on GFX11+. This has
been the behavior since GFX11 was introduced; this patch only updates
the documentation.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153564
Vitaly Buka [Thu, 22 Jun 2023 18:24:47 +0000 (11:24 -0700)]
Revert "[CodeGen] Extend reduction support in ComplexDeinterleaving pass to support predication"
ComplexDeinterleavingPass.cpp:1849:3: error: default label in switch which covers all enumeration values
This reverts commit
116953b82130df1ebd817b3587b16154f659c013.
Craig Topper [Thu, 22 Jun 2023 18:01:11 +0000 (11:01 -0700)]
[RISCV] Sort the extensions in SupportedExtensions and SupportedExperimentalExtensions.
As the extension list continues to grow it probably makes sense
to use a binary search rather than linear search. Sorting the strings
will make this possible.
This also avoids any question about where to add new strings in
the tables.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D153170
Vitaly Buka [Thu, 22 Jun 2023 05:44:56 +0000 (22:44 -0700)]
[asan] Optimize Quarantine of secondary allocations
For the secondary allocation we don't need poison and fill memory if we
skip quarantine, and we don't need to poison after quarantine. In both
cases the secondary allocator will unmap memory and unpoison the shadow
from get_allocator().Deallocate().
Depends on D153496.
Reviewed By: thurston
Differential Revision: https://reviews.llvm.org/D153497
Matt Arsenault [Wed, 17 May 2023 09:18:51 +0000 (10:18 +0100)]
InstCombine: Fold select of ldexp to ldexp of select
The select-of-different-exp pattern appears in the device
libraries. I haven't seen the select-of-values case.
Matt Arsenault [Wed, 17 May 2023 08:10:39 +0000 (09:10 +0100)]
InstCombine: Add some baseline tests for ldexp combines
Paul Robinson [Thu, 22 Jun 2023 17:15:16 +0000 (10:15 -0700)]
[Headers] Fix up some conditionals
Florian Hahn [Thu, 22 Jun 2023 18:10:48 +0000 (19:10 +0100)]
[LSR] Return nullptr from getExpr if the result isn't invertible.
getExpr is missing a check to make sure the result is invertible.
This can lead to incorrect results, so return nullptr in those cases
like in other places in IVUsers.
Fixes #62660.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D153202
Yann Girsberger [Thu, 22 Jun 2023 17:57:46 +0000 (10:57 -0700)]
[opt] Exposing the parameters of LoopRotate to the -passes interface
There is a gap between running opt -Oz and running opt -passes="OZ_PASSES" where OZ_PASSES is taken from running opt -Oz -print-pipeline-passes.
One of the reasons causing this is that -Oz uses non-default setting for LoopRotate but LoopRotate does not expose its settings when printing the pipeline.
This commit fixes this by exposing LoopRotates parameters.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D153437
Aiden Grossman [Thu, 22 Jun 2023 18:01:57 +0000 (18:01 +0000)]
Revert "[llvm-exegesis] Add ability to assign perf counters to specific PID"
Revert "[llvm-exegesis] Introduce Subprocess Executor Mode"
This reverts commit
5e9173c43a9b97c8614e36d6f754317f731e71e9.
This reverts commit
4d618b52f6e05e41d35f56653cb36bf7d4dc794e.
Reverting the PID commit as it is currently breaking MinGW builds and
the way I'm checking for the presence of pid_t needs to be fixed and I
need to do some testing. The subprocess executor mode patch is a
dependent patch so also needs to be reverted and also needs some work as
it is currently failing tests where libpfm is installed and the kernel
version is less than 5.6.
Kamlesh Kumar [Thu, 22 Jun 2023 17:41:09 +0000 (23:11 +0530)]
[llvm] Refactor BalancedPartitioning for fixing build failure with MSVC
Fix build failure on windows system with msvc toolchain
Reviewed By: ellis
Differential Revision: https://reviews.llvm.org/D153318
Kazuki Sakamoto [Wed, 21 Jun 2023 00:30:29 +0000 (17:30 -0700)]
[lldb][Windows] Fix ZipFileResolver tests
D152759 introduced the Android .zip so file support, but it only considered
POSIX path. The code also runs on Windows, so the path could be Windows path.
Support both patterns on Windows.
Differential Revision: https://reviews.llvm.org/D153390
Vitaly Buka [Thu, 22 Jun 2023 17:34:19 +0000 (10:34 -0700)]
[NFC][asan] Add const to QuarantineCallback methods
Vitaly Buka [Thu, 22 Jun 2023 17:26:21 +0000 (10:26 -0700)]
[NFC][asan] Extract FillChunk
Vitaly Buka [Thu, 22 Jun 2023 05:26:01 +0000 (22:26 -0700)]
[NFC][asan] Add QuarantineCallback::{PreQuarantine,RecyclePassThrough}
Reviewed By: thurston
Differential Revision: https://reviews.llvm.org/D153496
Michael Maitland [Wed, 21 Jun 2023 21:49:05 +0000 (14:49 -0700)]
[RISCV] Improve SiFive7 for reductions and ordered reductions
Since the scheduling resources for reductions and ordered reductions now
account for LMUL and SEW, we can modify the Latency and ResourceCycles
for these resoruces.
* Most reductions take a total of approx `vl*SEW/DLEN + 5*(4 + log2(DLEN/SEW))`
cycles.
* Ordered floating-point reductions take a total of approx `5*vl` cycles.
This commit re-commits
208fc34c65d648e869d7d3ba0dfcbca90942cda0. It was
failing because it used wrong version of SchedSEWSet.
Differential Revision: https://reviews.llvm.org/D153474
Zahira Ammarguellat [Thu, 15 Jun 2023 19:44:25 +0000 (15:44 -0400)]
[clang] Add a namespace for interesting identifiers.
Differential Revision: https://reviews.llvm.org/D146148
Michael Maitland [Thu, 22 Jun 2023 17:19:04 +0000 (10:19 -0700)]
Revert "[RISCV] Improve SiFive7 for reductions and ordered reductions"
This reverts commit
208fc34c65d648e869d7d3ba0dfcbca90942cda0.
Reverting because build failure.
Michael Maitland [Wed, 21 Jun 2023 21:49:05 +0000 (14:49 -0700)]
[RISCV] Improve SiFive7 for reductions and ordered reductions
Since the scheduling resources for reductions and ordered reductions now
account for LMUL and SEW, we can modify the Latency and ResourceCycles
for these resoruces.
* Most reductions take a total of approx `vl*SEW/DLEN + 5*(4 + log2(DLEN/SEW))`
cycles.
* Ordered floating-point reductions take a total of approx `5*vl` cycles.
Differential Revision: https://reviews.llvm.org/D153474
Michael Maitland [Wed, 21 Jun 2023 21:40:41 +0000 (14:40 -0700)]
[RISCV] Improve SiFive7 for loads and stores
* Unit-stride loads and stores can operate at the full bandwidth of the
memory pipe. The memory pipe is DLEN bits wide.
* Strided loads and stores operate at one element per cycle and should
be scheduled accordingly.
* Indexed loads and stores operate at one element per cycle, and they
stall the machine until all addresses have been generated, so they
cannot be scheduled.
* Unit stride seg2 load is number of DLEN parts
* seg3-8 are one segment per cycle, unless the segment is larger
than DLEN in which each segment takes multiple cycles.
Differential Revision: https://reviews.llvm.org/D153475
Jon Chesterfield [Thu, 22 Jun 2023 17:14:40 +0000 (18:14 +0100)]
[libc] Move fences into outbox/wait-for-ownership test
Also moves the wait-until-inbox-changes test into a shared method.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D153573
Vitaly Buka [Thu, 22 Jun 2023 05:02:43 +0000 (22:02 -0700)]
[asan] Don't quarantine large blocks
Almost NFC, as blocks over max quarantine size will trigger immediate
drain anyway. In followup patches we can optimize passthrough case.
Reviewed By: thurston
Differential Revision: https://reviews.llvm.org/D153495
Fangrui Song [Thu, 22 Jun 2023 17:03:17 +0000 (10:03 -0700)]
[XRay] Make xray_instr_map compatible with Mach-O
The `__DATA,xray_instr_map` section has label differences like
`.quad Lxray_sled_0-Ltmp0` that is represented as a pair of UNSIGNED and SUBTRACTOR relocations.
LLVM integrated assembler attempts to rewrite A-B into A-B'+offset where B' can
be included in the symbol table. B' is called an atom and should be a
non-temporary symbol in the same section. However, since `xray_instr_map` does
not define a non-temporary symbol, the SUBTRACTOR relocation will have no
associated symbol, and its `r_extern` value will be 0. Therefore, we will see
linker errors like:
error: SUBTRACTOR relocation must be extern at offset 0 of __DATA,xray_instr_map in a.o
To fix this issue, we need to define a non-temporary symbol in the section. We
can accomplish this by renaming `Lxray_sleds_start0` to `lxray_sleds_start0`
("L" to "l").
`lxray_sleds_start0` serves as the atom for this dead-strippable subsection.
With the `S_ATTR_LIVE_SUPPORT` attribute, `ld -dead_strip` will retain
subsections that reference live functions.
Special thanks to Oleksii Lozovskyi for reporting the issue and providing
initial analysis.
Differential Revision: https://reviews.llvm.org/D153239
Sindhu Chittireddy [Mon, 19 Jun 2023 02:31:53 +0000 (19:31 -0700)]
[NFC] Fix potential dereferencing of nullptr.
Replace getAs with castAs and add assert if needed.
Differential revision: https://reviews.llvm.org/D153236
Igor Kirillov [Fri, 9 Jun 2023 16:03:22 +0000 (16:03 +0000)]
[CodeGen] Extend reduction support in ComplexDeinterleaving pass to support predication
Adds the capability to recognize SelectInst that appear in the IR.
These instructions are generated during scalable vectorization for reduction
and when the code contains conditions inside the loop body or when
"-prefer-predicate-over-epilogue=predicate-dont-vectorize" is set.
Differential Revision: https://reviews.llvm.org/D152558
Jun Zhang [Thu, 22 Jun 2023 16:38:41 +0000 (00:38 +0800)]
[libc][NFC] Simplify return value logic in set_thread_ptr()
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D153572
Jon Chesterfield [Thu, 22 Jun 2023 16:46:08 +0000 (17:46 +0100)]
[libc] Add memory fences to device-local locking calls
This makes the interface less error prone. The acquire was previously
forgotten. Release is currently missing if recv() is the last operation made
before close.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D153571
Craig Topper [Thu, 22 Jun 2023 16:38:46 +0000 (09:38 -0700)]
[RISCV] Use GPR register class for RV64 ZDInx. Remove GPRF64 register class.
The GPRF64 has the same spill size as GPR and is only used for RV64.
There's no real reason to have it as a separate class other than
for type inference for isel patterns in tablegen.
This patch adds f64 to the GPR register class when XLen=64. I use
f32 when XLen=32 even though we don't make use of it just to avoid
the oddity.
isel patterns have been updated to fix the lack of type infererence.
I might do similar for GPRF16 and GPRF32 or I might change them to
use an optimized spill size instead of always using XLen.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D153110
Yuanfang Chen [Wed, 21 Jun 2023 21:45:35 +0000 (14:45 -0700)]
[vscode-mlir] bump vsce version to 2.19.0
For https://bugs.chromium.org/p/llvm/issues/detail?id=46
Differential Revision: https://reviews.llvm.org/D153473
Craig Topper [Thu, 22 Jun 2023 16:22:58 +0000 (09:22 -0700)]
[RISCV] Move Zca/Zcb/Zcd/Zcf/Zcmp/Zcmt out of experimental status.
According to https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions
these were ratified in April 2023.
Reviewed By: VincentWu
Differential Revision: https://reviews.llvm.org/D153161
Christian Ulmann [Thu, 22 Jun 2023 14:53:21 +0000 (14:53 +0000)]
[mlir][LLVM] Fix empty res attr import
This commit ensures that an empty list of result attributes is not
imported as an empty `ArrayAttr`. Instead, the attribute is just not
added to the `LLVMFuncOp`.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D153553
Arthur Eubanks [Tue, 20 Jun 2023 23:38:55 +0000 (16:38 -0700)]
[SimplifyCFG] Add option to not speculate blocks
Required for phase ordering changes to not regress Rust code with D145265.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D153391
Arthur Eubanks [Wed, 21 Jun 2023 21:13:05 +0000 (14:13 -0700)]
[clang][LTO] Add flag to run verifier after every pass
Helps with debugging issues caught by the verifier.
Plumbed through both normal clang compile and ThinLTO.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D153468
Paul Kirth [Wed, 19 Apr 2023 17:13:36 +0000 (17:13 +0000)]
[RISCV] Strengthen atomic ordering for sequentially consistent stores
This is a similar change to one proposed for GCC:
https://inbox.sourceware.org/gcc-patches/
20230414170942.1695672-1-patrick@rivosinc.com/
The changes in this patch are based on the proposal by Hans Boehm to more
closely match the intended semantics for sequentially consistent stores
and to allow some platforms to avoid an ABI break when switching to more
performant atomic instructions. Platforms that have already compiled
code using the existing mappings will also have more time to gradually
replace that code in preparation of the switch.
Further details can be found in the psABI proposal:
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/378.
This patch implements a mapping that is stronger than the one outlined in table
A.6 of the RISC-V unprivileged spec to be future compatible with table A.7 of
the same document. The related discussion can be found at
https://lists.riscv.org/g/tech-unprivileged/topic/risc_v_memory_model_topics/
92916241
The major change to RISC-V code generation is that we will now emit a trailing
fence for sequentially consistent stores.
The new code sequence should have the following form:
```
fence rw,w; s{b|h|w|d}; fence rw,rw;
```
Other changes and optimizations like using amoswap will be handled separately.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D149486
Ties Stuij [Thu, 22 Jun 2023 14:47:42 +0000 (15:47 +0100)]
[ARM] add Thumb-1 8-bit movs/adds relocations to LLVM
This patch adds the LLVM-side plumbing for the following relocations:
- R_ARM_THM_ALU_ABS_G0_NC
- R_ARM_THM_ALU_ABS_G1_NC
- R_ARM_THM_ALU_ABS_G2_NC
- R_ARM_THM_ALU_ABS_G3
(see section 5.6.1.5, Static Thumb16 relocations, of the AArch32 ELF Arm ABI:
https://github.com/ARM-software/abi-aa/blob/
844a79fd4c77252a11342709e3b27b2c9f590cf1/aaelf32/aaelf32.rst#5615static-thumb16-relocations)
Which can respectivly be generated by prefixing assembly symbols with:
- :lower0_7:
- :lower8_15:
- :upper0_7:
- :upper8_15:
LLD support for these relocations will be added in a follow-up patch
Reviewed By: john.brawn, MaskRay
Differential Revision: https://reviews.llvm.org/D149443
David Truby [Thu, 22 Jun 2023 12:52:37 +0000 (13:52 +0100)]
[flang] add -flang-experimental-polymorphism flag to flang-new
This flag enables Fortran 2003 polymorphism. It is marked experimental
and not included in --help.
Reviewed By: tblah, awarzynski
Differential Revision: https://reviews.llvm.org/D153281
Simi Pallipurath [Thu, 22 Jun 2023 12:55:59 +0000 (13:55 +0100)]
Revert "Revert "[lld][Arm] Big Endian - Byte invariant support.""
This reverts commit
d8851384c6ac2a1cea15e05228dbde5f13654e23.
Reason: Applied the fix for the Asan buildbot failures.
Nikita Popov [Wed, 21 Jun 2023 15:04:41 +0000 (17:04 +0200)]
[SDAGBuilder] Handle multi-part arguments in argument copy elision (PR63430)
When eliding an argument copy, we need to update the chain to ensure
the argument reads are performed before later writes. However, the
code doing this only handled this for the first part of the argument.
If the argument had multiple parts, the chains of the later parts were
dropped. Make sure we preserve all chains.
Fixes https://github.com/llvm/llvm-project/issues/63430.
Nikita Popov [Thu, 22 Jun 2023 14:54:42 +0000 (16:54 +0200)]
[LoopUnroll] Avoid undef indices in test (NFC)
Doesn't really matter for the larger purpose of the test, but
avoid the use of undef indices and instead use the loop induction
variable as index, which is what was likely intended here.
Akash Banerjee [Tue, 20 Jun 2023 14:29:22 +0000 (15:29 +0100)]
[MLIR][OpenMP]Add Flang lowering support for device_ptr and device_addr clauses
Add lowering support for the use_device_ptr and use_Device_addr clauses for the Target Data directive.
Depends on D152822
Differential Revision: https://reviews.llvm.org/D152824
Akash Banerjee [Tue, 20 Jun 2023 13:57:27 +0000 (14:57 +0100)]
[MLIR][OpenMP] Minor change to assembly format for Target Data op
Minor reordering of clauses in the assembly format for Target Data op to make it closer to the OpenMP standard.
Differential Revision: https://reviews.llvm.org/D152822
Peter Klausler [Tue, 20 Jun 2023 20:45:22 +0000 (13:45 -0700)]
[flang] Fix bug with generic and homonymous specific module procedure
An unconditional EraseSymbol() call was deleting a generic interface symbol
when the generic had a module procedure of the same name as a specific
procedure, and the module procedure's definition appeared in the same
module. Also clean up some applications of the MODULE attribute to
symbols created along the way.
Differential Revision: https://reviews.llvm.org/D153478
Nikita Popov [Thu, 22 Jun 2023 10:15:58 +0000 (12:15 +0200)]
[InstCombine] Fold assume(false) to non-terminator unreachable
assume(false) is immediate UB, so fold it to (non-terminator)
unreachable.
Florian Hahn [Thu, 22 Jun 2023 14:36:31 +0000 (15:36 +0100)]
[LSR] Adjust test to make sure it keeps testing for the original issue.
Make sure the test keeps testing for the original issue after D153202.
Peter Klausler [Wed, 21 Jun 2023 18:37:53 +0000 (11:37 -0700)]
[flang] Fix USE with homonymous renaming
Fortran requires that a USE with renaming prevent the USE'd symbol
from also being associated into a scope without renaming. The
implementation in name resolution gets confused in the case of
a USE with renaming using the same name ("x => x"). Clean things
up. Fixes LLVM bug https://github.com/llvm/llvm-project/issues/63397.
Differential Revision: https://reviews.llvm.org/D153452
Nikita Popov [Thu, 22 Jun 2023 14:19:47 +0000 (16:19 +0200)]
[InstCombine] Remove instructions before non-terminator unreachable
Treat non-terminator unreachable the same as unreachable, and
remove guaranteed-to-transfer instructions before it.
Nikita Popov [Thu, 22 Jun 2023 14:29:17 +0000 (16:29 +0200)]
[InstCombine] Add additional instructions in non-term unreachable test (NFC)
Nikita Popov [Thu, 22 Jun 2023 14:25:49 +0000 (16:25 +0200)]
[InstCombine] Avoid UB in tests (NFC)
Peter Klausler [Wed, 21 Jun 2023 19:43:55 +0000 (12:43 -0700)]
[flang] Rewrite "1*j" to "(j)", not "j", when j is a variable
Expression folding currently unconditionally rewrites "1*j"
to "j", which is wrong when "j" is a variable, as it transforms
an expression into a variable and can lead to incorrect associations
in contexts like an actual argument or an ASSOCIATE selector.
Transform "1*j" to a parenthesized "(j)" when "j" is a variable.
Fixes LLVM bug https://github.com/llvm/llvm-project/issues/63259.
Differential Revision: https://reviews.llvm.org/D153457
Peter Klausler [Wed, 21 Jun 2023 20:35:53 +0000 (13:35 -0700)]
[flang] Fix looping on LEN type parameter usage
When a LEN type parameter of one PDT is being used as the value
of a LEN type parameter in another PDT, expression rewriting can
loop infinitely due to an incorrect assumption that the same PDT's
parameters are being referenced.
Fixes LLVM bug https://github.com/llvm/llvm-project/issues/63198
Differential Revision: https://reviews.llvm.org/D153465
Nikita Popov [Thu, 22 Jun 2023 13:37:33 +0000 (15:37 +0200)]
[InstCombine] Remove code after non-terminator unreachable
Instruction after a non-terminator unreachable are ... unreachable,
so remove them. Reuse the same logic we use for removing
instructions from dead blocks.
Nikita Popov [Thu, 22 Jun 2023 13:56:41 +0000 (15:56 +0200)]
[InstCombine] Add test for code after non-terminator unreachable (NFC)
Nikita Popov [Thu, 22 Jun 2023 13:50:22 +0000 (15:50 +0200)]
[InstCombine] Don't remove non-terminator unreachable markers
Even if the value happens to be undef, we should preserve these so
they get turned into an unreachable terminator later.
Peter Klausler [Wed, 21 Jun 2023 21:33:48 +0000 (14:33 -0700)]
[flang] Error recovery in bad specific MIN/MAX calls
When a specific MIN/MAX intrinsic function (e.g. MAX1) reference
has an actual argument error, ensure that a later attempt to fold
the call into a constant doesn't crash due to a missing argument.
Fixes https://github.com/llvm/llvm-project/issues/63140
Differential Revision: https://reviews.llvm.org/D153470
Nikita Popov [Thu, 22 Jun 2023 13:42:31 +0000 (15:42 +0200)]
[InstCombine] Avoid UB in tests (NFC)
Peter Klausler [Tue, 20 Jun 2023 22:40:35 +0000 (15:40 -0700)]
[flang] Rework name resolution of Cray pointer declarations
The current code has redundancy with the infrastructure for
declaration checking that can be replaced by better usage of
the parse tree walking framework. This also fixes LLVM flang
bug #58971.
Differential Revision: https://reviews.llvm.org/D153385
Peter Klausler [Tue, 20 Jun 2023 16:36:12 +0000 (09:36 -0700)]
wip
Paul Robinson [Wed, 21 Jun 2023 20:01:32 +0000 (13:01 -0700)]
[Headers][doc] Add various arith/logical intrinsic descriptions to avx2intrin.h
Differential Revision: https://reviews.llvm.org/D153462
Haojian Wu [Thu, 22 Jun 2023 12:35:28 +0000 (14:35 +0200)]
[include-cleaner] No need to overwrite the source file if there is no
cleanups
Mitch Phillips [Thu, 22 Jun 2023 12:28:42 +0000 (14:28 +0200)]
Revert "Revert "Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support"""
This reverts commit
9246df7049b0bb83743f860caff4221413c63de2.
Reason: This patch broke the UBSan buildbots. See more information in
the original phabricator review: https://reviews.llvm.org/D139092
Jay Foad [Thu, 22 Jun 2023 11:25:16 +0000 (12:25 +0100)]
[AMDGPU] Regenerate some checks
David Green [Thu, 22 Jun 2023 11:46:54 +0000 (12:46 +0100)]
[AArch64] Remove G_VECREDUCE_FADD from selectReduction
I believe that for fp reductions we can use the imported tablegen patterns for
selection, as opposed to going via selectReduction. Integer reductions are more
difficult, as the return types in selection DAG will be promoted to i32.
Differential Revision: https://reviews.llvm.org/D153244
Pravin Jagtap [Thu, 22 Jun 2023 11:03:47 +0000 (07:03 -0400)]
[AMDGPU] Switch to the new cl option amdgpu-atomic-optimizer-strategy.
Atomic optimizer is turned on by default through D152649. This patch
removes the usage of old command line option amdgpu-atomic-optimizations
and transfer the responsibility to `amdgpu-atomic-optimizer-strategy`.
We can safely remove old option when LLPC remove its all usage.
Reviewed By: foad, arsenm, #amdgpu, cdevadas
Differential Revision: https://reviews.llvm.org/D153007
Nikita Popov [Thu, 22 Jun 2023 11:00:02 +0000 (13:00 +0200)]
[LoopUnroll] Regenerate test checks (NFC)
Nikita Popov [Thu, 22 Jun 2023 10:23:46 +0000 (12:23 +0200)]
[InstCombine] Use CreateNonTerminatorUnreachable() helper
Create the standard non-terminator unreachable, rather than a
slight variation on it.
Jolanta Jensen [Tue, 20 Jun 2023 12:51:41 +0000 (12:51 +0000)]
[NFC SVE ACLE] Remove IR combines that no longer apply.
Remove IR combines that no longer apply after the SVE merging
intrinsics taking an all active predicate, have been canonicalised
to their equivalent undef (_u) variants.
Differential Revision: https://reviews.llvm.org/D153415
Matt Arsenault [Wed, 1 Feb 2023 00:27:51 +0000 (20:27 -0400)]
DAG: Expand legalization of is.fpclass to fcmp for DAZ
Try to use a compare with 0 if DAZ is assumed.
FPClassTest really needs to be marked as a bimask enum, but the API
for that is currently broken.