Michael Jones [Tue, 17 May 2022 18:28:16 +0000 (11:28 -0700)]
[libc] add printf base 10 integer conversion
This patch adds support for d, i, and u conversions in printf, as well
as comprehensive unit tests.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D125929
Siva Chandra Reddy [Thu, 9 Jun 2022 06:40:55 +0000 (06:40 +0000)]
[libc] Add compile options to pthread_create target.
The compile options now match that of thrd_create. Two compile options
are of importance:
1. -O3 - This is required so that stack is not used between the clone
syscall and the start function in the child thread.
2. -fno-omit-frame-pointer - This is required so that we can sniff out
the thread start args from the child thread's stack memory.
Without these two options, pthread_create will exhibit flaky behavior.
Reviewed By: lntue, michaelrj
Differential Revision: https://reviews.llvm.org/D127381
Michael Jones [Wed, 8 Jun 2022 20:11:02 +0000 (13:11 -0700)]
[libc] simplify printf converter tests
previously the printf converter tests reused the same string_writer,
which meant that each test depended on the tests before it to succeed.
This makes a new string_writer for each test to simplify and clarify the
tests.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D127341
Alex Brachet [Thu, 9 Jun 2022 16:55:37 +0000 (16:55 +0000)]
[clang] Allow CLANG_MODULE_CACHE_PATH env var to override module caching behavior
CLANG_MODULE_CACHE_PATH can be used to change where clang should
put the module cache, or can be set to "" to disable caching entirely.
Differential revision: https://reviews.llvm.org/D126678
Matthias Springer [Thu, 9 Jun 2022 16:37:21 +0000 (18:37 +0200)]
[mlir][bufferize][NFC] Decouple dropping of equivalent return values from bufferization
This simplifies the bufferization itself and is in preparation of connecting with the sparse compiler.
Differential Revision: https://reviews.llvm.org/D126814
Matthias Springer [Thu, 9 Jun 2022 16:30:42 +0000 (18:30 +0200)]
[mlir][bufferize] Fix bug in module equivalence analysis
CallOp result are not equivalent to an OpOperand if the OpOperand bufferizes out-of-place.
Differential Revision: https://reviews.llvm.org/D126813
Nico Weber [Thu, 9 Jun 2022 16:29:17 +0000 (12:29 -0400)]
[gn build] (manually) port
4ff5e8184c665
Fixes link of many binaries if RISCV is enabled but most other targets aren't.
Matthias Springer [Thu, 9 Jun 2022 16:24:58 +0000 (18:24 +0200)]
[mlir][bufferize] Decouple promoteBufferResultsToOutParams from One-Shot Bufferize
Users should explicitly run `-buffer-results-to-out-params` instead.
The purpose of this change is to remove `finalizeBuffers`, which made it difficult to extend the bufferization to custom buffer types.
Differential Revision: https://reviews.llvm.org/D126253
Matthias Springer [Thu, 9 Jun 2022 16:19:54 +0000 (18:19 +0200)]
[mlir][bufferization] Decouple buffer-deallocation from One-Shot Bufferize
The buffer deallocation pass must now be run explicitly when `allow-return-alloc` is set.
This results in a few extra buffer copies in unoptimized test cases. The proper way to avoid such copies is to relax the OpOperand/OpResult aliasing contract on ops such as scf.for. Some of these copies can also be avoided by improving the buffer deallocation pass.
Differential Revision: https://reviews.llvm.org/D126252
Kito Cheng [Thu, 9 Jun 2022 16:17:10 +0000 (00:17 +0800)]
[RISCV][NFC] Update testcase for D126861
Jonas Devlieghere [Thu, 9 Jun 2022 16:13:00 +0000 (09:13 -0700)]
[lldb] Add a reference to the "On Demand Symbols" docs.
Include a reference to the documentation for "on demand symbols" in the
documentation index. This will ensure the page shows up in the side bar
on the website.
Jonas Devlieghere [Thu, 9 Jun 2022 16:10:14 +0000 (09:10 -0700)]
[lldb] Add table with custom LLDB asserts to the docs
Add table with custom LLDB asserts to the documentation.
Differential revision: https://reviews.llvm.org/D127410
Jonas Devlieghere [Thu, 9 Jun 2022 15:41:03 +0000 (08:41 -0700)]
[lldb] Fix code blocks in docs/use/intel_pt.rst
Florian Mayer [Thu, 9 Jun 2022 15:49:03 +0000 (08:49 -0700)]
[NFC] change error message wording.
Florian Mayer [Tue, 7 Jun 2022 22:26:05 +0000 (15:26 -0700)]
[libcxx] improve LIBCXX_ABI_NAMESPACE error message
include the invalid LIBCXX_ABI_NAMESPACE to ease debugging.
Reviewed By: #libc, jloser, ldionne
Differential Revision: https://reviews.llvm.org/D127257
Kito Cheng [Thu, 9 Jun 2022 15:35:57 +0000 (23:35 +0800)]
[RISCV] Fix missing stack pointer recover
In order to make sure the stack point is right through the EH region,
we also need to restore stack pointer from the frame pointer if we
don't preserve stack space within prologue/epilogue for outgoing variables,
normally it's just checking the variable sized object is present or not
is enough, but we also don't preserve that at prologue/epilogue when
have vector objects in stack.
Example to show what happened:
```
try {
sp adjust for outgoing args. // 1. Sp changed.
func_call // 2. Exception raised
sp restore // Oh, not restored
} catch {
// 3. And now we are here.
}
// 4. Prepare to return!, restore return address from stack, but...sp is wrong.
// 5. Screw up!
```
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D126861
Kito Cheng [Thu, 9 Jun 2022 15:34:13 +0000 (23:34 +0800)]
[RISCV] Pre-commit testcase for PR55442
The testcase show the stack pointer isn't recovered when we got
exception from `_Z3fooiiiiiiiiiiPi`, and then we screw up due to
restore return address from wrong stack pointer.
NOTE:
Trigger conditions:
1. Frame pointer is required.
2. Stack has out-going argument
3. Vector extension is enabled.
Another run-able testcase:
$ clang++ -target riscv64-unknown-linux-gnu -march=rv64gcv test.cpp
```
void __attribute__((noinline)) foo(int, int, int, int, int, int, int, int, int, int, int *){
throw int(0);
}
int main(int argc, char **argv) {
int exception_value = 1;
try {
foo(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
} catch (int i) {
exception_value = i;
}
return exception_value;
}
```
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D126860
Yuanqiang Liu [Thu, 9 Jun 2022 15:23:25 +0000 (08:23 -0700)]
[MLIR][Shape] Generalize `shape.concat` to extent tensors
The operation `shape.concat` was used for type shape only.
We now enable it for extent tensors.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D127321
Jun Zhang [Thu, 9 Jun 2022 15:12:21 +0000 (23:12 +0800)]
[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder
The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.
This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:
clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();
JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }
Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D126781
Johannes Doerfert [Thu, 9 Jun 2022 15:02:57 +0000 (17:02 +0200)]
Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"
This reverts commit
da50dab1ae111e9e6cb0248a47a038b17f798705.
Patch broke AMD GPU OpenMP offload buildbots.
https://lab.llvm.org/buildbot/#/builders/193/builds/13246
Simon Moll [Thu, 9 Jun 2022 14:51:32 +0000 (16:51 +0200)]
[NFC] Clang-format PatternMatch.h
Johannes Doerfert [Tue, 10 May 2022 22:08:43 +0000 (18:08 -0400)]
[Attributor] Replace AAValueSimplify with AAPotentialValues
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.
This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.
`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.
We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good.
Fixes: https://github.com/llvm/llvm-project/issues/54981
Johannes Doerfert [Thu, 19 May 2022 18:51:07 +0000 (13:51 -0500)]
[Attributor] Try to delete stores and simplify stored values
By default we should try to eliminate unused stores and simplify values
stored while we are at it.
Johannes Doerfert [Thu, 19 May 2022 18:35:58 +0000 (13:35 -0500)]
[Attributor] Ensure to use the proper liveness AA
When determining liveness via Attributor::isAssumedDead(...) we might
end up without a liveness AA or with one pointing into another function.
Neither is helpful and we will avoid both from now on.
Jay Foad [Thu, 9 Jun 2022 14:00:49 +0000 (15:00 +0100)]
[AMDGPU] Add GFX11 test coverage for the memory legalizer
Levon [Thu, 9 Jun 2022 14:27:51 +0000 (16:27 +0200)]
Pass plugin_name in SBProcess::SaveCore
This CL allows to use minidump save-core functionality (https://reviews.llvm.org/D108233) via SBProcess interface.
After adding a support from gdb-remote client (https://reviews.llvm.org/D101329) if the plugin name is empty the plugin manager will try to save the core directly from the process plugin.
See https://github.com/llvm/llvm-project/blob/main/lldb/source/Core/PluginManager.cpp#L696
To have an ability to save the core with minidump plugin I added plugin name as a parameter in SBProcess::SaveCore.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D125325
Philip Reames [Thu, 9 Jun 2022 14:20:33 +0000 (07:20 -0700)]
[RISCV] Add cost model for reverse shuffle
The majority of the cost appears to be forming the indices vector.
Differential Revision: https://reviews.llvm.org/D127141
Florian Hahn [Thu, 9 Jun 2022 14:20:10 +0000 (15:20 +0100)]
Recommit "[SCEV] Look through single value PHIs." (take 3)
This reverts commit
1fbdbb559569641f6d509b569966901c8fb02b63.
All known issues surfaced by this patch should have been fixed now.
The fixes included fixing issues with SCEV expansion in LV and DA's
reliance on LCSSA phis.
Yitzhak Mandelbaum [Tue, 3 May 2022 15:53:35 +0000 (15:53 +0000)]
[clang][dataflow] Track `optional` contents in `optional` model.
This patch adds partial support for tracking (i.e. modeling) the contents of an
optional value. Specifically, it supports tracking the value after it is
accessed. We leave tracking constructed/assigned contents to a future patch.
Differential Revision: https://reviews.llvm.org/D124932
Gabor Marton [Wed, 8 Jun 2022 10:11:21 +0000 (12:11 +0200)]
[analyzer] Fix assertion failure after getKnownValue call
Depends on D126560. `getKnownValue` has been changed by the parent patch
in a way that simplification was removed. This is not correct when the
function is called by the Checkers. Thus, a new internal function is
introduced, `getConstValue`, which simply queries the constraint manager.
This `getConstValue` is used internally in the `SimpleSValBuilder` when a
binop is evaluated, this way we avoid the recursion into the `Simplifier`.
Differential Revision: https://reviews.llvm.org/D127285
Simon Moll [Thu, 9 Jun 2022 14:09:14 +0000 (16:09 +0200)]
[NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName". This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.
This is the alternative to the less invasive clang-format only patch: D126783
Reviewed By: spatel, rengolin
Differential Revision: https://reviews.llvm.org/D126889
Simon Pilgrim [Thu, 9 Jun 2022 13:46:48 +0000 (14:46 +0100)]
[DAG] combineInsertEltToShuffle - if EXTRACT_VECTOR_ELT fails to match an existing shuffle op, try to replace an undef op if there is one.
This should fix a number of shuffle regressions in D127115 where the re-ordered combines mean we fail to fold a EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT sequence into a BUILD_VECTOR if we extract from more than one vector source.
Johannes Doerfert [Thu, 9 Jun 2022 12:51:49 +0000 (14:51 +0200)]
[Attributor][FIX] Give registered simplification callbacks precedence
We accidentally checked for constants before we looked for registered
simplification callbacks. The latter needs to take precedence though.
Andrew Turner [Wed, 18 May 2022 16:21:36 +0000 (17:21 +0100)]
Fix TableLookupTest on FreeBSD
As with Linux placce the Counters array in the __libfuzzer_extra_counters
section. This fixes the test on FreeBSD.
Reviewed by: vitalybuka
Differential Revision: https://reviews.llvm.org/D125902
Matthias Springer [Thu, 9 Jun 2022 11:00:08 +0000 (13:00 +0200)]
[mlir][bufferization] Add OneShotBufferize transform op
This commit allows for One-Shot Bufferize to be used through the transform dialect. No op handle is currently returned for the bufferized IR.
Differential Revision: https://reviews.llvm.org/D125098
Yuki Okushi [Sun, 5 Jun 2022 03:18:47 +0000 (12:18 +0900)]
[docs] Update supported language standards list for C++
Differential Revision: https://reviews.llvm.org/D127065
Yuki Okushi [Thu, 2 Jun 2022 11:34:04 +0000 (20:34 +0900)]
[OpenMP] Fix the build on Windows
The code expanded from kmp_barrier.h uses some `KMP_INTERNAL_*`s,
so the definitions have to be placed before it.
Fixes #55815
Differential Revision: https://reviews.llvm.org/D126873
Timm Bäder [Thu, 9 Jun 2022 13:10:29 +0000 (15:10 +0200)]
[clang][tests] Add missing compiler name
The driver stripts the first argument. Without the compiler name, the
test depends on whether GCC_INSTALL_PREFIX is set or not.
See https://reviews.llvm.org/D125862
Haojian Wu [Thu, 9 Jun 2022 10:16:14 +0000 (12:16 +0200)]
[pseudo] Move grammar-related headers to a separate dir, NFC.
We did that for .cpp, but forgot the headers.
Differential Revision: https://reviews.llvm.org/D127388
Mark de Wever [Mon, 30 May 2022 16:34:15 +0000 (18:34 +0200)]
[libc++][CI] Updates Docker image.
- Updates the image to use Ubuntu Jammy.
- Installs GCC-12 as preparation to migrate to that GCC version.
NOTE: This is a re-application of
f2f0dba818a50, which was reverted
in
2b5e3ef83c3 due to an issue with the CI nodes. The CI nodes have
since then been updated and this appears to be fine.
Differential Revision: https://reviews.llvm.org/D126666
Aaron Ballman [Thu, 9 Jun 2022 12:52:07 +0000 (08:52 -0400)]
Use HTTPS links instead of HTTP ones in the C DR status page
Christian Kandeler [Thu, 9 Jun 2022 12:42:14 +0000 (14:42 +0200)]
[pseudo] Fix unit test build
Analogous to
632545e8ce846ccaeca8df15a3dc5e36d01a1275.
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D127397
Alex Zinenko [Thu, 9 Jun 2022 09:10:52 +0000 (11:10 +0200)]
[mlir] add producer fusion to structured transform ops
This relies on the existing TileAndFuse pattern for tensor-based structured
ops. It complements pure tiling, from which some utilities are generalized.
Depends On D127300
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D127319
Douglas Yung [Thu, 9 Jun 2022 12:25:43 +0000 (05:25 -0700)]
Revert "[lld-macho] Initial support for EH Frames"
This reverts commit
826be330af9c0a8553a5b32718ecd2d97e10438e.
This was causing a test failure on build bots:
- https://lab.llvm.org/buildbot/#/builders/36/builds/21770
- https://lab.llvm.org/buildbot/#/builders/58/builds/23913
Douglas Yung [Thu, 9 Jun 2022 12:24:28 +0000 (05:24 -0700)]
Revert "[lld-macho] Support EH frames under arm64"
This reverts commit
977d62c33e3343a394777c1754682761eebb66cd.
This change was causing crashes in 2 tests on the buildbots:
- https://lab.llvm.org/buildbot/#/builders/58/builds/23914
- https://lab.llvm.org/buildbot/#/builders/36/builds/21771
Sam McCall [Thu, 9 Jun 2022 12:18:04 +0000 (14:18 +0200)]
[pseudo] Don't clang-format test inputs. NFC
Haojian Wu [Thu, 9 Jun 2022 12:10:36 +0000 (14:10 +0200)]
[pseudo] Fix the missing-field-initializers warning from
f1ac00c9b0d1, NFC
Jose Manuel Monsalve Diaz [Wed, 1 Jun 2022 21:49:23 +0000 (21:49 +0000)]
[LIBOMPTARGET] Adding AMD to llvm-omp-device-info
Adding device information print for AMD devices on the
`llvm-omp-device-info` command line tool. The output is inspired by
the rocminfo command line tool.
This commit adds missing HSA functions, enums and structs
needed to query additional information from the HSA agents.
A generic message for the `generic-elf-64bit` plugin is also added
Example of an output:
```
llvm-omp-device-info
Device (0):
This is a generic-elf-64bit device
Device (1):
This is a generic-elf-64bit device
Device (2):
This is a generic-elf-64bit device
Device (3):
This is a generic-elf-64bit device
Device (4):
HSA Runtime Version: 1.1
HSA OpenMP Device Number: 0
Device Name: gfx906
Vendor Name: AMD
Device Type: GPU
Max Queues: 128
Queue Min Size: 64
Queue Max Size: 131072
Cache:
L0: 16384 bytes
L1: 8388608 bytes
Cacheline Size: 64
Max Clock Freq(MHz): 1725
Compute Units: 60
SIMD per CU: 4
Fast F16 Operation: TRUE
Wavefront Size: 64
Workgroup Max Size: 1024
Workgroup Max Size per Dimension:
x: 1024
y: 1024
z: 1024
Max Waves Per CU: 40
Max Work-item Per CU: 2560
Grid Max Size:
4294967295
Grid Max Size per Dimension:
x:
4294967295
y:
4294967295
z:
4294967295
Max fbarriers/Workgrp: 32
Memory Pools:
Pool GLOBAL; FLAGS: COARSE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GLOBAL; FLAGS: FINE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GROUP:
Size: 65536 bytes
Allocatable: FALSE
Runtime Alloc Granule: 0 bytes
Runtime Alloc alignment: 0 bytes
Accessable by all: FALSE
Device (5):
HSA Runtime Version: 1.1
HSA OpenMP Device Number: 1
Device Name: gfx906
Vendor Name: AMD
Device Type: GPU
Max Queues: 128
Queue Min Size: 64
Queue Max Size: 131072
Cache:
L0: 16384 bytes
L1: 8388608 bytes
Cacheline Size: 64
Max Clock Freq(MHz): 1725
Compute Units: 60
SIMD per CU: 4
Fast F16 Operation: TRUE
Wavefront Size: 64
Workgroup Max Size: 1024
Workgroup Max Size per Dimension:
x: 1024
y: 1024
z: 1024
Max Waves Per CU: 40
Max Work-item Per CU: 2560
Grid Max Size:
4294967295
Grid Max Size per Dimension:
x:
4294967295
y:
4294967295
z:
4294967295
Max fbarriers/Workgrp: 32
Memory Pools:
Pool GLOBAL; FLAGS: COARSE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GLOBAL; FLAGS: FINE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GROUP:
Size: 65536 bytes
Allocatable: FALSE
Runtime Alloc Granule: 0 bytes
Runtime Alloc alignment: 0 bytes
Accessable by all: FALSE
Device (6):
HSA Runtime Version: 1.1
HSA OpenMP Device Number: 2
Device Name: gfx906
Vendor Name: AMD
Device Type: GPU
Max Queues: 128
Queue Min Size: 64
Queue Max Size: 131072
Cache:
L0: 16384 bytes
L1: 8388608 bytes
Cacheline Size: 64
Max Clock Freq(MHz): 1725
Compute Units: 60
SIMD per CU: 4
Fast F16 Operation: TRUE
Wavefront Size: 64
Workgroup Max Size: 1024
Workgroup Max Size per Dimension:
x: 1024
y: 1024
z: 1024
Max Waves Per CU: 40
Max Work-item Per CU: 2560
Grid Max Size:
4294967295
Grid Max Size per Dimension:
x:
4294967295
y:
4294967295
z:
4294967295
Max fbarriers/Workgrp: 32
Memory Pools:
Pool GLOBAL; FLAGS: COARSE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GLOBAL; FLAGS: FINE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GROUP:
Size: 65536 bytes
Allocatable: FALSE
Runtime Alloc Granule: 0 bytes
Runtime Alloc alignment: 0 bytes
Accessable by all: FALSE
Device (7):
HSA Runtime Version: 1.1
HSA OpenMP Device Number: 3
Device Name: gfx906
Vendor Name: AMD
Device Type: GPU
Max Queues: 128
Queue Min Size: 64
Queue Max Size: 131072
Cache:
L0: 16384 bytes
L1: 8388608 bytes
Cacheline Size: 64
Max Clock Freq(MHz): 1725
Compute Units: 60
SIMD per CU: 4
Fast F16 Operation: TRUE
Wavefront Size: 64
Workgroup Max Size: 1024
Workgroup Max Size per Dimension:
x: 1024
y: 1024
z: 1024
Max Waves Per CU: 40
Max Work-item Per CU: 2560
Grid Max Size:
4294967295
Grid Max Size per Dimension:
x:
4294967295
y:
4294967295
z:
4294967295
Max fbarriers/Workgrp: 32
Memory Pools:
Pool GLOBAL; FLAGS: COARSE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GLOBAL; FLAGS: FINE GRAINED, :
Size:
34342961152 bytes
Allocatable: TRUE
Runtime Alloc Granule: 4096 bytes
Runtime Alloc alignment: 4096 bytes
Accessable by all: FALSE
Pool GROUP:
Size: 65536 bytes
Allocatable: FALSE
Runtime Alloc Granule: 0 bytes
Runtime Alloc alignment: 0 bytes
Accessable by all: FALSE
```
Differential Revision: https://reviews.llvm.org/D126836
Benjamin Kramer [Thu, 9 Jun 2022 11:43:47 +0000 (13:43 +0200)]
AMDGPU/GISel: Remove unused variable. NFC.
Nathan Sidwell [Mon, 6 Jun 2022 12:27:10 +0000 (05:27 -0700)]
[clang][pr55896]:co_yield/co_await thread-safety
co_await and co_yield are represented by (classes derived from)
CoroutineSuspendExpr. That has a number of child nodes, not all of
which are used for code-generation. In particular the operand is
represented multiple times, and, like the problem with co_return
(55406) it must only be emitted in the CFG exactly once. The operand
also appears inside OpaqueValueExprs, but that's ok.
This adds a visitor for SuspendExprs to emit the required children in
the correct order. Note that this CFG is pre-coro xform. We don't
have initial or final suspend points.
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D127236
Johannes Doerfert [Thu, 19 May 2022 18:27:08 +0000 (13:27 -0500)]
[Attributor][NFC] Improve debug code and comments
Johannes Doerfert [Thu, 19 May 2022 18:12:49 +0000 (13:12 -0500)]
[Attributor] Add checks needed as we strengthen value simplify
Johannes Doerfert [Thu, 19 May 2022 18:09:05 +0000 (13:09 -0500)]
[Attributor] Look at base values for align, nonnull, and deref
Stripping bitcasts and 0-geps helps normalization and minimizes the
impact of a follow up change.
Johannes Doerfert [Thu, 19 May 2022 17:53:41 +0000 (12:53 -0500)]
[Attributor] Simplify loads from constant globals
If a global is constant and the initializer is known we can simplify
loads from it as the value has to be the initializer.
Martin Storsjö [Thu, 9 Jun 2022 11:09:40 +0000 (14:09 +0300)]
[lldb] Silence a GCC warning about missing returns after a fully covered switch. NFC.
Alvin Wong [Thu, 9 Jun 2022 08:44:49 +0000 (11:44 +0300)]
[lldb] Add gnu-debuglink support for Windows PE/COFF
The specification of gnu-debuglink can be found at:
https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
The file CRC or the CRC value from the .gnu_debuglink section is now
used to calculate the module UUID as a fallback, to allow verifying that
the debug object does match the executable. Note that if a CodeView
build id exists, it still takes precedence. This works even for MinGW
builds because LLD writes a synthetic CodeView build id which does not
get stripped from the debug object.
The `Minidump/Windows/find-module` test also needs a fix by adding a
CodeView record to the exe to match the one in the minidump, otherwise
it fails due to the new UUID calculated from the file CRC.
Fixes https://github.com/llvm/llvm-project/issues/54344
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D126367
Nicolai Hähnle [Fri, 29 Apr 2022 18:21:56 +0000 (13:21 -0500)]
AMDGPU/GISel: Introduce custom legalization of G_MUL
The generic legalizer framework is still used to reduce the problem
to scalar multiplication with the bit size a multiple of 32.
Generating optimal code sequences for big integer multiplication is
somewhat tricky and has a number of target-specific intricacies:
- The target has V_MAD_U64_U32 instructions that multiply two 32-bit
factors and add a 64-bit accumulator. Most partial products should
use this instruction.
- The accumulator is mapped to consecutive 32-bit GPRs, and partial-
product multiply-adds can feed the accumulator into each other
directly. (The register allocator's support for that is somewhat
limited, but that only matters for 128-bit integers and larger.)
- OTOH, on some hardware, V_MAD_U64_U32 requires the accumulator
to be stored in an even-aligned pair of GPRs. To avoid excessive
register copies, it makes sense to compute odd partial products
separately from even partial products (where a partial product
src0[j0] * src1[j1] is "odd" if j0 + j1 is odd) and add both
halves together as a final step.
- We can combine G_MUL+G_ADD into a single cascade of multiply-adds.
- The target can keep many carry-bits in flight simultaneously, so
combining carries using G_UADDE is preferable over G_ZEXT + G_ADD.
- Not addressed by this patch: When the factors are sign-extended,
the V_MAD_I64_I32 instruction (signed version!) can be used.
It is difficult to address these points generically:
1) Finding matching pairs of G_MUL and G_UMULH to find a wide
multiply is expensive. We could add a G_UMUL_LOHI generic instruction
and conditionally use that in the generic legalizer, but by itself
this wouldn't allow us to use the accumulation capability of
V_MAD_U64_U32. One could attempt to find matching G_ADD + G_UADDE
post-legalization, but this is also expensive.
2) Similarly, making sense of the legalization outcome of a wide
pre-legalization G_MUL+G_ADD pair is extremely expensive.
3) How could the generic legalizer possibly deal with the
particular idiosyncracy of "odd" vs. "even" partial products.
All this points in the direction of directly emitting an ideal code
sequence during legalization, but the generic legalizer should not
be burdened with such overly target-specific concerns. Hence, a
custom legalization.
Note that the implemented approach is different from that used by
SelectionDAG because narrowing of scalars works differently in
general. SelectionDAG iteratively cuts wide scalars into low and
high halves until a legal size is reached. By contrast, GlobalISel
does the narrowing in a single shot, which should be better for
compile-time and for the quality of the generated code.
This patch leaves three gaps open:
1. When the factors are uniform, we should execute the multiplication on
the SALU. Register bank mapping already ensures this.
However, the resulting code sequence is not optimal because it doesn't
fully use the carry-in capabilities of S_ADDC_U32. (V_MAD_U64_U32
doesn't have a carry-in.) It is very difficult to fix this after the
fact, so we should really use a different legalization sequence in
this case. Unfortunately, we don't have a divergence analysis and so
cannot make that choice.
(This only matters for 128-bit integers and larger.)
2. Avoid unnecessary multiplies when sources are known to be zero- or
sign-extended. The challenge is that the legalizer does not currently
have access to GISelKnownBits.
3. When the G_MUL is followed by a G_ADD, we should consider combining
the two instructions into a single multiply-add sequence, to utilize
the accumulator of V_MAD_U64_U32 fully. (Unless the multiply has
multiple uses and the implied duplication of the multiply is an
overall negative). However, this is also not true when the factors
are uniform: in that case, it is generally better to *not* combine
the two operations, so that the multiply can be done on the SALU.
Again, we don't have a divergence analysis available and so cannot
make an informed choice.
Differential Revision: https://reviews.llvm.org/D124844
Simon Pilgrim [Thu, 9 Jun 2022 11:18:14 +0000 (12:18 +0100)]
[X86] canonicalizeShuffleWithBinOps - add TODO for X86ISD::ANDNP bitwise handling
Its just as safe to move shuffles across X86ISD::ANDNP as any other logical bitop, they just tend to appear too late to matter.
Noticed while triaging D127115 regressions.
Benjamin Kramer [Thu, 9 Jun 2022 11:09:46 +0000 (13:09 +0200)]
Fix complex.conj integration test
- It doesn't actually print the fractional part if the result is a whole number
- One of the expectations was just wrong
Florian Hahn [Thu, 9 Jun 2022 11:05:37 +0000 (12:05 +0100)]
[VPlan] Replace remaining use of needsScalarIV.
All information is already available in VPlan. Note that there are some
test changes, because we now can correctly look through instructions
like truncates to analyze the actual users.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D123541
Jose Manuel Monsalve Diaz [Thu, 9 Jun 2022 10:46:03 +0000 (10:46 +0000)]
Revert "[LIBOMPTARGET] Adding AMD to llvm-omp-device-info"
This reverts commit
d16a0877d8ac12a49fc75ae651247f338d46fead.
Matheus Izvekov [Tue, 24 May 2022 16:21:34 +0000 (18:21 +0200)]
cmake: use llvm dir variables for clang/utils/hmaptool
Copy hmaptool using the paths for CURRENT_TOOLS_DIR, so
everything goes in the right place in case llvm is included
from a top level CMakeLists.txt.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D126308
Paul Pluzhnikov [Thu, 9 Jun 2022 10:11:25 +0000 (12:11 +0200)]
[clangd] Minor refactor of CanonicalIncludes::addSystemHeadersMapping.
Before commit
b3a991df3cd6a SystemHeaderMap used to be a vector.
Commit
b3a991df3cd6a changed it into a map, but neglected to remove
duplicate keys (e.g. "bits/typesizes.h", "include/stdint.h", etc.).
To prevent confusion, remove all duplicates, build HeaderMapping
one pair at a time and assert() that no duplicates are found.
Change by Paul Pluzhnikov (ppluzhnikov)!
Reviewed By: ilya-biryukov
Differential Revision: https://reviews.llvm.org/D125742
Kiran Chandramohan [Thu, 9 Jun 2022 10:01:42 +0000 (10:01 +0000)]
[Flang] Temporary fix for conversion materialization
Simply add a source and target materialization handler that do nothing
and that override the default handlers that would add illegal
LLVM::DialectCastOp otherwise.
This is the simplest workaround, but not an actual fix, something may be
inconsistent after D82831 (most likely fir lowering to llvm happens in a
way that mlir infrastructure is not expecting in D82831).
Here is a minimal reproducer of what the issue was:
```
func @foop(%a : !fir.real<4>) -> ()
func @bar(%a : !fir.real<2>) {
%1 = fir.convert %a : (!fir.real<2>) -> !fir.real<4>
call @foop(%1) : (!fir.real<4>) -> ()
return
}
```
tco -o - output was:
```
error: 'llvm.mlir.cast' op type must be non-index integer types, float types, or vector of mentioned types.
llvm.func @foop(!llvm.float)
llvm.func @bar(%arg0: !llvm.half) {
%0 = llvm.fpext %arg0 : !llvm.half to !llvm.float
%1 = llvm.mlir.cast %0 : !llvm.float to !fir.real<4>
llvm.call @foop(%1) : (!fir.real<4>) -> ()
llvm.return
}
```
This patch disable the introduction of the llvm.mlir.cast and preserve the previous behavior.
Also fixes https://github.com/llvm/llvm-project/issues/55210.
Note: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D127212
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Haojian Wu [Thu, 9 Jun 2022 09:42:29 +0000 (11:42 +0200)]
[pseudo] Add grammar annotations support.
Add annotation handling ([key=value]) in the BNF grammar parser, which
will be used in the conditional reduction, and error recovery.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D126536
Denis Antrushin [Thu, 9 Jun 2022 09:35:52 +0000 (16:35 +0700)]
[FixupStatepoints] Precommit test for D127308. NFC
Johannes Doerfert [Thu, 19 May 2022 17:35:52 +0000 (12:35 -0500)]
[Attributor] Generalize interface from ConstantInt to Constant
We can use constant to allow undef and there is no need to force
integers in the API anyway. The user can decide if a non integer
constant is fine or not.
Johannes Doerfert [Thu, 19 May 2022 16:20:21 +0000 (11:20 -0500)]
[Attributor][FIX] Replace call site argument uses, not values
We need to be careful replacing values as call site arguments
(IRPosition::IRP_CALL_SITE_ARGUMENT) is representing a use and not a
value. This patch replaces the interface to take a IR position instead
making it harder to misuse accidentally. It does not change our tests
right now but a follow up exposed the potential footgun.
Johannes Doerfert [Thu, 19 May 2022 16:03:01 +0000 (11:03 -0500)]
[Attributor] Simplify (integer range) state handling
We used to be very conservative when integer states were merged.
Instead of adding the known range (which is large due to uncertainty)
into the assumed range (which is hopefully small), we can also only
allow to merge in both at the same time into their respective
counterpart. This will ensure we keep the invariant that assumed is part
of known.
Johannes Doerfert [Tue, 10 May 2022 22:05:18 +0000 (18:05 -0400)]
[Attributor][NFC] Introduce helper struct
We often use a context associated with a value. For now only one use
case has been changed.
Johannes Doerfert [Wed, 13 Apr 2022 17:35:16 +0000 (12:35 -0500)]
[Attributor][FIX] Avoid metadata and duplicate replication assertion
When we recreate instructions as part of simplification we need to take
care of debug metadata and replacing the value multiple times. For now,
we handle both conservatively.
Biplob Mishra [Thu, 9 Jun 2022 09:58:30 +0000 (10:58 +0100)]
[InstCombine] Combine instructions of type or/and where AND masks can be combined.
The patch simplifies some of the patterns as below
(A | (B & C0)) | (B & C1) -> A | (B & C0|C1)
((B & C0) | A) | (B & C1) -> (B & C0|C1) | A
In some scenarios like byte reverse on half word, we can see this pattern multiple times and this conversion can optimize these patterns.
Additionally this commit fixes the issue reported with the test case.
int f(int a, int b) {
int c = ((unsigned char)(a >> 23) & 925);
if (a)
c = (a >> 23 & b) | ((unsigned char)(a >> 23) & 925) | (b >> 23 & 157);
return c;
}
The previous revision/commit did not check one-use of an intermediate value that this transform re-uses.
When that value has another use, an existing transform will try to invert the transform here.
By adding one-use checks, we avoid the infinite loops seen with the earlier commit.
Differential Revision: https://reviews.llvm.org/D124119
Andrzej Warzynski [Fri, 3 Jun 2022 10:29:30 +0000 (10:29 +0000)]
[flang] Add RUN lines using `fir-opt`
In tests that define a pass pipeline to use, add a RUN line using fir-opt.
Differential Revision: https://reviews.llvm.org/D126955
Kiran Chandramohan [Thu, 9 Jun 2022 09:34:21 +0000 (09:34 +0000)]
[Flang][OpenMP] Lower schedule modifiers for worksharing loop
Add support for lowering the schedule modifiers (simd, monotonic,
non-monotonic) in worksharing loops.
Note: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D127311
Co-authored-by: Mats Petersson <mats.petersson@arm.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Alex Zinenko [Thu, 9 Jun 2022 09:10:32 +0000 (11:10 +0200)]
[mlir] Introduce Transform ops for loops
Introduce transform ops for "for" loops, in particular for peeling, software
pipelining and unrolling, along with a couple of "IR navigation" ops. These ops
are intended to be generalized to different kinds of loops when possible and
therefore use the "loop" prefix. They currently live in the SCF dialect as
there is no clear place to put transform ops that may span across several
dialects, this decision is postponed until the ops actually need to handle
non-SCF loops.
Additionally refactor some common utilities for transform ops into trait or
interface methods, and change the loop pipelining to be a returning pattern.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D127300
Andrzej Warzynski [Wed, 8 Jun 2022 14:37:12 +0000 (14:37 +0000)]
[flang][driver] Generate run-time type info
This is a small follow-up for https://reviews.llvm.org/D120051. It makes
sure that tables with "run-time type information for derived types" are
generated for code-gen actions. Originally, only non-code-gen actions
were updated (i.e. actions that were fully supported at that time).
Differential Revision: https://reviews.llvm.org/D127307
Haojian Wu [Wed, 8 Jun 2022 09:59:40 +0000 (11:59 +0200)]
[pseudo] Simplify the glrReduce implementation.
glrReduce maintains two priority queues (one for bases, and the other
for Sequence), these queues are in parallel with each other, corresponding to a
single family. They can be folded into one.
This patch removes the bases priority queue, which improves the glrParse by
10%.
ASTReader.cpp: 2.03M/s (before) vs 2.26M/s (after)
Differential Revision: https://reviews.llvm.org/D127283
owenca [Thu, 9 Jun 2022 00:07:04 +0000 (17:07 -0700)]
[clang-format][NFC] Format lib/Format and unittests/Format in clang
Reformat these directories with InsertBraces and RemoveBracesLLVM.
Differential Revision: https://reviews.llvm.org/D127366
Haojian Wu [Wed, 8 Jun 2022 11:26:53 +0000 (13:26 +0200)]
[pseudo] Remove the explicit Accept actions.
As pointed out in the previous review section, having a dedicated accept
action doesn't seem to be necessary. This patch implements the the same behavior
without accept acction, which will save some code complexity.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D125677
Haojian Wu [Thu, 9 Jun 2022 09:18:03 +0000 (11:18 +0200)]
[pseudo] Fix a sign-compare warning in debug build, NFC.
lewuathe [Thu, 9 Jun 2022 09:11:44 +0000 (11:11 +0200)]
[mlir][complex] Correctness check for complex.conj
Add correctness check for complex.conj operation
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D127377
Sam Parker [Thu, 9 Jun 2022 08:41:44 +0000 (08:41 +0000)]
[ARM][ParallelDSP] Fix self reference bug
Ensure we don't generate a smlad intrinsic that takes itself as an
argument.
Differential Revision: https://reviews.llvm.org/D127213
Guillaume Chatelet [Wed, 8 Jun 2022 09:39:49 +0000 (09:39 +0000)]
[SelectionDAG] Handle bzero/memset libcalls globally instead of per target
Differential Revision: https://reviews.llvm.org/D127279
Florian Hahn [Thu, 9 Jun 2022 08:24:06 +0000 (09:24 +0100)]
[cmake] Add missing dependencies to objlib in add_llvm_executable.
After
f06abbb393800b0d466c88e283c06f75561c432c I have been seeing build
failures due to the obj.clang target missing a dependency on
tools/clang/clang-tablegen-targets.
This appears to be due to the fact that LLVM_COMMON_DEPENDS are not added
as dependencies to the object library.
This patch uses the same logic as llvm_add_library to register
dependencies for object libraries.
Reviewed By: beanz, abrachet, steven_wu
Differential Revision: https://reviews.llvm.org/D127318
Chenbing Zheng [Thu, 9 Jun 2022 08:15:42 +0000 (16:15 +0800)]
[InstCombine] improve fold for icmp-ugt-ashr
Existing condition for
fold icmp ugt (ashr X, ShAmtC), C --> icmp ugt X, ((C + 1) << ShAmtC) - 1
missed some boundary. It cause this fold don't work for some cases, and the
reason is due to signed number overflow.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D127188
Lian Wang [Wed, 8 Jun 2022 01:52:29 +0000 (01:52 +0000)]
[RISCV][VP] Add fp test of widen and split for vp.setcc
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D127079
Nikita Popov [Wed, 8 Jun 2022 10:51:29 +0000 (12:51 +0200)]
[IndVarSimplify] Don't assert that terminator is not SCEVable (PR55925)
The IV widening code currently asserts that terminators aren't SCEVable
-- however, this is not the case for invokes with a returned attribute.
As far as I can tell, this assertions is not necessary -- even if we
have a critical edge (the second test case), the trunc gets inserted
in a legal position.
Fixes https://github.com/llvm/llvm-project/issues/55925.
Differential Revision: https://reviews.llvm.org/D127288
Nicolai Hähnle [Fri, 29 Apr 2022 23:59:37 +0000 (18:59 -0500)]
ADT/ArrayRef: Add makeMutableArrayRef overloads
Equivalent overloads already exist for makeArrayRef.
Differential Revision: https://reviews.llvm.org/D126421
Lian Wang [Thu, 9 Jun 2022 07:34:54 +0000 (07:34 +0000)]
[RISCV][test] Add widen STEP_VECTOR tests.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127371
Christian Kandeler [Thu, 9 Jun 2022 07:38:12 +0000 (03:38 -0400)]
[include-cleaner] Fix build error in unit test
Reviewed By: nridge
Differential Revision: https://reviews.llvm.org/D127217
lorenzo chelini [Thu, 9 Jun 2022 07:05:44 +0000 (09:05 +0200)]
[MLIR][Math] Re-order conversions alphabetically (NFC)
Minor follow-up after: D127286 (https://reviews.llvm.org/D127286/new/)
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D127382
Matthias Gehre [Wed, 8 Jun 2022 09:31:52 +0000 (10:31 +0100)]
clang: Introduce -fexperimental-max-bitint-width
This splits of the introduction of -fexperimental-max-bitint-width
from https://reviews.llvm.org/D122234
because that PR is still blocked on discussions on the backend side.
I was asked [0] to upstream at least the flag.
[0] https://github.com/llvm/llvm-project/commit/
09854f2af3b914b616f29cb640bede3a27cf7c4e#commitcomment-
75116619
Differential Revision: https://reviews.llvm.org/D127287
Bruno Cardoso Lopes [Wed, 8 Jun 2022 07:54:05 +0000 (00:54 -0700)]
[Clang][CoverageMapping] Fix compile time explosions by adjusting only appropriated skipped ranges
D83592 added comments to be part of skipped regions, and as part of that, it
also shrinks a skipped range if it spans a line that contains a non-comment
token. This is done by `adjustSkippedRange`.
The `adjustSkippedRange` currently runs on skipped regions that are not
comments, causing a 5min regression while building a big C++ files without any
comments.
Fix the compile time introduced in D83592 by tagging SkippedRange with kind
information and use that to decide what needs additional processing.
Differential Revision: https://reviews.llvm.org/D127338
David Carlier [Thu, 9 Jun 2022 05:07:26 +0000 (06:07 +0100)]
[Sanitizers] prctl interception update for the PR_SET_VMA option case.
Supports on Android but also from Linux 5.17
Reviewers: vitalybuka, eugenis
Reviewed-By: vitalybuka
Differential Revision: https://reviews.llvm.org/D127326
Fangrui Song [Thu, 9 Jun 2022 04:11:54 +0000 (21:11 -0700)]
[DeadArgElim] Remove dead code after r128810
Jez Ng [Thu, 9 Jun 2022 03:41:29 +0000 (23:41 -0400)]
[lld-macho] Support EH frames under arm64
For arm64, llvm-mc emits relocations for the target function
address like so:
ltmp:
<CIE start>
...
<CIE end>
... multiple FDEs ...
<FDE start>
<target function address - (ltmp + pcrel offset)>
...
If any of the FDEs in `multiple FDEs` get dead-stripped, then `FDE start`
will move to an earlier address, and `ltmp + pcrel offset` will no longer
reflect an accurate pcrel value. To avoid this problem, we "canonicalize"
our relocation by adding an `EH_Frame` symbol at `FDE start`, and updating
the reloc to be `target function address - (EH_Frame + new pcrel offset)`.
Reviewed By: #lld-macho, Roger
Differential Revision: https://reviews.llvm.org/D124561
Jez Ng [Thu, 9 Jun 2022 03:40:52 +0000 (23:40 -0400)]
[lld-macho] Initial support for EH Frames
== Background ==
`llvm-mc` generates unwind info in both compact unwind and DWARF
formats. LLD already handles the compact unwind format; this diff gets
us close to handling the DWARF format properly.
== Caveats ==
It's not quite done yet, but I figure it's worth getting this reviewed
and landed first as it's shaping up to be a fairly large code change.
**Known limitations of the current code:**
* Only works for x86_64, for which `llvm-mc` emits "abs-ified"
relocations as described in https://github.com/llvm/llvm-project/commit/
618def651b59bd42c05bbd91d825af2fb2145683.
`llvm-mc` emits regular relocations for ARM EH frames, which we do not
yet handle correctly.
Since the feature is not ready for real use yet, I've gated it behind a
flag that only gets toggled on during test suite runs. With most of the
new code disabled, we see just a hint of perf regression, so I don't
think it'd be remiss to land this as-is:
base diff difference (95% CI)
sys_time 1.926 ± 0.168 1.979 ± 0.117 [ -1.2% .. +6.6%]
user_time 3.590 ± 0.033 3.606 ± 0.028 [ +0.0% .. +0.9%]
wall_time 7.104 ± 0.184 7.179 ± 0.151 [ -0.2% .. +2.3%]
samples 30 31
== Design ==
Like compact unwind entries, EH frames are also represented as regular
ConcatInputSections that get pointed to via `Defined::unwindEntry`. This
allows them to be handled generically by e.g. the MarkLive and ICF
code. (But note that unlike compact unwind subsections, EH frame
subsections do end up in the final binary.)
In order to make EH frames "look like" a regular ConcatInputSection,
some processing is required. First, we need to split the `__eh_frame`
section along EH frame boundaries rather than along symbol boundaries.
We do this by decoding the length field of each EH frame. Second, the
abs-ified relocations need to be turned into regular Relocs.
== Next Steps ==
In order to support EH frames on ARM targets, we will either have to
teach LLD how to handle EH frames with explicit relocs, or we can try to
make `llvm-mc` emit abs-ified relocs for ARM as well. I'm hoping to do
the latter as I think it will make the LLD implementation both simpler
and faster to execute.
== Misc ==
The `obj-file-with-stabs.s` test had to be updated as the previous
version would trip assertion errors in the code. It appears that in our
attempt to produce a minimal YAML test input, we created a file with
invalid EH frame data. I've fixed this by re-generating the YAML and not
doing any hand-pruning of it.
Reviewed By: #lld-macho, Roger
Differential Revision: https://reviews.llvm.org/D123435
Mogball [Thu, 9 Jun 2022 03:02:21 +0000 (03:02 +0000)]
[mlir][ods] Mark StructAttr as deprecated
chenglin.bi [Thu, 9 Jun 2022 03:13:56 +0000 (11:13 +0800)]
[InstCombine] Add vector tests for shl+lshr+and transforms; NFC
D126617
Fangrui Song [Thu, 9 Jun 2022 03:08:05 +0000 (20:08 -0700)]
[msan][test] Fix cpusetsize for another pthread_getaffinity_np.cpp test
Similar to D127368