Sanjay Patel [Tue, 26 Jul 2022 19:31:12 +0000 (15:31 -0400)]
[AggressiveInstCombine] convert sqrt libcalls with "nnan" to sqrt intrinsics
This is an alternate to
D129155 that uses TTI.haveFastSqrt() to avoid a
potential miscompile for programs with reads of errno. Moving the transform
to AggressiveInstCombine provides access to TTI.
If a sqrt call has "nnan", that implies that the input argument is never
negative because sqrt of {negative number} --> NAN.
If the argument is never negative and the call can be lowered without a
libcall, then we can assume that errno accesses are unchanged after lowering,
so the call can be translated to the LLVM intrinsic (which is expected to
become inline code).
This affects codegen for targets like x86 that have sqrt instructions, but
still have to conservatively assume that a libcall may be needed to set
errno as shown in issue #52620 and issue #56383.
This patch won't solve those examples - we will need to extend this to use
CannotBeOrderedLessThanZero or similar, enhance that analysis for new
operators, and/or deal with llvm.assume too.
Differential Revision: https://reviews.llvm.org/
D129167
Shilei Tian [Tue, 26 Jul 2022 19:39:00 +0000 (15:39 -0400)]
[Clang][Doc] Update the release note for clang
Add the support for `atomic compare` and `atomic compare capture` in the
release note of clang.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/
D129211
Danny Mösch [Sun, 17 Jul 2022 19:28:36 +0000 (21:28 +0200)]
[clang] Pass FoundDecl to DeclRefExpr creator for operator overloads
Without the "found declaration" it is later not possible to know where the operator declaration
was brought into the scope calling it.
The initial motivation for this fix came from #55095. However, this also has an influence on
`clang -ast-dump` which now prints a `UsingShadow` attribute for operators only visible through
`using` statements. Also, clangd now correctly references the `using` statement instead of the
operator directly.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/
D129973
Walter Erquinigo [Tue, 26 Jul 2022 18:44:50 +0000 (11:44 -0700)]
Move GetControlFlowKind's logic to DisassemblerLLVMC.cpp
This diff move the logic of `GetControlFlowKind()` from Disassembler.cpp to DisassemblerLLVMC.cpp.
Here's details:
- Actual logic of GetControlFlowKind() move to `DisassemblerLLVMC.cpp`, and we can check underlying architecture using `DisassemblerScope` there.
- With this change, passing 'triple' to `GetControlFlowKind()` is no more required.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/
D130320
Walter Erquinigo [Mon, 18 Jul 2022 23:56:01 +0000 (16:56 -0700)]
[trace][intel pt] Introduce wall clock time for each trace item
- Decouple TSCs from trace items
- Turn TSCs into events just like CPUs. The new name is HW clock tick, wich could be reused by other vendors.
- Add a GetWallTime that returns the wall time that the trace plug-in can infer for each trace item.
- For intel pt, we are doing the following interpolation: if an instruction takes less than 1 TSC, we use that duration, otherwise, we assume the instruction took 1 TSC. This helps us avoid having to handle context switches, changes to kernel, idle times, decoding errors, etc. We are just trying to show some approximation and not the real data. For the real data, TSCs are the way to go. Besides that, we are making sure that no two trace items will give the same interpolation value. Finally, we are using as time 0 the time at which tracing started.
Sample output:
```
(lldb) r
Process 750047 launched: '/home/wallace/a.out' (x86_64)
Process 750047 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
frame #0: 0x0000000000402479 a.out`main at main.cpp:29:20
26 };
27
28 int main() {
-> 29 std::vector<int> vvv;
30 for (int i = 0; i < 100; i++)
31 vvv.push_back(i);
32
(lldb) process trace start -s 64kb -t --per-cpu
(lldb) b 60
Breakpoint 2: where = a.out`main + 1689 at main.cpp:60:23, address = 0x0000000000402afe
(lldb) c
Process 750047 resuming
Process 750047 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 2.1
frame #0: 0x0000000000402afe a.out`main at main.cpp:60:23
57 map<int, int> m;
58 m[3] = 4;
59
-> 60 map<string, string> m2;
61 m2["5"] = "6";
62
63 std::vector<std::string> vs = {"2", "3"};
(lldb) thread trace dump instructions -t -f -e thread #1: tid = 750047
0: [379567.000 ns] (event) HW clock tick [
48599428476224707]
1: [379569.000 ns] (event) CPU core changed [new CPU=2]
2: [390487.000 ns] (event) HW clock tick [
48599428476246495]
3: [
1602508.000 ns] (event) HW clock tick [
48599428478664855]
4: [
1662745.000 ns] (event) HW clock tick [
48599428478785046]
libc.so.6`malloc
5: [
1662746.995 ns] 0x00007ffff7176660 endbr64
6: [
1662748.991 ns] 0x00007ffff7176664 movq 0x32387d(%rip), %rax ; + 408
7: [
1662750.986 ns] 0x00007ffff717666b pushq %r12
8: [
1662752.981 ns] 0x00007ffff717666d pushq %rbp
9: [
1662754.977 ns] 0x00007ffff717666e pushq %rbx
10: [
1662756.972 ns] 0x00007ffff717666f movq (%rax), %rax
11: [
1662758.967 ns] 0x00007ffff7176672 testq %rax, %rax
12: [
1662760.963 ns] 0x00007ffff7176675 jne 0x9c7e0 ; <+384>
13: [
1662762.958 ns] 0x00007ffff717667b leaq 0x17(%rdi), %rax
14: [
1662764.953 ns] 0x00007ffff717667f cmpq $0x1f, %rax
15: [
1662766.949 ns] 0x00007ffff7176683 ja 0x9c730 ; <+208>
16: [
1662768.944 ns] 0x00007ffff7176730 andq $-0x10, %rax
17: [
1662770.939 ns] 0x00007ffff7176734 cmpq $-0x41, %rax
18: [
1662772.935 ns] 0x00007ffff7176738 seta %dl
19: [
1662774.930 ns] 0x00007ffff717673b jmp 0x9c690 ; <+48>
20: [
1662776.925 ns] 0x00007ffff7176690 cmpq %rdi, %rax
21: [
1662778.921 ns] 0x00007ffff7176693 jb 0x9c7b0 ; <+336>
22: [
1662780.916 ns] 0x00007ffff7176699 testb %dl, %dl
23: [
1662782.911 ns] 0x00007ffff717669b jne 0x9c7b0 ; <+336>
24: [
1662784.906 ns] 0x00007ffff71766a1 movq 0x3236c0(%rip), %r12 ; + 24
(lldb) thread trace dump instructions -t -f -e -J -c 4
[
{
"id": 0,
"timestamp_ns": "379567.000000",
"event": "HW clock tick",
"hwClock":
48599428476224707
},
{
"id": 1,
"timestamp_ns": "379569.000000",
"event": "CPU core changed",
"cpuId": 2
},
{
"id": 2,
"timestamp_ns": "390487.000000",
"event": "HW clock tick",
"hwClock":
48599428476246495
},
{
"id": 3,
"timestamp_ns": "
1602508.000000",
"event": "HW clock tick",
"hwClock":
48599428478664855
},
{
"id": 4,
"timestamp_ns": "
1662745.000000",
"event": "HW clock tick",
"hwClock":
48599428478785046
},
{
"id": 5,
"timestamp_ns": "
1662746.995324",
"loadAddress": "0x7ffff7176660",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "endbr64"
},
{
"id": 6,
"timestamp_ns": "
1662748.990648",
"loadAddress": "0x7ffff7176664",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "movq"
},
{
"id": 7,
"timestamp_ns": "
1662750.985972",
"loadAddress": "0x7ffff717666b",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "pushq"
},
{
"id": 8,
"timestamp_ns": "
1662752.981296",
"loadAddress": "0x7ffff717666d",
"module": "libc.so.6",
"symbol": "malloc",
"mnemonic": "pushq"
}
]
```
Differential Revision: https://reviews.llvm.org/
D130054
Sanjay Patel [Tue, 26 Jul 2022 17:29:48 +0000 (13:29 -0400)]
[InstSimplify] remove redundant calls to 'isImplied'; NFCI
We already call the more general isImpliedCondition() (which calls
isImpliedTrueByMatchingCmp() internally) from simplifyAndInst()
and simplifyOrInst().
There was a difference visible with this change on a vector test
before
a925bef70c6c, but I can't find any gaps now.
LLVM GN Syncbot [Tue, 26 Jul 2022 18:27:34 +0000 (18:27 +0000)]
[gn build] Port
4638d7a28f62
Blue Gaston [Tue, 26 Jul 2022 03:47:15 +0000 (20:47 -0700)]
[Sanitizers][Darwin] Allows '-mtargetos' to used to set minimum deployment target.
Currently, m{platform}-version-min is default flag used to set min deployment target within compilter-rt and sanitizers.
However, clang uses flags -target and -mtargetos for setting target triple and minimum deployment targets.
-mtargetos will be the preferred flag to set min version in the future and the
${platform}-version-min flag will not be used for future platforms.
This change allows darwin platforms to use either ${platform}-min-version or -mtargetos
without breaking lit test flags that allows for overriding the default min value in lit tests
Tests using flags: 'darwin_min_target_with_tls_support', 'min_macos_deployment_target'
will no longer fail if they use mtargetos instead of version-min.
rdar://
81028225
Differential Revision: https://reviews.llvm.org/
D130542
Lambert, Jacob [Tue, 26 Jul 2022 18:22:31 +0000 (11:22 -0700)]
Revert "[clang-offload-bundler] Library-ize ClangOffloadBundler"
This reverts commit
8348c4095600ec2c0beee293267832799d2ebee3.
Francis Visoiu Mistrih [Wed, 20 Jul 2022 09:32:15 +0000 (11:32 +0200)]
[Matrix] Add assert to catch extracted vectors with poison elements
Assert when the extracted vector is wider than the row/column.
Differential Revision: https://reviews.llvm.org/
D130173
Craig Topper [Tue, 26 Jul 2022 17:56:37 +0000 (10:56 -0700)]
[RISCV] Add Predicate to c.lw/c.sw/c.lwsp/c.swsp InstAliases with no offset.
These are aliases that allow the immediate offset to be ommitted.
We had predicates for the RV64, RV32+F, and D versions, but
not the base versions.
I've also re-ordered them to share Predicate lines to improve
readability.
Francis Visoiu Mistrih [Wed, 20 Jul 2022 09:12:30 +0000 (11:12 +0200)]
[Matrix] Refactor tiled loops in a struct. NFC
The three loops have the same structure: index, header, latch.
Jessica Paquette [Tue, 26 Jul 2022 17:54:30 +0000 (10:54 -0700)]
[GlobalISel] Import patterns for G_FMAXIMUM + G_FMINIMUM
Allows us to select scalar instructions on AArch64.
Differential Revision: https://reviews.llvm.org/
D115381
Sam Estep [Tue, 26 Jul 2022 17:54:13 +0000 (17:54 +0000)]
[clang][dataflow] Analyze calls to in-TU functions
This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.
The `pushCall` method (which currently only supports non-method functions with some restrictions) maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.
This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `FIXME` comment in the `pushCall` implementation.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/
D130306
Craig Topper [Tue, 26 Jul 2022 17:39:20 +0000 (10:39 -0700)]
[RISCV] Minor fixes to rv64c-valid.s test.
-Missing CHECK-NO-EXT and CHECK-NO-RV64 on subw.
-Stray CHECK-NO-RV64 on c.slli.
-c.slli used immediate 1 instead of RV64 only immediate like 63.
-Missing CHECK-NO-EXT on c.srli and c.srai
Nico Weber [Tue, 26 Jul 2022 17:30:49 +0000 (13:30 -0400)]
[gn build] Port
8348c4095600
Jon Chesterfield [Tue, 26 Jul 2022 17:04:40 +0000 (18:04 +0100)]
[amdgpu][nfc] Skip operations on padding fields in LDS struct
Sam Estep [Tue, 26 Jul 2022 17:30:09 +0000 (17:30 +0000)]
Revert "[clang][dataflow] Analyze calls to in-TU functions"
This reverts commit
fa2b83d07ecab3b24b4c5ee2e7dc4b6bbc895317.
Sam Estep [Tue, 26 Jul 2022 17:26:58 +0000 (17:26 +0000)]
[clang][dataflow] Analyze calls to in-TU functions
Depends On
D130305
This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.
The `pushCall` method (which currently only supports non-method functions with some restrictions) first calls `initGlobalVars`, then maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.
This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `TODO` comment in the `pushCall` implementation.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/
D130306
Fangrui Song [Tue, 26 Jul 2022 17:16:49 +0000 (10:16 -0700)]
[MachineFunctionPass] Support -print-changed and -print-changed=quiet
-print-changed for new pass manager is handy beside -print-after-all.
Port it to MachineFunctionPass.
Note: lib/Passes/StandardInstrumentations.cpp implements a number of
misc features. If we want to use them for codegen, we may need to lift
some functionality to LLVMIR.
Reviewed By: aeubanks, jamieschmeiser
Differential Revision: https://reviews.llvm.org/
D130434
Jim Ingham [Tue, 26 Jul 2022 17:11:16 +0000 (10:11 -0700)]
StackFrame::GetValueObjectForFrameVariable holds the StackFrame lock too long.
This can cause a deadlock if other threads use the common pattern of
"lock the StackFrameList, get a frame, lock the StackFrame."
Differential Revision: https://reviews.llvm.org/
D130524
Jacob Lambert [Fri, 15 Jul 2022 00:00:26 +0000 (17:00 -0700)]
[clang-offload-bundler] Library-ize ClangOffloadBundler
Lifting the core functionalities of the clang-offload-bundler into a
user-facing library/API. This will allow online and JIT compilers to
bundle and unbundle files without spawning a new process.
This patch lifts the classes and functions used to implement
the clang-offload-bundler into a separate OffloadBundler.cpp,
and defines three top-level API functions in OfflaodBundler.h.
BundleFiles()
UnbundleFiles()
UnbundleArchives()
This patch also introduces a Config class that locally stores the
previously global cl::opt options and arrays to allow users to call
the APIs in a multi-threaded context, and introduces an
OffloadBundler class to encapsulate the top-level API functions.
We also lift the BundlerExecutable variable, which is specific
to the clang-offload-bundler tool, from the API, and replace
its use with an ObjcopyPath variable. This variable must be set
in order to internally call llvm-objcopy.
Finally, we move the API files from
clang/tools/clang-offload-bundler into clang/lib/Driver and
clang/include/clang/Driver.
Differential Revision: https://reviews.llvm.org/
D129873
Simon Pilgrim [Tue, 26 Jul 2022 16:58:08 +0000 (17:58 +0100)]
[DAG] matchRotateSub - set demanded bits to the shift amount type size, not the shift result size.
This should fix a report on
D130251 of an assert due to a bitwidth mismatch in APInt::isSubSetOf
Fangrui Song [Tue, 26 Jul 2022 16:48:35 +0000 (09:48 -0700)]
[AArch64] Simplify BTI/PAC-RET module flags
These module flags use the Min merge behavior with a default value of
zero, so we don't need to emit them if zero.
Reviewed By: danielkiss
Differential Revision: https://reviews.llvm.org/
D130145
Dmitry Preobrazhensky [Tue, 26 Jul 2022 16:32:34 +0000 (19:32 +0300)]
[AMDGPU][GFX10][DOC][NFC] Update assembler syntax description
Summary of changes:
- Update FLAT LDS syntax (see https://reviews.llvm.org/
D125126)
David Goldman [Tue, 19 Jul 2022 16:10:28 +0000 (12:10 -0400)]
[clangd] Improve XRefs support for ObjCMethodDecl
- Correct nameLocation to point to the first selector fragment instead
of the - or +
- getDefinition now searches through the proper impl decls to find
the definition of the ObjCMethodDecl if one exists
Differential Revision: https://reviews.llvm.org/
D130095
Matthias Springer [Tue, 26 Jul 2022 16:06:57 +0000 (18:06 +0200)]
[mlir][transform] Add ForeachOp to transform dialect
This op "unbatches" an op handle and executes the loop body for each payload op.
Differential Revision: https://reviews.llvm.org/
D130257
Chuanqi Xu [Fri, 22 Jul 2022 05:20:22 +0000 (13:20 +0800)]
[C++20] [Modules] Disable preferred_name when writing a C++20 Module interface
Currently, the use of preferred_name would block implementing std
modules in libcxx. See https://github.com/llvm/llvm-project/issues/56490
for example.
The problem is pretty hard and it looks like we couldn't solve it in a
short time. So we sent this patch as a workaround to avoid blocking us
to modularize STL. This is intended to be fixed properly in the future.
Reviewed By: erichkeane, aaron.ballman, tahonermann
Differential Revision: https://reviews.llvm.org/
D130331
Austin Kerbow [Thu, 14 Jul 2022 22:59:16 +0000 (15:59 -0700)]
[AMDGPU] Start refactoring GCNSchedStrategy
Tries to make the different scheduling stages a bit more self contained and
modifiable. Intended to be NFC. Preface to other changes.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/
D130147
Stefan Gränitz [Tue, 26 Jul 2022 09:40:48 +0000 (11:40 +0200)]
[WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms
WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This
affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222
This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs.
* Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand.
* PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers.
* LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites.
Reviewed By: theraven
Differential Revision: https://reviews.llvm.org/
D128190
Philip Reames [Tue, 26 Jul 2022 15:29:07 +0000 (08:29 -0700)]
[RISCV] Add codegen coverage for ceil/floor/trunc/round/roundeven within FPR
Currently, all of these go to libcalls. A change to improve lowering is upcoming.
LLVM GN Syncbot [Tue, 26 Jul 2022 15:44:44 +0000 (15:44 +0000)]
[gn build] Port
f4fb72e6d4ce
Nikolas Klauser [Tue, 26 Jul 2022 14:13:56 +0000 (16:13 +0200)]
[libc++] Use uninitialized algorithms for vector
Reviewed By: ldionne, #libc
Spies: huixie90, eaeltsin, joanahalili, bgraur, alexfh, hans, avogelsgesang, augusto2112, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/
D128146
Simon Tatham [Tue, 26 Jul 2022 13:01:33 +0000 (14:01 +0100)]
[bolt,AArch64] Fix one more test failure from
D130358.
This one actually makes the test simpler, because lit doesn't have to
reconstitute a 32-bit little-endian value from individual bytes any
more: llvm-objdump is printing the desired 32-bit value in the first
place, so we can move straight on to doing the arithmetic on it.
Paul Walker [Sun, 8 May 2022 20:40:06 +0000 (21:40 +0100)]
[SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR.
This patch starts small, only detecting sequences of the form
<a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNodes.
Differential Revision: https://reviews.llvm.org/
D125194
Nico Weber [Tue, 26 Jul 2022 15:28:05 +0000 (11:28 -0400)]
[gn build] (manually) port
a5640968f2f7
Alexander Yermolovich [Fri, 22 Jul 2022 20:10:13 +0000 (13:10 -0700)]
[DWP][DWARF] Detect and error on debug info offset overflow
Right now we silently overflow uint32_t for debug_indfo sections. Added a check
and error out.
Differential Revision: https://reviews.llvm.org/
D130395
Arthur Eubanks [Thu, 30 Jun 2022 22:18:04 +0000 (15:18 -0700)]
[WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes
Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`.
Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`.
To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/
D128955
Dmitry Preobrazhensky [Tue, 26 Jul 2022 14:48:25 +0000 (17:48 +0300)]
[AMDGPU][MC][GFX11] Correct src0 for VOP3_DPP variants of v_cmp*class* opcodes
Disable SGPRs for src0 of these opcodes.
Differential Revision: https://reviews.llvm.org/
D130486
John Ericson [Thu, 25 Mar 2021 00:03:33 +0000 (00:03 +0000)]
[llvm][cmake] Follow up to
D117973
1. Slightly document the "mark advanced" variable used to control the
installed CMake package dir.
I would document it more, but I am considering in the future adding
pkg-config support in this manner, after which `_PACKGE_DIR` is
probably better called `_CMAKE_PACKGE_DIR` or similar.
2. Convey the custom path to the legacy `llvm-config` binary.
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/
D130539
John Ericson [Sun, 16 Jan 2022 05:52:22 +0000 (05:52 +0000)]
[cmake] Slight fix ups to make robust to the full range of GNUInstallDirs
See https://cmake.org/cmake/help/v3.14/module/GNUInstallDirs.html#result-variables for `CMAKE_INSTALL_FULL_*`
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/
D130545
Michael Buch [Tue, 26 Jul 2022 10:46:02 +0000 (11:46 +0100)]
[LLDB][ClangExpression] Prevent nullptr namespace map access during logging
Some codepaths lead to `namespace_map == nullptr` when we get to
`ClangASTSource::FindCompleteType`. This occurred while debugging
an lldb session that had `settings set target.import-std-module true`.
In that case, with `LLDBLog::Expressions` logging enabled, we would
dereference a `nullptr` and crash.
This commit moves the logging until after we check for `nullptr`.
**Testing**
* Fixed the specific crash I was seeing while debugging an `lldb`
session with `import-std-module` enabled.
Differential Revision: https://reviews.llvm.org/
D130561
Dmitry Preobrazhensky [Tue, 26 Jul 2022 14:34:48 +0000 (17:34 +0300)]
[AMDGPU][MC][GFX11] Correct encoding of VOP3/VOP3_DPP v_cmpx* opcodes
Encode dst=EXEC but allow disassembler accept any dst value.
Differential Revision: https://reviews.llvm.org/
D130345
Alexander Belyaev [Tue, 26 Jul 2022 14:32:40 +0000 (16:32 +0200)]
[mlir] Sort the libraties in BUILD.bazel.
Alexander Belyaev [Tue, 26 Jul 2022 14:28:29 +0000 (16:28 +0200)]
[mlir] Update bazel build.
Augie Fackler [Tue, 26 Jul 2022 13:59:21 +0000 (09:59 -0400)]
LangRef: note that `allockind("free")` requires void return
Otherwise we have to work pretty hard to ensure a discarded alloc/free
pair doesn't remove a return value that's still useful.
Differential Revision: https://reviews.llvm.org/
D130568
Sander de Smalen [Tue, 26 Jul 2022 13:46:17 +0000 (14:46 +0100)]
[AArch64][SVE] Sink ptrue into loop if it is used by PTEST.
This helps fold away the ptest instructions, which needs the knowledge on whether
the general predicate is known to zero the inactive lanes.
This fixes some PTEST regressions introduced by
D129282.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/
D129852
Sander de Smalen [Fri, 15 Jul 2022 12:53:42 +0000 (13:53 +0100)]
[AArch64][SVE] Consider more intrinsics in 'isZeroingInactiveLanes'.
This fixes some PTEST regressions introduced by
D129282.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/
D129851
Sander de Smalen [Fri, 15 Jul 2022 12:46:42 +0000 (13:46 +0100)]
[AArch64][SVE] NFC: Add test-case to sve-ptest-removal-cmp* tests
This also adds new sve-ptest tests for FP compares that will retain
the ptest.
This also includes a few other NFC changes:
* Added type mangling to ptest.any intrinsic.
* Regenerated asm using update_llc_tests script.
Than McIntosh [Thu, 30 Jun 2022 13:31:17 +0000 (09:31 -0400)]
tsan: capture shadow map start/end on init and reuse in reset
Capture the computed shadow begin/end values at the point where the
shadow is first created and reuse those values on reset. Introduce new
windows-specific function "ZeroMmapFixedRegion" for zeroing out an
address space region previously returned by one of the MmapFixed*
routines; call this function (on windows) from DoResetImpl
tsan_rtl.cpp instead of MmapFixedSuperNoReserve.
See https://github.com/golang/go/issues/53539#issuecomment-
1168778740
for context; intended to help with updating the syso for Go's
windows/amd64 race detector.
Differential Revision: https://reviews.llvm.org/
D128909
Shraiysh Vaishay [Tue, 26 Jul 2022 13:48:27 +0000 (19:18 +0530)]
Revert "[flang][OpenMP] Lowering support for default clause"
This reverts commit
05e6fce84fd39d150195b8928561f2c90c71e538.
Benjamin Kramer [Tue, 26 Jul 2022 13:36:15 +0000 (15:36 +0200)]
wangpc [Tue, 26 Jul 2022 13:11:39 +0000 (21:11 +0800)]
[DAGCombine] Mask doesn't have to be (EltSize - 1) exactly when combining rotation
I think what we need is the least Log2(EltSize) significant bits are known to be ones.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/
D130251
Tue Ly [Mon, 25 Jul 2022 17:44:46 +0000 (13:44 -0400)]
[libc] Use nearest_integer instructions to improve expm1f performance.
Use nearest_integer instructions to improve expf performance.
Performance tests with CORE-MATH's perf tool:
Before the patch:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput : 10.096
System LIBC reciprocal throughput : 44.036
LIBC reciprocal throughput : 11.575
$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 42.239
System LIBC latency : 122.815
LIBC latency : 50.122
```
After the patch:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput : 10.046
System LIBC reciprocal throughput : 43.899
LIBC reciprocal throughput : 9.179
$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 42.078
System LIBC latency : 120.488
LIBC latency : 41.528
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/
D130502
Tue Ly [Mon, 25 Jul 2022 16:24:31 +0000 (12:24 -0400)]
[libc] Use nearest_integer instructions to improve expf performance.
Use nearest_integer instructions to improve expf performance.
Performance tests with CORE-MATH's perf tool:
Before the patch:
```
$ ./perf.sh expf
LIBC-location: /home/lnt/experiment/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput : 9.860
System LIBC reciprocal throughput : 7.728
LIBC reciprocal throughput : 12.363
$ ./perf.sh expf --latency
LIBC-location: /home/lnt/experiment/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 42.802
System LIBC latency : 35.941
LIBC latency : 49.808
```
After the patch:
```
$ ./perf.sh expf
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput : 9.441
System LIBC reciprocal throughput : 7.382
LIBC reciprocal throughput : 8.843
$ ./perf.sh expf --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 44.192
System LIBC latency : 37.693
LIBC latency : 44.145
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/
D130498
wangpc [Tue, 26 Jul 2022 13:06:14 +0000 (21:06 +0800)]
[RISCV] Precommit test for
D130251
Added tests won't modify the least Log2(EltSize) significant bits.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/
D130252
Chuanqi Xu [Tue, 26 Jul 2022 13:05:30 +0000 (21:05 +0800)]
[C++20] [Modules] Don't handle no linkage entities when overloading
The original implementation uses `ND->getFormalLinkage() <=
Linkage::InternalLinkage`. It is not right since the spec only says
internal linkage and it doesn't mention 'no linkage'. This matters when
we consider constructors. According to [class.ctor.general]p1,
constructors have no name so constructors have no linkage too.
Alexey Lapshin [Mon, 25 Jul 2022 17:08:46 +0000 (20:08 +0300)]
[Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections.
Current DWARFLinker implementation does not support some debug sections
(mainly DWARF v5 sections). This patch adds diagnostic for such sections.
The warning would be displayed for critical(such that could not be removed)
sections and the source file would be skipped. Other unsupported sections
would be removed and warning message should be displayed. The zero exit
status would be returned for both cases.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/
D123623
Peixin Qiao [Tue, 26 Jul 2022 12:21:51 +0000 (20:21 +0800)]
[flang] Remove fp128 support for llvm.round and llvm.trunc
The fp128 in llvm.round and llvm.trunc is not supported in X86_64 for
now. Revert the support. To support quad precision for llvm.round and
llvm.trunc, it may should be supported using runtime.
Reviewed By: Jean Perier
Differential Revision: https://reviews.llvm.org/
D130556
Dmitri Gribenko [Tue, 26 Jul 2022 12:05:53 +0000 (14:05 +0200)]
[clang][dataflow] Add explicit "AST" nodes for implications and iff
Previously we used to desugar implications and biconditionals into
equivalent CNF/DNF as soon as possible. However, this desugaring makes
debug output (Environment::dump()) less readable than it could be.
Therefore, it makes sense to keep the sugared representation of a
boolean formula, and desugar it in the solver.
Reviewed By: sgatev, xazax.hun, wyt
Differential Revision: https://reviews.llvm.org/
D130519
Evgeny Mandrikov [Tue, 26 Jul 2022 12:04:12 +0000 (14:04 +0200)]
[NFC] Fix some C++20 warnings
Without this patch when using CMAKE_CXX_STANDARD=20 Microsoft compiler produces following warnings
clang\include\clang/Basic/DiagnosticIDs.h(48): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(49): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(50): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(51): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(52): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(53): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(54): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(55): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(56): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(57): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(58): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(59): warning C5054: operator '+': deprecated between enumerations of different types
Patch By: Godin
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/
D130476
Sam McCall [Tue, 26 Jul 2022 07:03:02 +0000 (09:03 +0200)]
[pseudo] Allow opaque nodes to represent terminals
This allows incomplete code such as `namespace foo {` to be modeled as a
normal sequence with the missing } represented by an empty opaque node.
Differential Revision: https://reviews.llvm.org/
D130551
Louis Dionne [Tue, 26 Jul 2022 11:44:26 +0000 (07:44 -0400)]
[libc++][NFC] Add missing SHA in ABI changelog
Louis Dionne [Mon, 25 Jul 2022 17:19:51 +0000 (13:19 -0400)]
[libc++] Generalize the customizeable assertion handler
Instead of taking a fixed set of arguments, use variadics so that
we can pass arbitrary arguments to the handler. This is the first
step towards using the handler to handle other non-assertion-related
failures, like std::unreachable and an exception being thrown in
-fno-exceptions mode, which would improve user experience by including
additional information in crashes (right now, we call abort() without
additional information).
Differential Revision: https://reviews.llvm.org/
D130507
Louis Dionne [Tue, 26 Jul 2022 11:41:53 +0000 (07:41 -0400)]
[libc++] Remove XFAIL for libcpp_deallocate on AIX, which seems to be passing now
Nico Weber [Tue, 26 Jul 2022 11:28:33 +0000 (07:28 -0400)]
[gn build] Port
7a5cb15ea6fa
Dmitri Gribenko [Tue, 26 Jul 2022 11:11:08 +0000 (13:11 +0200)]
[bazel] Run autoformatter on BUILD.bazel
Roman Rusyaev [Tue, 26 Jul 2022 10:56:04 +0000 (18:56 +0800)]
[Clang] [P2025] Analyze only potential scopes for NRVO
Before the patch we calculated the NRVO candidate looking at the
variable's whole enclosing scope. The research in [P2025] shows that
looking at the variable's potential scope is better and covers more
cases where NRVO would be safe and desirable.
Many thanks to @Izaron for the original implementation.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/
D119792
Benjamin Kramer [Tue, 26 Jul 2022 10:53:23 +0000 (12:53 +0200)]
Chuanqi Xu [Wed, 6 Jul 2022 09:40:23 +0000 (17:40 +0800)]
[clang] [docs] Update the changes of C++20 Modules in clang15
Since clang15 is going to be branched in July 26, and C++ modules still
lack an update on ReleaseNotes. Although it is not complete yet, I think
it would be better to add one since we've done many works for C++20
Modules in clang15.
Differential Revision: https://reviews.llvm.org/
D129138
Simon Tatham [Tue, 26 Jul 2022 10:33:45 +0000 (11:33 +0100)]
[llvm-objdump,ARM] Fix further test failures.
Further test-failure fallout from
D130358. There were a handful of
uses of llvm-objdump in the CodeGen tests as well, which have taken me
longer to get to because more things had to be built.
Balazs Benics [Tue, 26 Jul 2022 10:31:21 +0000 (12:31 +0200)]
[analyzer] Improve loads from reinterpret-cast fields
Consider this example:
```lang=C++
struct header {
unsigned a : 1;
unsigned b : 1;
};
struct parse_t {
unsigned bits0 : 1;
unsigned bits2 : 2; // <-- header
unsigned bits4 : 4;
};
int parse(parse_t *p) {
unsigned copy = p->bits2;
clang_analyzer_dump(copy);
// expected-warning@-1 {{reg_$1<unsigned int SymRegion{reg_$0<struct Bug_55934::parse_t * p>}.bits2>}}
header *bits = (header *)©
clang_analyzer_dump(bits->b); // <--- Was UndefinedVal previously.
// expected-warning@-1 {{derived_$2{reg_$1<unsigned int SymRegion{reg_$0<struct Bug_55934::parse_t * p>}.bits2>,Element{copy,0 S64b,struct Bug_55934::header}.b}}}
return bits->b; // no-warning: it's not UndefinedVal
}
```
`bits->b` should have the same content as the second bit of `p->bits2`
(assuming that the bitfields are in spelling order).
---
The `Store` has the correct bindings. The problem is with the load of `bits->b`.
It will eventually reach `RegionStoreManager::getBindingForField()` with
`Element{copy,0 S64b,struct header}.b`, which is a `FieldRegion`.
It did not find any direct bindings, so the `getBindingForFieldOrElementCommon()`
gets called. That won't find any bindings, but it sees that the variable
is on the //stack//, thus it must be an uninitialized local variable;
thus it returns `UndefinedVal`.
Instead of doing this, it should have created a //derived symbol//
representing the slice of the region corresponding to the member.
So, if the value of `copy` is `reg1`, then the value of `bits->b` should
be `derived{reg1, elem{copy,0, header}.b}`.
Actually, the `getBindingForElement()` already does exactly this for
reinterpret-casts, so I decided to hoist that and reuse the logic.
Fixes #55934
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/
D128535
Chuanqi Xu [Tue, 26 Jul 2022 10:26:35 +0000 (18:26 +0800)]
[C++20] [Modules] Handle linkage properly for specializations when overloading
Currently, the semantics of linkage in clang is slightly
different from the semantics in C++ spec. In C++ spec, only names
have linkage. So that all entities of the same should share
one linkage. But in clang, different entities of the same could
have different linkage.
It would break a use case where the template have external linkage and
its specialization have internal linkage due to its type argument is
internal linkage. The root cause is that the semantics of internal
linkage in clang is a mixed form of internal linkage and TU-local in
C++ spec. It is hard to solve the root problem and I tried to add a
workaround inplace.
Zakk Chen [Tue, 26 Jul 2022 10:14:03 +0000 (10:14 +0000)]
[RISCV][Clang] Refactor RISCVVEmitter. (NFC)
Remove MaskedPrototype and add several fields in RVVIntrinsicRecord,
compute Prototype in runtime.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/
D126741
Simon Pilgrim [Tue, 26 Jul 2022 09:44:00 +0000 (10:44 +0100)]
Fix MSVC "not all control paths return a value" warning. NFC
Zakk Chen [Tue, 26 Jul 2022 09:12:32 +0000 (09:12 +0000)]
[RISCV][Clang] Refactor and rename rvv intrinsic related stuff. (NFC)
This changed is based on https://reviews.llvm.org/
D111617
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/
D126740
Benjamin Kramer [Tue, 26 Jul 2022 09:26:52 +0000 (11:26 +0200)]
[analyzer] Fix unused variable warning in release builds. NFC.
Benjamin Kramer [Tue, 26 Jul 2022 09:14:37 +0000 (11:14 +0200)]
[mlir] Fall back to posix_memalign for aligned_alloc on MacOS
aligned_alloc was added in MacOS 10.15, some users want to support older
versions. The runtime functions makes this easy, so just put in a call
to posix_memalign, which provides the same functionality.
Sebastian Neubauer [Tue, 26 Jul 2022 08:40:48 +0000 (10:40 +0200)]
[CMake] Fix add_subdirectory llvm builds
Fixes a regression from
D117973, that used CMAKE_BINARY_DIR instead of
LLVM_BINARY_DIR in some places.
Differential Revision: https://reviews.llvm.org/
D130555
Simon Tatham [Tue, 26 Jul 2022 09:08:56 +0000 (10:08 +0100)]
[llvm-objdump,ARM] Fix a lot more tests.
When I changed the output format of llvm-objdump for Arm and AArch64
in
D130358, I hadn't realised llvm-objdump was used so much in the
plain MC tests as well as tests of itself and lld. Sorry!
David Spickett [Tue, 26 Jul 2022 09:16:06 +0000 (09:16 +0000)]
[clang][analyzer][NFC] Use value_or instead of ValueOr
The latter is deprecated.
David Spickett [Tue, 26 Jul 2022 09:04:32 +0000 (09:04 +0000)]
[lldb][ARM] Use portable printf tokens for 64 bit types
Fixes some warnings in the 32 bit build.
Simon Tatham [Tue, 26 Jul 2022 08:43:26 +0000 (09:43 +0100)]
[llvm-objdump] Fix type mismatch in std::min.
I broke the build just now by trying to do std::min between a size_t
and a uint64_t, which of course worked fine on my 64-bit test platform.
David Spickett [Fri, 22 Jul 2022 15:17:28 +0000 (15:17 +0000)]
[lldb][ARM] Add tests for vpush/vpop D registers
Previously we just checked via S regs and were not checking
memory content after pushes.
The vpush test confirms that the fix in https://reviews.llvm.org/
D130307
is working.
Memory will only be checked if an "after" state is provided.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/
D130468
Nimish Mishra [Tue, 26 Jul 2022 08:38:34 +0000 (14:08 +0530)]
[flang][OpenMP] Lowering support for default clause
This patch adds lowering support for default clause.
1. During symbol resolution in semantics, should the enclosing context have
a default data sharing clause defined and a `parser::Name` is not attached
to an explicit data sharing clause, the
`semantics::Symbol::Flag::OmpPrivate` flag (in case of `default(private)`)
and `semantics::Symbol::Flag::OmpFirstprivate` flag (in case of
`default(firstprivate)`) is added to the symbol.
2. During lowering, all symbols having either
`semantics::Symbol::Flag::OmpPrivate` or
`semantics::Symbol::Flag::OmpFirstprivate` flag are collected and
privatised appropriately.
Co-authored-by: Peixin Qiao <qiaopeixin@huawei.com>
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/
D123930
Sven van Haastregt [Tue, 26 Jul 2022 08:39:12 +0000 (09:39 +0100)]
Reassoc FMF should not optimize FMA(a, 0, b) to (b)
Optimizing (a * 0 + b) to (b) requires assuming that a is finite and not
NaN. DAGCombiner will do this optimization when the reassoc fast math
flag is set, which is not correct. Change DAGCombiner to only consider
UnsafeMath for this optimization.
Differential Revision: https://reviews.llvm.org/
D130232
Co-authored-by: Andrea Faulds <andrea.faulds@arm.com>
Simon Tatham [Tue, 26 Jul 2022 08:20:52 +0000 (09:20 +0100)]
[llvm-objdump,ARM] Make dumpARMELFData line up with instructions.
The whitespace in output lines containing disassembled instructions
was extremely mismatched against that in `.word` lines produced from
dumping literal pools and other data in Arm ELF files. This patch
adjusts `dumpARMELFData` so that it uses the same alignment system as
in the instruction pretty-printers. Now the two classes of line are
aligned sensibly alongside each other.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/
D130359
Simon Tatham [Tue, 26 Jul 2022 08:20:46 +0000 (09:20 +0100)]
[llvm-objdump,ARM] Add PrettyPrinters for Arm and AArch64.
Most Arm disassemblers, including GNU objdump and Arm's own `fromelf`,
emit an instruction's raw encoding as a 32-bit words or (for Thumb)
one or two 16-bit halfwords, in logical order rather than according to
their storage endianness. This is generally easier to read: it matches
the encoding diagrams in the architecture spec, it matches the value
you'd write in a `.inst` directive, and it means that fields within
the instruction encoding that span more than one byte (such as branch
offsets or `SVC` immediates) can be read directly in the encoding
without having to mentally reverse the bytes.
llvm-objdump already has a system of PrettyPrinter subclasses which
makes it easy for a target to drop in its own preferred formatting.
This patch adds pretty-printers for all the Arm targets, so that
llvm-objdump will display Arm instruction encodings in their preferred
layout instead of little-endian and bytewise.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/
D130358
Simon Tatham [Tue, 26 Jul 2022 08:20:41 +0000 (09:20 +0100)]
[MC,llvm-objdump,ARM] Target-dependent disassembly resync policy.
Currently, when llvm-objdump is disassembling a code section and
encounters a point where no instruction can be decoded, it uses the
same policy on all targets: consume one byte of the section, emit it
as "<unknown>", and try disassembling from the next byte position.
On an architecture where instructions are always 4 bytes long and
4-byte aligned, this makes no sense at all. If a 4-byte word cannot be
decoded as an instruction, then the next place that a valid
instruction could //possibly// be found is 4 bytes further on.
Disassembling from a misaligned address can't possibly produce
anything that the code generator intended, or that the CPU would even
attempt to execute.
This patch introduces a new MCDisassembler virtual method called
`suggestBytesToSkip`, which allows each target to choose its own
resynchronization policy. For Arm (as opposed to Thumb) and AArch64,
I've filled in the new method to return a fixed width of 4.
Thumb is a more interesting case, because the criterion for
identifying 2-byte and 4-byte instruction encodings is very simple,
and doesn't require the particular instruction to be recognized. So
`suggestBytesToSkip` is also passed an ArrayRef of the bytes in
question, so that it can take that into account. The new test case
shows Thumb disassembly skipping over two unrecognized instructions,
and identifying one as 2-byte and one as 4-byte.
For targets other than Arm and AArch64, this is NFC: the base class
implementation of `suggestBytesToSkip` still returns 1, so that the
existing behavior is unchanged. Other targets can fill in their own
implementations as they see fit; I haven't attempted to choose a new
behavior for each one myself.
I've updated all the call sites of `MCDisassembler::getInstruction` in
llvm-objdump, and also one in sancov, which was the only other place I
spotted the same idiom of `if (Size == 0) Size = 1` after a call to
`getInstruction`.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/
D130357
David Spickett [Mon, 25 Jul 2022 08:48:13 +0000 (08:48 +0000)]
[lldb][ARM] Misc improvements to TestEmulations
* Look for files that end width arm/thumb.dat,
meaning we don't try to run, for example, vim swap files.
* Print the name of the test that failed.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/
D130467
Phoebe Wang [Tue, 26 Jul 2022 08:02:30 +0000 (16:02 +0800)]
[ArgPromotion] Transfer metadata nontemporal to promoted loads
Fixes #56703
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/
D130536
David Spickett [Fri, 22 Jul 2022 14:38:03 +0000 (14:38 +0000)]
[lldb][ARM] Print mismatched registers in emulation tests
Also correct the test failed message. It implies that what
it's done is compare the 'before' and 'ater' states from the
test input.
Except that that's the whole point of the test, that the state changes.
It should tell you that it compared the result of the emulation to the
'after'.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/
D130464
Dmitri Gribenko [Tue, 26 Jul 2022 08:16:13 +0000 (10:16 +0200)]
[clang][dataflow] Fix SAT solver crashes on `X ^ X` and `X v X`
BooleanFormula::addClause has an invariant that a clause has no duplicated
literals. When the solver was desugaring a formula into CNF clauses, it
could construct a clause with such duplicated literals in two cases.
Reviewed By: sgatev, ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/
D130522
isuckatcs [Wed, 29 Jun 2022 16:42:07 +0000 (18:42 +0200)]
[analyzer] Structured binding to tuple-like types
Introducing support for creating structured binding
to tuple-like types.
Differential Revision: https://reviews.llvm.org/
D128837
David Spickett [Fri, 22 Jul 2022 14:13:32 +0000 (14:13 +0000)]
[LLDB][ARM] Generalise adding register state in emulation tests and add D registers
Since some s and d registers overlap we will error if we find both.
This prevents you overwriting one with the other in a test case.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/
D130462
Cullen Rhodes [Tue, 26 Jul 2022 07:52:24 +0000 (07:52 +0000)]
[AArch64][SVE] Add patterns to select mla/mls
Adds patterns for:
add(a, select(mask, mul(b, c), splat(0))) -> mla(a, mask, b, c)
sub(a, select(mask, mul(b, c), splat(0))) -> mls(a, mask, b, c)
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/
D130492
Cullen Rhodes [Mon, 25 Jul 2022 12:41:51 +0000 (12:41 +0000)]
[AArch64][SVE] NFC: Add tests for masked mla/mls patterns (
D130492)
Kito Cheng [Wed, 13 Jul 2022 07:52:17 +0000 (15:52 +0800)]
[RISCV] Lazily add RVV C intrinsics.
Leverage the method OpenCL uses that adds C intrinsics when the lookup
failed. There is no need to define C intrinsics in the header file any
more. It could help to avoid the large header file to speed up the
compilation of RVV source code. Besides that, only the C intrinsics used
by the users will be added into the declaration table.
This patch is based on https://reviews.llvm.org/
D103228 and inspired by
OpenCL implementation.
### Experimental Results
#### TL;DR:
- Binary size of clang increase ~200k, which is +0.07% for debug build and +0.13% for release build.
- Single file compilation speed up ~33x for debug build and ~8.5x for release build
- Regression time reduce ~10% (`ninja check-all`, enable all targets)
#### Header size change
```
| size | LoC |
------------------------------
Before | 4,434,725 | 69,749 |
After | 6,140 | 162 |
```
#### Single File Compilation Time
Testcase:
```
#include <riscv_vector.h>
vint32m1_t test_vadd_vv_vfloat32m1_t(vint32m1_t op1, vint32m1_t op2, size_t vl) {
return vadd(op1, op2, vl);
}
```
##### Debug build:
Before:
```
real 0m19.352s
user 0m19.252s
sys 0m0.092s
```
After:
```
real 0m0.576s
user 0m0.552s
sys 0m0.024s
```
~33x speed up for debug build
##### Release build:
Before:
```
real 0m0.773s
user 0m0.741s
sys 0m0.032s
```
After:
```
real 0m0.092s
user 0m0.080s
sys 0m0.012s
```
~8.5x speed up for release build
#### Regression time
Note: the failed case is `tools/llvm-debuginfod-find/debuginfod.test` which is unrelated to this patch.
##### Debug build
Before:
```
Testing Time: 1358.38s
Skipped : 11
Unsupported : 446
Passed : 75767
Expectedly Failed: 190
Failed : 1
```
After
```
Testing Time: 1220.29s
Skipped : 11
Unsupported : 446
Passed : 75767
Expectedly Failed: 190
Failed : 1
```
##### Release build
Before:
```
Testing Time: 381.98s
Skipped : 12
Unsupported : 1407
Passed : 74765
Expectedly Failed: 176
Failed : 1
```
After:
```
Testing Time: 346.25s
Skipped : 12
Unsupported : 1407
Passed : 74765
Expectedly Failed: 176
Failed : 1
```
#### Binary size of clang
##### Debug build
Before
```
text data bss dec hex filename
335261851 12726004 552812
348540667 14c64efb bin/clang
```
After
```
text data bss dec hex filename
335442803 12798708 552940
348794451 14ca2e53 bin/clang
```
+253K, +0.07% code size
##### Release build
Before
```
text data bss dec hex filename
144123975 8374648 483140
152981763 91e5103 bin/clang
```
After
```
text data bss dec hex filename
144255762 8447296 483268
153186326 9217016 bin/clang
```
+204K, +0.13%
Authored-by: Kito Cheng <kito.cheng@sifive.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Reviewed By: khchen, aaron.ballman
Differential Revision: https://reviews.llvm.org/
D111617
David Spickett [Mon, 11 Jul 2022 12:26:55 +0000 (13:26 +0100)]
[lldb][AArch64] Add support for memory tags in core files
This teaches ProcessElfCore to recognise the MTE tag segments.
https://www.kernel.org/doc/html/latest/arm64/memory-tagging-extension.html#core-dump-support
These segments contain all the tags for a matching memory segment
which will have the same size in virtual address terms. In real terms
it's 2 tags per byte so the data in the segment is much smaller.
Since MTE is the only tag type supported I have hardcoded some
things to those values. We could and should support more formats
as they appear but doing so now would leave code untested until that
happens.
A few things to note:
* /proc/pid/smaps is not in the core file, only the details you have
in "maps". Meaning we mark a region tagged only if it has a tag segment.
* A core file supports memory tagging if it has at least 1 memory
tag segment, there is no other flag we can check to tell if memory
tagging was enabled. (unlike a live process that can support memory
tagging even if there are currently no tagged memory regions)
Tests have been added at the commands level for a core file with
mte and without.
There is a lot of overlap between the "memory tag read" tests here and the unit tests for
MemoryTagManagerAArch64MTE::UnpackTagsFromCoreFileSegment, but I think it's
worth keeping to check ProcessElfCore doesn't cause an assert.
Depends on
D129487
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/
D129489
Saiyedul Islam [Mon, 25 Jul 2022 16:40:06 +0000 (11:40 -0500)]
[Libomptarget] Add checks for AMDGPU TargetID using new image info
This patch extends the is_valid_binary routine to also check if the
binary's target ID matches the one parsed from the system's runtime
environment.
This should allow us to only use the binary whose compute capability
matches, allowing us to support basic multi-architecture binaries for
AMDGPU.
It also handles compatibility testing of target IDs of the image and
the enviornment.
Depends on
D127432
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/
D127769