Wenlei He [Tue, 24 Mar 2020 06:50:41 +0000 (23:50 -0700)]
[CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining
This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon.
Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO.
**Interface**
There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile.
- Query base profile (`getBaseSamplesFor`)
Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function.
- Query context profile (`getContextSamplesFor`)
Context profile is a function's CFG profile for a given calling context. We can query context profile by context string.
- Track inlined context profile (`markContextSamplesInlined`)
When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function.
- Track not-inlined context profile (`promoteMergeContextSamplesTree`)
When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination.
**Implementation**
Implementation-wise, `SampleContext` is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each `SampleContext` also has a `ContextState` indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each `FunctionSamples` now has a `SampleContext` that tells whether it's base profile or context profile, and for context profile what is the context and state.
On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, `SampleContextTracker` is implemented that encapsulates a trie tree with `ContextTireNode` as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track `FunctionSamples` for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes.
**Integration**
`SampleContextTracker` is now also integrated with AutoFDO, `SampleProfileReader` and `SampleProfileLoader`. When we detected input profile contains context-sensitive profile, `SampleContextTracker` will be used to track profiles, and all profile query will go to `SampleContextTracker` instead of `SampleProfileReader` automatically. Tracking APIs are called automatically for each inline decision from `SampleProfileLoader`.
Differential Revision: https://reviews.llvm.org/D90125
Fangrui Song [Sun, 6 Dec 2020 19:11:15 +0000 (11:11 -0800)]
[test] Fix asan/TestCases/Linux/globals-gc-sections-lld.cpp with -fsanitize-address-globals-dead-stripping
r302591 dropped -fsanitize-address-globals-dead-stripping for ELF platforms
(to work around a gold<2.27 bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19002)
Upgrade REQUIRES: from lto (COMPILER_RT_TEST_USE_LLD (set by Android, but rarely used elsewhere)) to lto-available.
Fangrui Song [Sun, 6 Dec 2020 18:31:40 +0000 (10:31 -0800)]
[test] Fix asan/TestCases/Posix/lto-constmerge-odr.cpp when 'binutils_lto' is avaiable
If COMPILER_RT_TEST_USE_LLD is not set, config.use_lld will be False.
However, if feature 'binutils_lto' is available, lto_supported can still be True,
but config.target_cflags will not get -fuse-ld=lld from config.lto_flags
As a result, we may use clang -flto with system 'ld' which may not support the bitcode file, e.g.
ld: error: /tmp/lto-constmerge-odr-44a1ee.o: Unknown attribute kind (70) (Producer: 'LLVM12.0.0git' Reader: 'LLVM 12.0.0git')
// The system ld+LLVMgold.so do not support ATTR_KIND_MUSTPROGRESS (70).
Just require lld-available and add -fuse-ld=lld.
Kazu Hirata [Sun, 6 Dec 2020 18:24:08 +0000 (10:24 -0800)]
[InstCombine] Remove replacePointer (NFC)
The declaration was introduced on Feb 10, 2017 in commit
ba01ed00fef32c48d8e2787a6feaf33568a80bfe without a corresponding
definition.
Kazu Hirata [Sun, 6 Dec 2020 18:12:55 +0000 (10:12 -0800)]
[Mips] Use llvm::is_contained (NFC)
Simon Pilgrim [Sun, 6 Dec 2020 17:56:41 +0000 (17:56 +0000)]
[X86] Fold MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X)))
Noticed while triaging PR37506
Simon Pilgrim [Sun, 6 Dec 2020 17:48:15 +0000 (17:48 +0000)]
[X86] Add tests for missing MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) fold
Noticed while triaging PR37506
Layton Kifer [Sun, 6 Dec 2020 16:50:42 +0000 (11:50 -0500)]
[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)
Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets.
Differential Revision: https://reviews.llvm.org/D91589
Paul C. Anagnostopoulos [Sat, 5 Dec 2020 15:22:31 +0000 (10:22 -0500)]
[TableGen] [CodeGenTarget] Cache the target's instruction namespace.
Differential Revision: https://reviews.llvm.org/D92722
Marek Kurdej [Sun, 6 Dec 2020 14:36:18 +0000 (15:36 +0100)]
[libc++] [docs] Mark P1865 as complete since 11.0 as it was implemented together with P1135. Fix synopses in <barrier> and <latch>.
It was implemented in commit
54fa9ecd3088508b05b0c5b5cb52da8a3c188655 ([libc++] Implementation of C++20's P1135R6 for libcxx).
Sanjay Patel [Sat, 5 Dec 2020 16:24:19 +0000 (11:24 -0500)]
[InstCombine] avoid crash on phi with unreachable incoming block (PR48369)
Marek Kurdej [Sun, 6 Dec 2020 14:23:46 +0000 (15:23 +0100)]
[libc++] [LWG3374] Mark `to_address(const Ptr& p)` overload `constexpr`.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D92659
Simon Pilgrim [Sun, 6 Dec 2020 14:08:15 +0000 (14:08 +0000)]
[CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds
Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.
Jon Chesterfield [Sun, 6 Dec 2020 12:13:56 +0000 (12:13 +0000)]
[libomptarget][amdgpu] Skip device_State allocation when using bss global
Nikita Popov [Sat, 5 Dec 2020 16:26:33 +0000 (17:26 +0100)]
[BasicAA] Migrate "same base pointer" logic to decomposed GEPs
BasicAA has some special bit of logic for "same base pointer" GEPs
that performs a structural comparison: It only looks at two GEPs
with the same base (as opposed to two GEP chains with a MustAlias
base) and compares their indexes in a limited way. I generalized
part of this code in D91027, and this patch merges the remainder
into the normal decomposed GEP logic.
What this code ultimately wants to do is to determine that
gep %base, %idx1 and gep %base, %idx2 don't alias if %idx1 != %idx2,
and the access size fits within the stride.
We can express this in terms of a decomposed GEP expression with
two indexes scale*%idx1 + -scale*%idx2 where %idx1 != %idx2, and
some appropriate checks for sizes and offsets.
This makes the reasoning slightly more powerful, and more
importantly brings all the GEP logic under a common umbrella.
Differential Revision: https://reviews.llvm.org/D92723
Fangrui Song [Sun, 6 Dec 2020 08:33:11 +0000 (00:33 -0800)]
[TargetMachine] Delete asan workaround
687b83ceabafe81970cd4639e7f0c89036402081 has fixed the X86FastISel bug.
We can revert the workaround now. Actually, the commit introduced a
bug that ppc64 should be excluded.
Fangrui Song [Sun, 6 Dec 2020 07:13:28 +0000 (23:13 -0800)]
[X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model
This fixes the bug referenced by
5582a7987662a92eda5d883b88fc4586e755acf5
which was exposed by
961f31d8ad14c66829991522d73e14b5a96ff6d4.
With this change, `movq src@GOTPCREL, %rcx` => `movq src@GOTPCREL(%rip), %rcx`
Fangrui Song [Sun, 6 Dec 2020 05:39:02 +0000 (21:39 -0800)]
[TargetMachine] Don't imply dso_local for memprof in static relocation model
The workaround is no longer needed with my previous commit to MemProfiler.cpp
Fangrui Song [Sun, 6 Dec 2020 05:36:31 +0000 (21:36 -0800)]
[MemProf] Make __memprof_shadow_memory_dynamic_address dso_local in static relocation model
The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference.
The program will crash without the dso_local specifier.
Vitaly Buka [Sat, 5 Dec 2020 05:32:11 +0000 (21:32 -0800)]
[NFC][CodeGen] Simplify SanitizeDtorMembers::Emit
Vitaly Buka [Sun, 6 Dec 2020 04:42:16 +0000 (20:42 -0800)]
[TargetMachine] Set dso_local for memprof
Similar to
5582a7987662a92eda5d883b88fc4586e755acf5
Lang Hames [Sun, 6 Dec 2020 04:41:53 +0000 (15:41 +1100)]
[ORC] Fix missing forward of Allow filter in TPCDynamicLibrarySearchGenerator.
Craig Topper [Sun, 6 Dec 2020 04:10:25 +0000 (20:10 -0800)]
[RISCV] Replace a custom SDTypeProfile with SDTIntBinOp which should be sufficient here.
On the surface this would be slightly less optimal for the isel
table, but due to a tablegen issue with HW mode this ends up
generating a smaller isel table.
Fangrui Song [Sun, 6 Dec 2020 03:30:41 +0000 (19:30 -0800)]
[asan][test] Fix odr-vtable.cpp
Fangrui Song [Sun, 6 Dec 2020 01:51:10 +0000 (17:51 -0800)]
[TargetMachine] Set dso_local if asan is detected
AddressSanitizer instrumentation does not set dso_local on non-thread-local
global variables in -fno-pic and it seems to rely on implied dso_local to work.
Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as
appropriate.
Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.
Jonas Devlieghere [Sun, 6 Dec 2020 01:35:32 +0000 (17:35 -0800)]
[debugserver] Call posix_spawnattr_setarchpref_np throught the fn ptr.
Fourth time is the charm? Of course all of these issues don't show up
when the function is available...
Vitaly Buka [Sat, 5 Dec 2020 08:54:14 +0000 (00:54 -0800)]
[NFC][CodeGen] Add sanitize-dtor-zero-size-field test
The test demonstrates invalid behaviour which will be fixed soon.
Kazu Hirata [Sun, 6 Dec 2020 00:22:12 +0000 (16:22 -0800)]
[ConstantHoisting] Remove unused declaration optimizeConstants (NFC)
The function was renamed to runImpl on Jul 2, 2016 in commit
071d8306b0d9d1345c1da84ae3e1c1b231ffd29d, but the old declaration has
remained since.
Philip Reames [Sat, 5 Dec 2020 23:57:09 +0000 (15:57 -0800)]
Add recursive decomposition reasoning to isKnownNonEqual
The basic idea is that by looking through operand instructions which don't change the equality result that we can push the existing known bits comparison down past instructions which would obscure them.
We have analogous handling in InstSimplify for most - though weirdly not all - of these cases starting from an icmp root. It's a bit unfortunate to duplicate logic, but since my actual goal is to extend BasicAA, the icmp logic doesn't help. (And just makes it hard to test here.) The BasicAA change will be posted separately for review.
Differential Revision: https://reviews.llvm.org/D92698
Fangrui Song [Sat, 5 Dec 2020 23:52:33 +0000 (15:52 -0800)]
[TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic + hidden)
This does not deserve special handling. The code should be added to Clang
instead if deemed useful. With this simplification, we can additionally delete
the PIC extern_weak special case.
Kazu Hirata [Sat, 5 Dec 2020 23:44:40 +0000 (15:44 -0800)]
[CodeGen] llvm::erase_if (NFC)
Aditya Kumar [Sat, 5 Dec 2020 19:43:45 +0000 (11:43 -0800)]
Remove memory allocation with string
Differential Revision: https://reviews.llvm.org/D92506
Fangrui Song [Sat, 5 Dec 2020 23:13:41 +0000 (15:13 -0800)]
[TargetMachine] Clean up TargetMachine::shouldAssumeDSOLocal after x86-32 specific hack is moved to X86Subtarget
With my previous commit, X86Subtarget::classifyGlobalReference has learned to
use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in
TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply
dso_local for function declarations, we can drop the ppc64 special case as well.
This is NFC in terms of Clang emitted assembly.
Fangrui Song [Sat, 5 Dec 2020 22:54:37 +0000 (14:54 -0800)]
[TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model for ELF/wasm
clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.)
By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value)
when taking the address of a function symbol.
This patch should be NFC in terms of the Clang emitted assembly because the case
we don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some function declarations and expose some differences. Most
tests have been fixed to be more robust in the previous commit.
Fangrui Song [Sat, 5 Dec 2020 21:55:48 +0000 (13:55 -0800)]
[test] Add explicit dso_local to function declarations in static relocation model tests
They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.
For such function declarations, clang -fno-pic emits the dso_local specifier.
Adding explicit dso_local makes these tests align with the clang behavior and
helps implementing an option to use GOT indirection when taking the address of a
function symbol in -fno-pic (to avoid a canonical PLT entry (SHN_UNDEF with
non-zero st_value)).
Philip Reames [Sat, 5 Dec 2020 22:05:48 +0000 (14:05 -0800)]
[BasicAA] Fix a bug with relational reasoning across iterations
Due to the recursion through phis basicaa does, the code needs to be extremely careful not to reason about equality between values which might represent distinct iterations. I'm generally skeptical of the correctness of the whole scheme, but this particular patch fixes one particular instance which is demonstrateable incorrect.
Interestingly, this appears to be the second attempted fix for the same issue. The former fix is incomplete and doesn't address the actual issue.
Differential Revision: https://reviews.llvm.org/D92694
Jonas Devlieghere [Sat, 5 Dec 2020 22:05:54 +0000 (14:05 -0800)]
[debugserver] Use dlsym for posix_spawnattr_setarchpref_np
The @available check did not work as I thought it did. Use good old
dlsym instead.
Fangrui Song [Sat, 5 Dec 2020 21:17:47 +0000 (13:17 -0800)]
[X86] Emit @PLT for x86-64 and keep unadorned symbols for x86-32
This essentially reverts the x86-64 side effect of r327198.
For x86-32, @PLT (R_386_PLT32) is not suitable in -fno-pic mode so the
code forces MO_NO_FLAG (like a forced dso_local) (https://bugs.llvm.org//show_bug.cgi?id=36674#c6).
For x86-64, both `call/jmp foo` and `call/jmp foo@PLT` emit R_X86_64_PLT32
(https://sourceware.org/bugzilla/show_bug.cgi?id=22791) so there is no
difference using @PLT. Using @PLT is actually favorable because this drops
a difference with -fpie/-fpic code and makes it possible to avoid a canonical
PLT entry when taking the address of an undefined function symbol.
Chris Sears [Sat, 5 Dec 2020 20:59:06 +0000 (21:59 +0100)]
[llvmbuildectomy] removed vestigial LLVMBuild.txt files
LLVMBuild has been removed from the build system. However, three LLVMBuild.txt
files remain in the tree. This patch simply removes them.
llvm/lib/ExecutionEngine/Orc/TargetProcess/LLVMBuild.txt
llvm/tools/llvm-jitlink/llvm-jitlink-executor/LLVMBuild.txt
llvm/tools/llvm-profgen/LLVMBuild.txt
Differential Revision: https://reviews.llvm.org/D92693
Fangrui Song [Sat, 5 Dec 2020 20:32:50 +0000 (12:32 -0800)]
[TargetMachine] Move X86 specific shouldAssumeDSOLocal logic to X86Subtarget::classifyGlobalFunctionReference
Nikita Popov [Sat, 5 Dec 2020 20:21:01 +0000 (21:21 +0100)]
[BasicAA] Add more tests for non-equal index (NFC)
Fangrui Song [Sat, 5 Dec 2020 19:40:18 +0000 (11:40 -0800)]
[TargetMachine] Simplify shouldAssumeDSOLocal by processing ExternalSymbolSDNode early
The function accrues many `GV` nullness checks. Process `!GV`
(ExternalSymbolSDNode) early to simplify code.
Also improve a comment added in r327198 (intrinsics is a subset of
ExternalSymbolSDNode).
Intended to be NFC.
Jonas Devlieghere [Sat, 5 Dec 2020 18:17:48 +0000 (10:17 -0800)]
[debugserver] Remove bridgeos availability
I didn't realize that the 'bridgeos' is not part of the public SDK.
Benjamin Kramer [Sat, 5 Dec 2020 18:07:09 +0000 (19:07 +0100)]
[X86] Autodetect znver3
Joachim Protze [Thu, 26 Nov 2020 10:55:56 +0000 (11:55 +0100)]
[OpenMP][OMPT] Fix OMPT return address guard for gomp interface
D91692 missed various locations in kmp_gsupport, where the scope for
OMPT_STORE_RETURN_ADDRESS is too narrow, i.e. the scope ends before the OMPT
callback is called in some nested function.
This patch fixes the scoping issue, so that all OMPT tests pass, when the
tests are built with gcc.
Differential Revision: https://reviews.llvm.org/D92121
Zbigniew Sarbinowski [Sat, 5 Dec 2020 00:13:23 +0000 (00:13 +0000)]
[SystemZ][ZOS] Fix the usage of pthread_t within libc++
This is the the minimal change introduced in [[ https://reviews.llvm.org/D88599 | D88599 ]] to unblock the controversial change and discussion of proper separation between thread from thread id which will continue in D88599.
This patch will address the differences of definition of pthread_t on z/OS vs. Linux and other OS. Main trick to make the code work on z/OS relies on redefining libcpp_thread_id type and _LIBCPP_NULL_THREAD macro. This is necessary to separate initialization of libcxx_thread_id from the one of __libcxx_thread_t;
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D91875
mydeveloperday [Sat, 5 Dec 2020 16:32:37 +0000 (16:32 +0000)]
[clang-format] Add option for case sensitive regexes for sorted includes
I think the title says everything.
Reviewed By: MyDeveloperDay
Patch By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D91507
Mark de Wever [Sat, 5 Dec 2020 15:36:19 +0000 (16:36 +0100)]
[NFC][libc++] Update C++20 issues status.
Properly mark LWG1203 as completed and move the version number to the
version column.
Mark de Wever [Sat, 5 Dec 2020 15:31:16 +0000 (16:31 +0100)]
[NFC][clang-tidy] Fixes comment typos.
Florian Hahn [Sat, 5 Dec 2020 12:55:27 +0000 (12:55 +0000)]
[ConstraintElimination] Wrap dump() call in LLVM_DEBUG (NFC).
ConstraintSystem::dump only generates output with -debug, but there's no
need to call it without -debug.
Florian Hahn [Sat, 5 Dec 2020 10:52:50 +0000 (10:52 +0000)]
[ConstraintElimination] Handle constraints with all zero var coeffs.
Constraints where all variable coefficients are 0 do not add any useful
information. When checking, we can check if they are always true/false.
Dmitry Preobrazhensky [Sat, 5 Dec 2020 10:41:27 +0000 (13:41 +0300)]
[AMDGPU][MC] Improved diagnostics message for sym/expr operands
See bug 48295 (https://bugs.llvm.org/show_bug.cgi?id=48295)
Reviewers: rampitec
Differential Revision: https://reviews.llvm.org/D92088
Nikita Popov [Sat, 5 Dec 2020 10:35:58 +0000 (11:35 +0100)]
[AA] Initialize Depth member
Fix mistake introduced in
f8afba5f7a25a69c12191d979d78d40fa6e5b684:
I failed to initialize the Depth member to zero.
Dmitry Preobrazhensky [Sat, 5 Dec 2020 10:21:28 +0000 (13:21 +0300)]
[AMDGPU][MC] Corrected error position for invalid MOVREL src
See bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)
Reviewers: rampitec
Differential Revision: https://reviews.llvm.org/D92084
mydeveloperday [Sat, 5 Dec 2020 10:14:51 +0000 (10:14 +0000)]
[clang-format] [NFC] keep clang-format tests clang-format clean
I use several of the clang-format clean directories as a test suite, this one had got slightly out of wack in a prior commit
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D92666
Nikita Popov [Sat, 5 Dec 2020 09:39:31 +0000 (10:39 +0100)]
[AA] Add statistics for alias results (NFC)
Count how many NoAlias/MustAlias/MayAlias we get from top-level
queries.
Nikita Popov [Fri, 4 Dec 2020 18:05:16 +0000 (19:05 +0100)]
[BasicAA] Add recphi tests with nested loops (NFC)
Fangrui Song [Sat, 5 Dec 2020 08:42:07 +0000 (00:42 -0800)]
[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal
PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead.
Actually Clang ppc32 does produce a pair of absolute relocations which match GCC.
This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).
Fangrui Song [Sat, 5 Dec 2020 07:22:47 +0000 (23:22 -0800)]
[TargetMachine] Delete wasm special case from shouldAssumeDSOLocal
Francis Visoiu Mistrih [Sat, 5 Dec 2020 04:10:06 +0000 (20:10 -0800)]
[llvm-nm][MachO] Don't call getFlags on redacted symbols
Avoid calling getFlags on a non-existent symbol.
The way this is triggered is by calling strip -N on a binary, which sets
the MH_NLIST_OUTOFSYNC_WITH_DYLDINFO header flag. Then, in the
LC_FUNCTION_STARTS command, nm is trying to print the stripped symbols
and needs the proper checks.
Kazu Hirata [Sat, 5 Dec 2020 05:42:54 +0000 (21:42 -0800)]
[AMDGPU] Use llvm::is_contained (NFC)
Kazu Hirata [Sat, 5 Dec 2020 05:26:12 +0000 (21:26 -0800)]
[IRCE] Remove unused IsSigned and its accessor (NFC)
IsSigned and its accessor, isSigned, were introduced on Oct 25, 2017
in commit
9ac7021a2563d433549a21990f96184d413e69e2. The last use was
removed on Nov 20, 2017 in commit
268467869b99b15a15f81bf009d31e11536bef39.
Hsiangkai Wang [Fri, 4 Dec 2020 07:34:11 +0000 (15:34 +0800)]
[RISCV] Formatting for easier reading (NFC)
Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
River Riddle [Sat, 5 Dec 2020 05:01:26 +0000 (21:01 -0800)]
[mlir][IR] Move the storage for results to before the Operation instead of after.
Trailing objects are really nice for storing additional data inline with the main class, and is something that we heavily take advantage of for Operation(and many other classes). To get the address of the inline data you need to compute the address by doing some pointer arithmetic taking into account any objects stored before the object you want to access. Most classes keep the count of the number of objects, so this is relatively cheap to compute. This is not the case for results though, which have two different types(inline and trailing) that are not necessarily as cheap to compute as the count for other objects. This revision moves the storage for results to before the operation and stores them in reverse order. This allows for getting results to still be very fast given that they are never iterated directly in order, and also greatly improves the speed when accessing the other trailing objects of an operation(operands/regions/blocks/etc).
This reduced compile time when compiling a decently sized mlir module by about ~400ms, or 2.17s -> 1.76s.
Differential Revision: https://reviews.llvm.org/D92687
River Riddle [Sat, 5 Dec 2020 04:54:23 +0000 (20:54 -0800)]
[mlir][OpFormatGen] Add support for optional enum attributes
The check for formatting enum attributes was missing a call to get the base attribute, which is necessary to strip off the top-level OptionalAttr<> wrapper.
Differential Revision: https://reviews.llvm.org/D92713
Zhuojia Shen [Sat, 5 Dec 2020 04:53:23 +0000 (20:53 -0800)]
[builtins][ARM] Check __ARM_FP instead of __VFP_FP__.
This patch fixes builtins' CMakeLists.txt and their VFP tests to check
the standard macro defined in the ACLE for VFP support. It also enables
the tests to be built and run for single-precision-only targets while
builtins were built with double-precision support.
Differential revision: https://reviews.llvm.org/D92497
Jonas Devlieghere [Sat, 5 Dec 2020 04:21:50 +0000 (20:21 -0800)]
[debugserver] Honor the cpu sub type if specified
Use the newly added spawnattr API, posix_spawnattr_setarchpref_np, to
select a slice preferences per cpu and subcpu types, instead of just cpu
with posix_spawnattr_setarchpref_np.
rdar://
16094957
Differential revision: https://reviews.llvm.org/D92712
Jonas Devlieghere [Fri, 4 Dec 2020 17:45:43 +0000 (09:45 -0800)]
[lldb] Remove unused argument to expectedFailure
Fangrui Song [Sat, 5 Dec 2020 03:33:19 +0000 (19:33 -0800)]
[ELF] Fix relocation-model.ll
Fangrui Song [Sat, 5 Dec 2020 03:03:40 +0000 (19:03 -0800)]
[TargetMachine] Don't imply dso_local on global variable declarations in Reloc::Static model
clang/lib/CodeGen/CodeGenModule sets dso_local on applicable global variables,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to remove as much
additional implied dso_local in TargetMachine:shouldAssumeDSOLocal as possible.)
By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent copy relocations.
This patch should be NFC in terms of the Clang behavior because the case we
don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some `external global` and expose some differences. Most tests
have been fixed to be more robust in previous commits.
Fangrui Song [Sat, 5 Dec 2020 02:35:45 +0000 (18:35 -0800)]
[test] Split some tests which test both static and pic relocation models
TargetMachine::shouldAssumeDSOLocal currently implies dso_local for
Static. Split some tests so that these `external dso_local global`
will align with the Clang behavior.
Craig Topper [Sat, 5 Dec 2020 02:36:14 +0000 (18:36 -0800)]
[RISCV] Use fcvt.h/d/f.w if the input is an assertsexti32 not just when the input is sext_inreg.
Tony [Sat, 5 Dec 2020 00:57:21 +0000 (00:57 +0000)]
[NFC][AMDGPU] AMDGPUUsage updates
- Document code object V2 gfx800.
- Document amdpal is supported by Linux Pro.
Differential Revision: https://reviews.llvm.org/D92708
Fangrui Song [Sat, 5 Dec 2020 02:11:35 +0000 (18:11 -0800)]
[test] Split some tests which test both static and pic relocation models
TargetMachine::shouldAssumeDSOLocal currently implies dso_local for
Static. Split some tests so that these `external dso_local global` will
align with the Clang behavior.
Sam Clegg [Fri, 4 Dec 2020 00:51:56 +0000 (16:51 -0800)]
[lld][WebAssembly] Add suppport for PIC + passive data initialization
This change improves our support for shared memory to include
PIC executables (and shared libraries).
To handle this case the linker-generated `__wasm_init_memory`
function (that only exists in shared memory builds) must be
capable of loading memory segements at non-const offsets based
on the runtime value of `__memory_base`.
Differential Revision: https://reviews.llvm.org/D92620
Fangrui Song [Sat, 5 Dec 2020 00:57:45 +0000 (16:57 -0800)]
Make __stack_chk_guard dso_local if Reloc::Static
This is currently implied by TargetMachine::shouldAssumeDSOLocal
but will be changed in the future.
Nathan Lanza [Fri, 4 Dec 2020 23:52:10 +0000 (15:52 -0800)]
[llvm] Update WinMsvc.cmake's fms-compatability to match llvm's prereqs
llvm's minimum fms-compatability-version was just bumped to 19.14 and
thus the WinMsvc.cmake file needs to be adjusted accordingly.
Hsiangkai Wang [Fri, 4 Dec 2020 12:45:41 +0000 (20:45 +0800)]
[RISCV] Define preprocessor definitions for 'V' extension.
Differential Revision: https://reviews.llvm.org/D92650
Alex Lorenz [Fri, 4 Dec 2020 23:06:13 +0000 (15:06 -0800)]
[objc] diagnose protocol conformance in categories with direct members
in their corresponding class interfaces
Categories that add protocol conformances to classes with direct members should prohibit protocol
conformances when the methods/properties that the protocol expects are actually declared as 'direct' in the class.
Differential Revision: https://reviews.llvm.org/D92602
Alex Lorenz [Fri, 4 Dec 2020 22:45:27 +0000 (14:45 -0800)]
[clang] add a `swift_async_name` attribute
The swift_async_name attribute provides a name for a function/method that can be used
to call the async overload of this method from Swift. This name specified in this attribute
assumes that the last parameter in the function/method its applied to is removed when
Swift invokes it, as the the Swift's await/async transformation implicitly constructs the callback.
Differential Revision: https://reviews.llvm.org/D92355
Alex Lorenz [Fri, 4 Dec 2020 17:29:45 +0000 (09:29 -0800)]
[clang] add a new `swift_attr` attribute
The swift_attr attribute is a generic annotation attribute that's not used by clang,
but is used by the Swift compiler. The Swift compiler can use these annotations to provide
various syntactic and semantic sugars for the imported Objective-C API declarations.
Differential Revision: https://reviews.llvm.org/D92354
Philip Reames [Fri, 4 Dec 2020 23:12:16 +0000 (15:12 -0800)]
[test] precommit test for D92698
Duncan P. N. Exon Smith [Fri, 4 Dec 2020 23:04:31 +0000 (15:04 -0800)]
Index: Remove unused internal header SimpleFormatContext.h, NFC
Looks like nothing has included this header since
d21485d2f5ffacf7b726c741ee409b3682045255 / r286279 in 2016. Delete the
dead code.
shafik [Fri, 4 Dec 2020 22:47:36 +0000 (14:47 -0800)]
Add diagnostic for for-range-declaration being specificed with thread_local
Currently we have a diagnostic that catches the other storage class specifies for the range based for loop declaration but we miss the thread_local case. This changes adds a diagnostic for that case as well.
Differential Revision: https://reviews.llvm.org/D92671
Fangrui Song [Fri, 4 Dec 2020 23:05:59 +0000 (15:05 -0800)]
[asan][test] Improve -asan-use-private-alias tests
In preparation for D92078
Arthur O'Dwyer [Thu, 3 Dec 2020 01:02:18 +0000 (20:02 -0500)]
[libc++] Update the commented "synopsis" in <algorithm> to match current reality.
The synopsis now reflects what's implemented. It does NOT reflect
all of what's specified in C++20. The "constexpr in C++20" markings
are still missing from these 12 algorithms, because they are still
unimplemented by libc++:
reverse partition sort nth_element next_permutation prev_permutation
push_heap pop_heap make_heap sort_heap partial_sort partial_sort_copy
All of the above algorithms were excluded from [P0202].
All of the above algorithms were made constexpr in [P0879] (along with
swap_ranges, iter_swap, and rotate — we've already implemented those three).
Differential Revision: https://reviews.llvm.org/D92255
Arthur O'Dwyer [Fri, 4 Dec 2020 18:47:12 +0000 (13:47 -0500)]
[libc++] [P0202] constexpr set_union, set_difference, set_symmetric_difference, merge
These had been waiting on the ability to use `std::copy` from
constexpr code (which in turn had been waiting on the ability to
use `is_constant_evaluated()` to switch between `memmove` and non-`memmove`
implementations of `std::copy`). That work landed a while ago,
so these algorithms can all be constexpr in C++20 now.
Simultaneously, update the tests for the set algorithms.
- Use an element type with "equivalent but not identical" values.
- The custom-comparator tests now pass something different from `operator<`.
- Make the constexpr coverage match the non-constexpr coverage.
Differential Revision: https://reviews.llvm.org/D92255
Arthur O'Dwyer [Fri, 4 Dec 2020 18:38:51 +0000 (13:38 -0500)]
[libc++] Slightly improve constexpr test coverage for std::includes.
Differential Revision: https://reviews.llvm.org/D92255
Kazushi (Jam) Marukawa [Fri, 4 Dec 2020 11:15:13 +0000 (20:15 +0900)]
[VE] Add vfsqrt, vfcmp, vfmax, and vfmin intrinsic instructions
Add vfsqrt, vfcmp, vfmax, and vfmin intrinsic instructions and
regression tests.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92651
Duncan P. N. Exon Smith [Thu, 3 Dec 2020 01:25:46 +0000 (17:25 -0800)]
ASTImporter: Migrate to the FileEntryRef overload of SourceManager::createFileID, NFC
Migrate `ASTImporter::Import` over to using the `FileEntryRef` overload
of `SourceManager::createFileID`. No functionality change here.
Differential Revision: https://reviews.llvm.org/D92529
Duncan P. N. Exon Smith [Thu, 3 Dec 2020 01:32:08 +0000 (17:32 -0800)]
ARCMigrate: Initialize fields in EditEntry inline, NFC
Initialize the fields inline instead of having to manually write out a
default constructor.
Differential Revision: https://reviews.llvm.org/D92597
Duncan P. N. Exon Smith [Fri, 4 Dec 2020 22:34:22 +0000 (14:34 -0800)]
Frontend: Use translateLineCol instead of translateFileLineCol, NFC
`ParseDirective` in VerifyDiagnosticConsumer.cpp is already calling
`translateFile`, so use the `FileID` returned by that to call
`translateLineCol` instead of using the more heavyweight
`translateFileLineCol`.
No functionality change here.
Scott Linder [Fri, 4 Dec 2020 22:14:37 +0000 (22:14 +0000)]
[MC] Consume EndOfStatement in .cfi_{sections,endproc}
Previously these directives were always interpreted as having an extra
blank line after them.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92612
LLVM GN Syncbot [Fri, 4 Dec 2020 22:16:56 +0000 (22:16 +0000)]
[gn build] Port
4d8bf870a82
Duncan P. N. Exon Smith [Wed, 2 Dec 2020 23:41:36 +0000 (15:41 -0800)]
ADT: Remove AlignedCharArrayUnion, NFC
Prep commit already migrated users over to std::aligned_union_t; this
just deletes the type / header / test.
Differential Revision: https://reviews.llvm.org/D92517
Aart Bik [Fri, 4 Dec 2020 19:48:48 +0000 (11:48 -0800)]
[mlir][vector] rephrased description
More carefully worded description. Added constructor to options.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D92664
Krzysztof Parzyszek [Fri, 4 Dec 2020 21:54:29 +0000 (15:54 -0600)]
Include BuiltinAttributes.h in llvm-prettyprinters/gdb/mlir-support.cpp
This header was introduced in
c7cae0e4fa4e1ed4bdca186096a408578225fc2b.
Fangrui Song [Fri, 4 Dec 2020 21:51:01 +0000 (13:51 -0800)]
[test] Add explicit dso_local to constant/global variable declarations
They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.
For external data, clang -fno-pic emits the dso_local specifier for ELF and
non-MinGW COFF. Adding explicit dso_local makes these tests in align with the
clang behavior and helps implementing an option to use GOT indirection for
external data access in -fno-pic mode (to avoid copy relocations).
Jianzhou Zhao [Fri, 4 Dec 2020 02:50:56 +0000 (02:50 +0000)]
[dfsan] Add empty APIs for field-level shadow
This is a child diff of D92261.
This diff adds APIs that return shadow type/value/zero from origin
objects. For the time being these APIs simply returns primitive
shadow type/value/zero. The following diff will be implementing the
conversion.
As D92261 explains, some cases still use primitive shadow during
the incremential changes. The cases include
1) alloca/load/store
2) custom function IO
3) vectors
At the cases this diff does not use the new APIs, but uses primitive
shadow objects explicitly.
Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D92629
Alexey Bataev [Fri, 4 Dec 2020 20:56:54 +0000 (12:56 -0800)]
[OPENMP]Fix PR48394: need to capture variables used in atomic constructs.
The variables used in atomic construct should be captured in outer
task-based regions implicitly. Otherwise, the compiler will crash trying
to find the address of the local variable.
Differential Revision: https://reviews.llvm.org/D92682