platform/upstream/llvm.git
3 years ago[CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining
Wenlei He [Tue, 24 Mar 2020 06:50:41 +0000 (23:50 -0700)]
[CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining

This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon.

Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO.

**Interface**

There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile.

- Query base profile (`getBaseSamplesFor`)
Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function.

- Query context profile (`getContextSamplesFor`)
Context profile is a function's CFG profile for a given calling context. We can query context profile by context string.

- Track inlined context profile (`markContextSamplesInlined`)
When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function.

- Track not-inlined context profile (`promoteMergeContextSamplesTree`)
When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination.

**Implementation**

Implementation-wise, `SampleContext` is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each `SampleContext` also has a `ContextState` indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each `FunctionSamples` now has a `SampleContext` that tells whether it's base profile or context profile, and for context profile what is the context and state.

On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, `SampleContextTracker` is implemented that encapsulates a trie tree with `ContextTireNode` as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track `FunctionSamples` for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes.

**Integration**

`SampleContextTracker` is now also integrated with AutoFDO, `SampleProfileReader` and `SampleProfileLoader`. When we detected input profile contains context-sensitive profile, `SampleContextTracker` will be used to track profiles, and all profile query will go to `SampleContextTracker` instead of `SampleProfileReader` automatically. Tracking APIs are called automatically for each inline decision from `SampleProfileLoader`.

Differential Revision: https://reviews.llvm.org/D90125

3 years ago[test] Fix asan/TestCases/Linux/globals-gc-sections-lld.cpp with -fsanitize-address...
Fangrui Song [Sun, 6 Dec 2020 19:11:15 +0000 (11:11 -0800)]
[test] Fix asan/TestCases/Linux/globals-gc-sections-lld.cpp with -fsanitize-address-globals-dead-stripping

r302591 dropped -fsanitize-address-globals-dead-stripping for ELF platforms
(to work around a gold<2.27 bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19002)

Upgrade REQUIRES: from lto (COMPILER_RT_TEST_USE_LLD (set by Android, but rarely used elsewhere)) to lto-available.

3 years ago[test] Fix asan/TestCases/Posix/lto-constmerge-odr.cpp when 'binutils_lto' is avaiable
Fangrui Song [Sun, 6 Dec 2020 18:31:40 +0000 (10:31 -0800)]
[test] Fix asan/TestCases/Posix/lto-constmerge-odr.cpp when 'binutils_lto' is avaiable

If COMPILER_RT_TEST_USE_LLD is not set, config.use_lld will be False.
However, if feature 'binutils_lto' is available, lto_supported can still be True,
but config.target_cflags will not get -fuse-ld=lld from config.lto_flags

As a result, we may use clang -flto with system 'ld' which may not support the bitcode file, e.g.

  ld: error: /tmp/lto-constmerge-odr-44a1ee.o: Unknown attribute kind (70) (Producer: 'LLVM12.0.0git' Reader: 'LLVM 12.0.0git')
  // The system ld+LLVMgold.so do not support ATTR_KIND_MUSTPROGRESS (70).

Just require lld-available and add -fuse-ld=lld.

3 years ago[InstCombine] Remove replacePointer (NFC)
Kazu Hirata [Sun, 6 Dec 2020 18:24:08 +0000 (10:24 -0800)]
[InstCombine] Remove replacePointer (NFC)

The declaration was introduced on Feb 10, 2017 in commit
ba01ed00fef32c48d8e2787a6feaf33568a80bfe without a corresponding
definition.

3 years ago[Mips] Use llvm::is_contained (NFC)
Kazu Hirata [Sun, 6 Dec 2020 18:12:55 +0000 (10:12 -0800)]
[Mips] Use llvm::is_contained (NFC)

3 years ago[X86] Fold MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X)))
Simon Pilgrim [Sun, 6 Dec 2020 17:56:41 +0000 (17:56 +0000)]
[X86] Fold MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X)))

Noticed while triaging PR37506

3 years ago[X86] Add tests for missing MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) fold
Simon Pilgrim [Sun, 6 Dec 2020 17:48:15 +0000 (17:48 +0000)]
[X86] Add tests for missing MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) fold

Noticed while triaging PR37506

3 years ago[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)
Layton Kifer [Sun, 6 Dec 2020 16:50:42 +0000 (11:50 -0500)]
[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)

Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets.

Differential Revision: https://reviews.llvm.org/D91589

3 years ago[TableGen] [CodeGenTarget] Cache the target's instruction namespace.
Paul C. Anagnostopoulos [Sat, 5 Dec 2020 15:22:31 +0000 (10:22 -0500)]
[TableGen] [CodeGenTarget] Cache the target's instruction namespace.

Differential Revision: https://reviews.llvm.org/D92722

3 years ago[libc++] [docs] Mark P1865 as complete since 11.0 as it was implemented together...
Marek Kurdej [Sun, 6 Dec 2020 14:36:18 +0000 (15:36 +0100)]
[libc++] [docs] Mark P1865 as complete since 11.0 as it was implemented together with P1135. Fix synopses in <barrier> and <latch>.

It was implemented in commit 54fa9ecd3088508b05b0c5b5cb52da8a3c188655 ([libc++] Implementation of C++20's P1135R6 for libcxx).

3 years ago[InstCombine] avoid crash on phi with unreachable incoming block (PR48369)
Sanjay Patel [Sat, 5 Dec 2020 16:24:19 +0000 (11:24 -0500)]
[InstCombine] avoid crash on phi with unreachable incoming block (PR48369)

3 years ago[libc++] [LWG3374] Mark `to_address(const Ptr& p)` overload `constexpr`.
Marek Kurdej [Sun, 6 Dec 2020 14:23:46 +0000 (15:23 +0100)]
[libc++] [LWG3374] Mark `to_address(const Ptr& p)` overload `constexpr`.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D92659

3 years ago[CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds
Simon Pilgrim [Sun, 6 Dec 2020 14:08:15 +0000 (14:08 +0000)]
[CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds

Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.

3 years ago[libomptarget][amdgpu] Skip device_State allocation when using bss global
Jon Chesterfield [Sun, 6 Dec 2020 12:13:56 +0000 (12:13 +0000)]
[libomptarget][amdgpu] Skip device_State allocation when using bss global

3 years ago[BasicAA] Migrate "same base pointer" logic to decomposed GEPs
Nikita Popov [Sat, 5 Dec 2020 16:26:33 +0000 (17:26 +0100)]
[BasicAA] Migrate "same base pointer" logic to decomposed GEPs

BasicAA has some special bit of logic for "same base pointer" GEPs
that performs a structural comparison: It only looks at two GEPs
with the same base (as opposed to two GEP chains with a MustAlias
base) and compares their indexes in a limited way. I generalized
part of this code in D91027, and this patch merges the remainder
into the normal decomposed GEP logic.

What this code ultimately wants to do is to determine that
gep %base, %idx1 and gep %base, %idx2 don't alias if %idx1 != %idx2,
and the access size fits within the stride.

We can express this in terms of a decomposed GEP expression with
two indexes scale*%idx1 + -scale*%idx2 where %idx1 != %idx2, and
some appropriate checks for sizes and offsets.

This makes the reasoning slightly more powerful, and more
importantly brings all the GEP logic under a common umbrella.

Differential Revision: https://reviews.llvm.org/D92723

3 years ago[TargetMachine] Delete asan workaround
Fangrui Song [Sun, 6 Dec 2020 08:33:11 +0000 (00:33 -0800)]
[TargetMachine] Delete asan workaround

687b83ceabafe81970cd4639e7f0c89036402081 has fixed the X86FastISel bug.
We can revert the workaround now. Actually, the commit introduced a
bug that ppc64 should be excluded.

3 years ago[X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model
Fangrui Song [Sun, 6 Dec 2020 07:13:28 +0000 (23:13 -0800)]
[X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model

This fixes the bug referenced by 5582a7987662a92eda5d883b88fc4586e755acf5
which was exposed by 961f31d8ad14c66829991522d73e14b5a96ff6d4.

With this change, `movq src@GOTPCREL, %rcx` => `movq src@GOTPCREL(%rip), %rcx`

3 years ago[TargetMachine] Don't imply dso_local for memprof in static relocation model
Fangrui Song [Sun, 6 Dec 2020 05:39:02 +0000 (21:39 -0800)]
[TargetMachine] Don't imply dso_local for memprof in static relocation model

The workaround is no longer needed with my previous commit to MemProfiler.cpp

3 years ago[MemProf] Make __memprof_shadow_memory_dynamic_address dso_local in static relocation...
Fangrui Song [Sun, 6 Dec 2020 05:36:31 +0000 (21:36 -0800)]
[MemProf] Make __memprof_shadow_memory_dynamic_address dso_local in static relocation model

The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference.
The program will crash without the dso_local specifier.

3 years ago[NFC][CodeGen] Simplify SanitizeDtorMembers::Emit
Vitaly Buka [Sat, 5 Dec 2020 05:32:11 +0000 (21:32 -0800)]
[NFC][CodeGen] Simplify SanitizeDtorMembers::Emit

3 years ago[TargetMachine] Set dso_local for memprof
Vitaly Buka [Sun, 6 Dec 2020 04:42:16 +0000 (20:42 -0800)]
[TargetMachine] Set dso_local for memprof

Similar to 5582a7987662a92eda5d883b88fc4586e755acf5

3 years ago[ORC] Fix missing forward of Allow filter in TPCDynamicLibrarySearchGenerator.
Lang Hames [Sun, 6 Dec 2020 04:41:53 +0000 (15:41 +1100)]
[ORC] Fix missing forward of Allow filter in TPCDynamicLibrarySearchGenerator.

3 years ago[RISCV] Replace a custom SDTypeProfile with SDTIntBinOp which should be sufficient...
Craig Topper [Sun, 6 Dec 2020 04:10:25 +0000 (20:10 -0800)]
[RISCV] Replace a custom SDTypeProfile with SDTIntBinOp which should be sufficient here.

On the surface this would be slightly less optimal for the isel
table, but due to a tablegen issue with HW mode this ends up
generating a smaller isel table.

3 years ago[asan][test] Fix odr-vtable.cpp
Fangrui Song [Sun, 6 Dec 2020 03:30:41 +0000 (19:30 -0800)]
[asan][test] Fix odr-vtable.cpp

3 years ago[TargetMachine] Set dso_local if asan is detected
Fangrui Song [Sun, 6 Dec 2020 01:51:10 +0000 (17:51 -0800)]
[TargetMachine] Set dso_local if asan is detected

AddressSanitizer instrumentation does not set dso_local on non-thread-local
global variables in -fno-pic and it seems to rely on implied dso_local to work.
Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as
appropriate.

Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.

3 years ago[debugserver] Call posix_spawnattr_setarchpref_np throught the fn ptr.
Jonas Devlieghere [Sun, 6 Dec 2020 01:35:32 +0000 (17:35 -0800)]
[debugserver] Call posix_spawnattr_setarchpref_np throught the fn ptr.

Fourth time is the charm? Of course all of these issues don't show up
when the function is available...

3 years ago[NFC][CodeGen] Add sanitize-dtor-zero-size-field test
Vitaly Buka [Sat, 5 Dec 2020 08:54:14 +0000 (00:54 -0800)]
[NFC][CodeGen] Add sanitize-dtor-zero-size-field test

The test demonstrates invalid behaviour which will be fixed soon.

3 years ago[ConstantHoisting] Remove unused declaration optimizeConstants (NFC)
Kazu Hirata [Sun, 6 Dec 2020 00:22:12 +0000 (16:22 -0800)]
[ConstantHoisting] Remove unused declaration optimizeConstants (NFC)

The function was renamed to runImpl on Jul 2, 2016 in commit
071d8306b0d9d1345c1da84ae3e1c1b231ffd29d, but the old declaration has
remained since.

3 years agoAdd recursive decomposition reasoning to isKnownNonEqual
Philip Reames [Sat, 5 Dec 2020 23:57:09 +0000 (15:57 -0800)]
Add recursive decomposition reasoning to isKnownNonEqual

The basic idea is that by looking through operand instructions which don't change the equality result that we can push the existing known bits comparison down past instructions which would obscure them.

We have analogous handling in InstSimplify for most - though weirdly not all - of these cases starting from an icmp root. It's a bit unfortunate to duplicate logic, but since my actual goal is to extend BasicAA, the icmp logic doesn't help. (And just makes it hard to test here.)  The BasicAA change will be posted separately for review.

Differential Revision: https://reviews.llvm.org/D92698

3 years ago[TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic +...
Fangrui Song [Sat, 5 Dec 2020 23:52:33 +0000 (15:52 -0800)]
[TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic + hidden)

This does not deserve special handling. The code should be added to Clang
instead if deemed useful. With this simplification, we can additionally delete
the PIC extern_weak special case.

3 years ago[CodeGen] llvm::erase_if (NFC)
Kazu Hirata [Sat, 5 Dec 2020 23:44:40 +0000 (15:44 -0800)]
[CodeGen] llvm::erase_if (NFC)

3 years agoRemove memory allocation with string
Aditya Kumar [Sat, 5 Dec 2020 19:43:45 +0000 (11:43 -0800)]
Remove memory allocation with string

Differential Revision: https://reviews.llvm.org/D92506

3 years ago[TargetMachine] Clean up TargetMachine::shouldAssumeDSOLocal after x86-32 specific...
Fangrui Song [Sat, 5 Dec 2020 23:13:41 +0000 (15:13 -0800)]
[TargetMachine] Clean up TargetMachine::shouldAssumeDSOLocal after x86-32 specific hack is moved to X86Subtarget

With my previous commit, X86Subtarget::classifyGlobalReference has learned to
use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in
TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply
dso_local for function declarations, we can drop the ppc64 special case as well.

This is NFC in terms of Clang emitted assembly.

3 years ago[TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model...
Fangrui Song [Sat, 5 Dec 2020 22:54:37 +0000 (14:54 -0800)]
[TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model for ELF/wasm

clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.)

By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value)
when taking the address of a function symbol.

This patch should be NFC in terms of the Clang emitted assembly because the case
we don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some function declarations and expose some differences. Most
tests have been fixed to be more robust in the previous commit.

3 years ago[test] Add explicit dso_local to function declarations in static relocation model...
Fangrui Song [Sat, 5 Dec 2020 21:55:48 +0000 (13:55 -0800)]
[test] Add explicit dso_local to function declarations in static relocation model tests

They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.

For such function declarations, clang -fno-pic emits the dso_local specifier.
Adding explicit dso_local makes these tests align with the clang behavior and
helps implementing an option to use GOT indirection when taking the address of a
function symbol in -fno-pic (to avoid a canonical PLT entry (SHN_UNDEF with
non-zero st_value)).

3 years ago[BasicAA] Fix a bug with relational reasoning across iterations
Philip Reames [Sat, 5 Dec 2020 22:05:48 +0000 (14:05 -0800)]
[BasicAA] Fix a bug with relational reasoning across iterations

Due to the recursion through phis basicaa does, the code needs to be extremely careful not to reason about equality between values which might represent distinct iterations. I'm generally skeptical of the correctness of the whole scheme, but this particular patch fixes one particular instance which is demonstrateable incorrect.

Interestingly, this appears to be the second attempted fix for the same issue. The former fix is incomplete and doesn't address the actual issue.

Differential Revision: https://reviews.llvm.org/D92694

3 years ago[debugserver] Use dlsym for posix_spawnattr_setarchpref_np
Jonas Devlieghere [Sat, 5 Dec 2020 22:05:54 +0000 (14:05 -0800)]
[debugserver] Use dlsym for posix_spawnattr_setarchpref_np

The @available check did not work as I thought it did. Use good old
dlsym instead.

3 years ago[X86] Emit @PLT for x86-64 and keep unadorned symbols for x86-32
Fangrui Song [Sat, 5 Dec 2020 21:17:47 +0000 (13:17 -0800)]
[X86] Emit @PLT for x86-64 and keep unadorned symbols for x86-32

This essentially reverts the x86-64 side effect of r327198.

For x86-32, @PLT (R_386_PLT32) is not suitable in -fno-pic mode so the
code forces MO_NO_FLAG (like a forced dso_local) (https://bugs.llvm.org//show_bug.cgi?id=36674#c6).

For x86-64, both `call/jmp foo` and `call/jmp foo@PLT` emit R_X86_64_PLT32
(https://sourceware.org/bugzilla/show_bug.cgi?id=22791) so there is no
difference using @PLT. Using @PLT is actually favorable because this drops
a difference with -fpie/-fpic code and makes it possible to avoid a canonical
PLT entry when taking the address of an undefined function symbol.

3 years ago[llvmbuildectomy] removed vestigial LLVMBuild.txt files
Chris Sears [Sat, 5 Dec 2020 20:59:06 +0000 (21:59 +0100)]
[llvmbuildectomy] removed vestigial LLVMBuild.txt files

LLVMBuild has been removed from the build system. However, three LLVMBuild.txt
files remain in the tree. This patch simply removes them.

llvm/lib/ExecutionEngine/Orc/TargetProcess/LLVMBuild.txt
llvm/tools/llvm-jitlink/llvm-jitlink-executor/LLVMBuild.txt
llvm/tools/llvm-profgen/LLVMBuild.txt

Differential Revision: https://reviews.llvm.org/D92693

3 years ago[TargetMachine] Move X86 specific shouldAssumeDSOLocal logic to X86Subtarget::classif...
Fangrui Song [Sat, 5 Dec 2020 20:32:50 +0000 (12:32 -0800)]
[TargetMachine] Move X86 specific shouldAssumeDSOLocal logic to X86Subtarget::classifyGlobalFunctionReference

3 years ago[BasicAA] Add more tests for non-equal index (NFC)
Nikita Popov [Sat, 5 Dec 2020 20:21:01 +0000 (21:21 +0100)]
[BasicAA] Add more tests for non-equal index (NFC)

3 years ago[TargetMachine] Simplify shouldAssumeDSOLocal by processing ExternalSymbolSDNode...
Fangrui Song [Sat, 5 Dec 2020 19:40:18 +0000 (11:40 -0800)]
[TargetMachine] Simplify shouldAssumeDSOLocal by processing ExternalSymbolSDNode early

The function accrues many `GV` nullness checks. Process `!GV`
(ExternalSymbolSDNode) early to simplify code.

Also improve a comment added in r327198 (intrinsics is a subset of
ExternalSymbolSDNode).

Intended to be NFC.

3 years ago[debugserver] Remove bridgeos availability
Jonas Devlieghere [Sat, 5 Dec 2020 18:17:48 +0000 (10:17 -0800)]
[debugserver] Remove bridgeos availability

I didn't realize that the 'bridgeos' is not part of the public SDK.

3 years ago[X86] Autodetect znver3
Benjamin Kramer [Sat, 5 Dec 2020 18:07:09 +0000 (19:07 +0100)]
[X86] Autodetect znver3

3 years ago[OpenMP][OMPT] Fix OMPT return address guard for gomp interface
Joachim Protze [Thu, 26 Nov 2020 10:55:56 +0000 (11:55 +0100)]
[OpenMP][OMPT] Fix OMPT return address guard for gomp interface

D91692 missed various locations in kmp_gsupport, where the scope for
OMPT_STORE_RETURN_ADDRESS is too narrow, i.e. the scope ends before the OMPT
callback is called in some nested function.

This patch fixes the scoping issue, so that all OMPT tests pass, when the
tests are built with gcc.

Differential Revision: https://reviews.llvm.org/D92121

3 years ago[SystemZ][ZOS] Fix the usage of pthread_t within libc++
Zbigniew Sarbinowski [Sat, 5 Dec 2020 00:13:23 +0000 (00:13 +0000)]
[SystemZ][ZOS] Fix the usage of pthread_t within libc++

This is the the minimal change introduced in [[ https://reviews.llvm.org/D88599 | D88599 ]]  to unblock the controversial change and discussion of proper separation between thread from thread id which will continue in D88599.

This patch will address the differences of definition of pthread_t on z/OS vs. Linux and other OS. Main trick to make the code work on z/OS relies on redefining libcpp_thread_id type and _LIBCPP_NULL_THREAD macro. This is necessary to separate initialization of libcxx_thread_id from the one of __libcxx_thread_t;

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D91875

3 years ago[clang-format] Add option for case sensitive regexes for sorted includes
mydeveloperday [Sat, 5 Dec 2020 16:32:37 +0000 (16:32 +0000)]
[clang-format] Add option for case sensitive regexes for sorted includes

I think the title says everything.

Reviewed By: MyDeveloperDay

Patch By:  HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D91507

3 years ago[NFC][libc++] Update C++20 issues status.
Mark de Wever [Sat, 5 Dec 2020 15:36:19 +0000 (16:36 +0100)]
[NFC][libc++] Update C++20 issues status.

Properly mark LWG1203 as completed and move the version number to the
version column.

3 years ago[NFC][clang-tidy] Fixes comment typos.
Mark de Wever [Sat, 5 Dec 2020 15:31:16 +0000 (16:31 +0100)]
[NFC][clang-tidy] Fixes comment typos.

3 years ago[ConstraintElimination] Wrap dump() call in LLVM_DEBUG (NFC).
Florian Hahn [Sat, 5 Dec 2020 12:55:27 +0000 (12:55 +0000)]
[ConstraintElimination] Wrap dump() call in LLVM_DEBUG (NFC).

ConstraintSystem::dump only generates output with -debug, but there's no
need to call it without -debug.

3 years ago[ConstraintElimination] Handle constraints with all zero var coeffs.
Florian Hahn [Sat, 5 Dec 2020 10:52:50 +0000 (10:52 +0000)]
[ConstraintElimination] Handle constraints with all zero var coeffs.

Constraints where all variable coefficients are 0 do not add any useful
information. When checking, we can check if they are always true/false.

3 years ago[AMDGPU][MC] Improved diagnostics message for sym/expr operands
Dmitry Preobrazhensky [Sat, 5 Dec 2020 10:41:27 +0000 (13:41 +0300)]
[AMDGPU][MC] Improved diagnostics message for sym/expr operands

See bug 48295 (https://bugs.llvm.org/show_bug.cgi?id=48295)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D92088

3 years ago[AA] Initialize Depth member
Nikita Popov [Sat, 5 Dec 2020 10:35:58 +0000 (11:35 +0100)]
[AA] Initialize Depth member

Fix mistake introduced in f8afba5f7a25a69c12191d979d78d40fa6e5b684:
I failed to initialize the Depth member to zero.

3 years ago[AMDGPU][MC] Corrected error position for invalid MOVREL src
Dmitry Preobrazhensky [Sat, 5 Dec 2020 10:21:28 +0000 (13:21 +0300)]
[AMDGPU][MC] Corrected error position for invalid MOVREL src

See bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D92084

3 years ago[clang-format] [NFC] keep clang-format tests clang-format clean
mydeveloperday [Sat, 5 Dec 2020 10:14:51 +0000 (10:14 +0000)]
[clang-format] [NFC] keep clang-format tests clang-format clean

I use several of the clang-format clean directories as a test suite, this one had got slightly out of wack in a prior commit

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D92666

3 years ago[AA] Add statistics for alias results (NFC)
Nikita Popov [Sat, 5 Dec 2020 09:39:31 +0000 (10:39 +0100)]
[AA] Add statistics for alias results (NFC)

Count how many NoAlias/MustAlias/MayAlias we get from top-level
queries.

3 years ago[BasicAA] Add recphi tests with nested loops (NFC)
Nikita Popov [Fri, 4 Dec 2020 18:05:16 +0000 (19:05 +0100)]
[BasicAA] Add recphi tests with nested loops (NFC)

3 years ago[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDS...
Fangrui Song [Sat, 5 Dec 2020 08:42:07 +0000 (00:42 -0800)]
[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal

PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead.
Actually Clang ppc32 does produce a pair of absolute relocations which match GCC.

This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).

3 years ago[TargetMachine] Delete wasm special case from shouldAssumeDSOLocal
Fangrui Song [Sat, 5 Dec 2020 07:22:47 +0000 (23:22 -0800)]
[TargetMachine] Delete wasm special case from shouldAssumeDSOLocal

3 years ago[llvm-nm][MachO] Don't call getFlags on redacted symbols
Francis Visoiu Mistrih [Sat, 5 Dec 2020 04:10:06 +0000 (20:10 -0800)]
[llvm-nm][MachO] Don't call getFlags on redacted symbols

Avoid calling getFlags on a non-existent symbol.

The way this is triggered is by calling strip -N on a binary, which sets
the MH_NLIST_OUTOFSYNC_WITH_DYLDINFO header flag. Then, in the
LC_FUNCTION_STARTS command, nm is trying to print the stripped symbols
and needs the proper checks.

3 years ago[AMDGPU] Use llvm::is_contained (NFC)
Kazu Hirata [Sat, 5 Dec 2020 05:42:54 +0000 (21:42 -0800)]
[AMDGPU] Use llvm::is_contained (NFC)

3 years ago[IRCE] Remove unused IsSigned and its accessor (NFC)
Kazu Hirata [Sat, 5 Dec 2020 05:26:12 +0000 (21:26 -0800)]
[IRCE] Remove unused IsSigned and its accessor (NFC)

IsSigned and its accessor, isSigned, were introduced on Oct 25, 2017
in commit 9ac7021a2563d433549a21990f96184d413e69e2.  The last use was
removed on Nov 20, 2017 in commit
268467869b99b15a15f81bf009d31e11536bef39.

3 years ago[RISCV] Formatting for easier reading (NFC)
Hsiangkai Wang [Fri, 4 Dec 2020 07:34:11 +0000 (15:34 +0800)]
[RISCV] Formatting for easier reading (NFC)

Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
3 years ago[mlir][IR] Move the storage for results to before the Operation instead of after.
River Riddle [Sat, 5 Dec 2020 05:01:26 +0000 (21:01 -0800)]
[mlir][IR] Move the storage for results to before the Operation instead of after.

Trailing objects are really nice for storing additional data inline with the main class, and is something that we heavily take advantage of for Operation(and many other classes). To get the address of the inline data you need to compute the address by doing some pointer arithmetic taking into account any objects stored before the object you want to access. Most classes keep the count of the number of objects, so this is relatively cheap to compute. This is not the case for results though, which have two different types(inline and trailing) that are not necessarily as cheap to compute as the count for other objects. This revision moves the storage for results to before the operation and stores them in reverse order. This allows for getting results to still be very fast given that they are never iterated directly in order, and also greatly improves the speed when accessing the other trailing objects of an operation(operands/regions/blocks/etc).

This reduced compile time when compiling a decently sized mlir module by about ~400ms, or 2.17s -> 1.76s.

Differential Revision: https://reviews.llvm.org/D92687

3 years ago[mlir][OpFormatGen] Add support for optional enum attributes
River Riddle [Sat, 5 Dec 2020 04:54:23 +0000 (20:54 -0800)]
[mlir][OpFormatGen] Add support for optional enum attributes

The check for formatting enum attributes was missing a call to get the base attribute, which is necessary to strip off the top-level OptionalAttr<> wrapper.

Differential Revision: https://reviews.llvm.org/D92713

3 years ago[builtins][ARM] Check __ARM_FP instead of __VFP_FP__.
Zhuojia Shen [Sat, 5 Dec 2020 04:53:23 +0000 (20:53 -0800)]
[builtins][ARM] Check __ARM_FP instead of __VFP_FP__.

This patch fixes builtins' CMakeLists.txt and their VFP tests to check
the standard macro defined in the ACLE for VFP support. It also enables
the tests to be built and run for single-precision-only targets while
builtins were built with double-precision support.

Differential revision: https://reviews.llvm.org/D92497

3 years ago[debugserver] Honor the cpu sub type if specified
Jonas Devlieghere [Sat, 5 Dec 2020 04:21:50 +0000 (20:21 -0800)]
[debugserver] Honor the cpu sub type if specified

Use the newly added spawnattr API, posix_spawnattr_setarchpref_np, to
select a slice preferences per cpu and subcpu types, instead of just cpu
with posix_spawnattr_setarchpref_np.

rdar://16094957

Differential revision: https://reviews.llvm.org/D92712

3 years ago[lldb] Remove unused argument to expectedFailure
Jonas Devlieghere [Fri, 4 Dec 2020 17:45:43 +0000 (09:45 -0800)]
[lldb] Remove unused argument to expectedFailure

3 years ago[ELF] Fix relocation-model.ll
Fangrui Song [Sat, 5 Dec 2020 03:33:19 +0000 (19:33 -0800)]
[ELF] Fix relocation-model.ll

3 years ago[TargetMachine] Don't imply dso_local on global variable declarations in Reloc::Stati...
Fangrui Song [Sat, 5 Dec 2020 03:03:40 +0000 (19:03 -0800)]
[TargetMachine] Don't imply dso_local on global variable declarations in Reloc::Static model

clang/lib/CodeGen/CodeGenModule sets dso_local on applicable global variables,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to remove as much
additional implied dso_local in TargetMachine:shouldAssumeDSOLocal as possible.)

By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent copy relocations.

This patch should be NFC in terms of the Clang behavior because the case we
don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some `external global` and expose some differences. Most tests
have been fixed to be more robust in previous commits.

3 years ago[test] Split some tests which test both static and pic relocation models
Fangrui Song [Sat, 5 Dec 2020 02:35:45 +0000 (18:35 -0800)]
[test] Split some tests which test both static and pic relocation models

TargetMachine::shouldAssumeDSOLocal currently implies dso_local for
Static. Split some tests so that these `external dso_local global`
will align with the Clang behavior.

3 years ago[RISCV] Use fcvt.h/d/f.w if the input is an assertsexti32 not just when the input...
Craig Topper [Sat, 5 Dec 2020 02:36:14 +0000 (18:36 -0800)]
[RISCV] Use fcvt.h/d/f.w if the input is an assertsexti32 not just when the input is sext_inreg.

3 years ago[NFC][AMDGPU] AMDGPUUsage updates
Tony [Sat, 5 Dec 2020 00:57:21 +0000 (00:57 +0000)]
[NFC][AMDGPU] AMDGPUUsage updates

- Document code object V2 gfx800.
- Document amdpal is supported by Linux Pro.

Differential Revision: https://reviews.llvm.org/D92708

3 years ago[test] Split some tests which test both static and pic relocation models
Fangrui Song [Sat, 5 Dec 2020 02:11:35 +0000 (18:11 -0800)]
[test] Split some tests which test both static and pic relocation models

TargetMachine::shouldAssumeDSOLocal currently implies dso_local for
Static. Split some tests so that these `external dso_local global` will
align with the Clang behavior.

3 years ago[lld][WebAssembly] Add suppport for PIC + passive data initialization
Sam Clegg [Fri, 4 Dec 2020 00:51:56 +0000 (16:51 -0800)]
[lld][WebAssembly] Add suppport for PIC + passive data initialization

This change improves our support for shared memory to include
PIC executables (and shared libraries).

To handle this case the linker-generated `__wasm_init_memory`
function (that only exists in shared memory builds) must be
capable of loading memory segements at non-const offsets based
on the runtime value of `__memory_base`.

Differential Revision: https://reviews.llvm.org/D92620

3 years agoMake __stack_chk_guard dso_local if Reloc::Static
Fangrui Song [Sat, 5 Dec 2020 00:57:45 +0000 (16:57 -0800)]
Make __stack_chk_guard dso_local if Reloc::Static

This is currently implied by TargetMachine::shouldAssumeDSOLocal
but will be changed in the future.

3 years ago[llvm] Update WinMsvc.cmake's fms-compatability to match llvm's prereqs
Nathan Lanza [Fri, 4 Dec 2020 23:52:10 +0000 (15:52 -0800)]
[llvm] Update WinMsvc.cmake's fms-compatability to match llvm's prereqs

llvm's minimum fms-compatability-version was just bumped to 19.14 and
thus the WinMsvc.cmake file needs to be adjusted accordingly.

3 years ago[RISCV] Define preprocessor definitions for 'V' extension.
Hsiangkai Wang [Fri, 4 Dec 2020 12:45:41 +0000 (20:45 +0800)]
[RISCV] Define preprocessor definitions for 'V' extension.

Differential Revision: https://reviews.llvm.org/D92650

3 years ago[objc] diagnose protocol conformance in categories with direct members
Alex Lorenz [Fri, 4 Dec 2020 23:06:13 +0000 (15:06 -0800)]
[objc] diagnose protocol conformance in categories with direct members
in their corresponding class interfaces

Categories that add protocol conformances to classes with direct members should prohibit protocol
conformances when the methods/properties that the protocol expects are actually declared as 'direct' in the class.

Differential Revision: https://reviews.llvm.org/D92602

3 years ago[clang] add a `swift_async_name` attribute
Alex Lorenz [Fri, 4 Dec 2020 22:45:27 +0000 (14:45 -0800)]
[clang] add a `swift_async_name` attribute

The swift_async_name attribute provides a name for a function/method that can be used
to call the async overload of this method from Swift. This name specified in this attribute
assumes that the last parameter in the function/method its applied to is removed when
Swift invokes it, as the the Swift's await/async transformation implicitly constructs the callback.

Differential Revision: https://reviews.llvm.org/D92355

3 years ago[clang] add a new `swift_attr` attribute
Alex Lorenz [Fri, 4 Dec 2020 17:29:45 +0000 (09:29 -0800)]
[clang] add a new `swift_attr` attribute

The swift_attr attribute is a generic annotation attribute that's not used by clang,
but is used by the Swift compiler. The Swift compiler can use these annotations to provide
various syntactic and semantic sugars for the imported Objective-C API declarations.

Differential Revision: https://reviews.llvm.org/D92354

3 years ago[test] precommit test for D92698
Philip Reames [Fri, 4 Dec 2020 23:12:16 +0000 (15:12 -0800)]
[test] precommit test for D92698

3 years agoIndex: Remove unused internal header SimpleFormatContext.h, NFC
Duncan P. N. Exon Smith [Fri, 4 Dec 2020 23:04:31 +0000 (15:04 -0800)]
Index: Remove unused internal header SimpleFormatContext.h, NFC

Looks like nothing has included this header since
d21485d2f5ffacf7b726c741ee409b3682045255 / r286279 in 2016. Delete the
dead code.

3 years agoAdd diagnostic for for-range-declaration being specificed with thread_local
shafik [Fri, 4 Dec 2020 22:47:36 +0000 (14:47 -0800)]
Add diagnostic for for-range-declaration being specificed with thread_local

Currently we have a diagnostic that catches the other storage class specifies for the range based for loop declaration but we miss the thread_local case. This changes adds a diagnostic for that case as well.

Differential Revision: https://reviews.llvm.org/D92671

3 years ago[asan][test] Improve -asan-use-private-alias tests
Fangrui Song [Fri, 4 Dec 2020 23:05:59 +0000 (15:05 -0800)]
[asan][test] Improve -asan-use-private-alias tests

In preparation for D92078

3 years ago[libc++] Update the commented "synopsis" in <algorithm> to match current reality.
Arthur O'Dwyer [Thu, 3 Dec 2020 01:02:18 +0000 (20:02 -0500)]
[libc++] Update the commented "synopsis" in <algorithm> to match current reality.

The synopsis now reflects what's implemented. It does NOT reflect
all of what's specified in C++20. The "constexpr in C++20" markings
are still missing from these 12 algorithms, because they are still
unimplemented by libc++:

    reverse partition sort nth_element next_permutation prev_permutation
    push_heap pop_heap make_heap sort_heap partial_sort partial_sort_copy

All of the above algorithms were excluded from [P0202].

All of the above algorithms were made constexpr in [P0879] (along with
swap_ranges, iter_swap, and rotate — we've already implemented those three).

Differential Revision: https://reviews.llvm.org/D92255

3 years ago[libc++] [P0202] constexpr set_union, set_difference, set_symmetric_difference, merge
Arthur O'Dwyer [Fri, 4 Dec 2020 18:47:12 +0000 (13:47 -0500)]
[libc++] [P0202] constexpr set_union, set_difference, set_symmetric_difference, merge

These had been waiting on the ability to use `std::copy` from
constexpr code (which in turn had been waiting on the ability to
use `is_constant_evaluated()` to switch between `memmove` and non-`memmove`
implementations of `std::copy`). That work landed a while ago,
so these algorithms can all be constexpr in C++20 now.

Simultaneously, update the tests for the set algorithms.

- Use an element type with "equivalent but not identical" values.
- The custom-comparator tests now pass something different from `operator<`.
- Make the constexpr coverage match the non-constexpr coverage.

Differential Revision: https://reviews.llvm.org/D92255

3 years ago[libc++] Slightly improve constexpr test coverage for std::includes.
Arthur O'Dwyer [Fri, 4 Dec 2020 18:38:51 +0000 (13:38 -0500)]
[libc++] Slightly improve constexpr test coverage for std::includes.

Differential Revision: https://reviews.llvm.org/D92255

3 years ago[VE] Add vfsqrt, vfcmp, vfmax, and vfmin intrinsic instructions
Kazushi (Jam) Marukawa [Fri, 4 Dec 2020 11:15:13 +0000 (20:15 +0900)]
[VE] Add vfsqrt, vfcmp, vfmax, and vfmin intrinsic instructions

Add vfsqrt, vfcmp, vfmax, and vfmin intrinsic instructions and
regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92651

3 years agoASTImporter: Migrate to the FileEntryRef overload of SourceManager::createFileID...
Duncan P. N. Exon Smith [Thu, 3 Dec 2020 01:25:46 +0000 (17:25 -0800)]
ASTImporter: Migrate to the FileEntryRef overload of SourceManager::createFileID, NFC

Migrate `ASTImporter::Import` over to using the `FileEntryRef` overload
of `SourceManager::createFileID`. No functionality change here.

Differential Revision: https://reviews.llvm.org/D92529

3 years agoARCMigrate: Initialize fields in EditEntry inline, NFC
Duncan P. N. Exon Smith [Thu, 3 Dec 2020 01:32:08 +0000 (17:32 -0800)]
ARCMigrate: Initialize fields in EditEntry inline, NFC

Initialize the fields inline instead of having to manually write out a
default constructor.

Differential Revision: https://reviews.llvm.org/D92597

3 years agoFrontend: Use translateLineCol instead of translateFileLineCol, NFC
Duncan P. N. Exon Smith [Fri, 4 Dec 2020 22:34:22 +0000 (14:34 -0800)]
Frontend: Use translateLineCol instead of translateFileLineCol, NFC

`ParseDirective` in VerifyDiagnosticConsumer.cpp is already calling
`translateFile`, so use the `FileID` returned by that to call
`translateLineCol` instead of using the more heavyweight
`translateFileLineCol`.

No functionality change here.

3 years ago[MC] Consume EndOfStatement in .cfi_{sections,endproc}
Scott Linder [Fri, 4 Dec 2020 22:14:37 +0000 (22:14 +0000)]
[MC] Consume EndOfStatement in .cfi_{sections,endproc}

Previously these directives were always interpreted as having an extra
blank line after them.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92612

3 years ago[gn build] Port 4d8bf870a82
LLVM GN Syncbot [Fri, 4 Dec 2020 22:16:56 +0000 (22:16 +0000)]
[gn build] Port 4d8bf870a82

3 years agoADT: Remove AlignedCharArrayUnion, NFC
Duncan P. N. Exon Smith [Wed, 2 Dec 2020 23:41:36 +0000 (15:41 -0800)]
ADT: Remove AlignedCharArrayUnion, NFC

Prep commit already migrated users over to std::aligned_union_t; this
just deletes the type / header / test.

Differential Revision: https://reviews.llvm.org/D92517

3 years ago[mlir][vector] rephrased description
Aart Bik [Fri, 4 Dec 2020 19:48:48 +0000 (11:48 -0800)]
[mlir][vector] rephrased description

More carefully worded description. Added constructor to options.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D92664

3 years agoInclude BuiltinAttributes.h in llvm-prettyprinters/gdb/mlir-support.cpp
Krzysztof Parzyszek [Fri, 4 Dec 2020 21:54:29 +0000 (15:54 -0600)]
Include BuiltinAttributes.h in llvm-prettyprinters/gdb/mlir-support.cpp

This header was introduced in c7cae0e4fa4e1ed4bdca186096a408578225fc2b.

3 years ago[test] Add explicit dso_local to constant/global variable declarations
Fangrui Song [Fri, 4 Dec 2020 21:51:01 +0000 (13:51 -0800)]
[test] Add explicit dso_local to constant/global variable declarations

They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.

For external data, clang -fno-pic emits the dso_local specifier for ELF and
non-MinGW COFF. Adding explicit dso_local makes these tests in align with the
clang behavior and helps implementing an option to use GOT indirection for
external data access in -fno-pic mode (to avoid copy relocations).

3 years ago[dfsan] Add empty APIs for field-level shadow
Jianzhou Zhao [Fri, 4 Dec 2020 02:50:56 +0000 (02:50 +0000)]
[dfsan] Add empty APIs for field-level shadow

This is a child diff of D92261.

This diff adds APIs that return shadow type/value/zero from origin
objects. For the time being these APIs simply returns primitive
shadow type/value/zero. The following diff will be implementing the
conversion.

As D92261 explains, some cases still use primitive shadow during
the incremential changes. The cases include
1) alloca/load/store
2) custom function IO
3) vectors
At the cases this diff does not use the new APIs, but uses primitive
shadow objects explicitly.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D92629

3 years ago[OPENMP]Fix PR48394: need to capture variables used in atomic constructs.
Alexey Bataev [Fri, 4 Dec 2020 20:56:54 +0000 (12:56 -0800)]
[OPENMP]Fix PR48394: need to capture variables used in atomic constructs.

The variables used in atomic construct should be captured in outer
task-based regions implicitly. Otherwise, the compiler will crash trying
to find the address of the local variable.

Differential Revision: https://reviews.llvm.org/D92682