Teresa Johnson [Thu, 30 Sep 2021 02:23:08 +0000 (19:23 -0700)]
Second attempt to fix Windows failures from test changes
Try to address Windows flakes from
d87bdc272ba47b7d9109ff5c7191454ab2ae6fcb
by adding "|| true" as suggested in D110276 so the whole test doesn't
fail when Windows thinks it can't remove the binary.
Ruiling Song [Thu, 16 Sep 2021 15:04:39 +0000 (23:04 +0800)]
AMDGPU: Broadcast scalar boolean to vector boolean explicitly
This is used to fix wrong code generation of s_add_co_select_user in
test/CodeGen/AMDGPU/expand-scalar-carry-out-select-user.ll
s_addc_u32 s4, s6, 0
s_cselect_b64 vcc, 1, 0 <-- vcc set as 0x1 if SCC==1
v_mov_b32_e32 v1, s4
s_cmp_gt_u32 s6, 31
v_cndmask_b32_e32 v1, 0, v1, vcc
If the s_addc_u32 set SCC, then we will get value 0x1 in VCC.
The v_cndmask will do per thread selection with VCC as condition
register. As VCC only gets the first bit being set, only the first
thread/lane in destination register can get correct result if the
very first lane is active. In fact, we should broadcast the value to all
active lanes of the final register.
The idea here is doing this broadcast to vector boolean explicitly
instead of lowering it into a COPY from SCC which would be interpreted as
selecting between 0/1.
This is used to replace D109754.
Reviewed-by: foad, alex-t
Differential Revision: https://reviews.llvm.org/D109889
Frederic Cambus [Thu, 30 Sep 2021 01:56:01 +0000 (07:26 +0530)]
[clang] Fix sentence in the usage section of ThinLTO docs.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D110750
Amy Huang [Thu, 30 Sep 2021 01:45:08 +0000 (18:45 -0700)]
Revert "[clang-cl] Accept `#pragma warning(disable : N)` for some N"
because it causes `error: error reading '/wd4091'` errors in
compiler-rt builds.
Matthias Springer [Thu, 30 Sep 2021 00:25:40 +0000 (09:25 +0900)]
[mlir][vector] Fold transfer ops and tensor.extract/insert_slice.
* Fold vector.transfer_read and tensor.extract_slice.
* Fold vector.transfer_write and tensor.insert_slice.
Differential Revision: https://reviews.llvm.org/D110627
Fangrui Song [Wed, 29 Sep 2021 23:56:52 +0000 (16:56 -0700)]
[llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support STO_RISCV_VARIANT_CC and DT_RISCV_VARIANT_CC
STO_RISCV_VARIANT_CC marks that a symbol uses a non-standard calling
convention or the vector calling convention.
See https://github.com/riscv/riscv-elf-psabi-doc/pull/190
Differential Revision: https://reviews.llvm.org/D107949
Andy Kaylor [Wed, 29 Sep 2021 19:25:16 +0000 (12:25 -0700)]
[IntelJITListener] Fix order in JitListener/multiple.ll
As reported in Bugzilla 51859, the JitListener/multiple.ll test had
become stale. The function order in the emitted image was changed by an
update to the MC/ElfObjectWriter code and because this test is disabled
by default, it wasn't updated.
Arthur O'Dwyer [Tue, 28 Sep 2021 16:19:35 +0000 (12:19 -0400)]
[libc++] Simplify the _LIBCPP_CONSTEXPR markings on starts_with() etc.
This came out of review comments on D110598.
Differential Revision: https://reviews.llvm.org/D110637
Rob Suderman [Wed, 29 Sep 2021 01:48:35 +0000 (18:48 -0700)]
[mlir][tosa] Ranked check for transpose was wrong.
Should have verified the perm length and input rank were the same before
inferring shape. Caused a crash with invalid IR.
Differential Revision: https://reviews.llvm.org/D110674
Amara Emerson [Wed, 29 Sep 2021 20:40:48 +0000 (13:40 -0700)]
[AArch64][GlobalISel] Widen G_BUILD_VECTOR source & dest element types to s8.
Louis Dionne [Wed, 29 Sep 2021 22:06:37 +0000 (18:06 -0400)]
[libc++] Fix missed rename of libcxx-trunk-shared.cfg.in
There was a race condition between the application of
565d45541f86
and the application of
0c874382b981, which led to the latter missing
some occurences.
Nikita Popov [Wed, 29 Sep 2021 21:40:43 +0000 (23:40 +0200)]
[BasicAA] Move DecomposedGEP out of header (NFC)
It's sufficient to have a forward declaration in the header, we
can move the definition of the struct (and VariableGEPIndex)
in the source file.
Leonard Chan [Wed, 29 Sep 2021 21:40:28 +0000 (14:40 -0700)]
[runtimes] Ensure required deps for tests targets are actually built
When building compiler-rt via runtimes, many tests fail because tools like
FileCheck and count aren't built yet. This is because the RUNTIME_TEST_DEPENDENCIES
haven't been added to any of the compiler-rt targets. The fix is to explicitly
add any runtimes as test_targets.
Differential Revision: https://reviews.llvm.org/D109625
Nikita Popov [Wed, 29 Sep 2021 20:58:30 +0000 (22:58 +0200)]
[BasicAA] Pass whole DecomposedGEP to subtraction API (NFC)
Rather than separately handling subtraction of offset and variable
indices, make this one operation. Also rewrite the implementation
to use range-based for loops.
Louis Dionne [Fri, 24 Sep 2021 16:31:45 +0000 (12:31 -0400)]
[libc++] Add the std::views::common range adaptor
Differential Revision: https://reviews.llvm.org/D110433
Sam McCall [Wed, 29 Sep 2021 13:25:15 +0000 (15:25 +0200)]
[VFS] InMemoryFilesystem's UniqueIDs are a function of path and content.
This ensures that re-creating "the same" FS results in the same UIDs for files.
In turn, this means that creating a clang module (preamble) using one in-memory
filesystem and consuming it using another doesn't create duplicate FileEntrys
for files that are the same in both FSes.
It's tempting to give the creator control over the UIDs instead. However that
requires fiddly API changes, e.g. what should the UIDs of intermediate
directories be?
This change is more "magic" but seems safe given:
- InMemoryFilesystem is used in testing more than production
- comparing UIDs across filesystems is unusual
- files with the same path and content are usually logically equivalent
(The usual reason for re-creating virtual filesystems rather than reusing them
is that typical use involves mutating their CWD and so is not threadsafe).
Differential Revision: https://reviews.llvm.org/D110711
Louis Dionne [Wed, 29 Sep 2021 18:47:19 +0000 (14:47 -0400)]
[libc++] Rename testing configurations to match Lit stdlib= parameter
To reduce confusion, this commit makes sure that the name of the testing
configurations match the convention used for the stdlib= Lit parameter,
since those effectively correspond to each other.
Louis Dionne [Tue, 28 Sep 2021 19:54:41 +0000 (15:54 -0400)]
[libc++][libc++abi] Add tests for vendor-specific properties
Vendors take libc++ and ship it in various ways. Some vendors might
ship it differently from what upstream LLVM does, i.e. the install
location might be different, some ABI properties might differ, etc.
In the past few years, I've come across several instances where
having a place to test some of these properties would have been
incredibly useful. I also just got bitten by the lack of tests
of that kind, so I'm adding some now.
The tests added by this commit for Apple platforms have numerous
TODOs that capture discrepancies between the upstream LLVM CMake
and the slightly-modified build we perform internally to produce
Apple's system libc++. In the future, the goal would be to upstream
all those differences so that it's possible to build a faithful
Apple system libc++ with the upstream LLVM sources only.
But this isn't only useful for Apple - this lays out the path for
any vendor being able to add their own checks (either upstream or
downstream) to libc++.
Differential Revision: https://reviews.llvm.org/D110736
Matheus Izvekov [Wed, 29 Sep 2021 13:23:30 +0000 (15:23 +0200)]
[clang] don't instantiate templates with injected arguments
There is a special situation with templates in local classes,
as can be seen in this example with generic lambdas in function scope:
```
template<class T1> void foo() {
(void)[]<class T2>() {
struct S {
void bar() { (void)[]<class T3>(T2) {}; }
};
};
};
template void foo<int>();
```
As a consequence of the resolution of DR1484, bar is instantiated during the
substitution of foo, and in this context we would substitute the lambda within
it with it's own parameters "injected" (turned into arguments).
This can't be properly dealt with for at least a couple of reasons:
* The 'TemplateTypeParm' type itself can only deal with canonical replacement
types, which the injected arguments are not.
* If T3 were constrained in the example above, our (non-conforming) eager
substitution of type constraints would just leave that parameter dangling.
Instead of substituting with injected parameters, this patch just leaves those
inner levels unreplaced.
Since injected arguments appear to be unused within the users of
`getTemplateInstantiationArgs`, this patch just removes that support there and
leaves a couple of asserts in place.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D110727
Louis Dionne [Wed, 29 Sep 2021 21:16:30 +0000 (17:16 -0400)]
[libc++][ci] Run alternative builds earlier to reduce latency
The Runtimes build is by far our longest CI configuration, so it makes
sense to run it earlier during CI. For consistency, move all the other
jobs from that "section" too.
Ricky Taylor [Wed, 29 Sep 2021 20:05:54 +0000 (21:05 +0100)]
[M68k] Avoid UB in disassembler
When reading 32 bits a 32-bit shift would be executed.
This is undefined behaviour, but in this case we can just replace the
entire scratch value to avoid it.
Differential Revision: https://reviews.llvm.org/D110769
Matheus Izvekov [Wed, 29 Sep 2021 11:24:28 +0000 (13:24 +0200)]
[clang] NFC: remove duplicated code around type constraint and templ arg subst
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D110726
Nikita Popov [Wed, 29 Sep 2021 20:22:47 +0000 (22:22 +0200)]
[BasicAA] Pass DecomposedGEP to constantOffsetHeuristic() (NFC)
Rather than separately passing VarIndices and BaseOffset, pass
the whole DecomposedGEP.
LLVM GN Syncbot [Wed, 29 Sep 2021 20:09:08 +0000 (20:09 +0000)]
[gn build] Port
969359e3b86b
Joseph Huber [Tue, 28 Sep 2021 19:53:55 +0000 (15:53 -0400)]
[OpenMP] Apply OpenMP assumptions to applicable call sites
This patch adds OpenMP assumption attributes to call sites in applicable
regions. Currently this applies the caller's assumption attributes to
any calls contained within it. So, if a call occurs inside an OpenMP
assumes region to a function outside that region, we will assume that
call respects the assumptions. This is primarily useful for inline
assembly calls used heavily in the OpenMP GPU device runtime, which
allows us to then make judgements about what the ASM will do.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110655
Joseph Huber [Tue, 28 Sep 2021 03:10:27 +0000 (23:10 -0400)]
[OpenMP][NFC] Fix linting messages in OpenMPOpt
Summary:
This patch addresses some linting messages I keep getting in my editor
when working on OpenMPOpt.
Joseph Huber [Wed, 29 Sep 2021 19:47:59 +0000 (15:47 -0400)]
[OpenMP] Add missing distribute definitions to AAKernelInfo
Summary:
The RTL functions added in https://reviews.llvm.org/D110429 were
mistakenly left out from the list of safe runtime calls in AAKernelInfo.
This patch adds them in.
peter klausler [Tue, 21 Sep 2021 23:06:30 +0000 (16:06 -0700)]
[flang] Make builtin types more easily accessible; use them
Rearrange the contents of __builtin_* module files a little and
make sure that semantics implicitly USEs the module __Fortran_builtins
before processing each source file. This ensures that the special derived
types for TEAM_TYPE, EVENT_TYPE, LOCK_TYPE, &c. exist in the symbol table
where they will be available for use in coarray intrinsic function
processing.
Update IsTeamType() to exploit access to the __Fortran_builtins
module rather than applying ad hoc name tests. Move it and some
other utilities from Semantics/tools.* to Evaluate/tools.* to make
them available to intrinsics processing.
Add/correct the intrinsic table definitions for GET_TEAM, TEAM_NUMBER,
and THIS_IMAGE to exercise the built-in TEAM_TYPE as an argument and
as a result.
Add/correct/extend tests accordingly.
Differential Revision: https://reviews.llvm.org/D110356
Arthur O'Dwyer [Mon, 27 Sep 2021 04:48:39 +0000 (00:48 -0400)]
[libc++] [compare] Named comparison functions, is_eq etc.
Some of these were previously half-implemented in "ordering.h";
now they're all implemented, and tested.
Note that `constexpr` functions are implicitly `inline`, so the
standard wording omits `inline` on these; but Louis and I agree
that that's surprising and it's better to be explicit about it.
Differential Revision: https://reviews.llvm.org/D110515
Bjorn Pettersson [Mon, 13 Sep 2021 08:43:38 +0000 (10:43 +0200)]
[test] Update some test cases to use -passes when specifying the pipeline
This updates transform test cases for
ADCE
AddDiscriminators
AggressiveInstCombine
AlignmentFromAssumptions
ArgumentPromotion
BDCE
CalledValuePropagation
DCE
Reg2Mem
WholeProgramDevirt
to use the -passes syntax when specifying the pipeline.
Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is
a deprecated feature) the updated test cases already used the new
pass manager, but they were using the legacy syntax when specifying
the passes to run. This patch can be seen as a step toward deprecating
that interface.
This patch also removes some redundant RUN lines. Here I am
referring to test cases that had multiple RUN lines verifying both
the legacy "-passname" syntax and the new "-passes=passname" syntax.
Since we switched the default pass manager to "new PM" both RUN lines
have verified the new PM version of the pass (more or less wasting
time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER
is set to "off". It is assumed that it is enough to run these tests
with the new pass manager now.
Differential Revision: https://reviews.llvm.org/D108472
Jessica Clarke [Wed, 29 Sep 2021 19:47:31 +0000 (20:47 +0100)]
[NFC][clang] Add newline to end of 2005-01-02-ConstantInits.c
This was removed in
a18181931f99.
Eric Schweitz [Wed, 29 Sep 2021 19:43:38 +0000 (21:43 +0200)]
[fir] Move parser/printer/verifier of fir.string_lit and add builders
Move the parser, printer and verifier to the .cpp file. Add builders
needed for lowering.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: schweitz, mehdi_amini
Differential Revision: https://reviews.llvm.org/D110686
Co-authored-by: Valentin Clement <clementval@gmail.com>
Wael Yehia [Wed, 29 Sep 2021 19:42:43 +0000 (19:42 +0000)]
Revert "[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace."
This reverts commit
a60405cf035dc114e7ee090139bed2577f4ea7ef.
Stefan Pintilie [Tue, 28 Sep 2021 19:44:30 +0000 (14:44 -0500)]
[PowerPC] The builtins load8r and store8r are Power 7 plus.
This patch makes sure that the builtins __builtin_ppc_load8r and
__ builtin_ppc_store8r are only available for Power 7 and up.
Currently the builtins seem to produce incorrect code if used for
Power 6 or before.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D110653
Roman Lebedev [Wed, 29 Sep 2021 19:16:08 +0000 (22:16 +0300)]
[NFC][X86][Codegen] Add test coverage for interleaved i32 load/store stride=2
Roman Lebedev [Wed, 29 Sep 2021 19:06:33 +0000 (22:06 +0300)]
[NFC][X86][LV] Add costmodel test coverage for interleaved i32 load/store stride=2
Daniil Fukalov [Wed, 29 Sep 2021 18:55:54 +0000 (21:55 +0300)]
[NFC][AMDGPU] Add missing gfx90a test cases to fsub.ll.
Sjoerd Meijer [Wed, 29 Sep 2021 18:32:46 +0000 (19:32 +0100)]
[LoopFlatten] Bail if we can't perform flattening after IV widening
It can happen that after widening of the IV, flattening may not be possible,
e.g. when it is deemed unprofitable. We were not properly checking this, which
resulted in flattening being applied when it shouldn't, also leading to
incorrect results (miscompilation).
This should fix PR51980 (https://bugs.llvm.org/show_bug.cgi?id=51980)
Differential Revision: https://reviews.llvm.org/D110712
Roman Lebedev [Wed, 29 Sep 2021 18:42:01 +0000 (21:42 +0300)]
[X86][Costmodel] Load/store i8 Stride=2 VF=32 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/xz6x7c35P - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=2.5`
So pick cost of `6`.
For store we have:
https://godbolt.org/z/xz6x7c35P - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110709
Roman Lebedev [Wed, 29 Sep 2021 18:42:01 +0000 (21:42 +0300)]
[X86][Costmodel] Load/store i8 Stride=2 VF=16 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/a9hv4z47v - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: =2.0`
So pick cost of `4`.
For store we have:
https://godbolt.org/z/6GfPn1b79 - for intels `Block RThroughput: =3.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `3`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110708
Roman Lebedev [Wed, 29 Sep 2021 18:41:56 +0000 (21:41 +0300)]
[X86][Costmodel] Load/store i8 Stride=2 VF=8 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
Identical to VF=2.
For load we have:
https://godbolt.org/z/4TEbdzbMM - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0`
So pick cost of `2`.
For store we have:
https://godbolt.org/z/MYfzGPf3Y - for intels `Block RThroughput: =1.0`; for ryzens, `Block RThroughput: <=0.5`
So pick cost of `1`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110705
Roman Lebedev [Wed, 29 Sep 2021 18:41:51 +0000 (21:41 +0300)]
[X86][Costmodel] Load/store i8 Stride=2 VF=4 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
Identical to VF=2.
For load we have:
https://godbolt.org/z/sGE41GYo7 - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0`
So pick cost of `2`.
For store we have:
https://godbolt.org/z/ba5r3s9xa - for intels `Block RThroughput: =1.0`; for ryzens, `Block RThroughput: <=0.5`
So pick cost of `1`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110704
Roman Lebedev [Wed, 29 Sep 2021 18:41:46 +0000 (21:41 +0300)]
[X86][Costmodel] Load/store i8 Stride=2 VF=2 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/caKqjr9hb - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0`
So pick cost of `2`.
For store we have:
https://godbolt.org/z/6TTn3eKj8 - for intels `Block RThroughput: =1.0`; for ryzens, `Block RThroughput: <=0.5`
So pick cost of `1`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110702
Muiez Ahmed [Wed, 29 Sep 2021 18:48:32 +0000 (14:48 -0400)]
[NFC] Add contributor name to CREDITS.TXT
Differential Revision: https://reviews.llvm.org/D110650
Wesley Wiser [Wed, 29 Sep 2021 18:36:13 +0000 (11:36 -0700)]
[Mangler] Calculate the argument list byte count suffix correctly when returning large values
`__stdcall`, `__fastcall` and `__vectorcall` return large values via a
hidden pointer argument. However, the size of that argument should not
be included in the argument list byte count suffix added to the
function's decorated name.
This patch fixes that issue so that LLVM generates the same decorated
name as MSVC does.
MSVC example: https://godbolt.org/z/nc35MKPhr
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110719
Alex Langford [Mon, 20 Sep 2021 21:39:13 +0000 (14:39 -0700)]
[lldb] Remove Expression's dependency on CPlusPlusLanguagePlugin
This change accomplishes the following:
- Moves `IRExecutionUnit::FindBestAlternateMangledName` to `Language`.
- Renames `FindBestAlternateMangledName` to
`FindBestAlternateFunctionMangledName`
- Changes the first parameter of said method from a `ConstString`
representing a demangled name to a `Mangled`.
- Remove the use of CPlusPlusLanguage from Expression
Martin Storsjö [Fri, 10 Sep 2021 11:36:13 +0000 (14:36 +0300)]
[libcxx] Add a CI configuration for standalone building in llvm-project/runtimes
Generate the llvm-lit script in runtimes/CmakeFiles.txt unless invoked
from llvm/runtimes.
Differential Revision: https://reviews.llvm.org/D109593
Nico Weber [Wed, 29 Sep 2021 18:28:13 +0000 (14:28 -0400)]
[clang] Minor cleanups after
b2de52bec
Jay Foad [Wed, 29 Sep 2021 10:42:04 +0000 (11:42 +0100)]
[AMDGPU] Enable machine verification after AMDGPUISelDAGToDAG
This was introduced in D32628 but it does not seem to be required any
more. At least it does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.
Differential Revision: https://reviews.llvm.org/D110692
Louis Dionne [Wed, 29 Sep 2021 17:42:55 +0000 (13:42 -0400)]
[libc++][NFC] Reorganize CI jobs into commented sections
Dan Liew [Wed, 29 Sep 2021 17:28:03 +0000 (10:28 -0700)]
Adapt `tsan/flush_memory.cpp` to run on non-local platforms.
ad890aa2327feb6b6aee676fe85b2352fba2403e landed a test without
using the `%run` prefix which means the test fails to run for
platforms that need it (e.g. iOS simulators).
This patch adds the `%run` prefix. While we're here also split
the single `RUN` line into two to make debugging easier.
rdar://
83637296
Differential Revision: https://reviews.llvm.org/D110734
Petr Hosek [Wed, 29 Sep 2021 08:24:43 +0000 (01:24 -0700)]
[Driver] Check that short triples are supported for Fuchsia
{x86_64,aarch64}-unknown-fuchsia and {x86_64,aarch64}-fuchsia should
behave identically as targets, update the test to make sure that's the
case.
Differential Revision: https://reviews.llvm.org/D110687
Nico Weber [Tue, 28 Sep 2021 23:33:59 +0000 (19:33 -0400)]
[clang-cl] Accept `#pragma warning(disable : N)` for some N
clang-cl maps /wdNNNN to -Wno-flags for a few warnings that map
cleanly from cl.exe concepts to clang concepts.
This patch adds support for the same numbers to
`#pragma warning(disable : NNNN)`. It also lets
`#pragma warning(push)` and `#pragma warning(pop)` have an effect,
since these are used together with `warning(disable)`.
The optional numeric argument to `warning(push)` is ignored,
as are the other non-`disable` `pragma warning()` arguments.
(Supporting `error` would be easy, but we also don't support
`/we`, and those should probably be added together.)
The motivating example is that a bunch of code (including in LLVM)
uses this idiom to locally disable warnings about calls to deprecated
functions in Windows-only code, and 4996 maps nicely to
-Wno-deprecated-declarations:
#pragma warning(push)
#pragma warning(disable: 4996)
f();
#pragma warning(pop)
Implementation-wise:
- Move `/wd` flag handling from Options.td to actual Driver-level code
- Extract the function mapping cl.exe IDs to warning groups to the
new file clang/lib/Basic/CLWarnings.cpp
- Create a diag::Group enum so that CLWarnings.cpp can refer to
existing groups by ID (and give DllexportExplicitInstantiationDecl
a named group), and add a function to map a diag::Group to the
spelling of it's associated commandline flag
- Call that new function from PragmaWarningHandler
Differential Revision: https://reviews.llvm.org/D110668
Louis Dionne [Wed, 29 Sep 2021 17:11:52 +0000 (13:11 -0400)]
[libc++] Move libc++ specific tests to `libcxx/test/libcxx`
This is consistent with what we've been doing forever.
Sanjay Patel [Wed, 29 Sep 2021 15:59:50 +0000 (11:59 -0400)]
[InstSimplify] (-1 << x) s>> x --> -1
This was noticed in:
https://llvm.org/PR51351
https://alive2.llvm.org/ce/z/aLxunD
Aart Bik [Wed, 29 Sep 2021 05:48:32 +0000 (22:48 -0700)]
[mlir][sparse] simplify negi code generation with subi
The lack of negi details leaked from merger class into codegen part.
Also, special case for vector code was not needed, the type can be used directly!
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110677
Walter Erquinigo [Wed, 29 Sep 2021 16:40:53 +0000 (09:40 -0700)]
Fix LLDB build on old Linux kernels
Usage of aux_size is guarded against elsewhere in this file, but is missing here.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D110269
Original Author: calebzulawski
Eric Schweitz [Wed, 29 Sep 2021 16:29:33 +0000 (18:29 +0200)]
[fir] Update fir.call op
Move builders to .cpp file and update accordingly.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D110698
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Quinn Pham [Thu, 22 Jul 2021 13:47:57 +0000 (08:47 -0500)]
[PowerPC] swdiv builtins for XL compatibility
This patch is in a series of patches to provide builtins for compatibility with
the XL compiler. This patch implements the software divide builtin as
wrappers for a floating point divide. XL provided these builtins because it
didn't produce software estimates by default at `-Ofast`. When compiled
with `-Ofast` these builtins will produce the software estimate for divide.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D106959
Michael Kruse [Wed, 29 Sep 2021 15:51:13 +0000 (10:51 -0500)]
[llvm-reduce] Reduce metadata references.
The ReduceMetadata pass before this patch removed metadata on a per-MDNode (or NamedMDNode) basis. Either all references to an MDNode are kept, or all of them are removed. However, MDNodes are uniqued, meaning that references to MDNodes with the same data become references to the same MDNodes. As a consequence, e.g. tbaa references to the same type will all have the same MDNode reference and hence make it impossible to reduce only keeping metadata on those memory access for which they are interesting.
Moreover, MDNodes can also be referenced by some intrinsics or other MDNodes. These references were not considered for removal leading to the possibility that MDNodes are not actually removed even if selected to be removed by the oracle.
This patch changes ReduceMetadata to reduces based on removable metadata references instead. MDNodes without references implicitly dropped anyway. References by intrinsic calls should be removed by ReduceOperands or ReduceInstructions. References in other MDNodes cannot be removed as it would violate the immutability of MDNodes.
Additionally, ReduceMetadata pass before this patch used `setMetadata(I, NULL)` to remove references, where `I` is the index in the array returned by `getAllMetadata`. However, `setMetadata` expects a MDKind (such as `MD_tbaa`) as first argument. `getAllMetadata` does not return those in consecutive order (otherwise it would not need to be a `std::pair` with `first` representing the MDKind).
Reviewed By: aeubanks, swamulism
Differential Revision: https://reviews.llvm.org/D110534
Stuart Brady [Tue, 28 Sep 2021 16:11:04 +0000 (17:11 +0100)]
[OpenCL][NFC] Refactor vloada_half and vstorea_half decls
Group them together with the vload_half and vstore_half decls for
simplicity.
Reviewed By: svenvh
Differential Revision: https://reviews.llvm.org/D110636
Dhruva Chakrabarti [Wed, 29 Sep 2021 05:44:14 +0000 (22:44 -0700)]
[libomptarget] [amdgpu] After a kernel dispatch packet is published, its contents must not be accessed.
Fixes: SWDEV-275232 (With contributions from Ammar Elwazir, Laurent Morichetti, and Tony Tye)
The current code is racy. After the packet is submitted, the GPU will increment the read index. If this wraps around before the memory is read from it'll refer to a signal from an unrelated packet. Change avoids reading from the packet post-submission.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D110679
Eric Schweitz [Wed, 29 Sep 2021 16:12:17 +0000 (18:12 +0200)]
[fir][NFC] Update fir.iterate_while op
Add getFinalValueAttrName() and remove specified number of
inlined elements for SmallVector. This patch is mainly motivated
to help the upstreaming effort.
This patch is part of the upstreaming effort from fir-dev branch.
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D110710
Kazu Hirata [Wed, 29 Sep 2021 16:08:41 +0000 (09:08 -0700)]
[AArch64] Remove redundant declaration createAArch64ObjectTargetStreamer (NFC)
Note that createAArch64ObjectTargetStreamer is declared in
AArch64TargetStreamer.h and defined in AArch64TargetStreamer.cpp.
Identified with readability-redundant-declaration.
David Green [Wed, 29 Sep 2021 15:55:31 +0000 (16:55 +0100)]
[AArch64] Model Cortex-A55 Q register NEON instructions
Cortex-A55 has 2 64bit NEON vector units, meaning a 128bit instruction
requires taking both units (and can only be issued as the first
instruction in a dual issue pair). This patch models that by splitting
the WriteV SchedWrite into two - the WriteVd that reads/writes only
64bit operands, and the WriteVq that read/writes 128bit registers. The
A55 schedule then uses this distinction to model the WriteVq as taking
both resource units, and starting a Schedule Group and WriteVd as taking
one as before.
I believe this is more correct, even if it does not lead to much better
performance.
Differential Revision: https://reviews.llvm.org/D108766
Sean Fertile [Wed, 29 Sep 2021 15:53:46 +0000 (11:53 -0400)]
[PowerPC][AIX] Warn when using pragma align(packed) on AIX.
With xlc and xlC pragma align(packed) will pack bitfields the same way
as pragma align(bit_packed). xlclang, xlclang++ and clang will
pack bitfields the same way as pragma pack(1). Issue a warning when
source code using pragma align(packed) is used to alert the user it
may not be compatable with xlc/xlC.
Differential Revision: https://reviews.llvm.org/D107506
Sanjay Patel [Wed, 29 Sep 2021 15:38:48 +0000 (11:38 -0400)]
[InstCombine] fix miscompile from dropRedundantMaskingOfLeftShiftInput()
The test is from https://llvm.org/PR51351.
There are 2 related logic bugs from over-generalizing "lshr" to "any shr",
but I'm not sure how to expose the difference for "MaskC" because instsimplify
already folds ashr of -1.
I'll extend instsimplify to catch the MaskD pattern as a follow-up, but this
patch should be enough to avoid the miscompile.
Sanjay Patel [Wed, 29 Sep 2021 14:56:56 +0000 (10:56 -0400)]
[InstSimplify] add tests for (-1 << x) s>> x; NFC
Sanjay Patel [Wed, 29 Sep 2021 14:44:46 +0000 (10:44 -0400)]
[InstCombine] add test for miscompile in dropRedundantMaskingOfLeftShiftInput(); NFC (PR51351)
Jay Foad [Wed, 29 Sep 2021 09:52:00 +0000 (10:52 +0100)]
[MSP430] Recognize Bi as an indirect branch in analyzeBranch. NFC.
Recognize Bi as an unconditional branch, just like JMP. This allows
machine verification to run after MSP430BranchSelector without failing
this assertion:
virtual bool llvm::MSP430InstrInfo::analyzeBranch(llvm::MachineBasicBlock &, llvm::MachineBasicBlock *&, llvm::MachineBasicBlock *&, SmallVectorImpl<llvm::MachineOperand> &, bool) const: Assertion `I->getOpcode() == MSP430::JCC && "Invalid conditional branch"' failed.
Note that machine verification is currently disabled after
addPreEmitPass passes because of problems on other targets, so this is
currently NFC.
Differential Revision: https://reviews.llvm.org/D110691
Simon Pilgrim [Wed, 29 Sep 2021 15:41:53 +0000 (16:41 +0100)]
[TTI] BasicTTI::getInterleavedMemoryOpCost(): use getScalarizationOverhead()
getScalarizationOverhead() results in a somewhat better cost estimation than counting the insertion/extraction costs directly. Notably, this is still overestimating the costs.
Original Patch by: @lebedev.ri (Roman Lebedev)
Differential Revision: https://reviews.llvm.org/D110713
Simon Pilgrim [Wed, 29 Sep 2021 11:54:52 +0000 (12:54 +0100)]
[clang-tidy] Merges separate isa<>/assert/unreachable/dyn_cast<>/cast<> calls
We can directly use cast<> instead of separate dyn_cast<> with assertions as cast<> will perform this for us.
Similarly we can replace a if(isa<>)+cast<>/dyn_cast<> with if(dyn_cast<>)
Simon Pilgrim [Wed, 29 Sep 2021 11:28:38 +0000 (12:28 +0100)]
[CostModel][AArch64] Don't dereference CostTblEntry before null check.
Fix static analysis warning that we check for null Entry after dereferencing it.
I don't think this can actually happen as i8/i16 should legalize to use the i32 path which should return a cost - but I'd rather play it safe that rely on an implicit type legalization.
Sam Clegg [Wed, 29 Sep 2021 14:26:46 +0000 (07:26 -0700)]
[WebAssemlby][Object] Fix dead code in WasmObjectFile.cpp
I introduced this by mistake in https://reviews.llvm.org/D109595.
Differential Revision: https://reviews.llvm.org/D110717
Raphael Isemann [Wed, 29 Sep 2021 14:47:12 +0000 (16:47 +0200)]
[lldb] Fix TestImportStdModule on some setups by testing minmax instead of abs
Some downstream forks of LLDB change parts of the test setup in a way that
causes lldb to somehow resolve `std::abs` (probably to `::abs`). This patch
changes the tested function here to be `std::minmax` which (hopefully) doesn't
have any identically named functions that LLDB could find and call. Just to be
extra safe this also explicitly specified the template arguments so that in
case there is a `minmax` non-template function we still don't end up calling it
from this test.
Teresa Johnson [Wed, 29 Sep 2021 14:58:49 +0000 (07:58 -0700)]
Use rm -f to fix Windows failures from test changes
Try to address Windows flakes from
d87bdc272ba47b7d9109ff5c7191454ab2ae6fcb
by using 'rm -f' instead of just 'rm' as discussed in D110276. For example:
http://45.33.8.238/win/46115/step_7.txt
Frederic Cambus [Wed, 29 Sep 2021 14:14:47 +0000 (19:44 +0530)]
[clang] Fix library name (libsupc++) in the admonition note.
Differential Revision: https://reviews.llvm.org/D110715
David Green [Wed, 29 Sep 2021 14:13:12 +0000 (15:13 +0100)]
[AArch64] Enable type promotion for AArch64
This enables the type promotion pass for AArch64, which acts as a
CodeGenPrepare pass to promote illegal integers to legal ones,
especially useful for removing extends that would otherwise require
cross-basic-block analysis.
I have enabled this generally, for both ISel and GlobalISel. In some
quick experiments it appeared to help GlobalISel remove extra extends in
places too, but that might just be missing optimizations that are better
left for later. We can disable it again if required.
In my experiments, this can improvement performance in some cases, and
codesize was a small improvement. SPEC was a very small improvement,
within the noise. Some of the test cases show extends being moved out of
loops, often when the extend would be part of a cmp operand, but that
should reduce the latency of the instruction in the loop on many cpus.
The signed-truncation-check tests are increasing as they are no longer
matching specific DAG combines.
We also hope to add some additional improvements to the pass in the near
future, to capture more cases of promoting extends through phis that
have come up in a few places lately.
Differential Revision: https://reviews.llvm.org/D110239
Marcel Koester [Tue, 7 Sep 2021 10:21:35 +0000 (12:21 +0200)]
Introduced AllocationOpInterface to create deallocation operations on-the-fly that are compatible with the allocation operation implementing this interface.
Added interface implementations for AllocOp and CloneOp defined in the MemRef diallect.
Adapted the BufferDeallocation pass to be compatible with the interface introduced in this CL.
Differential Revision: https://reviews.llvm.org/D109350
Florian Hahn [Wed, 29 Sep 2021 13:38:35 +0000 (14:38 +0100)]
[IndVarSimplify] Forget phi value after changing incoming value.
This fixes an issue exposed by D71539, where IndVarSimplify tries
to access an invalid cached SCEV expression after making changes to the
underlying PHI instruction earlier.
When changing the incoming value of a PHI, forget the cached SCEV for
the PHI.
Nicolas Vasilache [Wed, 29 Sep 2021 09:36:32 +0000 (09:36 +0000)]
[mlir][Linalg] Rewrite CodegenStrategy to populate a pass pipeline.
This revision retires a good portion of the complexity of the codegen strategy and puts the logic behind pass logic.
Differential revision: https://reviews.llvm.org/D110678
David Green [Wed, 29 Sep 2021 13:35:09 +0000 (14:35 +0100)]
[AArch64] Add TypePromotion tests and regenerate atomic test check lines
This adds some extra tests for TypePromotion as per D110239, and
regenerated the check lines in atomic-ops.ll and cmpxchg-idions.ll to be
more easy to maintain with changing codegen (hopefully in a way that
does not reduce what is tested).
Roman Lebedev [Wed, 29 Sep 2021 12:27:26 +0000 (15:27 +0300)]
[NFC][X86] Add codegen test coverage for interleaved load/store i8 stride=2
Roman Lebedev [Wed, 29 Sep 2021 12:19:56 +0000 (15:19 +0300)]
[NFC][X86][LV] Add costmodel test coverage for interleaved i8 load/store stride=2
Nico Weber [Wed, 29 Sep 2021 00:57:40 +0000 (20:57 -0400)]
[lld/mac] Don't warn on both --icf=all and -no_deduplicate
Instead, just make the later flag win, like usual.
Implement this by making -no_deduplicate an actual alias for --icf=none
at the Options.td level.
Differential Revision: https://reviews.llvm.org/D110672
Wael Yehia [Mon, 20 Sep 2021 14:59:26 +0000 (14:59 +0000)]
[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace.
Reviewed by: steven_wu, fhahn, tejohnson
Differential Revision: https://reviews.llvm.org/D110075
Salman Javed [Wed, 29 Sep 2021 11:59:04 +0000 (07:59 -0400)]
Revert
9b944c184396ce55a3ad608779cc326ba12c9ee3 with fixes
This reintroduces
c0687e1984a82925918c874b7bb68ad34c32aed0 (Add support
for `NOLINTBEGIN` ... `NOLINTEND` comments) but with fixes to the tests.
Michał Górny [Wed, 29 Sep 2021 11:13:54 +0000 (13:13 +0200)]
[lldb] [Host] Remove TerminalStateSwitcher
Remove TerminalStateSwitcher class. It is not used anywhere and its API
is really weird. This is the first step towards cleaning up Terminal.h.
Differential Revision: https://reviews.llvm.org/D110693
Djordje Todorovic [Wed, 29 Sep 2021 11:20:15 +0000 (04:20 -0700)]
NFC: [Debugify] Fix a typo when checking variables in the original mode
Nemanja Ivanovic [Wed, 29 Sep 2021 11:33:46 +0000 (06:33 -0500)]
[PowerPC] Implement builtin for vbpermd
The instruction has similar semantics to vbpermq but for doublewords.
It was added in Power9 and the ABI documents the builtin.
Differential revision: https://reviews.llvm.org/D107899
Roman Lebedev [Wed, 29 Sep 2021 09:49:28 +0000 (12:49 +0300)]
[NFC][X86][LV] Add some test coverage for [un]masked gather/scatter
While we did have test coverage for the intrinsics,
i don't believe there was LV-based test coverage.
Nemanja Ivanovic [Wed, 29 Sep 2021 11:14:12 +0000 (06:14 -0500)]
[PowerPC] Define XL-compatible macros only for AIX and Linux
Since XLC only ever shipped on PowerPC AIX and Linux, it is not reasonable to
provide the compatibility macros on any target other than those two. This patch
restricts those macros to AIX/Linux.
Differential revision: https://reviews.llvm.org/D110213
Sander de Smalen [Wed, 29 Sep 2021 10:33:40 +0000 (11:33 +0100)]
[SelectionDAG] Make WidenVecRes_EXTRACT_SUBVECTOR work for scalable vectors.
The legalizer handles this by breaking up an EXTRACT_SUBVECTOR into
smaller parts, and combines those together, padding the result with
UNDEF vectors, e.g.
nxv6i64 extract_subvector(nxv12i64, 6)
<->
nxv8i64 concat(
nxv2i64 extract_subvector(nxv16i64, 6)
nxv2i64 extract_subvector(nxv16i64, 8)
nxv2i64 extract_subvector(nxv16i64, 10)
nxv2i64 undef)
Reviewed By: frasercrmck, david-arm
Differential Revision: https://reviews.llvm.org/D110253
Sander de Smalen [Tue, 28 Sep 2021 18:44:10 +0000 (19:44 +0100)]
[AArch64][SVE] Fix extract_subvector patterns for unpacked fp types.
The patterns added in D110163 were incorrect, since it used the wrong
element widths for its shuffles.
Example for nxv2f16 extract_subvector(nxv8f16 %in, 6):
<a|b|c|d|e|f|g|h>
^^^
extract g and h.
=> UUNPKHI .h -> .s results in:
<e |f |g |h >
=> UUNPKHI .s -> .d results in:
<g |h >
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D110523
Martin Storsjö [Thu, 23 Sep 2021 10:38:36 +0000 (13:38 +0300)]
[X86] Fix handling of i128<->fp on Windows
On Windows, i128 arguments are passed as indirect arguments, and
they are returned in xmm0.
This is mostly fixed up by `WinX86_64ABIInfo::classify` in Clang, making
the IR functions return v2i64 instead of i128, and making the arguments
indirect. However for cases where libcalls are generated in the target
lowering, the lowering uses the default x86_64 calling convention for
i128, where they are passed/returned as a register pair.
Add custom lowering logic, similar to the existing logic for i128
div/mod (added in
4a406d32e97b1748c4eed6674a2c1819b9cf98ea),
manually making the libcall (while overriding the return type to
v2i64 or passing the arguments as pointers to arguments on the stack).
X86CallingConv.td doesn't seem to handle i128 at all, otherwise
the windows specific behaviours would ideally be implemented as
overrides there, in generic code, handling these cases automatically.
This fixes https://bugs.llvm.org/show_bug.cgi?id=48940.
Differential Revision: https://reviews.llvm.org/D110413
Amara Emerson [Wed, 29 Sep 2021 09:54:18 +0000 (02:54 -0700)]
[AArch64][GlobalISel] Add selection tests for vector G_UMULH/G_SMULH.
We already import these patterns from SelectionDAG.
David Spickett [Wed, 29 Sep 2021 09:52:27 +0000 (10:52 +0100)]
[AMDGPU] Require AMDGPU target for ASAN instrumentation tests
Should fix test failure on Arm/AArch64 quick bots which
only build those targets.
https://lab.llvm.org/buildbot/#/builders/171/builds/4077
Jay Foad [Wed, 29 Sep 2021 09:11:57 +0000 (10:11 +0100)]
[RemoveRedundantDebugValues] Enable machine verification after this pass
Machine verification after RemoveRedundantDebugValues has been disabled
since the pass was first added in D105279, but I guess this was just due
to copy-and-paste. Enabling it does not show any problems in check-llvm
in an LLVM_ENABLE_EXPENSIVE_CHECKS build.
Differential Revision: https://reviews.llvm.org/D110688
Igor Kudrin [Wed, 29 Sep 2021 09:36:37 +0000 (16:36 +0700)]
[llvm-objcopy] Rename relocation sections together with their targets.
As for now, llvm-objcopy renames only sections that are specified
explicitly in --rename-section, while GNU objcopy keeps names of
relocation sections in sync with their targets. For example:
> readelf -S test.o
...
[ 1] .foo PROGBITS
[ 2] .rela.foo RELA
> objcopy --rename-section .foo=.bar test.o gnu.o
> readelf -S gnu.o
...
[ 1] .bar PROGBITS
[ 2] .rela.bar RELA
> llvm-objcopy --rename-section .foo=.bar test.o llvm.o
> readelf -S llvm.o
...
[ 1] .bar PROGBITS
[ 2] .rela.foo RELA
This patch makes llvm-objcopy to match the behavior of GNU objcopy better.
Differential Revision: https://reviews.llvm.org/D110352
Amara Emerson [Wed, 29 Sep 2021 09:09:21 +0000 (02:09 -0700)]
[AArch64][GlobalISel] Make some vector G_SMULH/G_UMULH legal.