Paweł Bylica [Thu, 20 Jan 2022 20:16:46 +0000 (21:16 +0100)]
[test] Add tests for bswap combining. NFC
CJ Johnson [Thu, 20 Jan 2022 23:05:07 +0000 (18:05 -0500)]
[clang-tidy] Update bugprone-stringview-nullptr to consistently prefer the empty string when passing arguments to constructors/functions
Previously, function(nullptr) would have been fixed with function({}). This unfortunately can change overload resolution and even become ambiguous. T(nullptr) was already being fixed with T(""), so this change just brings function calls in line with that.
Differential Revision: https://reviews.llvm.org/D117840
Craig Topper [Thu, 20 Jan 2022 22:57:31 +0000 (14:57 -0800)]
[RISCV] Remove RISCVSubtarget::hasStdExtV() and hasStdExtZve*(). NFC
All code should use one of the cleaner named hasVInstructions*
functions. Fix the two uses that weren't and delete the methods
so no new uses can be created.
Siva Chandra Reddy [Thu, 20 Jan 2022 08:11:54 +0000 (08:11 +0000)]
[libc] Move the remaining public types to their own type headers.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D117838
Craig Topper [Thu, 20 Jan 2022 22:16:37 +0000 (14:16 -0800)]
[RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors.
RISCV only has a unary shuffle that requires places indices in a
register. For interleaving two vectors this means we need at least
two vrgathers and a vmerge to do a shuffle of two vectors.
This patch teaches shuffle lowering to use a widening addu followed
by a widening vmaccu to implement the interleave. First we extract
the low half of both V1 and V2. Then we implement
(zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which
simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further
simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the
result back to the original type splitting the wide elements in half.
We can only do this if we have a type with wider elements available.
Because we're using extends we also have to be careful with fractional
lmuls. Floating point types are supported by bitcasting to/from integer.
The tests test a varied combination of LMULs split across VLEN>=128 and
VLEN>=512 tests. There a few tests with shuffle indices commuted as well
as tests for undef indices. There's one test for a vXi64/vXf64 vector which
we can't optimize, but verifies we don't crash.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D117743
Rob Suderman [Thu, 20 Jan 2022 22:32:19 +0000 (14:32 -0800)]
[mlir][tosa] Limit right-shift to 31 bits
Right shift can occur that is a 32-bit right shift. This is undefined behavior.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117732
owenca [Thu, 20 Jan 2022 09:59:52 +0000 (01:59 -0800)]
[clang-format][NFC] Clean up tryMergeLessLess()
Differential Revision: https://reviews.llvm.org/D117759
Nathan James [Thu, 20 Jan 2022 22:20:10 +0000 (22:20 +0000)]
[clang-tidy][NFC] Remove redundant string creation for comparison
Michael Kruse [Thu, 20 Jan 2022 16:42:17 +0000 (10:42 -0600)]
[OpenMPIRBuilder] Detect ambiguous InsertPoints for apply*WorkshareLoop. NFC.
Follow-up on D117226 for applyStaticWorkshareLoop and
applyDynamicWorkshareLoop checking for conflicting InertPoints via an
assert. There is no in-tree code that violates this assertion, hence
nothing changes.
Philip Reames [Thu, 20 Jan 2022 22:07:46 +0000 (14:07 -0800)]
[SLP] Remove stray semicolon to make bots happy
Certain bots (e.g. sanitizer-x86_64-linux-android) appear to be running with strict c++98 flags which disallow ; at global scope.
Stanislav Mekhanoshin [Wed, 1 Dec 2021 21:44:42 +0000 (13:44 -0800)]
[AMDGPU] Do not ignore exec use where exec is read as data
Compares, v_cndmask_b32, and v_readfirstlane_b32 use EXEC
in a way which modifies the result. This implicit EXEC use
shall not be ignored for the purposes of instruction moves.
Differential Revision: https://reviews.llvm.org/D117814
Philip Reames [Thu, 20 Jan 2022 21:58:13 +0000 (13:58 -0800)]
[SLP] Kill an unused param and use a for-loop in calculateDependencies [NFC]
Adrian Prantl [Thu, 20 Jan 2022 21:36:55 +0000 (13:36 -0800)]
Work around a module build failure on the bots.
This patch works around what looks like a bug in Clang itself.
The error on the bot is:
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/40466/consoleText
In module 'LLVM_Utils' imported from /Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h:18:
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Support/Error.h:720:3: error: 'llvm::Expected<bool>::(anonymous)' from module 'LLVM_Utils.Support.Error' is not present in definition of 'llvm::Expected<bool>' in module 'LLVM_Utils.Support.Error'
union {
^
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Support/Error.h:720:3: note: declaration of '' does not match
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Support/Error.h:720:3: note: declaration of '' does not match
1 error generated.
The intention is to revert this as soon as a proper fix has been identified!
rdar://
87845391
John Ericson [Tue, 18 Jan 2022 23:34:54 +0000 (23:34 +0000)]
[cmake] Duplicate `{llvm,compiler_rt}_check_linker_flag` for runtime libs and llvm
We previously had a few varied definitions of this floating around. I made the one installed with LLVM handle all the cases, and then made the others use it.
This issue was reported to me in https://reviews.llvm.org/D116521#3248117 as
D116521 made clang and llvm use the common cmake utils.
Reviewed By: sebastian-ne, phosek, #libunwind, #libc, #libc_abi, ldionne
Differential Revision: https://reviews.llvm.org/D117537
John Ericson [Thu, 20 Jan 2022 19:04:15 +0000 (19:04 +0000)]
[compiler-rt][cmake] Use HandleOutOfTreeLLVM like libcxx and friends
This gives us the option of using CMake modules from LLVM, and other
things. We will use that to deduplicate code later.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D117815
Tue Ly [Thu, 20 Jan 2022 18:51:04 +0000 (13:51 -0500)]
[libc] Make log2f correctly rounded for all rounding modes when FMA is not available.
Add to log2f 2 more exceptional cases got when not using fma for polyeval.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D117812
River Riddle [Thu, 20 Jan 2022 20:54:03 +0000 (12:54 -0800)]
[mlir:TiingInterface] Remove unnecessary include of Tensor.h
Interfaces in Interfaces/ should not depend on any dialects, and this include
is unnecessary anyways.
Philip Reames [Thu, 20 Jan 2022 21:06:55 +0000 (13:06 -0800)]
[SLP] Extract formBundle helper for readability [NFC]
Sanjay Patel [Thu, 20 Jan 2022 19:51:45 +0000 (14:51 -0500)]
[InstCombine] convert mul with sexted bool and constant to select
We already have the related folds for zext-of-bool, so it
should make things more consistent to have this transform
to select for sext-of-bool too:
https://alive2.llvm.org/ce/z/YikdfA
Fixes #53319
Sanjay Patel [Thu, 20 Jan 2022 19:41:01 +0000 (14:41 -0500)]
[InstCombine] add/adjust tests for multiply with extended bool; NFC
Craig Topper [Thu, 20 Jan 2022 20:53:12 +0000 (12:53 -0800)]
[RISCV] Remove HadStdExtV and HasStdZve* Predicates from tablegen.
No instructions should be using these. Everything should use
HasVInstructions* Predicates. Remove them so that they can't be
used by accident.
Krzysztof Drewniak [Mon, 10 Jan 2022 23:53:58 +0000 (23:53 +0000)]
[MLIR][GPU] Add debug output to enable dumping GPU assembly
- Set the DEBUG_TYPE of SerializeToBlob to serialize-to-blob
- Add debug output to print the assembly or PTX for GPU modules before
they are assembled and linked
Note that, as SerializeToBlob is a superclass of SerializeToCubin and
SerializeToHsaco, --debug-only=serialize-to-blom will dump the
intermediate compiler result for both of these passes.
In addition, if LLVM options such as --stop-after are used to control
the GPU kernel compilation process, the debug output will contain the
appropriate intermediate IR.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D117519
Philip Reames [Thu, 20 Jan 2022 20:44:20 +0000 (12:44 -0800)]
[SLP] Use for loops for walking bundle elements
Craig Topper [Thu, 20 Jan 2022 19:49:35 +0000 (11:49 -0800)]
[RISCV] Remove Zvlsseg extension.
This string no longer appears in the Vector Extension specification.
The segment load/store instructions are just part of the vector
instruction set.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D117724
Mogball [Thu, 20 Jan 2022 20:17:40 +0000 (20:17 +0000)]
[mlir][pdl] Make `pdl` the default dialect when parsing/printing
PDLDialect being a somewhat user-facing dialect and whose ops contain exclusively other PDL ops in their regions can take advantage of `OpAsmOpInterface` to provide nicer IR.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117828
Mogball [Thu, 20 Jan 2022 20:17:26 +0000 (20:17 +0000)]
[mlir][pdl] OperationOp should not be side-effect free
Unbound OperationOp in the matcher (i.e. one with no uses) is already disallowed by the verifier. However, an OperationOp in the rewriter is not side-effect free -- it's creating an op!
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117825
Mogball [Thu, 20 Jan 2022 20:17:14 +0000 (20:17 +0000)]
[mlir][pdl] Some ops are missing `NoSideEffect`
Querying or building constraints on types, operands, results, and attributes are side-effect free in both the matcher and rewriter. The ops should be marked as such.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117826
Casey Carter [Wed, 29 Dec 2021 23:58:25 +0000 (15:58 -0800)]
[libcxx][test] view_interface need not derive from view_base
... after LWG-3549.
Differential Revision: https://reviews.llvm.org/D117608
Roger Kim [Thu, 20 Jan 2022 20:13:04 +0000 (12:13 -0800)]
[lld][macho] Stop grouping symbols by sections in mapfile.
As per [Bug 50689](https://bugs.llvm.org/show_bug.cgi?id=50689),
```
2. getSectionSyms() puts all the symbols into a map of section -> symbols, but this seems unnecessary. This was likely copied from the ELF port, which prints a section header before the list of symbols it contains. But the Mach-O map file doesn't print these headers.
```
This diff removes `getSectionSyms()` and keeps all symbols in a flat vector.
What does ld64's mapfile look like?
```
$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin test.s -o test.o
$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin foo.s -o foo.o
$ ld -map map test.o foo.o -o out -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem
```
```
[ 0] linker synthesized
[ 1] test.o
[ 2] foo.o
0x100003FB7 0x00000001 __TEXT __text
0x100003FB8 0x00000000 __TEXT obj
0x100003FB8 0x00000048 __TEXT __unwind_info
0x100004000 0x00000001 __DATA __common
0x100003FB7 0x00000001 [ 1] _main
0x100003FB8 0x00000000 [ 2] _foo
0x100003FB8 0x00000048 [ 0] compact unwind info
0x100004000 0x00000001 [ 1] _number
```
Perf numbers when linking chromium framework on a 16-Core Intel Xeon W Mac Pro:
```
base diff difference (95% CI)
sys_time 1.406 ± 0.020 1.388 ± 0.019 [ -1.9% .. -0.6%]
user_time 5.557 ± 0.023 5.914 ± 0.020 [ +6.2% .. +6.6%]
wall_time 4.455 ± 0.041 4.436 ± 0.035 [ -0.8% .. -0.0%]
samples 35 35
```
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D114735
Marek Kurdej [Thu, 20 Jan 2022 20:05:54 +0000 (21:05 +0100)]
[clang-format] Refactor: add FormatToken::hasWhitespaceBefore(). NFC.
This factors out a pattern that comes up from time to time.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D117769
Philip Reames [Thu, 20 Jan 2022 20:13:22 +0000 (12:13 -0800)]
[SLP] Rename a couple lambdas to be more clearly separate from method names
LLVM GN Syncbot [Thu, 20 Jan 2022 20:02:50 +0000 (20:02 +0000)]
[gn build] Port
83d59e05b201
LLVM GN Syncbot [Thu, 20 Jan 2022 20:02:49 +0000 (20:02 +0000)]
[gn build] Port
63a991d03589
Nico Weber [Thu, 20 Jan 2022 20:02:35 +0000 (15:02 -0500)]
[llvm] Remove an old bot cleanup command
Nico Weber [Thu, 20 Jan 2022 19:59:30 +0000 (14:59 -0500)]
clang: Auto-cleanup left-over file from before
3da69fb5a26c7b on bots
Tue Ly [Thu, 20 Jan 2022 19:43:09 +0000 (14:43 -0500)]
[libc] Use get_round() instead of floating point tricks in generic hypot implementation.
The floating point tricks used to get rounding mode require -frounding-math flag, which behaves differently on aarch64. Reverting back to use get_round instead.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D117824
Alexandre Ganea [Thu, 20 Jan 2022 19:53:18 +0000 (14:53 -0500)]
Re-land [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.
See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html
The previous land
f860fe362282ed69b9d4503a20e5d20b9a041189 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by
22ee510dac9440a74b2e5b3fe3ff13ccdbf55af3.
Differential Revision: https://reviews.llvm.org/D108850
Pavel Labath [Thu, 20 Jan 2022 19:36:14 +0000 (20:36 +0100)]
[lldb] Surround LLDB_API-defining code with #ifndef LLDB_API
This enables power-users to annotate lldb api functions with arbitrary
attributes. The motivation for this is being able to build liblldb as a
static library on windows (see discussion on D117564).
This should not be interpreted to mean that building liblldb is
supported in any way, but this does not cause any problems for us, and
can help users who really know what they are doing (or have no other
choice).
Casey Carter [Wed, 19 Jan 2022 06:50:15 +0000 (22:50 -0800)]
[libcxx] chrono::month_weekday should not be default constructible
It was not in P0355R7, nor has it ever been so in a working draft.
Drive-by:
* tests should test something: fix loop bounds so initial value is not >= final value
* calender type streaming tests are useless - let's remove them
* don't declare printf, especially if you don't intend to use it
Differential Revision: https://reviews.llvm.org/D117638
Nathan Sidwell [Thu, 20 Jan 2022 15:40:12 +0000 (07:40 -0800)]
[demangler][NFC] Small cleanups and sync
Some precursor work to adding module demangling.
* some mismatched comment and code in the demangler
* a const fn was not marked thusly
* we use std::islower. A direct range check is smaller code (no function call),
and we know we're in ASCII-land and later in that same function make the same
assumption about upper-case contiguity. Heck, maybe just drop the switch's
precondition and rely on the optimizer to do its thing?
* the directory is cloned in two places, which had gotten out of sync.
Differential Revision: https://reviews.llvm.org/D117800
Roman Lebedev [Thu, 20 Jan 2022 19:41:31 +0000 (22:41 +0300)]
[InstCombine] Instruction sinking: fix check for function terminating block
Checking for specific function terminating opcodes
means we don't handle other non-hardcoded ones :)
This should probably be generalized to something
similar to the `IsBlockFollowedByDeoptOrUnreachable()`.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D117810
Arthur O'Dwyer [Tue, 4 Jan 2022 01:28:00 +0000 (20:28 -0500)]
[libc++] Eliminate the `__function_like` helper.
As prefigured in the comments on D115315.
This gives us one unified style for all niebloids,
and also simplifies the modulemap.
Differential Revision: https://reviews.llvm.org/D116570
Craig Topper [Thu, 20 Jan 2022 19:32:26 +0000 (11:32 -0800)]
[RISCV] Add DAG combine to fold (fp_to_int_sat (ffloor X)) -> (select X == nan, 0, (fcvt X, rdn))
Similar for ceil, trunc, round, and roundeven. This allows us to use
static rounding modes to avoid a libcall.
This is similar to D116771, but for the saturating conversions.
This optimization is done for AArch64 as isel patterns.
RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven
so the operations don't stick around until isel to enable a pattern
match. Thus I've implemented a DAG combine.
I'm only handling saturating to i64 or i32. This could be extended
to other sizes in the future.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D116864
Daniel Thornburgh [Fri, 14 Jan 2022 23:10:37 +0000 (23:10 +0000)]
[Support] [DebugInfo] Lazily create cache dir.
This change defers creating Support/Caching.cpp's cache directory until
it actually writes to the cache.
This allows using Caching library in a read-only fashion. If read-only,
the cache is guaranteed not to write to disk. This keeps tools using
DebugInfod (currently llvm-symbolizer) hermetic when not configured to
perform remote lookups.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D117589
Nathan Sidwell [Thu, 20 Jan 2022 15:39:22 +0000 (07:39 -0800)]
[clang][NFC] Small mangler cleanups
In working on a module mangling problem I noticed a few cleanups to the mangler.
1) Use 'if (auto x = ...' idiom in a couple of places.
2) I noticed both 'isFileContext' and 'isNamespace || isTranslationUnit'
synonyms. Let's use the former.
3) The control flow in the seqId mangling was misordered. Let's channel Count
von Count. Also fix the inconsistent bracing.
Differential Revision: https://reviews.llvm.org/D117799
Stanislav Mekhanoshin [Thu, 20 Jan 2022 19:05:01 +0000 (11:05 -0800)]
[AMDGPU] Regenerate remat-vop.mir. NFC.
Alexandre Ganea [Thu, 20 Jan 2022 18:38:32 +0000 (13:38 -0500)]
[Clang] Separate the 'debug-info-hotpatch' test in two parts: one for ARM and another for AArch64
After
5af2433e1794ebf7e58e848aa612c7912d71dc78, this shall fix: https://lab.llvm.org/buildbot/#/builders/188/builds/8400 - if not I'll revert this patch and
5af2433e1794ebf7e58e848aa612c7912d71dc78.
Jonas Paulsson [Tue, 18 Jan 2022 23:40:26 +0000 (17:40 -0600)]
[SystemZ] Remove the ManipulatesSP flag from backend (NFC).
This flag was set in the presence of stacksave/stackrestore in order to force
a frame pointer.
This should however not be needed per the comment in MachineFrameInfo.h
stating that a a variable sized object "...is the sole condition which
prevents frame pointer elimination", and experiments have also shown that
there seems to be no effect whatsoever on code generation with ManipulatesSP.
Review: Ulrich Weigand
John Ericson [Wed, 19 Jan 2022 06:45:07 +0000 (06:45 +0000)]
[cmake] Make include(GNUInstallDirs) always below project(..)
Its defaulting logic must go after `project(..)` to work correctly, but `project(..)` is often in a standalone condition making this
awkward, since the rest of the condition code may also need GNUInstallDirs.
The good thing is there are the various standalone booleans, which I had missed before. This makes splitting the conditional blocks less awkward.
Reviewed By: arichardson, phosek, beanz, ldionne, #libunwind, #libc, #libc_abi
Differential Revision: https://reviews.llvm.org/D117639
Marco Elver [Thu, 20 Jan 2022 18:36:16 +0000 (19:36 +0100)]
[clang] Improve -Wdeclaration-after-statement
With
118f966b46cf, Clang matches GCC's behaviour and allows enabling
-Wdeclaration-after-statement with C99 and later.
However, the check for mixing declarations and code is not a constant time
algorithm, and therefore should be guarded with Diags.isIgnored().
Furthermore, improve test coverage with: non-pedantic C89 with the
warning; C11 with the warning; and when using -Wall.
Finally, mention the changed behaviour in ReleaseNotes.rst.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D117232
Sanjay Patel [Thu, 20 Jan 2022 18:35:58 +0000 (13:35 -0500)]
[InstCombine] add one-use check to opposite shift folds
Test comments say this might be intentional, but I don't
see any hard evidence to support it. The extra instruction
shows up as a potential regression in D117680.
One test does show a missed fold that might be recovered
with better demanded bits analysis.
Sanjay Patel [Thu, 20 Jan 2022 17:29:12 +0000 (12:29 -0500)]
[InstCombine] avoid 'tmp' usage in test files; NFC
The update script ( utils/update_test_checks.py ) warns against this
because it can conflict with the default FileCheck names given to
anonymous values in the IR.
eopXD [Thu, 20 Jan 2022 18:36:23 +0000 (10:36 -0800)]
[NFC][RISCV] Add end-of-line symbol in target-feature testcases
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117808
Craig Topper [Thu, 20 Jan 2022 18:36:21 +0000 (10:36 -0800)]
[TargetLowering][InstCombine] Simplify BSwap demanded bits code a little. NFC
Use alignDown instead of &= ~7.
Replace ResultBit with NLZ. (BitWidth - NLZ - NTZ == 8) so
(BitWidth - NTZ - 8 == NLZ).
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D117804
Roman Lebedev [Thu, 20 Jan 2022 18:37:26 +0000 (21:37 +0300)]
[NFC][InstCombine] Add test showing failure to sink into `resume` block
Evgeny Shulgin [Thu, 20 Jan 2022 18:34:28 +0000 (13:34 -0500)]
Add `isConsteval` matcher
Support C++20 consteval functions and C++2b if consteval for AST Matchers.
Tue Ly [Tue, 18 Jan 2022 18:46:18 +0000 (13:46 -0500)]
[libc] Implement correct rounding with all rounding modes for hypot functions.
Update the rounding logic for generic hypot function so that it will round correctly with all rounding modes.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D117590
Joseph Huber [Thu, 20 Jan 2022 13:29:16 +0000 (08:29 -0500)]
[OpenMP] Don't pass empty files to nvlink
This patch adds and exception to the nvlink wrapper tool to not pass
empty cubin files to the nvlink job. If an empty file is passed to
nvlink it will cause an error indicating that the file could not be
opened. This would occur if the user tried to link object files that
contained offloading code with a file that didnt. This will act as a
workaround until the new OpenMP offloading driver becomes the default.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117777
Sergei Grechanik [Thu, 20 Jan 2022 16:54:38 +0000 (08:54 -0800)]
[mlir][vector] Allow values outside of [0; dim-size] in create_mask
This commits explicitly states that negative values and values exceeding
vector dimensions are allowed in vector.create_mask (but not in
vector.constant_mask). These values are now truncated when
canonicalizing vector.create_mask to vector.constant_mask.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D116069
Alexandre Ganea [Thu, 20 Jan 2022 16:04:46 +0000 (11:04 -0500)]
[clang-cl] Support the /HOTPATCH flag
This patch adds support for the MSVC /HOTPATCH flag: https://docs.microsoft.com/sv-se/cpp/build/reference/hotpatch-create-hotpatchable-image?view=msvc-170&viewFallbackFrom=vs-2019
The flag is translated to a new -fms-hotpatch flag, which in turn adds a 'patchable-function' attribute for each function in the TU. This is then picked up by the PatchableFunction pass which would generate a TargetOpcode::PATCHABLE_OP of minsize = 2 (which means the target instruction must resolve to at least two bytes). TargetOpcode::PATCHABLE_OP is only implemented for x86/x64. When targetting ARM/ARM64, /HOTPATCH isn't required (instructions are always 2/4 bytes and suitable for hotpatching).
Additionally, when using /Z7, we generate a 'hot patchable' flag in the CodeView debug stream, in the S_COMPILE3 record. This flag is then picked up by LLD (or link.exe) and is used in conjunction with the linker /FUNCTIONPADMIN flag to generate extra space before each function, to accommodate for live patching long jumps. Please see: https://github.com/llvm/llvm-project/blob/
d703b922961e0d02a5effdd4bfbb23ad50a3cc9f/lld/COFF/Writer.cpp#L1298
The outcome is that we can finally use Live++ or Recode along with clang-cl.
NOTE: It seems that MSVC cl.exe always enables /HOTPATCH on x64 by default, although if we did the same I thought we might generate sub-optimal code (if this flag was active by default). Additionally, MSVC always generates a .debug$S section and a S_COMPILE3 record, which Clang doesn't do without /Z7. Therefore, the following MSVC command-line "cl /c file.cpp" would have to be written with Clang such as "clang-cl /c file.cpp /HOTPATCH /Z7" in order to obtain the same result.
Depends on D43002, D80833 and D81301 for the full feature.
Differential Revision: https://reviews.llvm.org/D116511
Matt Arsenault [Wed, 19 Jan 2022 22:39:27 +0000 (17:39 -0500)]
AMDGPU: Fix asm in test using wrong IR type for physical register
Matt Arsenault [Wed, 19 Jan 2022 20:20:39 +0000 (15:20 -0500)]
AMDGPU/GlobalISel: Try to use s_and_b64 in ptrmask selection
Avoids a test diff with SDAG.
Matt Arsenault [Wed, 19 Jan 2022 21:14:56 +0000 (16:14 -0500)]
AMDGPU/GlobalISel: Regenerate test checks with -NEXT
Matt Arsenault [Wed, 19 Jan 2022 21:46:49 +0000 (16:46 -0500)]
AMDGPU/GlobalISel: Explicitly set -global-isel-abort in failure tests
If the default mode is the fallback, this would fail since it would
end up seeing the DAG failure message instead.
zijunzhao [Thu, 20 Jan 2022 09:30:51 +0000 (09:30 +0000)]
add tsan shared library
Add tsan shared library on Android. Only build tsan when minSdkVersion is above 23.
Reviewed By: danalbert, vitalybuka
Differential Revision: https://reviews.llvm.org/D108394
Matt Arsenault [Wed, 19 Jan 2022 16:06:25 +0000 (11:06 -0500)]
AMDGPU/GlobalISel: Directly diagnose return value use for FP atomics
Emit an error if the return value is used on subtargets that do not
support them. Previously we were falling back to the DAG on selection
failure, where it would emit this error and then fail again.
Nikolas Klauser [Thu, 20 Jan 2022 12:53:59 +0000 (13:53 +0100)]
[libc++] basic_string::resize_and_overwrite: Adopt LWG3645 (Not voted in yet)
Adopt LWG3645, which fixes the value categories of basic_string::resize_and_overwrite
https://timsong-cpp.github.io/lwg-issues/3645
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D116815
Roman Lebedev [Thu, 20 Jan 2022 17:31:47 +0000 (20:31 +0300)]
[NFC][SimplifyCFG] Add some tests for `invoke` merging
Sam Clegg [Thu, 20 Jan 2022 02:41:39 +0000 (18:41 -0800)]
[lld][WebAssemlby] Convert test to check disassembly output. NFC
Differential Revision: https://reviews.llvm.org/D117739
Nadav Rotem [Thu, 20 Jan 2022 17:25:45 +0000 (09:25 -0800)]
optimize icmp-ugt-ashr
This diff optimizes the sequence icmp-ugt(ashr,C_1) C_2. InstCombine
already implements this optimization for sgt, and this patch adds
support ugt. This patch adds the check for UGT.
@craig.topper came up with the idea and proof:
define i1 @src(i8 %x, i8 %y, i8 %c) {
%cp1 = add i8 %c, 1
%i = shl i8 %cp1, %y
%i.2 = ashr i8 %i, %y
%cmp = icmp eq i8 %cp1, %i.2
;Assume: C + 1 == (((C + 1) << y) >> y)
call void @llvm.assume(i1 %cmp)
; uncomment for the sgt case
%j = shl i8 %cp1, %y
%j.2 = sub i8 %j, 1
%cmp2 = icmp ne i8 %j.2, 127
;Assume (((c + 1 ) << y) - 1) != 127
call void @llvm.assume(i1 %cmp2)
%s = ashr i8 %x, %y
%r = icmp sgt i8 %s, %c
ret i1 %r
}
define i1 @tgt(i8 %x, i8 %y, i8 %c) {
%cp1 = add i8 %c, 1
%j = shl i8 %cp1, %y
%j.2 = sub i8 %j, 1
%r = icmp sgt i8 %x, %j.2
ret i1 %r
}
declare void @llvm.assume(i1)
This change is related to the optimizations in D117252.
Differential Revision: https://reviews.llvm.org/D117365
Valentin Clement [Thu, 20 Jan 2022 17:30:09 +0000 (18:30 +0100)]
[flang][NFC] Remove unused/duplicated kStridePosInDim
kStridePosInDim is a duplicate of kDimStridePos and is not used. Just
remove it.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D117784
Fraser Cormack [Thu, 20 Jan 2022 17:07:11 +0000 (17:07 +0000)]
[RISCV] Add tests for commuted vector/scalar VP patterns
This patch adds a variety of tests checking that we can match
vector/scalar instructions against masked VP intrinsics when the splat
is on the LHS. At this stage, we can't, despite us having
ostensibly-commutable ISel patterns for them. The use of V0 as the mask
operand interferes with the auto-generated ISel table.
Matt Arsenault [Wed, 19 Jan 2022 15:45:37 +0000 (10:45 -0500)]
AMDGPU/GlobalISel: Stop handling llvm.amdgcn.buffer.atomic.fadd
This code is not structured to handle the legacy buffer intrinsics and
was miscompiling them.
Matt Arsenault [Wed, 19 Jan 2022 02:08:01 +0000 (21:08 -0500)]
AMDGPU/GlobalISel: Fix selection of gfx90a FP atomics
The struct/raw forms for the buffer atomics now work as
expected. However, we're incorrectly handling the legacy form (which
we probably shouldn't handle at all). We also are not diagnosing the
use of the return value on gfx908. These will be addressed separately.
Matt Arsenault [Sun, 16 Jan 2022 00:16:03 +0000 (19:16 -0500)]
AMDGPU: Stop reserving 36-bytes before kernel arguments for amdpal
This was inheriting the mesa behavior, and as far as I know nobody is
using opencl kernels with amdpal. The isMesaKernel check was
irrelevant because this property needs to be held for all functions.
Random06457 [Thu, 20 Jan 2022 17:03:49 +0000 (20:03 +0300)]
[mips] Improve vr4300 mulmul bugfix pass
When compiling with dwarf info, the mfix4300 flag introduced in
https://reviews.llvm.org/D116238 can miss some occurrences of the vr4300
mulmul bug if a debug instruction happens to be between two `muls`
instructions. This change skips debug instructions in order to fix
the mulmul bug detection.
Fixes https://github.com/llvm/llvm-project/issues/53094
Differential Revision: https://reviews.llvm.org/D117615
Lucas Prates [Mon, 10 Jan 2022 10:19:27 +0000 (10:19 +0000)]
[GlobalISel] Fix incorrect sign extension when combining G_INTTOPTR and G_PTR_ADD
The GlobalISel combiner currently uses sign extension when manipulating
the LHS constant when combining a sequence of the following sequence of
machine instructions into a single constant:
```
%0:_(s32) = G_CONSTANT i32 <CONSTANT>
%1:_(p0) = G_INTTOPTR %0:_(s32)
%2:_(s64) = G_CONSTANT i64 <CONSTANT>
%3:_(p0) = G_PTR_ADD %1:_, %2:_(s64)
```
This causes an issue when the bit width of the first contant and the
target pointer size are different, as G_INTTOPTR has no sign extension
semantics.
This patch fixes this by capture an arbitrary precision in when matching
the constant, allowing the matching function to correctly zero extend
it.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D116941
Sjoerd Meijer [Thu, 20 Jan 2022 16:56:31 +0000 (16:56 +0000)]
[FuncSpec] Add a reference, and some other clarifying comments. NFC.
Philip Reames [Thu, 20 Jan 2022 16:53:01 +0000 (08:53 -0800)]
{SLP] Delete dead code in favor of proper assert [NFC]
Philip Reames [Thu, 20 Jan 2022 16:46:06 +0000 (08:46 -0800)]
[SLP] Reduce nesting depth in calculateDependencies via for loop and early continue [NFC]
Sander de Smalen [Thu, 20 Jan 2022 15:13:32 +0000 (15:13 +0000)]
[ScalableVectors] Warn instead of error for invalid size requests.
This was intended to be fixed by D98856, but that only seemed to have
the desired behaviour when compiling to assembly using `-S`, not when
compiling into an object file or executable. Given that this was not
the intention of D98856, this patch fixes the behaviour.
Adrian Prantl [Thu, 20 Jan 2022 16:35:33 +0000 (08:35 -0800)]
Add missing include to fix modular build
Adrian Prantl [Thu, 20 Jan 2022 16:33:08 +0000 (08:33 -0800)]
Add missing include to fix modular build
Philip Reames [Thu, 20 Jan 2022 16:23:51 +0000 (08:23 -0800)]
[SLP] Add an asser to make a non-obvious precondition clear [NFC]
Michael Kruse [Thu, 20 Jan 2022 14:18:08 +0000 (08:18 -0600)]
[OpenMPIRBuilder] Detect and fix ambiguous InsertPoints for createParallel.
When a Builder methods accepts multiple InsertPoints, when both point to
the same position, inserting instructions at one position will "move" the
other after the inserted position since the InsertPoint is pegged to the
instruction following the intended InsertPoint. For instance, when
creating a parallel region at Loc and passing the same position as AllocaIP,
creating instructions at Loc will "move" the AllocIP behind the Loc
position.
To avoid this ambiguity, add an assertion checking this condition and
fix the unittests.
In case of AllocaIP, an alternative solution could be to implicitly
split BasicBlock at InsertPoint, using the first as AllocaIP, the second
for inserting the instructions themselves. However, this solution is
specific to AllocaIP since AllocaIP will always have to be first. Hence,
this is an argument to generally handling ambiguous InsertPoints as API
sage error.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117226
Nico Weber [Thu, 20 Jan 2022 16:02:06 +0000 (11:02 -0500)]
[gn build] (manually) port
f29256a64a
Nikita Popov [Thu, 20 Jan 2022 15:48:19 +0000 (16:48 +0100)]
[InstSimplify] Add test for load of non-integral pointer (NFC)
Mubashar Ahmad [Thu, 20 Jan 2022 15:20:57 +0000 (15:20 +0000)]
[Clang][AArch64][ARM] Unaligned Access Test Fix
Test fixed for the unaligned access warning.
Nikita Popov [Thu, 20 Jan 2022 15:25:22 +0000 (16:25 +0100)]
[InstSimplify] Add test for reinterpret load of pointer type (NFC)
Simon Pilgrim [Thu, 20 Jan 2022 15:15:46 +0000 (15:15 +0000)]
[X86] lowerToAddSubOrFMAddSub - lower 512-bit ADDSUB patterns to blend(fsub,fadd)
AVX512 doesn't provide a ADDSUB instruction, but if we've built this from a build vector of scalar fsub/fadd elements we can still lower to blend(fsub,fadd)
Mircea Trofin [Thu, 20 Jan 2022 05:19:53 +0000 (21:19 -0800)]
[MLGO] Improved support for AOT cross-targeting scenarios
The tensorflow AOT compiler can cross-target, but it can't run on (for
example) arm64. We added earlier support where the AOT-ed header and object
would be built on a separate builder and then passed at build time to
a build host where the AOT compiler can't run, but clang can be otherwise
built.
To simplify such scenarios given we now support more than one AOT-able
case (regalloc and inliner), we make the AOT scenario centered on whether
files are generated, case by case (this includes the "passed from a
different builder" scenario).
This means we shouldn't need an 'umbrella' LLVM_HAVE_TF_AOT, in favor of
case by case control. A builder can opt out of an AOT case by passing that case's
model path as `none`. Note that the overrides still take precedence.
This patch controls conditional compilation with case-specific flags,
which can be enabled locally, for the component where those are
available. We still keep an overall flag for some tests.
The 'development/training' mode is unchanged, because there the model is
passed from the command line and interpreted.
Differential Revision: https://reviews.llvm.org/D117752
Nikita Popov [Tue, 18 Jan 2022 16:38:08 +0000 (17:38 +0100)]
[DebugInstrRef] Memoize variable order during sorting (NFC)
Instead of constructing DebugVariables and looking up the order
in the comparison function, compute the order upfront and then sort
a vector of (order, instr).
This improves compile-time by -0.4% geomean on CTMark ReleaseLTO-g.
Differential Revision: https://reviews.llvm.org/D117575
Simon Pilgrim [Thu, 20 Jan 2022 14:58:23 +0000 (14:58 +0000)]
[X86] Fix v16f32 ADDSUB test
This was supposed to ensure we're not generating 512-bit ADDSUB nodes, but cut+paste typos meant we weren't generating a full 512-bit pattern
eopXD [Sat, 6 Nov 2021 14:54:58 +0000 (07:54 -0700)]
[Clang][RISCV] Change TARGET_BUILTIN to require zve32x for vector instruction
According to v-spec v1.0, `zve-32x` is the new minimum extension to include
to have vector instructions.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D112613
Jan Svoboda [Thu, 20 Jan 2022 14:07:39 +0000 (15:07 +0100)]
[llvm][vfs] Abstract in-memory node creation
The creation of in-memory VFS nodes happens in a single function that deduces what kind of node to create from the arguments. This leads to complicated if-then-else logic that's difficult to cleanly extend.
This patch abstracts away in-memory node creation via a type-erased factory function that's passed instead.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D117648
Jan Svoboda [Thu, 20 Jan 2022 14:05:38 +0000 (15:05 +0100)]
[llvm][vfs] NFC: Virtualize in-memory `getStatus`
This patch virtualizes the `getStatus` function on `InMemoryNode` in LLVM VFS. Currently, this is implemented via top-level function `getNodeStatus` that tries to cast `InMemoryNode *` into each subtype. Virtual functions seem to be the simpler solution here.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D117649
Stephan Herhut [Thu, 20 Jan 2022 13:02:04 +0000 (14:02 +0100)]
[mlir][memref] Add better support for identity layouts in memref.collapse_shape canonicalizer
When computing the new type of a collapse_shape operation, we need to at least
take into account whether the type has an identity layout, in which case we can
easily support dynamic strides. Otherwise, the canonicalizer creates invalid
IR.
Longer term, both the verifier and the canoncializer need to be extended to
support the general case.
Differential Revision: https://reviews.llvm.org/D117772
Stanislav Gatev [Thu, 20 Jan 2022 09:28:25 +0000 (09:28 +0000)]
[clang][dataflow] Intersect ExprToLoc when joining environments
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D117754
Valentin Clement [Thu, 20 Jan 2022 14:18:47 +0000 (15:18 +0100)]
[flang][NFC] Remove extra braces
Noticed during the upstreaming process.
Mubashar Ahmad [Thu, 23 Dec 2021 16:37:44 +0000 (16:37 +0000)]
[Clang][AArch64][ARM] Unaligned Access Warning Added
Added warning for potential cases of
unaligned access when option
-mno-unaligned-access has been specified
Differential Revision: https://reviews.llvm.org/D116221