Nick Desaulniers [Tue, 12 Apr 2022 18:37:42 +0000 (11:37 -0700)]
[Bitcode] materialize Functions early when BlockAddress taken
IRLinker builds a work list of functions to materialize, then moves them
from a source module to a destination module one at a time.
This is a problem for blockaddress Constants, since they need not refer
to the function they are used in; IPSCCP is quite good at sinking these
constants deep into other functions when passed as arguments.
This would lead to curious errors during LTO:
ld.lld: error: Never resolved function from blockaddress ...
based on the ordering of function definitions in IR.
The problem was that IRLinker would basically do:
for function f in worklist:
materialize f
splice f from source module to destination module
in one pass, with Functions being lazily added to the running worklist.
This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.
This causes BitcodeReader to insert Functions into its BasicBlockFwdRefs
list incorrectly, as it will never re-materialize an already
materialized (but spliced out) function.
Because of the possibility that blockaddress Constants may appear in
Functions other than the ones they reference, this patch adds a new
bitcode function code FUNC_CODE_BLOCKADDR_USERS that is a simple list of
Functions that contain BlockAddress Constants that refer back to this
Function, rather then the Function they are scoped in. We then
materialize those functions when materializing `f` from the example loop
above. This might over-materialize Functions should the user of
BitcodeReader ultimately decide not to link those Functions, but we can
at least now we can avoid this ordering related issue with blockaddresses.
Fixes: https://github.com/llvm/llvm-project/issues/52787
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1215
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D120781
Shraiysh Vaishay [Tue, 12 Apr 2022 18:20:27 +0000 (23:50 +0530)]
[mlir][OpenMP] Added omp.task
This patch adds tasking construct according to Section 2.10.1 of OpenMP 5.0
Reviewed By: peixin, kiranchandramohan, abidmalikwaterloo
Differential Revision: https://reviews.llvm.org/D123575
Fangrui Song [Tue, 12 Apr 2022 18:24:19 +0000 (11:24 -0700)]
[ubsan] Fix print_stacktrace=1:fast_unwind_on_fatal=0 to correctly fallback to fast unwinder
ubsan_GetStackTrace (from
52b751088b11547e0f4ef0589ebbe5e57752c68c) called by
~ScopeReport leaves top/bottom zeroes in the
`!WillUseFastUnwind(request_fast_unwind)` code path.
When BufferedStackTrace::Unwind falls back to UnwindFast,
`if (stack_top < 4096) return;` will return early, leaving just one frame in the stack trace.
Fix this by always initializing top/bottom like
261d6e05d5574bec753ea6b7e9a7f99229927753.
Reviewed By: eugenis, yln
Differential Revision: https://reviews.llvm.org/D123562
Martin Sebor [Tue, 12 Apr 2022 17:10:42 +0000 (11:10 -0600)]
[InstCombine] Add more memrchr tests (NFC).
Jonathan Peyton [Tue, 12 Apr 2022 16:15:54 +0000 (11:15 -0500)]
[OpenMP][libomp] Replace global variable references with local object
Remove references to global __kmp_topology within a kmp_topology_t
object method. There should just be implicit references to the
private object.
Arthur Eubanks [Mon, 11 Apr 2022 21:13:53 +0000 (14:13 -0700)]
[docs] Mention that we are in the process of removing the legacy PM for the optimization pipeline
And remove references to flags to turn it off.
Reviewed By: nikic, MaskRay
Differential Revision: https://reviews.llvm.org/D123547
Louis Dionne [Mon, 11 Apr 2022 16:32:40 +0000 (12:32 -0400)]
[libc++] Define legacy symbols for inline functions at a finer-grained level
When we build the library with the stable ABI, we need to include some
functions in the dylib that were made inline in later versions of the
library (to avoid breaking code that might be relying on those symbols).
However, those methods were made non-inline whenever we'd be building
the library, which means that all translation units would end up using
the old out-of-line definition of these methods, as opposed to the new
inlined version. This patch makes it so that only the translation units
that actually define the out-of-line methods use the old definition,
opening up potential optimization opportunities in other translation
units.
This should solve some of the issues encountered in D65667.
Differential Revision: https://reviews.llvm.org/D123519
Ahmed Bougacha [Tue, 12 Apr 2022 16:23:11 +0000 (09:23 -0700)]
[AArch64][LOH] Don't ignore regmasks in bundles by iterating over instrs.
The LOH pass iterates over instructions to build its custom register
state machine, but it uses the top-level bundle iterator.
This should be okay, because when the wrapper BUNDLE MI is built,
it aggregates the register defs/uses in its instructions into MOs.
However, that doesn't apply to regmasks, and accumulating regmasks
across multiple instructions would be messy business.
There are a couple AnalyzePhysRegInBundle (/Virt) helpers that
do look at regmasks, but those don't fit in very well here.
AArch64 has started to use a few bundle instructions, specifically
as glorified pseudos for variant call instructions, which have regmasks.
So the LOH pass ends up ignoring regmasks.
Concretely, this has been wrong for a while, but, on aarch64, the
most common bundle (rv_marker call) was always followed by the
attached call instruction, a plain BL with a regmask. Which
was properly detected by the pass.
However, we recently started keeping the attached call in the bundle,
so the regmask is now ignored. And the pass happily combines ADRPs, of
say, x8, across the bundle, resulting in corrupt pointers later.
Ahmed Bougacha [Tue, 12 Apr 2022 16:39:03 +0000 (09:39 -0700)]
[AArch64] Cleanup call-rv-marker.ll test. NFC.
This was doing -iphoneos instead of -ios. While there,
remove an old TODO and cleanup some alignment.
Harald van Dijk [Tue, 12 Apr 2022 17:32:14 +0000 (18:32 +0100)]
[X86] Fix handling of maskmovdqu in x32 differently
This reverts the functional changes of D103427 but keeps its tests, and
and reimplements the functionality by reusing the existing 32-bit
MASKMOVDQU and VMASKMOVDQU instructions as suggested by skan in review.
These instructions were previously predicated on Not64BitMode. This
reimplementation restores the disassembly of a class of instructions,
which will see a test added in followup patch D122449.
These instructions are in 64-bit mode special cased in
X86MCInstLower::Lower, because we use flags with one meaning for subtly
different things: we have an AdSize32 class which indicates both that
the instruction needs a 0x67 prefix and that the text form of the
instruction implies a 0x67 prefix. These instructions are special in
needing a 0x67 prefix but having a text form that does *not* imply a
0x67 prefix, so we encode this in MCInst as an instruction that has an
explicit address size override.
Note that originally VMASKMOVDQU64 was special cased to be excluded from
disassembly, as we cannot distinguish between VMASKMOVDQU and
VMASKMOVDQU64 and rely on the fact that these are indistinguishable, or
close enough to it, at the MCInst level that it does not matter which we
use. Because VMASKMOVDQU now receives special casing, even though it
does not make a difference in the current implementation, as a
precaution VMASKMOVDQU is excluded from disassembly rather than
VMASKMOVDQU64.
Reviewed By: RKSimon, skan
Differential Revision: https://reviews.llvm.org/D122540
Groverkss [Tue, 12 Apr 2022 17:14:25 +0000 (22:44 +0530)]
[MLIR][Presburger] Remove inheritance from PresburgerSpace in IntegerRelation, PresburgerRelation and PWMAFunction
This patch removes inheritence from PresburgerSpace in IntegerRelation and
instead makes it a member of these classes.
This is required for three reasons:
- It prevents implicit casting to PresburgerSpace.
- Not all functions of PresburgerSpace need to be exposed by the deriving classes.
- IntegerRelation and IntegerPolyhedron are defined in a PresburgerSpace. It
makes more sense for the space to be a member instead of them inheriting from
a space.
Reviewed By: arjunp, ftynse
Differential Revision: https://reviews.llvm.org/D123585
Zixu Wang [Mon, 11 Apr 2022 17:52:36 +0000 (10:52 -0700)]
[clang][ExtractAPI][NFC] Fix sed delimiter in test
Fix path replacement in sed (properly this time) using lit
regex_replacement.
Differential Revision: https://reviews.llvm.org/D123526
Co-authored-by: Michele Scandale <michele.scandale@gmail.com>
Co-authored-by: Zixu Wang <9819235+zixu-w@users.noreply.github.com>
Shao-Ce SUN [Tue, 12 Apr 2022 15:22:41 +0000 (23:22 +0800)]
[NFC][CodeGen] Use ArrayRef in TargetLowering functions
This patch is similar to D122557, adding an `ArrayRef` version for `setOperationAction`, `setLoadExtAction`, `setCondCodeAction`, `setLibcallName`.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D123467
Anshil Gandhi [Tue, 12 Apr 2022 15:46:41 +0000 (09:46 -0600)]
[AMDGPU][Codegen] Unsupported image sample texture map instructions
Disables image_sample_*_g16 instructions on architectures lacking g16 support. This patch fixes the issue 54672.
Differential Revision: https://reviews.llvm.org/D123461
Sanjay Patel [Tue, 12 Apr 2022 16:06:27 +0000 (12:06 -0400)]
[SimplifyCFG] cleanup code for converting switch to select (NFC)
This renames functions for more general usage (and current capitalization style)
before a proposed logic change in D122485.
Differential Revision: https://reviews.llvm.org/D123614
Jonathan Peyton [Tue, 12 Apr 2022 16:00:36 +0000 (11:00 -0500)]
[OpenMP][libomp] Fix some Doxygen issues
Fix spelling of variable names and remove accidental references (#)
in Doxygen comments.
Momchil Velikov [Tue, 12 Apr 2022 15:30:46 +0000 (16:30 +0100)]
[AArch64] Async unwind - function epilogues
Reviewed By: MaskRay, chill
Differential Revision: https://reviews.llvm.org/D112330
Mark de Wever [Wed, 30 Mar 2022 16:52:31 +0000 (18:52 +0200)]
[NFC][libc++][test] Move time tests.
In the C++20 Standard time is no longer section under utilities, but
became its own chapter. This moves the time tests accordingly so their
location matches the current Standard.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D122745
Jay Foad [Tue, 12 Apr 2022 15:19:11 +0000 (16:19 +0100)]
[AMDGPU] Use default member initializers in Subtarget classes
Use default member initializers in AMDGPUSubtarget and subclasses. This
is to guard against adding a new feature boolean in AMDGPUSubtarget.h
but forgetting to initialize it to false in AMDGPUSubtarget.cpp.
This was mostly autogenerated by:
clang-tidy -checks=-*,cppcoreguidelines-prefer-member-initializer,modernize-use-default-member-init -header-filter=Subtarget -fix lib/Target/AMDGPU/*Subtarget.cpp
Differential Revision: https://reviews.llvm.org/D123613
Nico Weber [Tue, 12 Apr 2022 15:37:57 +0000 (11:37 -0400)]
[gn build] Fix a URL in a comment
Nikita Popov [Tue, 12 Apr 2022 15:31:29 +0000 (17:31 +0200)]
[InstSimplify] Don't fold phi of poison and trapping const expr (PR49839)
Folding this case would result in the constant expression being
executed unconditionally, which may introduce a new trap.
Fixes https://github.com/llvm/llvm-project/issues/49839.
Nikita Popov [Tue, 12 Apr 2022 15:25:28 +0000 (17:25 +0200)]
[InstSimplify] Add test for PR49839 (NFC)
Stanislav Mekhanoshin [Mon, 11 Apr 2022 17:13:31 +0000 (10:13 -0700)]
[AMDGPU] Split unaligned 3 DWORD DS operations
I have written a minitest to check the performance. Overall
the benefit of aligned b96 operations on data which is not
known but happens to be aligned is small, while performance
hit of using b96 operations on a really unaligned memory is
high.
The only exception is when data is not aligned even by 4, it
is better to use b96 in this case.
Here is the test output on Vega and Navi:
```
Using platform: AMD Accelerated Parallel Processing
Using device: gfx900:xnack-
ds_write_b96 aligned: 3.4 sec
ds_write_b32 + ds_write_b64 aligned: 4.5 sec
ds_write_b32 * 3 aligned: 4.8 sec
ds_write_b96 misaligned by 1: 4.8 sec
ds_write_b32 + ds_write_b64 misaligned by 1: 7.2 sec
ds_write_b32 * 3 misaligned by 1: 10.0 sec
ds_write_b96 misaligned by 2: 4.8 sec
ds_write_b32 + ds_write_b64 misaligned by 2: 7.2 sec
ds_write_b32 * 3 misaligned by 2: 10.1 sec
ds_write_b96 misaligned by 4: 4.8 sec
ds_write_b32 + ds_write_b64 misaligned by 4: 4.2 sec
ds_write_b32 * 3 misaligned by 4: 4.9 sec
ds_write_b96 misaligned by 8: 4.8 sec
ds_write_b32 + ds_write_b64 misaligned by 8: 4.6 sec
ds_write_b32 * 3 misaligned by 8: 4.9 sec
ds_read_b96 aligned: 3.3 sec
ds_read_b32 + ds_read_b64 aligned: 4.9 sec
ds_read_b32 * 3 aligned: 2.6 sec
ds_read_b96 misaligned by 1: 4.1 sec
ds_read_b32 + ds_read_b64 misaligned by 1: 7.2 sec
ds_read_b32 * 3 misaligned by 1: 10.1 sec
ds_read_b96 misaligned by 2: 4.1 sec
ds_read_b32 + ds_read_b64 misaligned by 2: 7.2 sec
ds_read_b32 * 3 misaligned by 2: 10.1 sec
ds_read_b96 misaligned by 4: 4.1 sec
ds_read_b32 + ds_read_b64 misaligned by 4: 2.6 sec
ds_read_b32 * 3 misaligned by 4: 2.6 sec
ds_read_b96 misaligned by 8: 4.1 sec
ds_read_b32 + ds_read_b64 misaligned by 8: 4.9 sec
ds_read_b32 * 3 misaligned by 8: 2.6 sec
Using platform: AMD Accelerated Parallel Processing
Using device: gfx1030
ds_write_b96 aligned: 4.1 sec
ds_write_b32 + ds_write_b64 aligned: 13.0 sec
ds_write_b32 * 3 aligned: 4.5 sec
ds_write_b96 misaligned by 1: 12.5 sec
ds_write_b32 + ds_write_b64 misaligned by 1: 22.0 sec
ds_write_b32 * 3 misaligned by 1: 31.5 sec
ds_write_b96 misaligned by 2: 12.4 sec
ds_write_b32 + ds_write_b64 misaligned by 2: 22.0 sec
ds_write_b32 * 3 misaligned by 2: 31.5 sec
ds_write_b96 misaligned by 4: 12.4 sec
ds_write_b32 + ds_write_b64 misaligned by 4: 4.0 sec
ds_write_b32 * 3 misaligned by 4: 4.5 sec
ds_write_b96 misaligned by 8: 12.4 sec
ds_write_b32 + ds_write_b64 misaligned by 8: 13.0 sec
ds_write_b32 * 3 misaligned by 8: 4.5 sec
ds_read_b96 aligned: 3.8 sec
ds_read_b32 + ds_read_b64 aligned: 12.8 sec
ds_read_b32 * 3 aligned: 4.4 sec
ds_read_b96 misaligned by 1: 10.9 sec
ds_read_b32 + ds_read_b64 misaligned by 1: 21.8 sec
ds_read_b32 * 3 misaligned by 1: 31.5 sec
ds_read_b96 misaligned by 2: 10.9 sec
ds_read_b32 + ds_read_b64 misaligned by 2: 21.9 sec
ds_read_b32 * 3 misaligned by 2: 31.5 sec
ds_read_b96 misaligned by 4: 10.9 sec
ds_read_b32 + ds_read_b64 misaligned by 4: 3.8 sec
ds_read_b32 * 3 misaligned by 4: 4.5 sec
ds_read_b96 misaligned by 8: 10.9 sec
ds_read_b32 + ds_read_b64 misaligned by 8: 12.8 sec
ds_read_b32 * 3 misaligned by 8: 4.5 sec
```
Fixes: SWDEV-330802
Differential Revision: https://reviews.llvm.org/D123524
Stanislav Mekhanoshin [Thu, 7 Apr 2022 00:41:25 +0000 (17:41 -0700)]
[AMDGPU] Refactor LDS alignment checks.
Move features/bugs checks into the single place
allowsMisalignedMemoryAccessesImpl.
This is mostly NFCI except for the order of selection in couple places.
A separate change may be needed to stop lying about Fast.
Differential Revision: https://reviews.llvm.org/D123343
Simon Pilgrim [Tue, 12 Apr 2022 14:36:20 +0000 (15:36 +0100)]
[X86] getFauxShuffleMask - remove use DemandedElts TODO
Most of the getTargetShuffleInputs recursive calls have now gone and the remaining uses aren't likely to benefit from a DemandedElts mask
Sam McCall [Tue, 12 Apr 2022 14:17:32 +0000 (16:17 +0200)]
[pseudo] Remove unused clangTesting dep. NFC
Fabian Wolff [Tue, 12 Apr 2022 14:03:14 +0000 (16:03 +0200)]
[clang-tidy] Never consider assignments as equivalent in `misc-redundant-expression` check
Fixes https://github.com/llvm/llvm-project/issues/35853.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122535
Pavel Labath [Tue, 12 Apr 2022 13:41:45 +0000 (15:41 +0200)]
[lldb] Adjust libc++ string formatter for changes in D122598
The __size_ member is now in a slightly different location.
Jun Zhang [Tue, 12 Apr 2022 13:11:51 +0000 (21:11 +0800)]
[Clang] Fix unknown type attributes diagnosed twice with [[]] spelling
Don't warn on unknown type attributes in Parser::ProhibitCXX11Attributes
for most cases, but left the diagnostic to the later checks.
module declaration and module import declaration are special cases.
Fixes https://github.com/llvm/llvm-project/issues/54817
Differential Revision: https://reviews.llvm.org/D123447
serge-sans-paille [Mon, 11 Apr 2022 11:38:31 +0000 (13:38 +0200)]
[ValueTracking] Make getStringLenth aware of strdup
During strlen compile-time evaluation, make it possible to track size of
strduped strings.
Differential Revision: https://reviews.llvm.org/D123497
David Spickett [Thu, 7 Apr 2022 15:12:21 +0000 (15:12 +0000)]
[lldb][AArch64] Automatically add all extensions to disassembler
This means we don't have to remember to update this code as much.
This is all tested in lldb/test/Shell/Commands/command-disassemble-aarch64-extensions.s
which I added previously.
We don't have a way to get the latest base architecture yet
so that remains manual. Having all the extensions specified
will probably be equivalent to the latest architecture version
in any case.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D123582
Dmitry Preobrazhensky [Tue, 12 Apr 2022 12:16:20 +0000 (15:16 +0300)]
[AMDGPU][DOC][NFC] Updated GFX10 assembler syntax description
The description has been updated to reflect AMDGPU MC changes:
- enabled literals for src0 of v_fmaak_f*, v_fmamk_f*, v_madak_f32, v_madmk_f32;
- enabled global_atomic_fcmpswap and global_atomic_fcmpswap_x2;
- enabled dlc with flat_atomic* and global_atomic_*.
Bug fixing and improvements:
- enabled s_wait_idle;
- enabled s_waitcnt_depctr;
- added description of s_waitcnt_depctr syntactic sugar;
- disabled SYSMSG_OP_HOST_TRAP_ACK (it is not supported on GFX10);
- corrected description of lgkmcnt (accept values from 0 to 63).
Arjun P [Tue, 12 Apr 2022 12:04:56 +0000 (13:04 +0100)]
[MLIR][Presburger] normalizeDiv: add assert that denom > 0
Dmitry Preobrazhensky [Tue, 12 Apr 2022 11:55:46 +0000 (14:55 +0300)]
[AMDGPU][DOC][NFC] Updated GFX1030 assembler syntax description
Summary of changes:
- enabled null for VOP operands;
- added description of s_waitcnt_depctr syntactic sugar.
Simon Pilgrim [Tue, 12 Apr 2022 11:57:48 +0000 (12:57 +0100)]
[DAG] Add non-uniform vector support to (shl (sr[la] exact X, C1), C2) folds
Dmitri Gribenko [Tue, 12 Apr 2022 11:47:51 +0000 (13:47 +0200)]
Update the Bazel build files for "[mlir][Math] Replace some constant ..."
jacquesguan [Mon, 11 Apr 2022 07:22:32 +0000 (07:22 +0000)]
[mlir][Math] Replace some constant folder functions with common folder functions.
Differential Revision: https://reviews.llvm.org/D123485
Arjun P [Mon, 11 Apr 2022 20:21:34 +0000 (21:21 +0100)]
[MLIR][Presburger][Simplex] addSymbolicCut: don't add symbol div if denom is 1
This is unncessary, so we remove it as an optimization.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D123540
Simon Pilgrim [Tue, 12 Apr 2022 11:21:45 +0000 (12:21 +0100)]
[X86] Fix extact -> exact typo in test names
LLVM GN Syncbot [Tue, 12 Apr 2022 09:55:37 +0000 (09:55 +0000)]
[gn build] Port
95f0f69f1ff8
Haojian Wu [Tue, 12 Apr 2022 09:51:00 +0000 (11:51 +0200)]
Revert "[AST] Add a new TemplateKind for template decls found via a using decl."
It breaks arm build, there is no free bit for the extra
UsingShadowDecl in TemplateName::StorageType.
Reverting it to build the buildbot back until we comeup with a fix.
This reverts commit
5a5be4044f0bceb71bb6a81f6955704691b389ed.
Andrzej Warzynski [Mon, 11 Apr 2022 11:03:29 +0000 (11:03 +0000)]
[mlir] Prefix pass manager options with `mlir-`
With this change, there's going to be a clear distinction between LLVM
and MLIR pass maanger options (e.g. `-mlir-print-after-all` vs
`-print-after-all`). This change is desirable from the point of view of
projects that depend on both LLVM and MLIR, e.g. Flang.
For consistency, all pass manager options in MLIR are prefixed with
`mlir-`, even options that don't have equivalents in LLVM .
Differential Revision: https://reviews.llvm.org/D123495
Matthias Springer [Tue, 12 Apr 2022 09:08:11 +0000 (18:08 +0900)]
[mlir][scf][bufferize][NFC] Lookup buffer using helper function
Lookup iter_arg buffers using `lookupBuffer` instead of always creating a new `ToMemrefOp`. Also cast all yielded buffers (if necessary), regardless of whether they are an equivalent buffer or a new allocation.
Note: This should have been part of D123369.
Differential Revision: https://reviews.llvm.org/D123383
Nikita Popov [Tue, 12 Apr 2022 09:03:42 +0000 (11:03 +0200)]
[InlineCost] Check that function types match
Retain the behavior we get without opaque pointers: A call to a
known function with different function type is considered an
indirect call.
This fixes the crash reported in https://reviews.llvm.org/D123300#3444772.
LLVM GN Syncbot [Tue, 12 Apr 2022 08:49:06 +0000 (08:49 +0000)]
[gn build] Port
5a5be4044f0b
Haojian Wu [Mon, 11 Apr 2022 12:44:46 +0000 (14:44 +0200)]
[AST] Add a new TemplateKind for template decls found via a using decl.
This is the template version of https://reviews.llvm.org/D114251.
This patch introduces a new template name kind (UsingTemplateName). The
UsingTemplateName stores the found using-shadow decl (and underlying
template can be retrieved from the using-shadow decl). With the new
template name, we can be able to find the using decl that a template
typeloc (e.g. TemplateSpecializationTypeLoc) found its underlying template,
which is useful for tooling use cases (include cleaner etc).
This patch merely focuses on adding the node to the AST.
Next steps:
- support using-decl in qualified template name;
- update the clangd and other tools to use this new node;
- add ast matchers for matching different kinds of template names;
Differential Revision: https://reviews.llvm.org/D123127
Yi Kong [Mon, 11 Apr 2022 13:56:12 +0000 (21:56 +0800)]
[BOLT] Compact legacy profiles
Merging multiple legacy profiles (produced by instrumentation BOLT) can
easily reach GiBs. Let merge-fdata compact the profiles during merge to
significantly reduce space usage.
Differential Revision: https://reviews.llvm.org/D123513
Balázs Kéri [Tue, 12 Apr 2022 07:07:28 +0000 (09:07 +0200)]
[clang][ASTImporter] Add import of attribute 'enable_if'.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D123397
Mehdi Amini [Sun, 3 Apr 2022 23:07:43 +0000 (23:07 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in LLVMDialect.cpp (NFC)
Mehdi Amini [Sun, 3 Apr 2022 23:03:41 +0000 (23:03 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in SplitReduction.cpp (NFC)
Mehdi Amini [Tue, 12 Apr 2022 07:43:12 +0000 (07:43 +0000)]
Guard copy of std::function to llvm::function_ref (fix crash)
This is a footgun: assigning a null std::function to a function_ref
does not yield a null function_ref...
Vitaly Buka [Tue, 12 Apr 2022 07:33:08 +0000 (00:33 -0700)]
[sanitizer] Fix internal_mmap in internal symbolizer
Mehdi Amini [Tue, 12 Apr 2022 07:28:19 +0000 (07:28 +0000)]
Use std::function instead of function_ref in MLIR JitRunner
This fixes an ASAN failure.
Mehdi Amini [Tue, 12 Apr 2022 06:50:27 +0000 (06:50 +0000)]
Revert "Fix CUDA runtime wrapper for GPU mem alloc/free to async"
This reverts commit
b4117fede20b8c649320ad37364ae208baa0d0e7.
This broke one of the MLIR bot, a test is failing.
Tobias Hieta [Fri, 8 Apr 2022 07:28:22 +0000 (09:28 +0200)]
workflow: When updating the issueXX branch, use force push
Otherwise if you try to update the branch with a new /cherry-pick
from the same issue you will run into problems similar as to the
one shown in this workflow:
https://github.com/llvm/llvm-project/runs/
5864672298?check_suite_focus=true
Reviewed By: tstellar
Differential Revision: https://reviews.llvm.org/D123365
Carlos Alberto Enciso [Tue, 12 Apr 2022 04:31:26 +0000 (05:31 +0100)]
[llvm-pdbutil] Fix broken '-modi' option after change D122226.
The change described by:
https://reviews.llvm.org/D122226
Moved some llvm-pdbutil functionality to the debug PDB library.
This patch addresses a broken '-modi' argument handling, which
causes an assertion if its value is other than '0' or '1'.
In addition, it moves the assertion for the number of occurrences
of the '-modi' argument from the PDB library into the llvm-pdbutil
driver.
Reviewed By: zequanwu
Differential Revision: https://reviews.llvm.org/D123483
Mehdi Amini [Sun, 3 Apr 2022 22:55:32 +0000 (22:55 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in LinalgOps.cpp (NFC)
Mehdi Amini [Sun, 3 Apr 2022 22:54:19 +0000 (22:54 +0000)]
Apply clang-tidy fixes for performance-for-range-copy in LinalgOps.cpp (NFC)
Fangrui Song [Tue, 12 Apr 2022 05:27:39 +0000 (22:27 -0700)]
[CodeGen][test] Fix disable-tail-calls.c if CLANG_ENABLE_OPAQUE_POINTERS_INTERNAL is off
Fangrui Song [Tue, 12 Apr 2022 05:21:23 +0000 (22:21 -0700)]
[Driver] -fno-optimize-sibling-calls: use the same spelling for its -cc1 counterpart
And remove a -no-opaque-pointers
Carl Ritson [Tue, 12 Apr 2022 04:58:42 +0000 (13:58 +0900)]
[AMDGPU] Graceful abort for waterfalls in SIOptimizeVGPRLiveRange
If the CFG structure of a waterfall loop is not the expected shape
then gracefully abort traversing the IR for the given loop.
This applies to nest waterfall loops which are not supported by
the VGPR live range optimizer.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D123480
rdzhabarov [Tue, 12 Apr 2022 04:47:42 +0000 (04:47 +0000)]
Fix BUILD dependency for ExecutionEngineUtils
Differential Revision: https://reviews.llvm.org/D123570
Carl Ritson [Tue, 12 Apr 2022 04:28:47 +0000 (13:28 +0900)]
[AMDGPU] Pre-commit test for D123569. NFC.
Mehdi Amini [Sun, 3 Apr 2022 22:53:08 +0000 (22:53 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in LinalgOps.cpp (NFC)
Mehdi Amini [Sun, 3 Apr 2022 22:42:23 +0000 (22:42 +0000)]
Apply clang-tidy fixes for performance-move-const-arg in ArithmeticOps.cpp (NFC)
Uday Bondhugula [Tue, 12 Apr 2022 04:23:49 +0000 (09:53 +0530)]
[MLIR] NFC. Address clang-tidy warning in AffineOps.cpp
NFC. Address clang-tidy warning in AffineOps.cpp.
Vitaly Buka [Tue, 12 Apr 2022 04:10:49 +0000 (21:10 -0700)]
[sanitizer] Fix typo in test
Uday Bondhugula [Tue, 12 Apr 2022 03:33:53 +0000 (09:03 +0530)]
Fix CUDA runtime wrapper for GPU mem alloc/free to async
Switch CUDA runtime wrapper for GPU mem alloc/free to async. The
semantics of the GPU dialect ops (gpu.alloc/dealloc) and the wrappers it
lowered to (gpu-to-llvm) was for the async versions -- however, this was
being incorrectly mapped to cuMemAlloc/cuMemFree instead of
cuMemAllocAsync/cuMemFreeAsync.
Reviewed By: csigg
Differential Revision: https://reviews.llvm.org/D123482
PoYao Chang [Fri, 8 Apr 2022 18:13:42 +0000 (02:13 +0800)]
[Clang] CWG 1394: Incomplete types as parameters of deleted functions
According to CWG 1394 and C++20 [dcl.fct.def.general]p2,
Clang should not diagnose incomplete types if function body is "= delete;".
For example:
```
struct Incomplete;
Incomplete f(Incomplete) = delete; // well-formed
```
Also close https://github.com/llvm/llvm-project/issues/52802
Differential Revision: https://reviews.llvm.org/D122981
PoYao Chang [Fri, 8 Apr 2022 18:10:43 +0000 (02:10 +0800)]
[NFC][Clang] Use previously declared variable instead of calling function redundantly
Brad Smith [Tue, 12 Apr 2022 02:34:44 +0000 (22:34 -0400)]
[CSKY] Remove redundant enabling of IAS for Clang, NFC
Generic_GCC::IsIntegratedAssemblerDefault() already takes care of CSKY.
Reviewed By: zixuan-wu
Differential Revision: https://reviews.llvm.org/D123431
Peixin-Qiao [Tue, 12 Apr 2022 02:15:15 +0000 (10:15 +0800)]
[MLIR][OpenMP] Add support for threadprivate directive
This supports the threadprivate directive in OpenMP dialect following
the OpenMP 5.1 [2.21.2] standard. Also lowering to LLVM IR using OpenMP
IRBduiler.
Reviewed By: kiranchandramohan, shraiysh, arnamoy10
Differential Revision: https://reviews.llvm.org/D123350
jacquesguan [Mon, 11 Apr 2022 07:56:46 +0000 (07:56 +0000)]
[mlir][NFC] Remove some redundant code.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D123487
Vitaly Buka [Tue, 12 Apr 2022 01:58:26 +0000 (18:58 -0700)]
[sanitizer] Update undefined symbols of symbolizer
Eugene Zhulenev [Tue, 12 Apr 2022 00:28:51 +0000 (17:28 -0700)]
[mlir] Add msan memory unpoisoning macros to mlir ExecutionEngine
Adding annotations on as-needed bases, currently only for memrefCopy, but in general all C API functions that take pointers to memory allocated/initialized inside the jit-compiled code must be annotated, to be able to run with msan.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D123557
Liqin Weng [Tue, 12 Apr 2022 01:19:39 +0000 (09:19 +0800)]
[InstCombine] fold more constant remainder to select-of-constants remainder
Reviewed By: xbolva00, spatel, Chenbing.Zheng
Differential Revision: https://reviews.llvm.org/D123486
Alexander Shaposhnikov [Tue, 12 Apr 2022 01:25:29 +0000 (01:25 +0000)]
[InstCombine] Fold icmp(X) ? f(X) : C
This diff extends foldSelectInstWithICmp to handle the case icmp(X) ? f(X) : C
when f(X) is guaranteed to be equal to C for all X in the exact range of the inverse predicate.
This addresses the issue https://github.com/llvm/llvm-project/issues/54089.
Differential revision: https://reviews.llvm.org/D123159
Test plan: make check-all
rdzhabarov [Tue, 12 Apr 2022 00:29:23 +0000 (00:29 +0000)]
Fixing BUILD dependency on the DialectBase.
Differential Revision: https://reviews.llvm.org/D123558
Alexander Shaposhnikov [Tue, 12 Apr 2022 01:07:30 +0000 (01:07 +0000)]
[InstCombine][NFC] Add baseline tests for folds icmp(X) ? f(X) : C
Differential revision: https://reviews.llvm.org/D123430
Test plan: make check-all
Craig Topper [Tue, 12 Apr 2022 01:03:43 +0000 (18:03 -0700)]
[SelectionDAG] Remove unecessary null check after call to getNode. NFC
As far as I know getNode will never return a null SDValue.
I'm guessing this was modeled after the FoldConstantArithmetic
call earlier.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D123550
Vitaly Buka [Tue, 12 Apr 2022 00:26:21 +0000 (17:26 -0700)]
[sanitizer] Make test pass with InternalSymbolizer
Vitaly Buka [Tue, 12 Apr 2022 00:25:08 +0000 (17:25 -0700)]
[sanitizer] Fix arg types of internal functions
They didn't match sanitizer_common for 32bit.
Matt Arsenault [Sun, 10 Apr 2022 14:47:12 +0000 (10:47 -0400)]
GlobalISel: Verify atomic load/store ordering restriction
Reject acquire stores and release loads. This matches the restriction
imposed by the LLParser and IR verifier.
Matt Arsenault [Sat, 9 Apr 2022 18:16:18 +0000 (14:16 -0400)]
AArch64/GlobalISel: Regenerate mir test checks
Minimizes the test diffs in future changes from introduction of -NEXT.
Arthur Eubanks [Fri, 8 Apr 2022 22:18:16 +0000 (15:18 -0700)]
Reland [mlir] Remove uses of LLVM's legacy pass manager
Use the new pass manager.
This also removes the ability to run arbitrary sets of passes. Not sure if this functionality is used, but it doesn't seem to be tested.
No need to initialize passes outside of constructing the PassBuilder with the new pass manager.
Reland: Fixed custom calls to `-lower-matrix-intrinsics` in integration tests by replacing them with `-O0 -enable-matrix`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D123425
LLVM GN Syncbot [Mon, 11 Apr 2022 23:49:30 +0000 (23:49 +0000)]
[gn build] Port
203a1e36ed75
Arthur Eubanks [Mon, 11 Apr 2022 23:45:19 +0000 (16:45 -0700)]
Revert "[mlir] Remove uses of LLVM's legacy pass manager"
This reverts commit
b0f7f6f78d050cc89b31c87fb48744989145af60.
Causes test failures: https://lab.llvm.org/buildbot#builders/61/builds/24879
Matt Arsenault [Sun, 10 Apr 2022 23:50:47 +0000 (19:50 -0400)]
GlobalISel: Add memSizeNotByteSizePow2 legality helper
This is really a replacement for memSizeInBytesNotPow2 that actually
does what most every target wants. In particular, since s1 rounds to 1
byte, it wasn't lowered by this predicate. This results in targets
needing to think harder and add more matchers to catch all the
degenerate cases.
Also small bug fix that prevented the correct insertion of
G_ASSERT_ZEXT in the AArch64 use case.
Matt Arsenault [Mon, 11 Apr 2022 17:24:57 +0000 (13:24 -0400)]
GlobalISel: Implement computeKnownBits for overflow bool results
Matt Arsenault [Sun, 10 Apr 2022 22:20:10 +0000 (18:20 -0400)]
AMDGPU/GlobalISel: Add some additional IR tests for zextload
Matt Arsenault [Sun, 10 Apr 2022 17:27:56 +0000 (13:27 -0400)]
AMDGPU/GlobalISel: Add more tests for inreg extend + load combine
Matt Arsenault [Sun, 10 Apr 2022 15:23:06 +0000 (11:23 -0400)]
Mips/GlobalISel: Remove test IR sections and regenerate checks
Matt Arsenault [Sun, 10 Apr 2022 12:37:44 +0000 (08:37 -0400)]
AArch64/GlobalISel: Remove IR section from a test
Matt Arsenault [Sat, 9 Apr 2022 12:37:44 +0000 (08:37 -0400)]
AMDGPU/GlobalISel: Remove unused parameter
Matt Arsenault [Fri, 17 Dec 2021 15:23:03 +0000 (10:23 -0500)]
Reapply "AMDGPU: Remove AMDGPUFixFunctionBitcasts pass"
This reverts commit
8a85be807bd453eb9c88d0126c75fd5ea393f60d.
The unrelated failure this exposed was fixed.
Mahesh Ravishankar [Mon, 11 Apr 2022 23:34:43 +0000 (23:34 +0000)]
[mlir][Linalg] Split `populateElementwiseOpsFusionPatterns`.
The method to add elementwise ops fusion patterns pulls in many other
patterns by default. The patterns to pull in along with the
elementwise op fusion should be upto the caller. Split the method to
pull in just the elementwise ops fusion pattern. Other cleanup changes
include
- Move the pattern for constant folding of generic ops (currently only
constant folds transpose) into a separate file, cause it is not
related to fusion
- Drop the uber LinalgElementwiseFusionOptions. With the
populateElementwiseOpsFusionPatterns being split, this has no
utility now.
- Drop defaults for the control function.
- Fusion of splat constants with generic ops doesnt need a control
function. It is always good to do.
Differential Revision: https://reviews.llvm.org/D123236
Arthur Eubanks [Fri, 8 Apr 2022 22:18:16 +0000 (15:18 -0700)]
[mlir] Remove uses of LLVM's legacy pass manager
Use the new pass manager.
This also removes the ability to run arbitrary sets of passes. Not sure if this functionality is used, but it doesn't seem to be tested.
No need to initialize passes outside of constructing the PassBuilder with the new pass manager.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D123425
Mehdi Amini [Sun, 3 Apr 2022 22:36:30 +0000 (22:36 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in AffineOps.cpp (NFC)
Mehdi Amini [Sun, 3 Apr 2022 22:28:45 +0000 (22:28 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in ConvertShapeConstraints.cpp (NFC)
Changpeng Fang [Mon, 11 Apr 2022 23:12:39 +0000 (16:12 -0700)]
AMDGPU: Align the implicit kernel argument segment to 8 bytes for v5
Summary:
In emitting metadata for implicit kernel arguments, we need to be in sync with the actual loads
to align the implicit kernel argument segment to 8 byte boundary. In this work, we simply force
this alignment through the first implicit argument.
In addition, we don't emit metadata for any implicit kernel argument if none of them is actually used.
Reviewers: arsenm, b-sumner
Differential Revision: https://reviews.llvm.org/D123346