Fangrui Song [Sat, 26 Mar 2022 07:57:06 +0000 (00:57 -0700)]
[Option] Avoid using the default argument for the 3-argument hasFlag. NFC
The default argument true is error-prone: I think many would think the
default is false.
Fangrui Song [Sat, 26 Mar 2022 06:59:31 +0000 (23:59 -0700)]
[Driver][test] Clean up riscv* tests
See `D119309` for the guideline (-target, -no-canonical-prefixes, unneeded -o
with -###).
Ben Shi [Sat, 26 Mar 2022 03:24:18 +0000 (03:24 +0000)]
[AVR] Optimize int16 airthmetic right shift for shift amount 7/14/15
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D115618
Fangrui Song [Sat, 26 Mar 2022 05:15:35 +0000 (22:15 -0700)]
[LoongArch] Fix several Clang warnings. NFC
Shengchen Kan [Sat, 26 Mar 2022 05:00:53 +0000 (13:00 +0800)]
[X86][tablgen] Add class RecognizableInstrBase to simplify X86 code, NFCI
Joseph Huber [Sat, 26 Mar 2022 03:05:29 +0000 (23:05 -0400)]
[OpenMP] Fix AMDGPU globals test
Shilei Tian [Sat, 26 Mar 2022 02:49:25 +0000 (22:49 -0400)]
[OpenMP][CUDA] Fix potential program crash caused by double free resources
As we mentioned in the code comments for function `ResourcePoolTy::release`,
at some point there could be two identical resources on the two sides of `Next`
mark. It is usually not an issue, unless the following case:
1. Some resources are not returned.
2. We need to iterate the pool and free the element.
That will cause double free, which is the case for event pool. Since we don't release
events hold by the data map, it can happen that the `Next` mark is not reset, and
we have two identical items in the pool. When the pool is destroyed, we will call
`cuEventDestroy` twice on the same event. In the best case, we can only observe
CUDA errors. In the worst case, it can cause internal failures in CUDART and further
crash.
This patch fixes the issue by tracking all resources that have been given using
an `unordered_set`. We don't remove it when a resource is returned. When the pool
is destroyed, we merge the pool (a `vector`) and the set. In this way, we can make
sure that the set contains all resources allocated from the device. We just need
to iterate the set and free the resource accordingly.
For now, only event pool is set to use it. Stream pool is not because we can make
sure all streams are returned when the plugin is destroyed.
Someone might be wondering, why don't we release all events hold in the data map.
That is because, plugins are determined to be destroyed *before* `libomptarget`.
If we can somehow make the plugin outlast `libomptarget`, life will be much
easier.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122014
Joseph Huber [Fri, 25 Mar 2022 22:42:12 +0000 (18:42 -0400)]
[OpenMP] Add AMDGPU calling convention to ctor / dtor functions
This patch adds the necessary AMDGPU calling convention to the ctor /
dtor kernels. These are fundamentally device kenels called by the host
on image load. Without this calling convention information the AMDGPU
plugin is unable to identify them.
Depends on D122504
Fixes #54091
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122515
Joseph Huber [Fri, 25 Mar 2022 20:36:57 +0000 (16:36 -0400)]
[OpenMP] Make Ctor / Dtor functions have external visibility
The default construction of constructor functions by LLVM tends to make
them have internal linkage. When we call a ctor / dtor function in the
target region we are actually creating a kernel that is called at
registration. Because the ctor is a kernel we need to make sure it's
externally visible so we can actually call it. This prevented AMDGPU
from correctly using constructors while NVPTX could use them simply
because it ignored internal visibility.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D122504
Shengchen Kan [Fri, 25 Mar 2022 12:54:19 +0000 (20:54 +0800)]
[X86][tablgen] Add interface getMnemonic to namespace X86Disassembler, NFCI
Address comments in D122477 b/c `getMnemonic` is common to X86 and may be
used in more than one place.
Maksim Panchenko [Mon, 21 Mar 2022 22:45:48 +0000 (15:45 -0700)]
[Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D122245
Peter Klausler [Thu, 24 Mar 2022 19:59:04 +0000 (12:59 -0700)]
[flang] Catch bad OPEN(STATUS=) cases
STATUS='NEW' and 'REPLACE' require FILE= to be present.
STATUS='SCRATCH' may not appear with FILE=.
These errors are caught at compilation time when constant character
strings are used in an OPEN statement, but the runtime needs
to enforce them as well to catch errors in OPEN statements
with character variables and expressions.
Differential Revision: https://reviews.llvm.org/D122509
Uday Bondhugula [Wed, 23 Mar 2022 06:46:27 +0000 (12:16 +0530)]
Update affine.load folding hook to fold global splat constant loads
Enhance affine.load folding hook to fold loads on global splat constant
memrefs.
Differential Revision: https://reviews.llvm.org/D122292
Fred Riss [Mon, 1 Feb 2021 19:44:27 +0000 (11:44 -0800)]
Adopt new dyld SPIs to introspect the shared cache.
With the shared cache getting split into multiple files, the current
way we created ObjectFileMachO objects for shared cache dylib images
will break.
This patch conditionally adopts new SPIs which will do the right
thing in the new world of multi-file caches.
Gulfem Savrun Yeniceri [Sat, 26 Mar 2022 00:22:38 +0000 (00:22 +0000)]
[InstrProfiling] Add comments for no runtime hook
This patch adds comments about
c7f91e227a79, and
follows LLVM style guideline about nested if statements.
David Blaikie [Wed, 9 Mar 2022 21:13:31 +0000 (21:13 +0000)]
Disable -Wmissing-prototypes for internal linkage functions that aren't explicitly marked "static"
Some functions can end up non-externally visible despite not being
declared "static" or in an unnamed namespace in C++ - such as by having
parameters that are of non-external types.
Such functions aren't mistakenly intended to be defining some function
that needs a declaration. They could be maybe more legible (except for
the operator new example) with an explicit static, but that's a
stylistic thing outside what should be addressed by a warning.
This reapplies
275c56226d7fbd6a4d554807374f78d323aa0c1c - once we figure
out what to do about the change in behavior for -Wnon-c-typedef-for-linkage
(this reverts the revert commit
85ee1d3ca1d06b6bd3477515b8d0c72c8df7c069)
Differential Revision: https://reviews.llvm.org/D121328
David Blaikie [Fri, 25 Mar 2022 22:42:41 +0000 (22:42 +0000)]
DebugInfo: Don't allow type units to references types in the CU
We could only do this in limited ways (since we emit the TUs first, we
can't use ref_addr (& we can't use that in Split DWARF either) - so we
had to synthesize declarations into the TUs) and they were ambiguous in
some cases (if the CU type had internal linkage, parsing the TU would
require knowing which CU was referencing the TU to know which type the
declaration was for, which seems not-ideal). So to avoid all that, let's
just not reference types defined in the CU from TUs - instead moving the
TU type into the CU (recursively).
This does increase debug info size (by pulling more things out of type
units, into the compile unit) - about 2% of uncompressed dwp file size
for clang -O0 -g -gsplit-dwarf. (5% .debug_info.dwo section size
increase in the .dwp)
Haojian Wu [Fri, 25 Mar 2022 23:04:41 +0000 (00:04 +0100)]
[pseudo] Fix the wrong rule ids in ForestTest.
Haojian Wu [Fri, 25 Mar 2022 22:51:19 +0000 (23:51 +0100)]
[pseudo] Add missing header guard for Forest.h
Peter Klausler [Mon, 21 Mar 2022 23:01:06 +0000 (16:01 -0700)]
[flang] Mark C_ASSOCIATED specific procedures as PURE
The interfaces to C_ASSOCIATED()'s specific procedures must be
PURE so that they are accepted for use in specification expressions.
Differential Revision: https://reviews.llvm.org/D122438
Med Ismail Bennani [Fri, 25 Mar 2022 21:25:23 +0000 (14:25 -0700)]
[lldb/Plugin] Sort the ScriptedProcess' thread list before creating threads
With Scripted Processes, in order to create scripted threads, the blueprint
provides a dictionary that have each thread index as the key with the respective
thread instance as the pair value.
In Python, this is fine because a dictionary key can be of any type including
integer types:
```
>>> {1: "one", 2: "two", 10: "ten"}
{1: 'one', 2: 'two', 10: 'ten'}
```
However, when the python dictionary gets bridged to C++ we convert it to a
`StructuredData::Dictionary` that uses a `std::map<ConstString, ObjectSP>`
for storage.
Because `std::map` is an ordered container and ours uses the `ConstString`
type for keys, the thread indices gets converted to strings which makes the
dictionary sorted alphabetically, instead of numerically.
If the ScriptedProcess has 10 threads or more, it causes thread “10”
(and higher) to be after thread “1”, but before thread “2”.
In order to solve this, this sorts the thread info dictionary keys
numerically, before iterating over them to create ScriptedThreads.
rdar://
90327854
Differential Revision: https://reviews.llvm.org/D122429
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Med Ismail Bennani [Thu, 24 Mar 2022 19:21:21 +0000 (12:21 -0700)]
[lldb/Utility] Make StructuredData::Dictionary::GetKeys return an Array
This patch changes `StructuredData::Dictionary::GetKeys` return type
from an `StructuredData::ObjectSP` to a `StructuredData::ArraySP`.
The function already stored the keys in an array but implicitely upcasted
it to an `ObjectSP`, which required the user to convert it again to a
Array object to access each element.
Since we know the keys should be held by an iterable container, it makes
more sense to return the allocated ArraySP as-is.
Differential Revision: https://reviews.llvm.org/D122426
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Med Ismail Bennani [Fri, 25 Mar 2022 00:19:33 +0000 (17:19 -0700)]
[lldb/crashlog] Parse thread fields and pass it to crashlog scripted process
Previously, the ScriptedThread used the thread index as the thread id.
This patch parses the crashlog json to extract the actual thread "id" value,
and passes this information to the Crashlog ScriptedProcess blueprint,
to create a higher fidelity ScriptedThreaad.
It also updates the blueprint to show the thread name and thread queue.
Finally, this patch updates the interactive crashlog test to reflect
these changes.
rdar://
90327854
Differential Revision: https://reviews.llvm.org/D122422
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Fangrui Song [Fri, 25 Mar 2022 21:56:18 +0000 (14:56 -0700)]
[Driver][Linux] Remove D.Dir+"/../lib" from default search paths for LLVM_ENABLE_RUNTIMES builds
The rule was added in 2014 to support -stdlib=libc++ and -lc++ without
specifying -L, when D.Dir is not a well-known system library directory like
/usr/lib /usr/lib64. This rule turns out to get in the way with (-m32 for
64-bit clang) or (-m64 for 32-bit clang) for Gentoo :
https://github.com/llvm/llvm-project/issues/54515
Nowadays LLVM_ENABLE_RUNTIMES is the only recommended way building libc++ and
LLVM_ENABLE_PROJECTS=libc++ is deprecated. LLVM_ENABLE_RUNTIMES builds libc++
in D.Dir+"/../lib/${triple}/". The rule is unneeded. Also reverts D108286.
Gentoo uses a modified LLVM_ENABLE_RUNTIMES that installs libc++.so in
well-known paths like /usr/lib64 and /usr/lib which are already covered by
nearby search paths.
Implication: if a downstream package needs something like -lLLVM-15git and uses
libLLVM-15git.so not in a well-known path, it needs to supply -L
D.Dir+"/../lib" explicitly (e.g. via LLVMConfig.cmake), instead of relying on
the previous default search path.
Reviewed By: mgorny
Differential Revision: https://reviews.llvm.org/D122444
Johannes Doerfert [Fri, 25 Mar 2022 21:06:46 +0000 (16:06 -0500)]
Revert "[OpenMP][NFC] Add missing virtual destructor to silence warning"
This reverts commit
b9fd8f34ae547674ac0b5f5fbc5bb66d2bc0fedb as it
accidentally contained a unit test change that is not finished (and
unrelated).
Florian Hahn [Fri, 25 Mar 2022 21:05:58 +0000 (21:05 +0000)]
[Clang] Use pattern to match profile metadata in test.
Make the test more robust to slightly different metadata numbering by
using a pattern instead of hard coding the ids.
Johannes Doerfert [Fri, 25 Mar 2022 21:00:06 +0000 (16:00 -0500)]
[OpenMP][FIX] Repair ExclusiveAccess move semantic snafu
Johannes Doerfert [Fri, 25 Mar 2022 19:53:38 +0000 (14:53 -0500)]
[OpenMP][NFC] Add missing virtual destructor to silence warning
William S. Moses [Fri, 25 Mar 2022 19:39:43 +0000 (15:39 -0400)]
[Clang] Add helper method to determine if a nonvirtual base has an entry in the LLVM struct
This patch adds a helper method to determine if a nonvirtual base has an entry in the LLVM struct. Such a base may not have an entry
if the base does not have any fields/bases itself that would change the size of the struct. This utility method is useful for other frontends (Polygeist) that use Clang as an API to generate code.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D122502
Paul Robinson [Fri, 25 Mar 2022 20:13:00 +0000 (13:13 -0700)]
Remove dead code in driver parsing -gsimple-template-names= options
While -g[no-]simple-template-names is a driver option, the fancier
-gsimple-template-names={simple,mangled} option is cc1-only, so code
to handle it in the driver is dead.
Differential Revision: https://reviews.llvm.org/D122503
Peter Klausler [Wed, 23 Mar 2022 21:05:50 +0000 (14:05 -0700)]
[flang] Add & use a better visit()
Adds flang/include/flang/Common/visit.h, which defines
a Fortran::common::visit() template function that is a drop-in
replacement for std::visit(). Modifies most use sites in
the front-end and runtime to use common::visit().
The C++ standard mandates that std::visit() have O(1) execution
time, which forces implementations to build dispatch tables.
This new common::visit() is O(log2 N) in the number of alternatives
in a variant<>, but that N tends to be small and so this change
produces a fairly significant improvement in compiler build
memory requirements, a 5-10% improvement in compiler build time,
and a small improvement in compiler execution time.
Building with -DFLANG_USE_STD_VISIT causes common::visit()
to be an alias for std::visit().
Calls to common::visit() with multiple variant arguments
are referred to std::visit(), pending further work.
Differential Revision: https://reviews.llvm.org/D122441
Hongtao Yu [Thu, 24 Mar 2022 18:15:34 +0000 (11:15 -0700)]
[PseudoProbe] Do not emit pseudo probes when module is not probed.
There is a case when a function has pseudo probe intrinsics but the module it resides does not have the probe desc. This could happen when the current module is not built with `-fpseudo-probe-for-profiling` while a function in it calls some other function from a probed module. In thinLTO mode, the callee function could be imported and inlined into the current function.
While this is undefined behavior, I'm fixing the asm printer to not ICE and warn user about this.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D121737
Emilio Cota [Fri, 25 Mar 2022 19:40:48 +0000 (15:40 -0400)]
[bazel] add missing targets since
3be7c28917
Adrian Prantl [Fri, 25 Mar 2022 19:36:32 +0000 (12:36 -0700)]
Add missing include diagnosed in modules build. (NFC)
Martin Storsjö [Fri, 25 Mar 2022 19:22:34 +0000 (21:22 +0200)]
[clang-tidy] Fix the condition for building CTTestTidyModule
This is the correct intended condition; the problematic case where
we don't want to try to build the plugin is "WIN32 AND LLVM_LINK_LLVM_DYLIB"
and thus the negation is "NOT WIN32 OR NOT LLVM_LINK_LLVM_DYLIB".
Differential Revision: https://reviews.llvm.org/D121687
Sam McCall [Wed, 16 Mar 2022 02:13:44 +0000 (03:13 +0100)]
[cmake] Provide CURRENT_TOOLS_DIR centrally, replacing CLANG_TOOLS_DIR
CLANG_TOOLS_DIR holds the the current bin/ directory, maybe with a %(build_mode)
placeholder. It is used to add the just-built binaries to $PATH for lit tests.
In most cases it equals LLVM_TOOLS_DIR, which is used for the same purpose.
But for a standalone build of clang, CLANG_TOOLS_DIR points at the build tree
and LLVM_TOOLS_DIR points at the provided LLVM binaries.
Currently CLANG_TOOLS_DIR is set in clang/test/, clang-tools-extra/test/, and
other things always built with clang. This is a few cryptic lines of CMake in
each place. Meanwhile LLVM_TOOLS_DIR is provided by configure_site_lit_cfg().
This patch moves CLANG_TOOLS_DIR to configure_site_lit_cfg() and renames it:
- there's nothing clang-specific about the value
- it will also replace LLD_TOOLS_DIR, LLDB_TOOLS_DIR etc (not in this patch)
It also defines CURRENT_LIBS_DIR. While I removed the last usage of
CLANG_LIBS_DIR in
e4cab4e24d1, there are LLD_LIBS_DIR usages etc that
may be live, and I'd like to mechanically update them in a followup patch.
Differential Revision: https://reviews.llvm.org/D121763
Chia-hung Duan [Fri, 25 Mar 2022 18:09:53 +0000 (18:09 +0000)]
[mlir] Add InferTensorType without supporting reifyReturnTypeShapes
This is useful for the case that we don't need to implement
reifyReturnTypeShapes.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D121403
Philip Reames [Fri, 25 Mar 2022 18:34:31 +0000 (11:34 -0700)]
[SLP] Simplify eraseInstruction [NFC]
This simplifies the implementation of eraseInstruction by moving the odd-replace-users-with-undef handling back to the only caller which uses it. This handling was not obviously correct, so add the asserts which make it clear why this is safe to do at all. The result is simpler code and stronger assertions.
LLVM GN Syncbot [Fri, 25 Mar 2022 18:54:35 +0000 (18:54 +0000)]
[gn build] Port
cef52105bd4b
Douglas Yung [Fri, 25 Mar 2022 18:53:42 +0000 (11:53 -0700)]
Revert "[clang-tidy] Add modernize-macro-to-enum check"
This reverts commit
39b80c8380c86539de391600efaa17184b5a52b4.
This change was causing build failures on several build bots:
- https://lab.llvm.org/buildbot/#/builders/139/builds/19210
- https://lab.llvm.org/buildbot/#/builders/93/builds/7956
Corentin Jabot [Fri, 25 Mar 2022 18:34:16 +0000 (19:34 +0100)]
[Clang] Fix error in Documentation introduced by
3784e8cc [nfc].
The documentation contained extra space.
Also remove https://github.com/llvm/llvm-project/issues/54296
from the list of issues by
3784e8cc as this commit did not
fix it (nor was it supposed to).
Peter Klausler [Wed, 23 Mar 2022 23:02:59 +0000 (16:02 -0700)]
[flang] Fix bogus error from assignment to CLASS(*)
Assignment semantics was coughing up bad errors and crashes for
intrinsic assignments to unlimited polymorphic entities while
looking for any (impossible) user defined ASSIGNMENT(=) generic
or intrinsic type conversion.
Differential Revision: https://reviews.llvm.org/D122440
Corentin Jabot [Sat, 12 Mar 2022 19:49:01 +0000 (20:49 +0100)]
[Clang] Fix Unevaluated Lambdas
Unlike other types, when lambdas are instanciated,
they are recreated from scratch.
When an unevaluated lambdas appear in the type of a function,
parameter it is instanciated in the wrong declaration context,
as parameters are transformed before the function.
To support lambda in function parameters, we try to
compute whether they are dependant without looking at the
declaration context.
This is a short term stopgap solution to avoid clang
iceing. A better fix might be to inject some kind of
transparent declaration with correctly computed dependency
for function parameters, variable templates, etc.
Fixes https://github.com/llvm/llvm-project/issues/50376
Fixes https://github.com/llvm/llvm-project/issues/51414
Fixes https://github.com/llvm/llvm-project/issues/51416
Fixes https://github.com/llvm/llvm-project/issues/51641
Fixes https://github.com/llvm/llvm-project/issues/54296
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D121532
Florian Hahn [Fri, 25 Mar 2022 18:12:39 +0000 (18:12 +0000)]
[Clang,TBAA] Use pattern for metadata reference in test.
Update the single check line that still had a hard-coded metadata
reference. This makes it more robust to slight changes in the metadata
numbering.
Florian Hahn [Fri, 25 Mar 2022 18:08:24 +0000 (18:08 +0000)]
[ConstraintElimination] Use AddOverflow for offset summation.
Fixes an incorrect transformation due to values overflowing
https://alive2.llvm.org/ce/z/uizoea
Florian Hahn [Fri, 25 Mar 2022 18:08:17 +0000 (18:08 +0000)]
[ConstraintElimination] Add test where offset additions overflow.
Emil Kieri [Fri, 25 Mar 2022 15:33:25 +0000 (16:33 +0100)]
[clang][driver] Disable non-functional --version option for clang -cc1
This patch removes --version as a clang -cc1 option.
clang --version
and
clang --cc1 -version
remain valid. This behaviour is consistent with clang -cc1as.
Previously, clang -cc1 accepted both --version and -version, but
only -version was acted upon. The call
clang -cc1 --version
stalled without any message: --version was an accepted option but
triggered no action, and the driver waited for standard input.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D122344
Nathan James [Fri, 25 Mar 2022 17:53:24 +0000 (17:53 +0000)]
Reland "[ASTMatchers] Output currently processing match and nodes on crash"
This reverts commit
cff34ccb605aa78030cd51cfe44362ed1c1fb80b.
This relands commit
d89f9e963e4979466193dc6a15fe091bf7ca5c47
Philip Reames [Fri, 18 Mar 2022 22:33:43 +0000 (15:33 -0700)]
Reapply "[SLP] Schedule only sub-graph of vectorizable instructions"" (try 3)
The original commit exposed several missing dependencies (e.g. latent bugs in SLP scheduling). Most of these were fixed over the weekend and have had several days to bake. The last was fixed this morning after being noticed in manual review of test changes yesterday. See the review thread for links to each change.
Original commit message follows:
SLP currently schedules all instructions within a scheduling window which stretches from the first instruction potentially vectorized to the last. This window can include a very large number of unrelated instructions which are not being considered for vectorization. This change switches the code to only schedule the sub-graph consisting of the instructions being vectorized and their transitive users.
This has the effect of greatly reducing the amount of work performed in large basic blocks, and thus greatly improves compile time on degenerate examples. To understand the effects, I added some statistics (not planned for upstream contribution). Here's an illustration from my motivating example:
Before this patch:
704357 SLP - Number of calcDeps actions
699021 SLP - Number of schedule calls
5598 SLP - Number of ReSchedule actions
59 SLP - Number of ReScheduleOnFail actions
10084 SLP - Number of schedule resets
8523 SLP - Number of vector instructions generated
After this patch:
102895 SLP - Number of calcDeps actions
161916 SLP - Number of schedule calls
5637 SLP - Number of ReSchedule actions
55 SLP - Number of ReScheduleOnFail actions
10083 SLP - Number of schedule resets
8403 SLP - Number of vector instructions generated
I do want to highlight that there is a small difference in number of generated vector instructions. This example is hitting the bailout due to maximum window size, and the change in scheduling is slightly perturbing when and how we hit it. This can be seen in the RescheduleOnFail counter change. Given that, I think we can safely ignore.
The downside of this change can be seen in the large test diff. We group all vectorizable instructions together at the bottom of the scheduling region. This means that vector instructions can move quite far from their original point in code. While maybe undesirable, I don't see this as being a major problem as this pass is not intended to be a general scheduling pass.
For context, it's worth noting that the pre-scheduling that SLP does while building the vector tree is exactly the sub-graph scheduling implemented by this patch.
Differential Revision: https://reviews.llvm.org/D118538
Christopher Bate [Fri, 25 Mar 2022 17:20:07 +0000 (17:20 +0000)]
[mlir][NVVM] Add support for nvvm mma.sync ops
This patch adds MLIR NVVM support for the various NVPTX `mma.sync`
operations. There are a number of possible data type, shape,
and other attribute combinations supported by the operation, so a
custom assebmly format is added and attributes are inferred where
possible.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D122410
Jean Perier [Fri, 25 Mar 2022 17:01:50 +0000 (18:01 +0100)]
[flang][lowering] Handle zero extent case in LBOUND
Follow up of https://reviews.llvm.org/D121488. Ensure lower bounds
are `1` when the related dimension extent is zero. Note that lower
bounds from descriptors are now guaranteed to fulfill this property
after the runtime/codegen patches.
Also fixes explicit shape array extent lowering when instantiating
variables to deal with negative extent cases (issue found while testing
LBOUND edge case). This notably caused allocation crashes when dealing
with automatic arrays with reversed bounds or negative size
specification expression. The standard specifies that the extent of such
arrays is zero. This change has some ripple effect in the current lit
tests.
Add move two helpers as part of this change:
- Add a helper to tell if a fir::ExtendedValue describes an assumed size
array (last dimension extent is unknown to the compiler, both at compile
time and runtime).
- Move and share getIntIfConstant from Character.cpp so that it can be
used elsewhere (NFC).
Differential Revision: https://reviews.llvm.org/D122467
Philip Reames [Fri, 25 Mar 2022 17:01:39 +0000 (10:01 -0700)]
[SLP] Optimize stacksave dependence handling [NFC]
After writing the commit message for
4b1bace28, realized that the mentioned optimization was rather straight forward. We already have the code for scanning a block during region initialization, we can simply keep track if we've seen a stacksave or stackrestore. If we haven't, none of these dependencies are relevant and we can avoid the relatively expensive scans entirely.
Philip Reames [Fri, 25 Mar 2022 16:09:02 +0000 (09:09 -0700)]
[SLP] Explicit track required stacksave/alloca dependency (try 3)
This is an extension of commit b7806c to handle one last case noticed in test changes for D118538. Again, this is thought to be a latent bug in the existing code, though this time I have not managed to reduce tests for the original algoritthm.
The prior attempt had failed to account for this case:
%a = alloca i8
stacksave
stackrestore
store i8 0, i8* %a
If we allow '%a' to reorder into the stacksave/restore region, then the alloca will be deallocated before the use. We will have taken a well defined program, and introduced a use-after-free bug.
There's also an inverse case where the alloca originally follows the stackrestore, and we need to prevent the reordering it above the restore.
Compile time wise, we potentially do an extra scan of the block for each alloca seen in a bundle. This is significantly more expensive than the stacksave rooted version and is why I'd tried to avoid this in the initial patch. There is room to optimize this (by essentially caching a "has stacksave" bit per block), but I'm leaving that to future work if it actually shows up in practice. Since allocas in bundles should be rare in practice, I suspect we can defer the complexity for a long while.
Gulfem Savrun Yeniceri [Wed, 23 Mar 2022 16:22:32 +0000 (16:22 +0000)]
[InstrProfiling] No runtime hook for unused funcs
CoverageMappingModuleGen generates a coverage mapping record
even for unused functions with internal linkage, e.g.
static int foo() { return 100; }
Clang frontend eliminates such functions, but InstrProfiling pass
still pulls in profile runtime since there is a coverage record.
Fuchsia uses runtime counter relocation, and pulling in profile
runtime for unused functions causes a linker error:
undefined hidden symbol: __llvm_profile_counter_bias.
Since
389dc94d4be7, we do not hook profile runtime for the binaries
that none of its translation units have been instrumented in Fuchsia.
This patch extends that for the instrumented binaries that
consist of only unused functions.
Differential Revision: https://reviews.llvm.org/D122336
Argyrios Kyrtzidis [Fri, 25 Mar 2022 16:47:41 +0000 (09:47 -0700)]
[Support/BLAKE3] Do manual instrumentation of `llvm_blake3_hasher_finalize` for memory sanitizer
This is to avoid false positives when using the uninstrumented assembly code implementation.
Florian Hahn [Fri, 25 Mar 2022 16:57:12 +0000 (16:57 +0000)]
[LV] Use getVectorLoopRegion to retrieve header. (NFC)
Update all places that currently assume the entry block to the plan is
also the vector loop header to use getVectorLoopRegion instead.
getVectorLoopRegion will keep doing the right thing when the pre-header
is modeled explicitly (and becomes the new entry block in the plan).
Jonas Devlieghere [Fri, 25 Mar 2022 16:47:08 +0000 (09:47 -0700)]
[lldb] Conditionalize target_link_libraries on the target
Fixes "Cannot specify link libraries for target "lldb-target-fuzzer"
which is not built by this project." Normally that's taken care of by
add_llvm_fuzzer but we need target_link_libraries for liblldb and our
utility library.
lipracer [Fri, 25 Mar 2022 16:49:14 +0000 (16:49 +0000)]
[mlir][tosa] : adding folder and canonicalizer for select
define canonicalizer and folder for tosa::select
Reviewed By: mehdi_amini, Mogball
Differential Revision: https://reviews.llvm.org/D121513
Peter Klausler [Mon, 21 Mar 2022 23:18:03 +0000 (16:18 -0700)]
[flang] Fix cycle-catcher in procedure characterization
The "seenProcs" sets passed as arguments to the procedure and dummy
procedure characterization routines need to be passed by value so that
local updates to those sets do not become permanent. They are
presently passed by reference and that has led to bogus errors about
recursively defined procedures in testing.
(It might be faster to pass the sets by reference and undo those local
updates in these functions, but that's error-prone, and the performance
difference is not expected to be detectable in practice.)
Differential Revision: https://reviews.llvm.org/D122439
Yitzhak Mandelbaum [Mon, 21 Mar 2022 15:03:52 +0000 (15:03 +0000)]
[clang][dataflow] Add support for disabling warnings on smart pointers.
This patch provides the user with the ability to disable all checked of accesses
to optionals that are the pointees of smart pointers. Since smart pointers are
not modeled (yet), the system cannot distinguish safe from unsafe accesses to
optionals through smart pointers. This results in false positives whenever
optionals are used through smart pointers. The patch gives the user the choice
of ignoring all positivess in these cases.
Differential Revision: https://reviews.llvm.org/D122143
Johannes Doerfert [Sat, 5 Mar 2022 21:14:20 +0000 (15:14 -0600)]
[OpenMP][FIX] Ensure exclusive access to the HDTT map
This patch solves two problems with the `HostDataToTargetMap` (HDTT
map) which caused races and crashes before:
1) Any access to the HDTT map needs to be exclusive access. This was not
the case for the "dump table" traversals that could collide with
updates by other threads. The new `Accessor` and `ProtectedObject`
wrappers will ensure we have a hard time introducing similar races in
the future. Note that we could allow multiple concurrent
read-accesses but that feature can be added to the `Accessor` API
later.
2) The elements of the HDTT map were `HostDataToTargetTy` objects which
meant that they could be copied/moved/deleted as the map was changed.
However, we sometimes kept pointers to these elements around after we
gave up the map lock which caused potential races again. The new
indirection through `HostDataToTargetMapKeyTy` will allows us to
modify the map while keeping the (interesting part of the) entries
valid. To offset potential cost we duplicate the ordering key of the
entry which avoids an additional indirect lookup.
We should replace more objects with "protected objects" as we go.
Differential Revision: https://reviews.llvm.org/D121057
Jonas Devlieghere [Fri, 25 Mar 2022 16:03:52 +0000 (09:03 -0700)]
[lldb] Add a fuzzer for target creation
This patch adds a generic fuzzer that interprets inputs as object files
and uses them to create a target in lldb. It is very similar to the
llvm-dwarfdump fuzzer which found a bunch of issues in libObject.
Differential revision: https://reviews.llvm.org/D122461
Tue Ly [Thu, 24 Mar 2022 22:07:46 +0000 (18:07 -0400)]
[libc] Improve the performance of expf.
Reduce the polynomial's degree from 7 down to 4.
Currently we use a degree-7 minimax polynomial on an interval of length 2^-7
around 0 to compute `expf`. Based on the suggestion of @santoshn and the RLIBM
project (https://github.com/rutgers-apl/rlibm-all/blob/main/source/float/exp.c)
and the improvement we made with `exp2f` in https://reviews.llvm.org/D122346,
it is possible to have a good polynomial of degree-4 on a subinterval of length
2^(-7) to approximate e^x.
We did try to either reduce the degree of the polynomial down to 3 or increase
the interval size to 2^(-6), but in both cases the number of exceptional values
exploded. So we settle with using a degree-4 polynomial of the interval of
size 2^(-7) around 0.
Reviewed By: sivachandra, zimmermann6, santoshn
Differential Revision: https://reviews.llvm.org/D122418
Dávid Bolvanský [Fri, 25 Mar 2022 16:12:53 +0000 (17:12 +0100)]
[NFCI] Fix set-but-unused warning in DenseMap.h in some configurations
Dávid Bolvanský [Fri, 25 Mar 2022 16:10:25 +0000 (17:10 +0100)]
[NFCI] Fix set-but-unused warning in AArch64AsmParser.cpp
Hongtao Yu [Thu, 24 Mar 2022 23:30:20 +0000 (16:30 -0700)]
[CSSPGO] Turn on profi and ext-tsp when using probe-based profile.
Probe-based profile leads to a better performance when combined with profi and ext-tsp block layout. I'm turning them on by default.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D122442
Philip Reames [Fri, 25 Mar 2022 16:02:23 +0000 (09:02 -0700)]
[slp] Factor out a lambda to avoid uplicating code a third time in upcoming patch [nfc]
Ben Shi [Thu, 27 Jan 2022 12:51:52 +0000 (12:51 +0000)]
[AVR][NFC] Fix incorrect register states in expanding pseudo instructions
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D118354
LLVM GN Syncbot [Fri, 25 Mar 2022 15:50:53 +0000 (15:50 +0000)]
[gn build] Port
39b80c8380c8
Philip Reames [Fri, 25 Mar 2022 15:48:08 +0000 (08:48 -0700)]
[test,slp] Add another stacksave related dependence test
Richard [Mon, 3 Jan 2022 16:24:12 +0000 (09:24 -0700)]
[clang-tidy] Add modernize-macro-to-enum check
This check performs basic analysis of macros and replaces them
with an anonymous unscoped enum. Using an unscoped anonymous enum
ensures that everywhere the macro token was used previously, the
enumerator name may be safely used.
Potential macros for replacement must meet the following constraints:
- Macros must expand only to integral literal tokens. The unary
operators plus, minus and tilde are recognized to allow for positive,
negative and bitwise negated integers.
- Macros must be defined on sequential source file lines, or with
only comment lines in between macro definitions.
- Macros must all be defined in the same source file.
- Macros must not be defined within a conditional compilation block.
- Macros must not be defined adjacent to other preprocessor directives.
- Macros must not be used in preprocessor conditions
Each cluster of macros meeting the above constraints is presumed to
be a set of values suitable for replacement by an anonymous enum.
From there, a developer can give the anonymous enum a name and
continue refactoring to a scoped enum if desired. Comments on the
same line as a macro definition or between subsequent macro definitions
are preserved in the output. No formatting is assumed in the provided
replacements.
The check cppcoreguidelines-macro-to-enum is an alias for this check.
Fixes #27408
Differential Revision: https://reviews.llvm.org/D117522
Simon Pilgrim [Fri, 25 Mar 2022 15:39:08 +0000 (15:39 +0000)]
[InstCombine] SimplifyDemandedUseBits - remove ashr node if we only demand known sign bits
We already do this for SelectionDAG, but we're missing it here.
Noticed while re-triaging PR21929
Differential Revision: https://reviews.llvm.org/D122340
Joseph Huber [Thu, 24 Mar 2022 23:32:05 +0000 (19:32 -0400)]
[OpenMP] Replace device kernel linkage with weak_odr
Currently the device kernels all have weak linkage to prevent linkage
errors on multiple defintions. However, this prevents some optimizations
from adequately analyzing them because of the nature of weak linkage.
This patch replaces the weak linkage with weak_odr linkage so we can
statically assert that multiple declarations of the same kernel will
have the same definition.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122443
Dmitry Preobrazhensky [Fri, 25 Mar 2022 14:52:13 +0000 (17:52 +0300)]
[AMDGPU][DOC][NFC] Added GFX1030 assembler syntax description
Iain Sandoe [Thu, 24 Mar 2022 13:10:38 +0000 (13:10 +0000)]
[C++20][Modules] Correct an assert for modules-ts.
When adding the support for modules partitions we added an assert that the
actual status of Global Module Fragments matches the state machine that is
driven by the module; keyword.
That does not apply to the modules-ts case, where there is an implicit GMF.
Differential Revision: https://reviews.llvm.org/D122394
Adam Czachorowski [Wed, 16 Mar 2022 13:27:27 +0000 (14:27 +0100)]
[clang] Do not crash on arrow operator on dependent type.
There seems to be more than one way to get to that state. I included to
example cases in the test, both were noticed recently.
There is room for improvement, for example by creating RecoveryExpr in
place of the bad initializer, but for now let's stop the crashes.
Differential Revision: https://reviews.llvm.org/D121824
Johannes Doerfert [Thu, 24 Mar 2022 20:08:53 +0000 (15:08 -0500)]
Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes"
This reverts commit
c5f789050daab25aad6770790987e2b7c0395936 and
reapplies
7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with additional test
changes.
Kiran Chandramohan [Fri, 25 Mar 2022 14:22:49 +0000 (14:22 +0000)]
[Flang] Lower achar intrinsic
The intrinsic returns the character located at the position requested
in the ASCII sequence. The intrinsic is lowered to inline FIR code.
This is part of the upstreaming effort from the fir-dev branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D122480
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Jay Foad [Fri, 25 Mar 2022 09:57:28 +0000 (09:57 +0000)]
[AMDGPU] Move VOP3 classes into VOPInstructions.td. NFC.
These classes are also used by VOP1/2/C instructions.
Differential Revision: https://reviews.llvm.org/D122470
Simon Pilgrim [Fri, 25 Mar 2022 13:41:59 +0000 (13:41 +0000)]
[X86] Add test showing failure to fold multiple constant args in ADC
As noticed on Issue #35256
Javier Setoain [Thu, 3 Jun 2021 09:34:44 +0000 (10:34 +0100)]
[mlir][Vector] Add integration tests for ArmSVE
In order to run these integration tests, it is required access to an
SVE-enabled CPU or and emulator with SVE support. In case of using
an emulator, aarch64 versions of lli and the MLIR C Runner Utils Library
are also required.
Differential Revision: https://reviews.llvm.org/D104517
Sam McCall [Fri, 25 Mar 2022 13:17:31 +0000 (14:17 +0100)]
[pseudo] Use box-drawing chars to prettify debug dumps. NFC
Roman Lebedev [Fri, 25 Mar 2022 12:11:49 +0000 (15:11 +0300)]
[SimplifyCFG] `FoldBranchToCommonDest()`: allow branch-on-select
This whole check is bogus, it's some kind of a profitability check.
For now, simply extend it to not only allow branch-on-binary-ops,
but also on poison-safe logic ops.
Refs. https://github.com/llvm/llvm-project/issues/53861
Refs. https://github.com/llvm/llvm-project/issues/54553
Roman Lebedev [Fri, 25 Mar 2022 12:09:50 +0000 (15:09 +0300)]
[NFC][SimplifyCFG] Add test from https://github.com/llvm/llvm-project/issues/53861
Simon Pilgrim [Fri, 25 Mar 2022 12:52:53 +0000 (12:52 +0000)]
[X86] combineADC - pull out repeated dyn_cast<ConstantSDNode> calls. NFC.
Louis Dionne [Wed, 23 Mar 2022 21:02:07 +0000 (17:02 -0400)]
[libc++] Remove the _LIBCPP_BOOL_CONSTANT macro
I suspect this is a remnant of the times when we were not comfortable
using Clang's C++11/14 extensions everywhere, but now we do, so we can
use _BoolConstant instead and get rid of the macro.
Differential Revision: https://reviews.llvm.org/D122351
Sergei Lebedev [Fri, 25 Mar 2022 12:38:13 +0000 (13:38 +0100)]
Updated MLIR type stubs to work with pytype
The diff is big, but there are in fact only three kinds of changes
* ir.py had a synax error -- underminated [
* forward references are unnecessary in .pyi files (see https://github.com/python/typeshed/blob/
9a76b13127ffa8365431dcc105fc111cdd267e7e/CONTRIBUTING.md?plain=1#L450-L454)
* methods defined via .def_static() are now decorated with @staticmethod
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122300
Aakanksha [Fri, 25 Mar 2022 11:33:25 +0000 (11:33 +0000)]
Prevent comparison with wider type in loop condition
This change fixes the code violations flagged in AMD compute CodeQL scan - "comparison-with-wider-type"
Differential Revision: https://reviews.llvm.org/D122447
Aaron Ballman [Fri, 25 Mar 2022 11:13:26 +0000 (07:13 -0400)]
Fix clang Sphinx build bot
Simon Pilgrim [Fri, 25 Mar 2022 11:06:41 +0000 (11:06 +0000)]
[SDAG] enable binop identity constant folds for multiplies
Add mul to the list of ops that we canonicalize with a select to expose an identity merge
Differential Revision: https://reviews.llvm.org/D122071
Benjamin Kramer [Fri, 25 Mar 2022 11:04:43 +0000 (12:04 +0100)]
[bazel] Add missing dependency after
a75a46db89
Benjamin Kramer [Fri, 25 Mar 2022 11:02:36 +0000 (12:02 +0100)]
[bazel] Add missing dependency after
a75a46db89
Javier Setoain [Thu, 2 Dec 2021 15:09:33 +0000 (15:09 +0000)]
[mlir][Sparse] Add option for VLA sparsification
Use "enable-vla-vectorization=vla" to generate a vector length agnostic
loops during vectorization. This option works for vectorization strategy 2.
Differential Revision: https://reviews.llvm.org/D118379
Simon Pilgrim [Fri, 25 Mar 2022 10:49:04 +0000 (10:49 +0000)]
[X86] combineAdd - fold ADD(ADC(Y,0,W),X) -> ADC(X,Y,W)
This also exposed a missed ADC canonicalization of constant ops to the RHS
Javier Setoain [Wed, 26 Jan 2022 15:01:39 +0000 (15:01 +0000)]
[mlir][Vector] Enable create_mask for scalable vectors
The way vector.create_mask is currently lowered is
vector-length-dependent, and therefore incompatible with scalable vector
types. This patch adds an alternative lowering path for create_mask
operations that return a scalable vector mask.
Differential Revision: https://reviews.llvm.org/D118248
Thomas Symalla [Tue, 1 Feb 2022 09:28:18 +0000 (10:28 +0100)]
[AMDGPU] Improve v_cmpx usage on GFX10.3.
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ...
s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a SGPR and having the
following SALU wait for the SGPR.
An equivalent sequence is to save the exec mask manually instead of letting
s_and_saveexec do the work and use a v_cmpx instruction instead to do the
comparison.
This patch modifies the SIOptimizeExecMasking pass as this is the last position
where s_and_saveexec instructions are inserted. It does the transformation by
trying to find the pattern, extracting the operands and generating the new
instruction sequence.
It also changes some existing lit tests and introduces a few new tests to show
the changed behavior on GFX10.3 targets.
Same as D119696 including a buildbot and MIR test fix.
Reviewed By: critson
Differential Revision: https://reviews.llvm.org/D122332
Simon Pilgrim [Fri, 25 Mar 2022 10:27:16 +0000 (10:27 +0000)]
[AArch64] isProfitableToHoist - remove nullptr test
User is dereferenced on the main codepath so the null test is likely superfluous
Simon Pilgrim [Fri, 25 Mar 2022 10:25:04 +0000 (10:25 +0000)]
[Utils] stripDebugifyMetadata - use cast<> instead of dyn_cast_or_null<> to avoid dereference of nullptr
The pointer is dereferenced immediately, so assert the cast is correct instead of returning nullptr
Simon Pilgrim [Fri, 25 Mar 2022 10:23:22 +0000 (10:23 +0000)]
[AsmPrinter] AIXException::endFunction - use cast<> instead of dyn_cast<> to avoid dereference of nullptr
The pointer is used immediately inside the getSymbol() call, so assert the cast is correct instead of returning nullptr
Simon Pilgrim [Fri, 25 Mar 2022 10:22:05 +0000 (10:22 +0000)]
[clang] CheckSizelessVectorOperands - use castAs<> instead of getAs<> to avoid dereference of nullptr
Move the only uses of the cast to where they are dereferenced.