Slava Gurevich [Wed, 27 Jul 2022 19:30:19 +0000 (12:30 -0700)]
[LLDB] Fix missing return value in SBBreakpointLocation::GetQueueName()
- Fix a typo in the function that never returns a significant value
- Add unit tests for the getters/setters in SBBreakpointLocation
- Verified the newly added unit test succeeds after the fix:
llvm-lit -sv lldb/test/API/functionalities/breakpoint/breakpoint_locations/TestBreakpointLocations.py
Differential Revision: https://reviews.llvm.org/D130660
Shafik Yaghmour [Thu, 28 Jul 2022 22:26:15 +0000 (15:26 -0700)]
[Clang] Diagnose ill-formed constant expression when setting a non fixed enum to a value outside the range of the enumeration values
DR2338 clarified that it was undefined behavior to set the value outside the
range of the enumerations values for an enum without a fixed underlying type.
We should diagnose this with a constant expression context.
Differential Revision: https://reviews.llvm.org/D130058
Felipe de Azevedo Piovezan [Thu, 28 Jul 2022 21:56:52 +0000 (14:56 -0700)]
[GlobalISel] Handle nullptr constants in dbg.value
Currently, the LLVM IR -> MIR translator fails to translate dbg.values
whose first argument is a null pointer. However, in other portions of
the code, such pointers are always lowered to the constant zero, for
example see IRTranslator::Translate(Constant, Register).
This patch addresses the limitation by following the same approach of
lowering null pointers to zero.
A prior test was checking that null pointers were always lowered to
$noreg; this test is changed to check for zero, and the previous
behavior is now checked by introducing a dbg.value whose first argument
is the address of a global variable.
Differential Revision: https://reviews.llvm.org/D130721
Jez Ng [Thu, 28 Jul 2022 21:55:12 +0000 (17:55 -0400)]
[lld-macho] `-exported_symbols` should hide symbols before LTO runs
We were previously doing it after LTO, which did have the desired effect
of having the un-exported symbols marked as private extern in the final
output binary, but doing it before LTO creates more optimization
opportunities.
One observable difference is that LTO can now elide un-exported symbols
entirely, so they may not even be present as private externs in the
output.
This is also what ld64 implements.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D130429
Felipe de Azevedo Piovezan [Thu, 28 Jul 2022 21:53:47 +0000 (14:53 -0700)]
[GlobalISel][nfc] Remove unnecessary cast
The getOperand method already returns a Constant when it is called on
a ConstantExpression, as such the cast is not needed. To prevent a type
mismatch between the different return statements of the lambda, the
lambda return type is explicitly provided.
Differential Revision: https://reviews.llvm.org/D130719
Jacques Pienaar [Thu, 28 Jul 2022 21:43:13 +0000 (14:43 -0700)]
[mlir] Introduce DefaultValuedOptionalAttr
Currently DefaultValuedAttr is confusingly actually default valued &
optional but that was an artifact of development and longstanding TODO
to address. Add new attribute that matches this behavior for cases where
that is actually the desired behavior before addressing TODO (e.g., this
is an incremental step to fixing DefaultValuedAttr).
Differential Revision: https://reviews.llvm.org/D130679
Anshil Gandhi [Thu, 28 Jul 2022 19:27:42 +0000 (13:27 -0600)]
[AMDGPU][Scheduler] Avoid initializing Register pressure tracker when tracking is disabled
When register pressure tracking is disabled, the scheduler attempts to load
pressures at SReg_32 and VGPR_32. This causes an index out of bounds error.
This patch fixes this issue by disabling the initialization of RPTracker
when not needed. NFC
Reviewed By: rampitec, kerbowa, arsenm
Differential Revision: https://reviews.llvm.org/D129322
Denis Fatkulin [Thu, 28 Jul 2022 21:28:46 +0000 (00:28 +0300)]
[clang-format] Missing space between trailing return type 'auto' and left brace
There's no a space symbol between trailing return type `auto` and left brace `{`.
The simpliest examles of code to reproduce the issue:
```
[]() -> auto {}
```
and
```
auto foo() -> auto {}
```
Depends on D130299
Reviewed By: HazardyKnusperkeks, curdeius, owenpan
Differential Revision: https://reviews.llvm.org/D130417
Markus Böck [Thu, 28 Jul 2022 20:41:08 +0000 (22:41 +0200)]
[mlir] Add Type::isa_and_nonnull
Greg Clayton [Thu, 28 Jul 2022 20:31:41 +0000 (13:31 -0700)]
Cache the value for absolute path in FileSpec.
Checking if a path is absolute can be expensive and currently the result is not cached in the FileSpec object. This patch adds caching and also code to clear the cache if the file is modified.
Differential Revision: https://reviews.llvm.org/D130396
Greg Clayton [Tue, 26 Jul 2022 06:29:30 +0000 (23:29 -0700)]
[NFC] Improve FileSpec internal APIs and usage in preparation for adding caching of resolved/absolute.
Resubmission of https://reviews.llvm.org/D130309 with the 2 patches that fixed the linux buildbot, and new windows fixes.
The FileSpec APIs allow users to modify instance variables directly by getting a non const reference to the directory and filename instance variables. This makes it impossible to control all of the times the FileSpec object is modified so we can clear cached member variables like m_resolved and with an upcoming patch caching if the file is relative or absolute. This patch modifies the APIs of FileSpec so no one can modify the directory or filename instance variables directly by adding set accessors and by removing the get accessors that are non const.
Many clients were using FileSpec::GetCString(...) which returned a unique C string from a ConstString'ified version of the result of GetPath() which returned a std::string. This caused many locations to use this convenient function incorrectly and could cause many strings to be added to the constant string pool that didn't need to. Most clients were converted to using FileSpec::GetPath().c_str() when possible. Other clients were modified to use the newly renamed version of this function which returns an actualy ConstString:
ConstString FileSpec::GetPathAsConstString(bool denormalize = true) const;
This avoids the issue where people were getting an already uniqued "const char *" that came from a ConstString only to put the "const char *" back into a "ConstString" object. By returning the ConstString instead of a "const char *" clients can be more efficient with the result.
The patch:
- Removes the non const GetDirectory() and GetFilename() get accessors
- Adds set accessors to replace the above functions: SetDirectory() and SetFilename().
- Adds ClearDirectory() and ClearFilename() to replace usage of the FileSpec::GetDirectory().Clear()/FileSpec::GetFilename().Clear() call sites
- Fixed all incorrect usage of FileSpec::GetCString() to use FileSpec::GetPath().c_str() where appropriate, and updated other call sites that wanted a ConstString to use the newly returned ConstString appropriately and efficiently.
Differential Revision: https://reviews.llvm.org/D130549
David Blaikie [Thu, 28 Jul 2022 20:21:55 +0000 (20:21 +0000)]
llvm-dwp: Include dwo name even when the input is a dwo
This still only includes the dwo name if it's in the DW_AT_dwo_name
attribute in the split unit - though it could be improved/modified to
use the dwo name from the command line (if linking raw dwo files) or
retrieved from the DW_AT_dwo_name in the executable (when using -e).
It's useful in any case because you might have a large command line with
many files and knowing exactly which dwo files are relevant will
simplify debugging, but especially with '-e' when you didn't pass the
dwo files explicitly in nthe first place it would be quite non-obvious
where the duplicate units are coming from.
Mats Petersson [Wed, 6 Jul 2022 12:38:47 +0000 (13:38 +0100)]
[flang]Fix incorrect array type transformation
When an array is defined with "unknown" size, such as fir.array<2x?x5xi32>,
it should be converted to llvm.array<10 x i32>. The code so far has
been converting it to llvm.ptr<i32>.
Using a different function to check the if there starting are constant
dimensions, rather than if ALL dimensions are constant, it now produces
the correct array form.
Some tests has been updated, so they are now checking the new behaviour
rather than the old behaviour - so there's no need to add further tests
for this particular scenario.
This was originally found when compiling Spec 17 code, where an assert
in a GepOP was hit. That is bug #56141, which this change fixes.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D129196
Daniele Vettorel [Thu, 28 Jul 2022 19:53:37 +0000 (19:53 +0000)]
Add `llvm-dwarfutil` to Bazel targets
Adds support for building the `llvm-dwarfutil` tool with Bazel
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D130720
Adrian Kuegel [Thu, 28 Jul 2022 19:14:57 +0000 (21:14 +0200)]
[mlir][Complex] Change complex.number attribute type to ComplexType.
It is more useful to use ComplexType as type of the attribute than to
use the element type as attribute type. This means when using this
attribute in complex::ConstantOp, we just need to check whether
the types match.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D130703
Ben Langmuir [Fri, 15 Jul 2022 17:26:19 +0000 (10:26 -0700)]
[clang][deps] Include canonical invocation in ContextHash
The "strict context hash" is insufficient to identify module
dependencies during scanning, leading to different module build commands
being produced for a single module, and non-deterministically choosing
between them. This commit switches to hashing the canonicalized
`CompilerInvocation` of the module. By hashing the invocation we are
converting these from correctness issues to performance issues, and we
can then incrementally improve our ability to canonicalize
command-lines.
This change can cause a regression in the number of modules needed. Of
the 4 projects I tested, 3 had no regression, but 1, which was
clang+llvm itself, had a 66% regression in number of modules (4%
regression in total invocations). This is almost entirely due to
differences between -W options across targets. Of this, 25% of the
additional modules are system modules, which we could avoid if we
canonicalized -W options when -Wsystem-headers is not present --
unfortunately this is non-trivial due to some warnings being enabled in
system headers by default. The rest of the additional modules are mostly
real differences in potential warnings, reflecting incorrect behaviour
in the current scanner.
There were also a couple of differences due to `-DFOO`
`-fmodule-ignore-macro=FOO`, which I fixed here.
Since the output paths for the module depend on its context hash, we
hash the invocation before filling in outputs, and rely on the build
system to always return the same output paths for a given module.
Note: since the scanner itself uses an implicit modules build, there can
still be non-determinism, but it will now present as different
module+hashes rather than different command-lines for the same
module+hash.
Differential Revision: https://reviews.llvm.org/D129884
Chris Bieneman [Fri, 15 Jul 2022 21:03:28 +0000 (16:03 -0500)]
[HLSL] Add RWBuffer default constructor
This fills out the default constructor for RWBuffer to assign the
handle with the result of __builtin_hlsl_create_handle which we can
then treat as a pointer to the resource data through the mid-level of
the compiler.
Depends on D130016
Differential Revision: https://reviews.llvm.org/D130017
Jon Chesterfield [Thu, 28 Jul 2022 19:00:01 +0000 (20:00 +0100)]
[openmp][amdgpu] Tear down amdgpu plugin accurately
Moves DeviceInfo global to heap to accurately control lifetime.
Moves calls from libomptarget to deinit_plugin later, plugins need to stay
alive until very shortly before libomptarget is destructed.
Leaving the deinit_plugin calls where initially inserted hits use after
free from the dynamic_module.c offloading test (verified with valgrind
that the new location is sound with respect to this)
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D130714
Alexey Lapshin [Thu, 28 Jul 2022 16:20:58 +0000 (19:20 +0300)]
[Reland][Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections.
Current DWARFLinker implementation does not support some debug sections
(mainly DWARF v5 sections). This patch adds diagnostic for such sections.
The warning would be displayed for critical(such that could not be removed)
sections and the source file would be skipped. Other unsupported sections
would be removed and warning message should be displayed. The zero exit
status would be returned for both cases.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D123623
Austin Kerbow [Thu, 28 Jul 2022 17:53:43 +0000 (10:53 -0700)]
[AMDGPU] Add isMeta flag to SCHED_GROUP_BARRIER
Fangrui Song [Thu, 28 Jul 2022 17:57:56 +0000 (10:57 -0700)]
[MC][test] Rename two --compress-debug-sections=zlib tests
To be clearer when zstd support is added.
River Riddle [Thu, 28 Jul 2022 09:40:08 +0000 (02:40 -0700)]
[mlir:SubElementsInterface] Add support for "skipping" when replacing attributes/types
This is used to fix a bug in SymbolTable::replaceAllSymbolUses where we replace symbols that
we shouldn't.
Differential Revision: https://reviews.llvm.org/D130693
Fangrui Song [Thu, 28 Jul 2022 17:45:53 +0000 (10:45 -0700)]
[llvm-objcopy] Support --{,de}compress-debug-sections for zstd
Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal:
https://groups.google.com/g/generic-abi/c/satyPkuMisk
("Add new ch_type value: ELFCOMPRESS_ZSTD")
Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")
Differential Revision: https://reviews.llvm.org/D130458
Austin Kerbow [Mon, 13 Jun 2022 15:00:19 +0000 (08:00 -0700)]
[AMDGPU] Add amdgcn_sched_group_barrier builtin
This builtin allows the creation of custom scheduling pipelines on a per-region
basis. Like the sched_barrier builtin this is intended to be used either for
testing, in situations where the default scheduler heuristics cannot be
improved, or in critical kernels where users are trying to get performance that
is close to handwritten assembly. Obviously using these builtins will require
extra work from the kernel writer to maintain the desired behavior.
The builtin can be used to create groups of instructions called "scheduling
groups" where ordering between the groups is enforced by the scheduler.
__builtin_amdgcn_sched_group_barrier takes three parameters. The first parameter
is a mask that determines the types of instructions that you would like to
synchronize around and add to a scheduling group. These instructions will be
selected from the bottom up starting from the sched_group_barrier's location
during instruction scheduling. The second parameter is the number of matching
instructions that will be associated with this sched_group_barrier. The third
parameter is an identifier which is used to describe what other
sched_group_barriers should be synchronized with. Note that multiple
sched_group_barriers must be added in order for them to be useful since they
only synchronize with other sched_group_barriers. Only "scheduling groups" with
a matching third parameter will have any enforced ordering between them.
As an example, the code below tries to create a pipeline of 1 VMEM_READ
instruction followed by 1 VALU instruction followed by 5 MFMA instructions...
// 1 VMEM_READ
__builtin_amdgcn_sched_group_barrier(32, 1, 0)
// 1 VALU
__builtin_amdgcn_sched_group_barrier(2, 1, 0)
// 5 MFMA
__builtin_amdgcn_sched_group_barrier(8, 5, 0)
// 1 VMEM_READ
__builtin_amdgcn_sched_group_barrier(32, 1, 0)
// 3 VALU
__builtin_amdgcn_sched_group_barrier(2, 3, 0)
// 2 VMEM_WRITE
__builtin_amdgcn_sched_group_barrier(64, 2, 0)
Reviewed By: jrbyrnes
Differential Revision: https://reviews.llvm.org/D128158
Sunho Kim [Thu, 28 Jul 2022 17:37:16 +0000 (02:37 +0900)]
[clang-repl] Support destructors of global objects.
Supports destructors of global objects by properly calling jitdylib deinitialize which calls the global dtors of ir modules.
This supersedes https://reviews.llvm.org/D127945. There was an issue when calling deinitialize on windows but it got fixed by https://reviews.llvm.org/D128037.
Reviewed By: v.g.vassilev
Differential Revision: https://reviews.llvm.org/D128589
Xing Xue [Thu, 28 Jul 2022 17:17:12 +0000 (13:17 -0400)]
[libc++][AIX] Use non-unique implementation for typeinfo comparison
Summary:
The AIX linker does not merge typeinfos when shared libraries are involved, which causes address comparison to fail although the types are the same. This patch changes to use the non-unique implementation for typeinfo comparison for AIX.
Reviewed by: hubert.reinterpretcast, philnik, libc++
Differential Revision: https://reviews.llvm.org/D130715
Craig Topper [Thu, 28 Jul 2022 16:11:56 +0000 (09:11 -0700)]
[RISCV] Update lowerFROUND to use masked instructions.
This avoids a vmerge at the end and avoids spurious fflags updates.
This isn't used for constrained intrinsic so we technically don't have
to worry about fflags, but it doesn't cost much to support it.
To support I've extend our FCOPYSIGN_VL node to support a passthru
operand. Similar to what was done for VRGATHER*_VL nodes.
I plan to do a similar update for trunc, floor, and ceil.
Reviewed By: reames, frasercrmck
Differential Revision: https://reviews.llvm.org/D130659
Craig Topper [Thu, 28 Jul 2022 16:10:55 +0000 (09:10 -0700)]
[RISCV] Remove duplicate code. NFC
The same operations are part of `FloatingPointVecReduceOps` a little
bit earlier.
Louis Dionne [Thu, 28 Jul 2022 14:25:30 +0000 (10:25 -0400)]
[libc++] Properly log crashes with the assertion handler on older Androids
This reintroduces the same workaround we have in libc++abi for older
Androids based on https://reviews.llvm.org/D130507#inline-1255914.
Differential Revision: https://reviews.llvm.org/D130708
Mahesh Ravishankar [Mon, 25 Jul 2022 22:54:15 +0000 (22:54 +0000)]
[mlir][Linalg] Allow decompose to handle ops when value of `outs` operand is used in payload.
Current implementation of decomposition of Linalg operations wouldnt
work if the `outs` operand values were used within the body of the
operation. Relax this restriction. This potentially sets the stage for
decomposing ops with reduction iterator types (but is not done here
since it requires more study).
Differential Revision: https://reviews.llvm.org/D130527
Mahesh Ravishankar [Fri, 22 Jul 2022 05:35:00 +0000 (05:35 +0000)]
[mlir][TilingInterface] Add a method to generate scalar implementation of the op.
While The tiling interface provides a mechanism for operations to be
tiled into tiled version of the op (or another op at the same level of
abstraction), the `generateScalarImplementation` method added here is
the "exit point" after all transformations have been done. Ops that
implement this method are expected to generate IR that are directly
lowerable to backend dialects like LLVM or SPIR-V dialects.
Differential Revision: https://reviews.llvm.org/D130612
Amaury Séchet [Thu, 28 Jul 2022 15:58:05 +0000 (15:58 +0000)]
[NFC] Autogenerate CodeGen/PowerPC/pzero-fp-xored.ll
Simon Pilgrim [Thu, 28 Jul 2022 16:03:35 +0000 (17:03 +0100)]
[DAG] Remove SelectionDAG::GetDemandedBits and use SimplifyMultipleUseDemandedBits directly.
GetDemandedBits is mainly a wrapper around SimplifyMultipleUseDemandedBits now, and is only used by DAGCombiner::visitSTORE so I've moved all remaining functionality there.
visitSTORE was making use of this to 'simplify' constants for a trunc-store. Just removing this code left to a mixture of regressions and gains - it came down to whether a target preferred a sign or zero extended constant for materialization/truncation. I've just moved the code over for now, but a next step would be to move this to targetShrinkDemandedConstant, but some targets that override the method expect a basic binop, and might react badly to a store node.....
Philip Reames [Thu, 28 Jul 2022 15:22:36 +0000 (08:22 -0700)]
[LV] Don't predicate uniform mem op stores unneccessarily
We already had the reasoning about uniform mem op loads; if the address is accessed at least once, we know the instruction doesn't need predicated to ensure fault safety. For stores, we do need to ensure that the values visible in memory are the same with and without predication. The easiest sub-case to check for is that all the values being stored are the same. Since we know that at least one lane is active, this tells us that the value must be visible.
Warning on confusing terminology: "uniform" vs "uniform mem op" mean two different things here, and this patch is specific to the later. It would *not* be legal to make this same change for merely "uniform" operations.
Differential Revision: https://reviews.llvm.org/D130637
Jon Chesterfield [Thu, 28 Jul 2022 15:49:36 +0000 (16:49 +0100)]
[amdgpu][openmp][nfc] Restore stb_local on DeviceInfo symbol
Prabhdeep Singh Soni [Thu, 28 Jul 2022 15:49:04 +0000 (23:49 +0800)]
[Flang][MLIR][OpenMP] Add support for simdlen clause
This supports lowering from parse-tree to MLIR and translation from
MLIR to LLVM IR using OMPIRBuilder for OpenMP simdlen clause in SIMD
construct.
Reviewed By: shraiysh, peixin, arnamoy10
Differential Revision: https://reviews.llvm.org/D130195
Jon Chesterfield [Thu, 28 Jul 2022 15:32:56 +0000 (16:32 +0100)]
[openmp][amdgpu] Move global DeviceInfo behind call syntax prior to using D130712
Jon Chesterfield [Thu, 28 Jul 2022 15:21:36 +0000 (16:21 +0100)]
[openmp] Introduce optional plugin init/deinit functions
Will allow plugins to migrate away from using global variables to
manage lifetime, which will fix a segfault discovered in relation to D127432
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D130712
LLVM GN Syncbot [Thu, 28 Jul 2022 14:44:36 +0000 (14:44 +0000)]
[gn build] Port
d52e775b05a4
Liqiang Tao [Mon, 18 Jul 2022 14:49:13 +0000 (22:49 +0800)]
[llvm][ModuleInliner] Add inline cost priority for module inliner
This patch introduces the inline cost priority into the
module inliner, which uses the same computation as
InlineCost.
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D130012
LLVM GN Syncbot [Thu, 28 Jul 2022 14:37:35 +0000 (14:37 +0000)]
[gn build] Port
c113594378a0
Liqiang Tao [Thu, 28 Jul 2022 14:36:17 +0000 (22:36 +0800)]
Revert "[llvm][ModuleInliner] Add inline cost priority for module inliner"
This reverts commit
bb7f62bbbd35840006a1d202228e835909f591cf.
Florian Hahn [Thu, 28 Jul 2022 14:26:42 +0000 (15:26 +0100)]
Revert "[X86][DAGISel] Don't widen shuffle element with AVX512"
This reverts commit
5fb41342105700949c81f68aefc85d9c46e9a1a6.
This patch is causing crashes when building llvm-test-suite when
optimizing for CPUs with AVX512.
Reproducer crashing with llc:
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx"
define i32 @test(<32 x i32> %0) #0 {
entry:
%1 = mul <32 x i32> %0, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
%2 = tail call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> %1)
ret i32 %2
}
; Function Attrs: nocallback nofree nosync nounwind readnone willreturn
declare i32 @llvm.vector.reduce.add.v32i32(<32 x i32>) #1
attributes #0 = { "min-legal-vector-width"="0" "target-cpu"="skylake-avx512" }
attributes #1 = { nocallback nofree nosync nounwind readnone willreturn }
Simon Pilgrim [Thu, 28 Jul 2022 14:23:04 +0000 (15:23 +0100)]
[DAG] DAGCombiner::visitTRUNCATE - remove GetDemandedBits call
This should now all be handled by SimplifyDemandedBits.
Chris Bieneman [Fri, 15 Jul 2022 20:49:55 +0000 (15:49 -0500)]
[HLSL] Add __builtin_hlsl_create_handle
This is pretty straightforward, it just adds a builtin to return a
pointer to a resource handle. This maps to a dx intrinsic.
The shape of this builtin and the underlying intrinsic will likely
shift a bit as this implementation becomes more feature complete, but
this is a good basis to get started.
Depends on D128569.
Differential Revision: https://reviews.llvm.org/D130016
Chris Bieneman [Wed, 6 Jul 2022 18:29:48 +0000 (13:29 -0500)]
Start support for HLSL `RWBuffer`
Most of the change here is fleshing out the HLSLExternalSemaSource with
builder implementations to build the builtin types. Eventually, I may
move some of this code into tablegen or a more managable declarative
file but I want to get the AST generation logic ready first.
This code adds two new types into the HLSL AST, `hlsl::Resource` and
`hlsl::RWBuffer`. The `Resource` type is just a wrapper around a handle
identifier, and is largely unused in source. It will morph a bit over
time as I work on getting the source compatability correct, but for now
it is a reasonable stand-in. The `RWBuffer` type is not ready for use.
I'm posting this change for review because it adds a lot of
infrastructure code and is testable.
There is one change to clang code outside the HLSL-specific logic here,
which addresses a behavior change introduced a long time ago in
967d438439ac. That change resulted in unintentionally breaking
situations where an incomplete template declaration was provided from
an AST source, and needed to be completed later by the external AST.
That situation doesn't happen in the normal AST importer flow, but can
happen when an AST source provides incomplete declarations of
templates. The solution is to annotate template specializations of
incomplete types with the HasExternalLexicalSource bit from the base
template.
Depends on D128012.
Differential Revision: https://reviews.llvm.org/D128569
Sunho Kim [Thu, 28 Jul 2022 13:40:33 +0000 (22:40 +0900)]
[clang-repl] Disable exception unittest on AIX.
AIX platform was not supported but it was not explicitly checked in exception test as it was excluded by isPPC() check.
Simon Pilgrim [Thu, 28 Jul 2022 13:46:50 +0000 (14:46 +0100)]
[DAG] SelectionDAG::GetDemandedBits - don't simplify opaque constants
I'm actually trying to get rid of GetDemandedBits - but while dismantling it I noticed that we were altering opaque constants. Fixing that causes a FP_TO_INT_SAT regression that should be addressed separately - I'll raise a bug.
LLVM GN Syncbot [Thu, 28 Jul 2022 13:30:20 +0000 (13:30 +0000)]
[gn build] Port
bb7f62bbbd35
Liqiang Tao [Thu, 28 Jul 2022 13:27:04 +0000 (21:27 +0800)]
[llvm][ModuleInliner] Add inline cost priority for module inliner
This patch introduces the inline cost priority into the
module inliner, which uses the same computation as
InlineCost.
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D130012
David Green [Thu, 28 Jul 2022 13:26:17 +0000 (14:26 +0100)]
[ARM] Remove duplicate fp16 intrinsics
These vdup and vmov float16 intrinsics are being defined in both the
general section and then again in fp16 under a !aarch64 flag. The
vdup_lane intrinsics were being defined in both aarch64 and !aarch64
sections, so have been commoned. They are defined as macros, so do not
give duplicate warnings, but removing the duplicates shouldn't alter the
available intrinsics.
Simon Pilgrim [Thu, 28 Jul 2022 13:10:44 +0000 (14:10 +0100)]
[DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.
This is another step towards removing SelectionDAG::GetDemandedBits and just using TargetLowering::SimplifyMultipleUseDemandedBits.
There a few cases where we end up with extra register moves which I think we can accept in exchange for the increased ILP.
Differential Revision: https://reviews.llvm.org/D77804
Kevin P. Neal [Wed, 27 Jul 2022 19:21:17 +0000 (15:21 -0400)]
Precommit tests for D112256 "[FPEnv][EarlyCSE] Add support for CSE of constrained FP intrinsics, take 2"
Amaury Séchet [Mon, 25 Jul 2022 01:15:06 +0000 (01:15 +0000)]
[DAG] Use recursivelyDeleteUnusedNodes in PromoteLoad
It simplifies the code overall and removes the need for manual bookkeeping.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D130447
Sebastian Neubauer [Thu, 28 Jul 2022 12:34:59 +0000 (14:34 +0200)]
[CMake][OpenMP] Remove wrong backslash
outdir is defined in the line above, it will not exist in the install
command, so it should not be escaped.
Amaury Séchet [Mon, 25 Jul 2022 00:36:14 +0000 (00:36 +0000)]
[DAG] Use recursivelyDeleteUnusedNodes in ReplaceLoadWithPromotedLoad
It simplifies the code overall and removes the need for manual bookkeeping.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D130444
Alexander Timofeev [Tue, 26 Jul 2022 10:47:09 +0000 (12:47 +0200)]
[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs
In the
2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D130367
Sunho Kim [Thu, 28 Jul 2022 12:14:58 +0000 (21:14 +0900)]
[clang-repl] Add host exception support check utility flag.
Add host exception support check utility flag. This is needed to not run tests that require exception support in few buildbots that lacks related symbols for some reason.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D129242
Sunho Kim [Thu, 28 Jul 2022 12:12:25 +0000 (21:12 +0900)]
[ORC] Fix weak hidden symbols failure on PPC with runtimedyld
Fix "JIT session error: Symbols not found: [ DW.ref.__gxx_personality_v0 ] error" which happens when trying to use exceptions on ppc linux. To do this, it expands AutoClaimSymbols option in RTDyldObjectLinkingLayer to also claim weak symbols before they are tried to be resovled. In ppc linux, DW.ref symbols is emitted as weak hidden symbols in the later stage of MC pipeline. This means when using IRLayer (i.e. LLJIT), IRLayer will not claim responsibility for such symbols and RuntimeDyld will skip defining this symbol even though it couldn't resolve corresponding external symbol.
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D129175
Muhammad Usman Shahid [Thu, 28 Jul 2022 11:45:28 +0000 (07:45 -0400)]
Missing tautological compare warnings due to unary operators
The patch mainly focuses on the lack of warnings for
-Wtautological-compare. It works fine for positive numbers but doesn't
for negative numbers. This is because the warning explicitly checks for
an IntegerLiteral AST node, but -1 is represented by a UnaryOperator
with an IntegerLiteral sub-Expr.
For the below code we have warnings:
if (0 == (5 | x)) {}
but not for
if (0 == (-5 | x)) {}
This patch changes the analysis to not look at the AST node directly to
see if it is an IntegerLiteral, but instead attempts to evaluate the
expression to see if it is an integer constant expression. This handles
unary negation signs, but also handles all the other possible operators
as well.
Fixes #42918
Differential Revision: https://reviews.llvm.org/D130510
Dmitry Preobrazhensky [Thu, 28 Jul 2022 11:36:53 +0000 (14:36 +0300)]
[AMDGPU][GFX1030][DOC][NFC] Update assembler syntax description
Summary of changes:
- Update FLAT LDS syntax (see https://reviews.llvm.org/D125126)
Dmitry Preobrazhensky [Thu, 28 Jul 2022 11:27:13 +0000 (14:27 +0300)]
[AMDGPU][MC][GFX90A] Correct MIMG dst size validation
Correct validator to enable MIMG dst size checks.
Differential Revision: https://reviews.llvm.org/D130512
Adrian Kuegel [Thu, 28 Jul 2022 10:50:40 +0000 (12:50 +0200)]
[mlir] Add getters for DenseArrayAttr.
This change adds convenience getters to builders.
Differential Revision: https://reviews.llvm.org/D130696
Sanjay Patel [Wed, 27 Jul 2022 21:40:35 +0000 (17:40 -0400)]
[InstCombine] try harder to narrow bitwise logic with cast operands
This works with any logic + extend:
https://alive2.llvm.org/ce/z/vzsqQD
The motivating case is from issue #56294, but that's still not optimal
(it should simplify completely).
Sanjay Patel [Wed, 27 Jul 2022 21:26:26 +0000 (17:26 -0400)]
[InstCombine] add tests for bitwise logic with cast operands; NFC
Dmitry Preobrazhensky [Thu, 28 Jul 2022 11:20:05 +0000 (14:20 +0300)]
[AMDGPU][MC][GFX11] Disable SGPRs for src1 of v_fma_mix*_dpp opcodes
Differential Revision: https://reviews.llvm.org/D130634
Nico Weber [Thu, 28 Jul 2022 11:14:43 +0000 (07:14 -0400)]
[gn build] (manually) port
18b4a8bcf35 more
chendewen [Thu, 28 Jul 2022 09:15:25 +0000 (17:15 +0800)]
[Aarch64] Add cost for missing extensions.
This patch adds a cost estimate for some missing sign extensions.
ref: https://reviews.llvm.org/D14730
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D130565
Kirill Okhotnikov [Wed, 6 Jul 2022 16:11:40 +0000 (18:11 +0200)]
[libc][math] Universal exp function for cosh/sinh calculation.
Added a function and test, which can be used later for cosh/sinh
and possibly for expf/expm1f.
Differential Revision: https://reviews.llvm.org/D129215
Konstantin Varlamov [Thu, 28 Jul 2022 09:06:44 +0000 (02:06 -0700)]
[libc++] Make `_IterOps::__iter_move` more similar to `std::ranges::iter_move`.
Avoid relying on `iterator_traits` and instead deduce the return type of
dereferencing the iterator. Additionally, add a static check to reject
iterators with incorrect `iterator_traits` at compile time.
Differential Revision: https://reviews.llvm.org/D130538
Florian Hahn [Thu, 28 Jul 2022 09:02:19 +0000 (10:02 +0100)]
[SCEV] Avoid repeated proveNoUnsignedWrapViaInduction calls.
At the moment, proveNoUnsignedWrapViaInduction may be called for the
same AddRec a large number of times via getZeroExtendExpr. This can have
a severe compile-time impact for very loop-heavy code. One one
particular workload, LSR takes ~51s without this patch, almost
exlusively in proveNoUnsignedWrapViaInduction. With this patch, the time
in LSR drops to ~0.4s.
If proveNoUnsignedWrapViaInduction failed to prove NUW the first time,
it is unlikely to succeed on subsequent tries and the cost doesn't seem
to be justified.
Besides drastically improving compile-time in some excessive cases, this
also has a slightly positive compile-time impact on CTMark:
NewPM-O3: -0.07%
NewPM-ReleaseThinLTO: -0.08%
NewPM-ReleaseLTO-g: -0.06
https://llvm-compile-time-tracker.com/compare.php?from=
b435da027d7774c24cdb8c88d09f6b771e07fb14&to=
f2729e33e8284b502f6c35a43345272252f35d12&stat=instructions
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D130648
Haojian Wu [Tue, 26 Jul 2022 20:27:09 +0000 (22:27 +0200)]
[pseudo] Eliminate the false `::` nested-name-specifier ambiguity
The solution is to favor the longest possible nest-name-specifier, and
drop other alternatives by using the guard, per per C++ [basic.lookup.qual.general].
Motivated cases:
```
Foo::Foo() {};
// the constructor can be parsed as:
// - Foo ::Foo(); // where the first Foo is return-type, and ::Foo is the function declarator
// + Foo::Foo(); // where Foo::Foo is the function declarator
```
```
void test() {
// a very slow parsing case when there are many qualifers!
X::Y::Z;
// The statement can be parsed as:
// - X ::Y::Z; // ::Y::Z is the declarator
// - X::Y ::Z; // ::Z is the declarator
// + X::Y::Z; // a declaration without declarator (X::Y::Z is decl-specifier-seq)
// + X::Y::Z; // a qualifed-id expression
}
```
Differential Revision: https://reviews.llvm.org/D130511
Martin Storsjö [Thu, 14 Jul 2022 19:46:04 +0000 (22:46 +0300)]
[clang-tidy] Add CLANG_TIDY_CONFUSABLE_CHARS_GEN cmake cache variable to avoid building when cross compiling
This is similar to the LLVM_TABLEGEN, CLANG_TABLEGEN and
CLANG_PSEUDO_GEN cmake cache variables.
Differential Revision: https://reviews.llvm.org/D129799
Martin Storsjö [Thu, 14 Jul 2022 19:39:55 +0000 (22:39 +0300)]
[clang-tidy] Rename the make-confusable-table executable
Rename it to clang-tidy-confusable-chars-gen, to make its role
clearer in a wider context.
In cross builds, the caller might want to provide this tool
externally (to avoid needing to rebuild it in the cross build).
In such a case, having the tool properly namespaced makes its role
clearer.
This matches how the clang-pseudo-gen tool was renamed in
a43fef05d4fae32f02365c7b8fef2aa631d23628 / D126725.
Differential Revision: https://reviews.llvm.org/D129798
Alexander Belyaev [Wed, 27 Jul 2022 13:26:01 +0000 (15:26 +0200)]
[mlir] Small stylistic changes to Complex_NumberAttr
Differential Revision: https://reviews.llvm.org/D130632
Kirill Okhotnikov [Fri, 1 Jul 2022 12:42:22 +0000 (14:42 +0200)]
[libc][math] Improved performance of exp2f function.
New exp2 function algorithm:
1) Improved performance: 8.176 vs 15.270 by core-math perf tool.
2) Improved accuracy. Only two special values left.
3) Lookup table size reduced twice.
Differential Revision: https://reviews.llvm.org/D129005
David Spickett [Wed, 20 Jul 2022 09:37:24 +0000 (09:37 +0000)]
[llvm] Fix some test failures with EXPENSIVE_CHECKS and libstdc++
DebugLocEntry assumes that it either contains 1 item that has no fragment
or many items that all have fragments (see the assert in addValues).
When EXPENSIVE_CHECKS is enabled, _GLIBCXX_DEBUG is defined. On a few machines
I've checked, this causes std::sort to call the comparator even
if there is only 1 item to sort. Perhaps to check that it is implemented
properly ordering wise, I didn't find out exactly why.
operator< for a DbgValueLoc will crash if this happens because the
optional Fragment is empty.
Compiler/linker/optimisation level seems to make this happen
or not. So I've seen this happen on x86 Ubuntu but the buildbot
for release EXPENSIVE_CHECKS did not have this issue.
Add an explicit check whether we have 1 item.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D130156
Daniel Bertalan [Mon, 25 Jul 2022 18:05:40 +0000 (20:05 +0200)]
[lld-macho] Add LOH_ARM64_ADRP_ADD_LDR optimization hint support
This hint instructs the linker to optimize an adrp+add+ldr sequence used
for loading from a local symbol's address by loading directly if it's
close enough, or with an adrp(p)+ldr sequence if it's not.
This transformation is the same as what's done for ADRP_LDR_GOT_LDR when
the symbol is local. The logic for acting on this hint is therefore
moved to a new function which will be called from the existing
applyAdrpLdrGotLdr() function.
Differential Revision: https://reviews.llvm.org/D130505
Matthias Springer [Wed, 27 Jul 2022 15:58:24 +0000 (17:58 +0200)]
[mlir][transform] Support results on ForeachOp
Handles can be yielded from the ForeachOp.
Differential Revision: https://reviews.llvm.org/D130640
Nikolas Klauser [Thu, 28 Jul 2022 08:32:02 +0000 (10:32 +0200)]
[libc++] Fix merge-conflict in .clang-format
LLVM GN Syncbot [Thu, 28 Jul 2022 08:23:10 +0000 (08:23 +0000)]
[gn build] Port
e01b4fe956dd
Nikolas Klauser [Wed, 27 Jul 2022 21:52:45 +0000 (23:52 +0200)]
[libc++] Fix unwrapping ranges with different iterators and sentinels
Reviewed By: ldionne, huixie90, #libc
Spies: arichardson, sstefan1, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D129040
Daniel Bertalan [Tue, 26 Jul 2022 10:06:39 +0000 (12:06 +0200)]
[lld-macho] Support creating N_SO stab for DWARF5 compile units
In DWARF5, the `DW_AT_name` and `DW_AT_comp_dir` attributes are encoded
using the `strx*` forms, which specify an index into `__debug_str_offs`.
This commit adds that section to DwarfObject, so the debug info parser
can resolve these references.
The test case was manually adapted from stabs-icf.s.
Fixes #51668
Differential Revision: https://reviews.llvm.org/D130559
LLVM GN Syncbot [Thu, 28 Jul 2022 07:43:55 +0000 (07:43 +0000)]
[gn build] Port
8a61749f767e
Gaurav Shukla [Thu, 28 Jul 2022 07:41:05 +0000 (13:11 +0530)]
[mlir][tensor] Fold `tensor.cast` into `tensor.collapse_shape` op
This commit folds a `tensor.cast` op into a `tensor.collapse_shape` op
when following two conditions meet:
1. the `tensor.collapse_shape` op consumes result of the `tensor.cast` op.
2. `tensor.cast` op casts to a more dynamic version of the source tensor.
This is added as a canonicalization pattern in `tensor.collapse_shape` op.
Signed-Off-By: Gaurav Shukla <gaurav@nod-labs.com>
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D130650
Hui Xie [Wed, 27 Jul 2022 12:20:16 +0000 (13:20 +0100)]
[libc++][ranges] implement `std::ranges::inplace_merge`
Differential Revision: https://reviews.llvm.org/D130627
Fangrui Song [Thu, 28 Jul 2022 07:34:04 +0000 (00:34 -0700)]
[Driver][PowerPC] Support -mtune=
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D130526
Max Kazantsev [Thu, 28 Jul 2022 06:42:14 +0000 (13:42 +0700)]
[Test] Fix block name in test
Max Kazantsev [Thu, 28 Jul 2022 06:27:19 +0000 (13:27 +0700)]
[LAA] Remove block order sensitivity in LAA algorithm. PR56672
As test in PR56672 shows, LAA produces different results which lead to either
positive or negative vectorization decisions depending on the order of blocks
in loop. The exact reason of this is not clear to me, however this makes investigation
of related bugs extremely complex.
Current order of blocks in the loop is arbitrary. It may change, for example, if loop
info analysis is dropped and recomputed. Seems that it interferes with LAA's logic.
This patch chooses fixed traversal order of blocks in loops, making it RPOT.
Note: this is *not* a fix for bug with incorrect analysis result. It just makes
the answer more robust to make the investigation easier.
Differential Revision: https://reviews.llvm.org/D130482
Reviewed By: aeubanks, fhahn
Tom Stellard [Wed, 27 Jul 2022 22:23:24 +0000 (15:23 -0700)]
workflows: Use macos-11 runners
macos-10.15 is deprecated and will be removed.
Christian Sigg [Thu, 28 Jul 2022 06:14:18 +0000 (08:14 +0200)]
Argyrios Kyrtzidis [Thu, 28 Jul 2022 06:00:56 +0000 (23:00 -0700)]
[ASTWriter] Replace `const std::string &OutputFile` with `StringRef OutputFile` in some of `ASTWriter` functions, NFC
This is to make it consistent with LLVM's string parameter passing convention.
Phoebe Wang [Thu, 28 Jul 2022 05:40:32 +0000 (13:40 +0800)]
[X86][MC] Avoid emitting incorrect warning for complex FMUL
We will insert a new operand which is identical to the Dest for complex
FMUL with a mask. https://godbolt.org/z/eTEdnYv3q
Complex FMA and FMUL with maskz don't have this problem.
Reviewed By: LuoYuanke, skan
Differential Revision: https://reviews.llvm.org/D130638
Austin Kerbow [Wed, 20 Jul 2022 05:55:42 +0000 (22:55 -0700)]
[AMDGPU] Aggressively schedule to reduce RP in occupancy limited regions
By not clustering loads and adjusting heuristics to more aggressively reduce
register pressure we may be able to increase occupancy for the function if it
was dropped in a first pass scheduling.
Similarly, try to reduce spilling if register usage exceeds lower bound
occupancy.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D130329
Amara Emerson [Thu, 28 Jul 2022 05:10:42 +0000 (22:10 -0700)]
[AArch64][GlobalISel] Fix custom legalization of rotates using sext for shift vs zext.
Rotates are defined according to DAG documentation as having unsigned shifts,
so we need to zero-extend instead of sign-extend here.
Fixes issue 56664
Amara Emerson [Thu, 28 Jul 2022 05:09:31 +0000 (22:09 -0700)]
GlobalISel: update legalize-rotr-rotl.mir checks before change.
Sridhar Gopinath [Thu, 28 Jul 2022 04:00:37 +0000 (21:00 -0700)]
[clang-format] Fix the return code of git-clang-format
In diff and diffstat modes, the return code is != 0 even when there are no
changes between commits. This issue can be fixed by passing --exit-code to
git-diff command that returns 0 when there are no changes and using that as
the return code for git-clang-format.
Fixes #56736.
Differential Revision: https://reviews.llvm.org/D129311
Utkarsh Saxena [Mon, 18 Jul 2022 14:23:28 +0000 (16:23 +0200)]
Use pseudoparser-based folding ranges in ClangdServer.
Differential Revision: https://reviews.llvm.org/D130011
Chuanqi Xu [Thu, 28 Jul 2022 03:13:00 +0000 (11:13 +0800)]
[NFC] [C++20] [Modules] Add tests for merging redefinitions in modules
Add tests for detecting redefinitions in C++20 modules. Some of these
may be covered by other tests. But more tests should be always good.
Tom Stellard [Thu, 28 Jul 2022 03:13:21 +0000 (20:13 -0700)]
workflows: Use correct access token when pushing to llvm-project-release-prs repo
The checkout action will hard-code the default github actions token in
the git config so that all pushes use it. We need to set
persist-credentials=false so we can use a token that has permission
to push to the llvm-project-release-prs repo.