Tina Jung [Wed, 5 Jul 2023 07:37:47 +0000 (08:37 +0100)]
[mlir][tosa] Constant folding for reciprocal
Add constant fold for tosa.reciprocal, which can be applied if the input is a dense constant tensor. The reciprocal is computed for every element and the result is a tensor with the same dimensions as the input tensor.
As the input tensor might require a lot of memory and the folding might double the required memory, a heuristic decides when to actually apply the folding. Currently, the operation will be replaced only if the input constant is a splat (i.e. requires little memory) or has in single user (similar to the already existing fold for constant transposes). This keeps the additionally required space low.
Differential Revision: https://reviews.llvm.org/D150578
Serge Pavlov [Wed, 5 Jul 2023 09:31:23 +0000 (16:31 +0700)]
[Clang] Reset FP options before function instantiations
This is recommit of
98390ccb80569e8fbb20e6c996b4b8cff87fbec6, reverted
in
82a3969d710f5fb7a2ee4c9afadb648653923fef, because it caused
https://github.com/llvm/llvm-project/issues/63542. Although the problem
described in the issue is independent of the reverted patch, fail of
PCH/late-parsed-instantiations.cpp indeed obseved on PowerPC and is
likely to be caused by wrong serialization of `LateParsedTemplate`
objects. In this patch the serialization is fixed.
Original commit message is below.
Previously function template instantiations occurred with FP options
that were in effect at the end of translation unit. It was a problem
for late template parsing as these FP options were used as attributes of
AST nodes and may result in crash. To fix it FP options are set to the
state of the point of template definition.
Differential Revision: https://reviews.llvm.org/D143241
Freddy Ye [Wed, 5 Jul 2023 09:31:11 +0000 (17:31 +0800)]
[X86] Remove CPU_SPECIFIC* MACROs and add getCPUDispatchMangling
This refactor patch means to remove CPU_SPECIFIC* MACROs in X86TargetParser.def
and move those information into ProcInfo of X86TargetParser.cpp. Since these
two files both maintain a table with redundant info such as cpuname and its
features supported. CPU_SPECIFIC* MACROs define some different information. This
patch dealt with them in these ways when moving:
1.mangling
This is now moved to Mangling in ProcInfo and directly initialized at array of
Processors. CPUs don't support cpu_dispatch/specific are assigned '\0' as
mangling.
2.CPU alias
The alias cpu will also be initialized in array of Processors, its attributes
will be same as its alias target cpu. Same feature list, same mangling.
3.TUNE_NAME
Before my change, some cpu names support cpu_dispatch/specific are not
supported in X86.td, which means optimizer/backend doesn't recognize them. So
they use a different TUNE_NAME to generate in IR. In this patch, I added these
missing cpu support at X86.td by utilizing existing Features and XXXTunings, so
that each cpu name can directly use its own name as TUNE_NAME to be supported
by optimizer/backend.
4.Feature list
The feature list of one CPU maintained in X86TargetParser.def is not same as
the one in X86TargetParser.cpp. It only maintains part of features of one CPU
(features defined by X86_FEATURE_COMPAT). While X86TargetParser.cpp maintains
a complete one. This patch abandons the feature list maintained by CPU_SPECIFIC*
MACROs because assigning a CPU with a complete one doesn't affect the
functionality of cpu_dispatch/specific.
Except these four info, since some of CPUs supported by cpu_dispatch/specific
doesn's support clang options like -march, -mtune before, this patch also kept
this behavior still by adding another member OnlyForCPUDispatchSpecific in
ProcInfo.
Reviewed By: pengfei, RKSimon
Differential Revision: https://reviews.llvm.org/D151696
Alex Bradbury [Wed, 5 Jul 2023 09:28:32 +0000 (10:28 +0100)]
Revert "[RISCV][test] Add test coverage for llvm.frexp.*.* intrinsics"
Reverting due to weird failure.
This reverts commit
4b8162fe9ccb68b5b42f683df8df42ed43bfd5e7.
Ivan Kosarev [Wed, 5 Jul 2023 09:25:22 +0000 (10:25 +0100)]
[AMDGPU] Fix expensive-checks build.
Completes <https://reviews.llvm.org/D154337>.
Alex Bradbury [Wed, 5 Jul 2023 09:21:26 +0000 (10:21 +0100)]
[RISCV][test] Add test coverage for llvm.frexp.*.* intrinsics
The test file is copied from X86 (which is also mostly shared with Arm,
PowerPC) rather than integrated into float-intrinsics.ll and
double-intrinsics.ll.
There's currently a compiler crash for the soft float cases (expect this
is the issue in <https://github.com/llvm/llvm-project/issues/63661>)
which will be a addressed with a follow-on patch posted for review.
Ivan Kosarev [Wed, 5 Jul 2023 09:15:12 +0000 (10:15 +0100)]
[AMDGPU][NFC] Rename SIMCCodeEmitter.cpp to match the new emitter class name.
The class was renamed in <https://reviews.llvm.org/D154337>.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D154426
LLVM GN Syncbot [Wed, 5 Jul 2023 09:13:40 +0000 (09:13 +0000)]
[gn build] Port
ee165cdb1b8b
Ivan Kosarev [Wed, 5 Jul 2023 09:07:43 +0000 (10:07 +0100)]
[AMDGPU] Eliminate SIMCCodeEmitter and de-virtualise encoding methods.
Simplifies some future changes needed for
<https://github.com/llvm/llvm-project/issues/62629>.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154337
David Spickett [Wed, 5 Jul 2023 09:05:04 +0000 (10:05 +0100)]
[llvm][TableGen][Jupyter] Note needed restart when using an IDE
Certainly with VSCode, this makes it repopulate the kernel list.
Andrzej Warzynski [Sun, 2 Jul 2023 13:43:14 +0000 (14:43 +0100)]
[mlir][transform] Allow arbitrary indices to be scalable
This change lifts the limitation that only the trailing dimensions/sizes
in dynamic index lists can be scalable. It allows us to extend
`MaskedVectorizeOp` and `TileOp` from the Transform dialect so that the
following is allowed:
%1, %loops:3 = transform.structured.tile %0 [4, [4], [4]]
This is also a follow up for https://reviews.llvm.org/D153372
that will enable the following (middle vector dimension is scalable):
transform.structured.masked_vectorize %0 vector_sizes [2, [4], 8]
To facilate this change, the hooks for parsing and printing dynamic
index lists are updated accordingly (`printDynamicIndexList` and
`parseDynamicIndexList`, respectively). `MaskedVectorizeOp` and `TileOp`
are updated to include an array of attribute of bools that captures
whether the corresponding vector dimension/tile size, respectively, are
scalable or not.
NOTE 1: I am re-landing this after the initial version was reverted. To
fix the regression and in addition to the original patch, this revision
updates the Python bindings for the transform dialect
NOTE 2: This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
* https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/
This relands
048764f23a380fd6f8cc562a0008dcc6095fb594 with fixes.
Differential Revision: https://reviews.llvm.org/D154336
David Spickett [Tue, 4 Jul 2023 09:16:44 +0000 (10:16 +0100)]
[llvm][TableGen][Jupyter] Note an easily encountered error
When using VSCode it'll default to the Python kernel the first
time you open the notebook. Mention this in the readme, as the fix
is simple but only if you know what to look for.
David Spickett [Mon, 3 Jul 2023 14:44:53 +0000 (15:44 +0100)]
[llvm][TableGen][Jupyter] Record current python when kernel is installed
Previously the kernel.json would always point to `python3` even if you
installed using a python from a virtualenv. This meant that tools like VSCode
would try to run the kernel against the system python and fail.
Added a note to the readme about it. I've removed the need to
add to PYTHONPTHON as well, turns out it wasn't needed.
This fixes an issue reported in https://discourse.llvm.org/t/tablegen-the-playground-ipynb-file-is-not-working-as-expected/71745.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D154351
Nikita Popov [Wed, 14 Jun 2023 08:34:14 +0000 (10:34 +0200)]
[LoopUnroll] Fold add chains during unrolling
Loop unrolling tends to produce chains of
`%x1 = add %x0, 1; %x2 = add %x1, 1; ...` with one add per unrolled
iteration. This patch simplifies these adds to `%xN = add %x0, N`
directly during unrolling, rather than waiting for InstCombine to do so.
The motivation for this is that having a single add (rather than
an add chain) on the induction variable makes it a simple recurrence,
which we specially recognize in a number of places. This allows
InstCombine to directly perform folds with that knowledge, instead
of first folding the add chains, and then doing other folds in another
InstCombine iteration.
Due to the reduced number of InstCombine iterations, this also
results in a small compile-time improvement.
Differential Revision: https://reviews.llvm.org/D153540
Lorenzo Chelini [Tue, 4 Jul 2023 12:42:28 +0000 (14:42 +0200)]
[MLIR] Fix compiler warnings (NFC)
In `TestTensorTransforms.cpp` `replaced` is nullptr I assumed the intent
was to emit the error for the `rootOp`.
In `TransformInterfaces.cpp` there were some uninitialized variables.
In `NVGPUTransformOps.cpp` `matmulOp` was never used.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D154439
Aleksandr Popov [Mon, 3 Jul 2023 21:04:23 +0000 (23:04 +0200)]
[IRCE] Support inverted range check's predicate
IRCE expects true edge of range check's branch comes to loop.
If it meets reverse case - invert the branch.
Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D148244
Nikita Popov [Wed, 5 Jul 2023 07:45:58 +0000 (09:45 +0200)]
[LoopUnroll] Add test for early add folding (NFC)
Test for D153540, with adds that have different overflow flags.
Kadir Cetinkaya [Wed, 5 Jul 2023 06:21:28 +0000 (08:21 +0200)]
[clang][Tooling] Add mapping for make_error_code
Differential Revision: https://reviews.llvm.org/D154473
Siva Chandra [Fri, 26 May 2023 08:08:46 +0000 (08:08 +0000)]
[libc] Initiliaze the global pointer in riscv startup code.
Reviewed By: mikhail.ramalho
Differential Revision: https://reviews.llvm.org/D151539
Nikita Popov [Wed, 5 Jul 2023 07:30:24 +0000 (09:30 +0200)]
[GVNHoist] Convert test to opaque pointers (NFC)
Nikita Popov [Tue, 4 Jul 2023 15:00:17 +0000 (17:00 +0200)]
[Attributor] Convert test to opaque pointers (NFC)
Differential Revision: https://reviews.llvm.org/D153949
Nikita Popov [Fri, 30 Jun 2023 10:25:09 +0000 (12:25 +0200)]
[cmake] Add LLVM_UNITTEST_LINK_FLAGS option
Add an option to specify additional linker flags for unit tests only.
For example, this allows using something like
-DLLVM_UNITTEST_LINK_FLAGS="-Wl,-plugin-opt=O0" if you're doing LTO
builds, or -DLLVM_UNITTEST_LINK_FLAGS="-fno-lto" if you're using
fat LTO objects.
The build system already does this itself if the LLVM_ENABLE_LTO
flag is used, but this does not cover all possible LTO configurations.
Differential Revision: https://reviews.llvm.org/D154212
Timm Bäder [Wed, 5 Jul 2023 03:37:34 +0000 (05:37 +0200)]
[clang][Interp][NFC] Remove unnecessary else blocks
Timm Bäder [Tue, 4 Jul 2023 18:33:21 +0000 (20:33 +0200)]
[clang][Interp][NFC] Move a declaration into an if statement
Timm Bäder [Wed, 5 Jul 2023 06:52:08 +0000 (08:52 +0200)]
[clang][Interp][NFC] Fix a doc comment
Timm Bäder [Tue, 4 Jul 2023 18:32:17 +0000 (20:32 +0200)]
[clang][Interp][NFC] Add Descriptor::isCompositeArray()
Unused for now, but will be used in later commits.
Balazs Benics [Wed, 5 Jul 2023 06:56:13 +0000 (08:56 +0200)]
[analyzer][NFC] Move away from using raw-for loops inside StaticAnalyzer
I'm involved with the Static Analyzer for the most part.
I think we should embrace newer language standard features and gradually
move forward.
Differential Revision: https://reviews.llvm.org/D154325
Timm Bäder [Tue, 4 Jul 2023 18:31:07 +0000 (20:31 +0200)]
[clang][NFC] Move two declarations closer to their point of use
Matthias Springer [Wed, 5 Jul 2023 06:34:03 +0000 (08:34 +0200)]
[mlir][linalg] Do not emit FillOp for tensor.pad with zero padding
No need to fill the buffer if no padding is added. I.e., the tensor.pad is packing only.
Differential Revision: https://reviews.llvm.org/D153874
esmeyi [Wed, 5 Jul 2023 05:58:18 +0000 (01:58 -0400)]
[XCOFF] Force recording a relocation for weak symbol label.
Summary: Currently, if there are multiple definitions of the same symbol declared has weak linkage, the linker may choose the wrong one when they are compiled with integrated-as. This patch fixes the issue. If the target symbol is a weak label we must not attempt to resolve the fixup directly. Emit a relocation and leave resolution of the final target address to the linker.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D153839
Tomasz Kamiński [Tue, 4 Jul 2023 14:54:54 +0000 (16:54 +0200)]
[analyzer] Differentiate lifetime extended temporaries
This patch introduces a new `CXXLifetimeExtendedObjectRegion` as a representation
of the memory for the temporary object that is lifetime extended by the reference
to which they are bound.
This separation provides an ability to detect the use of dangling pointers
(either binding or dereference) in a robust manner.
For example, the `ref` is conditionally dangling in the following example:
```
template<typename T>
T const& select(bool cond, T const& t, T const& u) { return cond ? t : u; }
int const& le = Composite{}.x;
auto&& ref = select(cond, le, 10);
```
Before the change, regardless of the value of `cond`, the `select()` call would
have returned a `temp_object` region.
With the proposed change we would produce a (non-dangling) `lifetime_extended_object`
region with lifetime bound to `le` or a `temp_object` region for the dangling case.
We believe that such separation is desired, as such lifetime extended temporaries
are closer to the variables. For example, they may have a static storage duration
(this patch removes a static temporary region, which was an abomination).
We also think that alternative approaches are not viable.
While for some cases it may be possible to determine if the region is lifetime
extended by searching the parents of the initializer expr, this quickly becomes
complex in the presence of the conditions operators like this one:
```
Composite cc;
// Ternary produces prvalue 'int' which is extended, as branches differ in value category
auto&& x = cond ? Composite{}.x : cc.x;
// Ternary produces xvalue, and extends the Composite object
auto&& y = cond ? Composite{}.x : std::move(cc).x;
```
Finally, the lifetime of the `CXXLifetimeExtendedObjectRegion` is tied to the lifetime of
the corresponding variables, however, the "liveness" (or reachability) of the extending
variable does not imply the reachability of all symbols in the region.
In conclusion `CXXLifetimeExtendedObjectRegion`, in contrast to `VarRegions`, does not
need any special handling in `SymReaper`.
RFC: https://discourse.llvm.org/t/rfc-detecting-uses-of-dangling-references/70731
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D151325
Rahul Kayaith [Sun, 7 May 2023 23:28:46 +0000 (19:28 -0400)]
[mlir][TestDialect] Fix invalid custom op printers
This fixes a few custom printers which were printing IR that couldn't be
round-tripped.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D150080
Joseph Huber [Wed, 5 Jul 2023 03:01:03 +0000 (22:01 -0500)]
[Libomptarget][Obvious] Missing comma on enum
Joseph Huber [Wed, 5 Jul 2023 02:55:49 +0000 (21:55 -0500)]
[Libomptarget] Add missing HSA agent info enumeration
Summary:
This was not added to dynamic_hsa.h
Joseph Huber [Tue, 4 Jul 2023 17:31:28 +0000 (12:31 -0500)]
[Libomptarget] Correctly implement `getWTime` on AMDGPU
AMDGPU provides a fixed frequency clock since some generations back.
However, the frequency is variable by card and must be looked up at
runtime. This patch adds a new device environment line for the clock
frequency so that we can use it in the same way as NVPTX. This is the
correct implementation and the version in ASO should be replaced.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D154456
Craig Topper [Wed, 5 Jul 2023 01:58:32 +0000 (18:58 -0700)]
[RISCV] Fix 80 column violations in RISCVInstrInfoXCV.td. NFC
Craig Topper [Wed, 5 Jul 2023 01:52:31 +0000 (18:52 -0700)]
[RISCV] Rename RVInstBitManipRII->CVInstBitManipRII since it belongs to XVendorCV. NFC
This is consistent with the other classes in this file.
It avoids a possible name conflict with standard extensions or
other vendors in the future.
Yeting Kuo [Mon, 12 Jun 2023 13:32:18 +0000 (21:32 +0800)]
[ASAN] Support memeory check for masked.gather/scatter.
The patch handle masked.gather/scatter just like the way D149245 handles
vp.gather/scatter.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D151545
Blue Gaston [Mon, 26 Jun 2023 17:28:16 +0000 (10:28 -0700)]
[ASanAbi][Darwin] Build ios stable ABI library
In the initially commit, we limited the static archive to osx.
This patch removes that limitation.
Differential Revision: https://reviews.llvm.org/D153789
Sergei Barannikov [Tue, 4 Jul 2023 21:00:07 +0000 (00:00 +0300)]
[MC/AsmParser] Remove no-op overrides of parseDirective (NFC)
Remove overrides of parseDirective that unconditionally return NoMatch.
This is what the base implementation does.
This is a follow-up to D154101 based on post-commit review feedback.
Thomas Preud'homme [Thu, 29 Jun 2023 21:44:24 +0000 (22:44 +0100)]
[FileCheck, 2/4] NFC: Switch to APInt getter for ExpressionValue
Use an APInt getter as the only interface to getting the value out of an
ExpressionValue. This paves the way to switch ExpressionValue to handle
any integer without causing too big of a patch.
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D154429
Thomas Preud'homme [Wed, 28 Jun 2023 21:53:56 +0000 (22:53 +0100)]
[FileCheck, 1/4] NFC: Switch ExpressionValue to APInt
Use APInt internally to store values represented by ExpressionValue.
This will allow to support any integer values in FileCheck numeric
expression in a subsequent commit.
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D154428
Florian Hahn [Tue, 4 Jul 2023 20:28:02 +0000 (21:28 +0100)]
[LV] Forget SCEVs for exit phis after vectorization.
After vectorization, the exit blocks of the original loop will have additional
predecessors. Invalidate SCEVs for the exit phis in case SE looked through
single-entry phis.
Fixes https://github.com/llvm/llvm-project/issues/63368
Fixes https://github.com/llvm/llvm-project/issues/63669
Sergei Barannikov [Sat, 1 Jul 2023 22:51:17 +0000 (01:51 +0300)]
[RISCV] Replace OperandMatchResultTy with ParseStatus (NFC)
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
Error(L, "msg");
return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D154291
Sergei Barannikov [Sun, 2 Jul 2023 01:58:52 +0000 (04:58 +0300)]
[AMDGPU] Replace OperandMatchResultTy with ParseStatus (NFC)
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
Error(L, "msg");
return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.
Reviewed By: kosarev
Differential Revision: https://reviews.llvm.org/D154293
Sergei Barannikov [Mon, 3 Jul 2023 03:45:13 +0000 (06:45 +0300)]
[LoongArch] Replace OperandMatchResultTy with ParseStatus (NFC)
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
Error(L, "msg");
return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.
Reviewed By: xen0n
Differential Revision: https://reviews.llvm.org/D154318
Sergei Barannikov [Mon, 3 Jul 2023 03:06:35 +0000 (06:06 +0300)]
[CSKY] Replace OperandMatchResultTy with ParseStatus (NFC)
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
Error(L, "msg");
return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.
Reviewed By: zixuan-wu
Differential Revision: https://reviews.llvm.org/D154315
Sergei Barannikov [Mon, 3 Jul 2023 04:21:53 +0000 (07:21 +0300)]
[M68k] Replace OperandMatchResultTy with ParseStatus (NFC)
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
Error(L, "msg");
return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D154320
Lei Huang [Tue, 4 Jul 2023 17:43:52 +0000 (13:43 -0400)]
[PowerPC] Add DFP quantum adjustment instruction definitions and MC tests
Add td definitions and asm/disasm tests for the quantum adjustment
instructions in ISA 3.1 section 5.6.4
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D154369
Sergei Barannikov [Sun, 2 Jul 2023 15:58:26 +0000 (18:58 +0300)]
[MC] Use ParseStatus in generated AsmParser methods
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
Error(L, "msg");
return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.
Reviewed By: kosarev
Differential Revision: https://reviews.llvm.org/D154303
Dmitry Ilvokhin [Tue, 4 Jul 2023 18:58:40 +0000 (20:58 +0200)]
Use hash value checks optimizations consistently
There are couple of optimizations of `__hash_table::find` which are applicable
to other places like `__hash_table::__node_insert_unique_prepare` and
`__hash_table::__emplace_unique_key_args`.
```
for (__nd = __nd->__next_; __nd != nullptr &&
(__nd->__hash() == __hash
// ^^^^^^^^^^^^^^^^^^^^^^
// (1)
|| std::__constrain_hash(__nd->__hash(), __bc) == __chash);
__nd = __nd->__next_)
{
if ((__nd->__hash() == __hash)
// ^^^^^^^^^^^^^^^^^^^^^^^^^^
// (2)
&& key_eq()(__nd->__upcast()->__value_, __k))
return iterator(__nd, this);
}
```
(1): We can avoid expensive modulo operations from `std::__constrain_hash` if
hashes matched. This one is from commit
6a411472e3c4.
(2): We can avoid `key_eq` calls if hashes didn't match. Commit:
318d35a7bca6c4e5.
Both of them are applicable for insert and emplace methods.
Results of unordered_set_operations benchmark:
```
Comparing /tmp/main to /tmp/hashtable-hash-value-optimization
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------------------------------------------
BM_Hash/uint32_random_std_hash/1024 -0.0127 -0.0127 1511 1492 1508 1489
BM_Hash/uint32_random_custom_hash/1024 +0.0012 +0.0013 1370 1371 1367 1369
BM_Hash/uint32_top_std_hash/1024 -0.0027 -0.0028 1502 1497 1498 1494
BM_Hash/uint32_top_custom_hash/1024 +0.0033 +0.0032 1368 1373 1365 1370
BM_InsertValue/unordered_set_uint32/1024 +0.0267 +0.0266 36421 37392 36350 37318
BM_InsertValue/unordered_set_uint32_sorted/1024 +0.0230 +0.0229 28247 28897 28193 28837
BM_InsertValue/unordered_set_top_bits_uint32/1024 +0.0492 +0.0491 31012 32539 30952 32472
BM_InsertValueRehash/unordered_set_top_bits_uint32/1024 +0.0523 +0.0520 62905 66197 62780 66043
BM_InsertValue/unordered_set_string/1024 -0.0252 -0.0253 300762 293189 299805 292221
BM_InsertValueRehash/unordered_set_string/1024 -0.0932 -0.0920 332924 301882 331276 300810
BM_InsertValue/unordered_set_prefixed_string/1024 -0.0578 -0.0577 314239 296072 313222 295137
BM_InsertValueRehash/unordered_set_prefixed_string/1024 -0.0986 -0.0985 336091 302950 334982 301995
BM_Find/unordered_set_random_uint64/1024 -0.1416 -0.1417 16075 13798 16041 13769
BM_FindRehash/unordered_set_random_uint64/1024 -0.0105 -0.0105 5900 5838 5889 5827
BM_Find/unordered_set_sorted_uint64/1024 +0.0014 +0.0014 2813 2817 2807 2811
BM_FindRehash/unordered_set_sorted_uint64/1024 -0.0247 -0.0249 5863 5718 5851 5706
BM_Find/unordered_set_sorted_uint128/1024 +0.0113 +0.0112 15570 15746 15539 15713
BM_FindRehash/unordered_set_sorted_uint128/1024 +0.0438 +0.0441 6917 7220 6902 7206
BM_Find/unordered_set_sorted_uint32/1024 -0.0020 -0.0020 3098 3091 3092 3085
BM_FindRehash/unordered_set_sorted_uint32/1024 +0.0570 +0.0569 5377 5684 5368 5673
BM_Find/unordered_set_sorted_large_uint64/1024 +0.0081 +0.0081 3594 3623 3587 3616
BM_FindRehash/unordered_set_sorted_large_uint64/1024 -0.0542 -0.0540 6154 5820 6140 5808
BM_Find/unordered_set_top_bits_uint64/1024 -0.0061 -0.0061 10440 10377 10417 10353
BM_FindRehash/unordered_set_top_bits_uint64/1024 +0.0131 +0.0128 5852 5928 5840 5914
BM_Find/unordered_set_string/1024 -0.0352 -0.0349 189037 182384 188389 181809
BM_FindRehash/unordered_set_string/1024 -0.0309 -0.0311 180718 175142 180141 174532
BM_Find/unordered_set_prefixed_string/1024 -0.0559 -0.0557 190853 180177 190251 179659
BM_FindRehash/unordered_set_prefixed_string/1024 -0.0563 -0.0561 182396 172136 181797 171602
BM_Rehash/unordered_set_string_arg/1024 -0.0244 -0.0241 27052 26393 26989 26339
BM_Rehash/unordered_set_int_arg/1024 -0.0410 -0.0410 19582 18779 19539 18738
BM_InsertDuplicate/unordered_set_int/1024 +0.0023 +0.0025 12168 12196 12142 12173
BM_InsertDuplicate/unordered_set_string/1024 -0.0505 -0.0504 189238 179683 188648 179133
BM_InsertDuplicate/unordered_set_prefixed_string/1024 -0.0989 -0.0987 198893 179222 198263 178702
BM_EmplaceDuplicate/unordered_set_int/1024 -0.0175 -0.0173 12674 12452 12646 12427
BM_EmplaceDuplicate/unordered_set_string/1024 -0.0559 -0.0557 190104 179481 189492 178934
BM_EmplaceDuplicate/unordered_set_prefixed_string/1024 -0.1111 -0.1110 201233 178870 200608 178341
BM_InsertDuplicate/unordered_set_int_insert_arg/1024 -0.0747 -0.0745 12993 12022 12964 11997
BM_InsertDuplicate/unordered_set_string_insert_arg/1024 -0.0584 -0.0583 191489 180311 190864 179731
BM_EmplaceDuplicate/unordered_set_int_insert_arg/1024 -0.0807 -0.0804 35946 33047 35866 32982
BM_EmplaceDuplicate/unordered_set_string_arg/1024 -0.0312 -0.0310 321623 311601 320559 310637
OVERALL_GEOMEAN -0.0276 -0.0275 0 0 0 0
```
Time differences looks more like noise to me. But if we want to have this
optimizations in `find`, we probably want them in `insert` and `emplace` as
well.
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D140779
Edoardo Sanguineti [Tue, 4 Jul 2023 18:42:44 +0000 (20:42 +0200)]
[libc++] Use this in lamba capture in <latch>
"&" seemed to be used in a situation where perhaps it's not the best option.
Other libc++ modules make use of [this] when calling functions from the same class.
[this] would be the appropriate lambda capture specifier to use in this situation.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D154358
Piotr Fusik [Tue, 4 Jul 2023 17:37:32 +0000 (19:37 +0200)]
[NFC][libc++] Fix whitespace in sstream
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D154455
Marius Brehler [Tue, 4 Jul 2023 16:25:05 +0000 (18:25 +0200)]
[mlir][emitc][nfc] Update summary of opaque type
With this patch error messages are improved. So far, error messages like
`operand #0 must An opaque type` can be generated.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D154453
Louis Dionne [Mon, 3 Jul 2023 20:55:30 +0000 (16:55 -0400)]
[libc++] Avoid including things that require a filesystem in filesytem_clock.cpp
The filesystem clock implementation should be available regardless of
whether a proper filesystem is available on the platform, so it makes
sense to try and avoid including things that are inherently filesystem-y
in the implementation of filesystem clock.
Differential Revision: https://reviews.llvm.org/D154390
David Green [Tue, 4 Jul 2023 17:13:10 +0000 (18:13 +0100)]
[TTI][AArch64] Add basic vector_reduce_fmaximum/vector_reduce_fminimum costmodelling
This adds some basic handling in TargetTransformInfo to treat
vector_reduce_fminimum/vector_reduce_fmaximum similar to
vector_reduce_fmax/vector_reduce_fmax, getting better costs via
getMinMaxReductionCost.
Differential Revision: https://reviews.llvm.org/D153548
Joseph Huber [Tue, 4 Jul 2023 16:05:17 +0000 (11:05 -0500)]
[libc] Remove flaky static assert from RPC interface
Summary:
This function is intended to only be used on the GPU as a shorthand. The
static assert should only fire if it's called ,but it seems that its
precence can sometimes cause issues and other times not. Simply remove
it as it's causing build problems.
Timm Bäder [Mon, 3 Jul 2023 10:43:20 +0000 (12:43 +0200)]
[clang][Interp][NFC] Move CastFP to Interp.h
It's not a Check* function, so try to stay consistent and move this to
the header.
Timm Bäder [Fri, 30 Jun 2023 06:26:36 +0000 (08:26 +0200)]
[clang][Interp][NFC] Return integer from Boolean::bitWidth()
Timm Bäder [Thu, 11 May 2023 12:44:39 +0000 (14:44 +0200)]
[clang][Interp][NFC] Fix GetFnPtr signature
Louis Dionne [Tue, 4 Jul 2023 15:25:15 +0000 (11:25 -0400)]
[libc++][NFC] Sort list of attribute macros in the .clang-format file
Timm Bäder [Tue, 2 May 2023 09:21:11 +0000 (11:21 +0200)]
[clang][Interp][NFC] Return a const pointer from Pointer::getRecord()
Louis Dionne [Tue, 4 Jul 2023 15:21:21 +0000 (11:21 -0400)]
Add clang-format commit to git blame ignore revs
Louis Dionne [Tue, 4 Jul 2023 15:19:10 +0000 (11:19 -0400)]
[libc++][NFC] clang-format <shared_mutex>
I am about to touch several lines in that file for a patch anyway, so
I might as well clang-format it upfront to avoid mixing styles after
my patch.
Markus Böck [Tue, 4 Jul 2023 15:04:47 +0000 (17:04 +0200)]
[mlir][LLVM] drop `llvm.intr.dbg.value` when promoting in `SROA` or `mem2reg`
This has previously been done for `llvm.intr.dbg.declare`, which is a common occurrence when the debug info points to the variable through the pointer, but may also occur when the `alloca` itself is a local variable in debug info.
Not doing so prevents `SROA` and `mem2reg` from promoting e.g. an `alloca`. We simply drop the value completetly, since there is no meaninful debug info that can be constructed instead as the pointer value is removed.
Differential Revision: https://reviews.llvm.org/D154451
Anna Thomas [Tue, 4 Jul 2023 14:59:02 +0000 (10:59 -0400)]
[MetaRenamer] Rename only unnamed instructions in mode renaming instructions
6f9e743b91ad6ac1f333c introduced a mode which renames only instructions in
the function. This change updates that mode to skip instructions that are already named.
This serves the original purpose of the mode (rename-only-inst) which is:
1. Modify IR without failing verifier with serially ordered number
requirement (%1, %2, %3 required in order).
2. Give meaningful names to instructions.
Amaury Séchet [Tue, 4 Jul 2023 14:22:32 +0000 (14:22 +0000)]
[NFC] Reorder functions in DAGCombiner so all UADDO_CARRY related functions are next to each others.
Matthias Springer [Tue, 4 Jul 2023 14:45:59 +0000 (16:45 +0200)]
[mlir][linalg] Add test case: vectorize tensor.pad and bufferize to allocation
Add a test case that first vectorizes a `tensor.pad` op, then bufferizes it to a new allocation with a specified memory space.
Differential Revision: https://reviews.llvm.org/D154082
Lei Huang [Tue, 4 Jul 2023 14:36:32 +0000 (10:36 -0400)]
[PowerPC] add testcase for vector add and shift
Matthias Springer [Tue, 4 Jul 2023 14:36:00 +0000 (16:36 +0200)]
[mlir][linalg] BufferizeToAllocationOp: Do not copy uninitialized buffers
Tensors/buffers that do not have any defined contents (e.g., `tensor.empty`) are no longer copied.
Differential Revision: https://reviews.llvm.org/D154081
Matthias Springer [Tue, 4 Jul 2023 12:56:40 +0000 (14:56 +0200)]
[mlir][transform] Improve transform.get_closest_isolated_parent
* Rename op to `transform.get_parent_op`
* Match parents by "is isolated from above" and/or op name, or just the direct parent.
* Deduplication of result payload ops is optional.
Differential Revision: https://reviews.llvm.org/D154071
Timm Bäder [Mon, 19 Jun 2023 07:15:04 +0000 (09:15 +0200)]
[clang][Interp][NFC] Return std::nullopt explicitly
Timm Bäder [Mon, 19 Jun 2023 06:57:50 +0000 (08:57 +0200)]
[clang][Interp][NFC] Add some missing const qualifiers
Timm Bäder [Mon, 19 Jun 2023 06:45:07 +0000 (08:45 +0200)]
[clang][Interp][NFC] Merge two if statements
Kadir Cetinkaya [Tue, 4 Jul 2023 13:50:47 +0000 (15:50 +0200)]
[clangd] Downgrade deprecated warnings to hints
This tries to improve adoption of noisy warnings in existing codebases.
Hints have a lot less visual clutter in most of the editors, and DiagnosticTags
already imply a custom decorations per LSP.
Differential Revision: https://reviews.llvm.org/D154443
Christian Ulmann [Tue, 4 Jul 2023 12:45:35 +0000 (12:45 +0000)]
[mlir][LLVM] Stop importing module location for all unknown locs
This commit changes the LLVM IR import to use UnkownLoc for missing
debug locations. This change ensures that we do not accidentially
introduce faulty locations that can influence debugging post export.
This behavior change is not applied to locations of global metadata
operations, as their location will not be exported.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D154416
David Green [Tue, 4 Jul 2023 14:02:30 +0000 (15:02 +0100)]
[CostModel] Use min/max intrinsics for vecreduce.min/max costs
This changes the costmodelling of the vecreduce.min/max nodes to use the costs
of the relevant min/max intrinsics instead of expanding them to compare and
selects. The getMinMaxReductionCost have changed to take a Opcode for the
relevant intrinsic, dropping the IsUnsigned and CondTy parameters as they are
no longer needed.
A follow up patch will add some basic fminimum/fmaximum costmodelling.
Differential Revision: https://reviews.llvm.org/D153547
Stephen Thomas [Tue, 4 Jul 2023 10:43:09 +0000 (11:43 +0100)]
[AMDGPU] Fix incorrect hazard mitigation
GCNHazardRecognizer::fixVcmpxExecWARHazard() mitigates a specific hazard
by inserting a wait on sa_sdst==0 if such a wait isn't already present.
Unfortunately, the check for an existing wait incorrectly checks for one
that doesn't actually care about sa_sdst itself, but requires that no
other counters are waited for.
Once the check is performed correctly, a lit test needs to be updated,
since it is currently testing for the incorrect behaviour.
Differential Revision: https://reviews.llvm.org/D154438
Hans Wennborg [Tue, 4 Jul 2023 08:46:07 +0000 (10:46 +0200)]
[libc++] Disable tree invariant check in asserts mode
This is a follow-up to D153672 which removed the old debug mode and
moved many of those checks to the regular asserts mode.
The tree invariant check is too expensive for the regular asserts mode,
making element removal O(n) instead of O(log n), so disable it until
there is a new debug assert category it can be put in.
Differential revision: https://reviews.llvm.org/D154417
Aleksandr Popov [Mon, 3 Jul 2023 08:07:57 +0000 (10:07 +0200)]
[NFC][IRCE] Extract 'IV vs Limit' parsing to a separate method
Next step of the preparatory refactoring for the upcoming support of new
new range check form to parse.
This change isolates logic of 'IV vs Limit' range check parsing to
simplify adding parsers for new range checks forms.
Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D154160
LLVM GN Syncbot [Tue, 4 Jul 2023 13:10:45 +0000 (13:10 +0000)]
[gn build] Port
7a72ce98224b
Tom Weaver [Tue, 4 Jul 2023 13:05:54 +0000 (14:05 +0100)]
Revert "[dataflow] Add dedicated representation of boolean formulas"
This reverts commit
2fd614efc1bb9c27f1bc6c3096c60a7fe121e274.
Commit caused failures on the following two build bots:
http://45.33.8.238/win/80815/step_7.txt
https://lab.llvm.org/buildbot/#/builders/139/builds/44269
Aleksandr Popov [Mon, 3 Jul 2023 04:48:13 +0000 (06:48 +0200)]
[NFC][IRCE] Check that Index is AddRec in the parseRangeCheckICmp
Next step of the preparatory refactoring for the upcoming support of
new range check form to parse.
Previous one: https://reviews.llvm.org/D154156
With this change we avoid meaningless parsing after realizing that Index
is not AddRec
Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D154158
Florian Hahn [Tue, 4 Jul 2023 13:01:05 +0000 (14:01 +0100)]
[LV] Regenerate check lines to reduced diff.
Regenerate checks to avoid unnecessary changes in D154264.
Joseph Huber [Tue, 4 Jul 2023 12:59:11 +0000 (07:59 -0500)]
[Libomptarget] Fix misused macro name preventing printing of library name
Summary:
This code used `LIBOMPTARGET_DEBUG` which is not the macro name, but the
environment variable. This caused this portion to always be disabled. In
the long run we should aim for this to always be availible as it's
useful for other diagnostic message.
Matthias Springer [Tue, 4 Jul 2023 12:48:56 +0000 (14:48 +0200)]
[mlir][linalg] BufferizeToAllocationOp: Support vector.mask
This op needs special handling because the allocation for the masked op must be placed outside of the mask op.
Differential Revision: https://reviews.llvm.org/D154058
Matthias Springer [Tue, 4 Jul 2023 12:40:49 +0000 (14:40 +0200)]
[mlir][linalg] BufferizeToAllocation: Bufferize non-allocating ops
Until now, only `tensor.pad` ops could be bufferized to an allocation. This revision adds support for all bufferizable ops that do not already bufferize to an allocation. (Those still need special handling.)
Differential Revision: https://reviews.llvm.org/D153971
Martin Braenne [Tue, 4 Jul 2023 12:14:17 +0000 (12:14 +0000)]
[clang][dataflow] Make `runDataflowReturnError()` a non-template function.
It turns out this didn't need to be a template at all.
Likewise, change callers to they're non-template functions.
Also, correct / clarify some comments in RecordOps.h.
This is in response to post-commit comments on https://reviews.llvm.org/D153006.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D154339
Matthias Springer [Tue, 4 Jul 2023 12:34:00 +0000 (14:34 +0200)]
[mlir][linalg] BufferizeToAllocationOp: Bufferize ops, not values
The `bufferize_to_allocation` transform op now operates on payload ops, not payload values. Only ops can be bufferized, not values.
Also remove the `replacement` result from the transform op.
Differential Revision: https://reviews.llvm.org/D153970
Haojian Wu [Tue, 4 Jul 2023 11:16:05 +0000 (13:16 +0200)]
[clang-tidy] Don't emit the whole spelling include header in include-cleaner diagnostic message
To keep the message short and consistent with clangd, and the diagnostics are
attached to the #include line, users have enough context to understand the whole #include.
Differential Revision: https://reviews.llvm.org/D154434
Alex Bradbury [Tue, 4 Jul 2023 12:28:12 +0000 (13:28 +0100)]
[RISCV][NFC] Fix doc comment for RISCVDAGToDAGISel::selectSETCC
The doc comment referred to a boolean parameter that has since been
replaced with an ISD::CondCode.
Matthias Springer [Tue, 4 Jul 2023 12:18:32 +0000 (14:18 +0200)]
[mlir][linalg] Return tensor::PadOp handle from transform op
"transform.structured.pad" now returns all `tensor::PadOp` in addition to the padded ops.
Also add a test case that shows how to force an allocation for "tensor.pad" ops with a custom memory space.
Differential Revision: https://reviews.llvm.org/D153555
Matthias Springer [Tue, 4 Jul 2023 12:03:02 +0000 (14:03 +0200)]
[mlir][NFC] Use `getConstantIntValue` instead of casting to `ConstantIndexOp`
`getConstantIntValue` extracts constant values from all constant-like ops, not just `arith::ConstantIndexOp`.
Differential Revision: https://reviews.llvm.org/D154356
Martin Braenne [Tue, 4 Jul 2023 10:41:05 +0000 (10:41 +0000)]
[clang][dataflow] Add a test for a struct that is directly self-referential through a reference.
The ongoing migration to strict handling of value
categories (see https://discourse.llvm.org/t/70086) will change the way we
handle fields of reference type, and I want to put a test in place that makes
sure we continue to handle this special case correctly.
Depends On D154420
Reviewed By: gribozavr2, xazax.hun
Differential Revision: https://reviews.llvm.org/D154421
Martin Braenne [Tue, 4 Jul 2023 10:40:19 +0000 (10:40 +0000)]
[clang][dataflow] Model variables / fields / funcs used in default initializers.
The newly added test fails without the other changes in this patch.
Reviewed By: sammccall, gribozavr2
Differential Revision: https://reviews.llvm.org/D154420
Quentin Colombet [Mon, 3 Jul 2023 16:19:00 +0000 (18:19 +0200)]
[mlir][arith] Move getNeutralElement from Linalg utils to arith
This consolidates where this kind of implementations lives and
refactor the code to have more code sharing.
NFC
Differential Revision: https://reviews.llvm.org/D154362
Benjamin Kramer [Tue, 4 Jul 2023 11:34:03 +0000 (13:34 +0200)]
Jay Foad [Wed, 21 Jun 2023 20:16:08 +0000 (21:16 +0100)]
[AMDGPU] Do not wait for vscnt on function entry and return
SIInsertWaitcnts inserts waitcnt instructions to resolve data
dependencies. The GFX10+ vscnt (VMEM store count) counter is never used
in this way. It is only used to resolve memory dependencies, and that is
handled by SIMemoryLegalizer. Hence there is no need to conservatively
wait for vscnt to be 0 on function entry and before returns.
Differential Revision: https://reviews.llvm.org/D153537
Alexey Lapshin [Sat, 1 Jul 2023 10:20:50 +0000 (12:20 +0200)]
[DWARFLinker][NFC] Remove RangesTy &getValidAddressRanges().
This patch simplifies line table generation. It removes global
array of all units ranges(RangesTy &getValidAddressRanges()).
The comment says that global array of all units ranges is necessary
to handle corner cases inside line table rows. Removing that
special handling shows that its current usage is handling of
"end of range case" which is already handled correctly
(without special handling). .debug_line tables for clang binary
built with and without this patch are equal.
Differential Revision: https://reviews.llvm.org/D154288
Jan Svoboda [Tue, 4 Jul 2023 10:33:56 +0000 (12:33 +0200)]
[clang][modules] Mark fewer identifiers as out-of-date
In `clang-scan-deps` contexts, the number of interesting identifiers in PCM files is fairly low (only macros), while the number of identifiers in the importing instance is high (builtins). Marking the whole identifier table out-of-date triggers lots of benign and expensive calls to `ASTReader::updateOutOfDateIdentifiers()`. (That unfortunately happens even for unused identifiers due to `SemaRef.IdResolver.begin(II)` line in `ASTWriter::WriteASTCore()`.)
This patch makes the main code path more similar to C++ modules, where the PCM files have `INTERESTING_IDENTIFIERS` section which lists identifiers that get created in the identifier table of the importing instance and marked as out-of-date. The only difference is that the main code path doesn't *create* identifiers in the table and relies on the importing instance calling `ASTReader::get()` when creating new identifier on-demand. It only marks existing identifiers as out-of-date.
This speeds up `clang-scan-deps` by 5-10%.
Reviewed By: Bigcheese, benlangmuir
Differential Revision: https://reviews.llvm.org/D151277