Mircea Trofin [Fri, 16 Sep 2022 04:01:56 +0000 (21:01 -0700)]
[lld][thinlto] Include -mllvm options in the thinlto cache key
They may modify thinlto optimization.
This patch only extends support for `-mllvm`. There is another way to
pass llvm flags, `-plugin-opt`, but its processing is different and will
be provided in a subsequent patch.
Differential Revision: https://reviews.llvm.org/D134013
Haojian Wu [Mon, 19 Sep 2022 18:56:39 +0000 (20:56 +0200)]
Fix one more unused warning in release build, NFC
Haojian Wu [Mon, 19 Sep 2022 18:44:59 +0000 (20:44 +0200)]
Fix an unused warning in release build, NFC
Fangrui Song [Mon, 19 Sep 2022 18:41:16 +0000 (11:41 -0700)]
[Object] Add zstd decompression support to Decompressor
llvm::object::Decompressor is used by many DWARF consumers like llvm-dwarfdump,
llvm-dwp, llvm-symbolizer. Add tests to them. The lldb test can be left to
D133530.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D134116
Wei Yi Tee [Mon, 19 Sep 2022 18:13:50 +0000 (18:13 +0000)]
[clang][dataflow] Modify `transfer` in `DataflowModel` to take `CFGElement` as input instead of `Stmt`.
To keep API of transfer functions consistent.
The single use of this transfer function in `ChromiumCheckModel` is also updated.
Reviewed By: gribozavr2, sgatev
Differential Revision: https://reviews.llvm.org/D133933
jeff [Mon, 19 Sep 2022 18:08:55 +0000 (18:08 +0000)]
[AMDGPU] [DAGCombiner] Precommit test for D133584
Change-Id: I488ac9b23718f8d0b28db034c4cc455ae736e785
Nuno Lopes [Mon, 19 Sep 2022 18:25:14 +0000 (19:25 +0100)]
add test for -enable-global-analyses=0 [NFC]
Krzysztof Parzyszek [Fri, 2 Sep 2022 19:04:49 +0000 (12:04 -0700)]
[Hexagon] Implement [SU]INT_TO_FP and FP_TO_[SU]INT for HVX
Nirvedh Meshram [Fri, 16 Sep 2022 19:18:43 +0000 (12:18 -0700)]
[mlir][spirv] fix ordering in Intel joint matrix ops
Reviwed By: antiagainst
Differential Revision: https://reviews.llvm.org/D134069
Wei Yi Tee [Mon, 19 Sep 2022 17:36:50 +0000 (17:36 +0000)]
[clang][dataflow] Replace `transfer(const Stmt *, ...)` with `transfer(const CFGElement *, ...)` in `clang/Analysis/FlowSensitive`.
Reviewed By: gribozavr2, sgatev
Differential Revision: https://reviews.llvm.org/D133931
Vitaly Buka [Thu, 15 Sep 2022 17:39:16 +0000 (10:39 -0700)]
[IRBuilder] Use PoisonValue in CreateMasked*
Followup to
72b776168c7c80d2035c7226488462dcffc97e75
Reviewed By: nlopes
Differential Revision: https://reviews.llvm.org/D133967
Kazu Hirata [Mon, 19 Sep 2022 17:42:49 +0000 (10:42 -0700)]
Fix unused variable warnings:
This patch fixes warnings during a release build:
mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp:198:52: error:
lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
bolt/lib/Rewrite/RewriteInstance.cpp:5318:18: error: unused variable
'HasNoAddress' [-Werror,-Wunused-variable]
Stanley Winata [Mon, 19 Sep 2022 17:28:55 +0000 (13:28 -0400)]
[mlir][spirv] Lower arith max/min ops to OpenCL ones
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D132881
Wei Yi Tee [Mon, 19 Sep 2022 16:56:35 +0000 (16:56 +0000)]
[clang][dataflow] Replace usage of deprecated functions with the optional check
- Update `transfer` and `diagnose` to take `const CFGElement *` as input in `Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel`.
- Update `clang-tools-extra/clang-tidy/bugprone/UncheckedOptionalAccessCheck.cpp` accordingly.
- Rename `runDataflowAnalysisOnCFG` to `runDataflowAnalysis` and remove the deprecated `runDataflowAnalysis` (this was only used by the now updated optional check).
Reviewed By: gribozavr2, sgatev
Differential Revision: https://reviews.llvm.org/D133930
Stanley Winata [Mon, 19 Sep 2022 17:14:58 +0000 (13:14 -0400)]
[mlir][spirv] Support OpenCL when lowering memref load/store
-Add awareness to Kernel vs Shader capability for memref to SPIR-V
lowering.
-Add lowering using spv.PtrAccessChain for Kernel capability.
-Enable lowering from scalar pointee types for kernel capabilities.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D132714
Haojian Wu [Mon, 19 Sep 2022 13:04:42 +0000 (15:04 +0200)]
[clang] Fix a nullptr-access crash in CheckTemplateArgument.
It is possible that we can pass a null ParamType to
CheckNonTypeTemplateParameter -- the ParamType var can be reset to a null
type on Line 6940, and the followed bailout if is not entered.
Differential Revision: https://reviews.llvm.org/D134180
Florian Hahn [Mon, 19 Sep 2022 17:14:34 +0000 (18:14 +0100)]
[LV] Keep track of cost-based ScalarAfterVec in VPWidenPointerInd.
Epilogue vectorization uses isScalarAfterVectorization to check if
widened versions for inductions need to be generated and bails out in
those cases.
At the moment, there are scenarios where isScalarAfterVectorization
returns true but VPWidenPointerInduction::onlyScalarsGenerated would
return false, causing widening.
This can lead to widened phis with incorrect start values being created
in the epilogue vector body.
This patch addresses the issue by storing the cost-model decision in
VPWidenPointerInductionRecipe and restoring the behavior before 151c144.
This effectively reverts 151c144, but the long-term fix is to properly
support widened inductions during epilogue vectorization
Fixes #57712.
Craig Topper [Mon, 19 Sep 2022 17:10:57 +0000 (10:10 -0700)]
[LangRef] Clarify that noimplicitfloat disables all implicit vectors not just floating point.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D134086
Simon Pilgrim [Mon, 19 Sep 2022 17:13:05 +0000 (18:13 +0100)]
[LoopIdiom] Add non-LZCNT target test coverage
Krzysztof Parzyszek [Mon, 19 Sep 2022 17:05:03 +0000 (10:05 -0700)]
[Hexagon] Add HVX patterns for ISD::ABS
Vitaly Buka [Mon, 19 Sep 2022 17:10:42 +0000 (10:10 -0700)]
[test][clangd] Join back -Xclang and -undef
Craig Topper [Mon, 19 Sep 2022 16:45:27 +0000 (09:45 -0700)]
[LV] Remove FIXME about NoImplicitFloat. NFC
My understanding is that NoImplicitFloat, despite it's name, is
supposed to disable all vectors not just float vectors.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D134084
zhijian [Mon, 19 Sep 2022 17:00:25 +0000 (13:00 -0400)]
Fixed llvm-nm.rst:145:Block quote ends without a blank line; unexpected unindent.
ninja: build stopped: subcommand failed.
Lei Zhang [Mon, 19 Sep 2022 16:48:43 +0000 (12:48 -0400)]
[mlir][tensor] Fold round-tripping extract/insert slice ops
Reviewed By: ThomasRaoux, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D133909
Mathieu Fehr [Mon, 19 Sep 2022 16:47:37 +0000 (09:47 -0700)]
[mlir] Add Dynamic Dialects
Dynamic dialects are dialects that can be defined at runtime.
Dynamic dialects are extensible by new operations, types, and
attributes at runtime.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125201
Michał Górny [Mon, 19 Sep 2022 16:00:47 +0000 (18:00 +0200)]
[clang] Make config-related options CoreOptions
Make `--config`, `--no-default-config` and `--config-*-dir` CoreOptions
to enable their availability to all clang driver modes. This improves
consistency given that the default set of configuration files is
processed independently of mode anyway.
Differential Revision: https://reviews.llvm.org/D134191
Sebastian Peryt [Mon, 19 Sep 2022 16:28:24 +0000 (09:28 -0700)]
[NFC][1/n] Remove -enable-new-pm=0 flags from lit tests
This is the first patch in a series intended for removing flag
-enable-new-pm=0 from lit tests. This is part of a bigger
effort of completely removing legacy code related to legacy
pass manager in favor of currently default new pass manager.
In this patch flag has been removed only from tests where no significant
change has been required because checks has been duplicated for
both PMs.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D134150
Felipe de Azevedo Piovezan [Tue, 13 Sep 2022 15:30:07 +0000 (11:30 -0400)]
[lldb] Reset breakpoint hit count before new runs
A common debugging pattern is to set a breakpoint that only stops after
a number of hits is recorded. The current implementation never resets
the hit count of breakpoints; as such, if a user re-`run`s their
program, the debugger will never stop on such a breakpoint again.
This behavior is arguably undesirable, as it renders such breakpoints
ineffective on all but the first run. This commit changes the
implementation of the `Will{Launch, Attach}` methods so that they reset
the _target's_ breakpoint hitcounts.
Differential Revision: https://reviews.llvm.org/D133858
mydeveloperday [Mon, 19 Sep 2022 16:48:58 +0000 (17:48 +0100)]
[clang-format] JSON formatting add new option for controlling newlines in json arrays
Working in a mixed environment of both vscode/vim with a team configured prettier configuration, this can leave clang-format and prettier fighting each other over the formatting of arrays, both simple arrays of elements.
This review aims to add some "control knobs" to the Json formatting in clang-format to help align the two tools so they can be used interchangeably.
This will allow simply arrays `[1, 2, 3]` to remain on a single line but will break those arrays based on context within that array.
Happy to change the name of the option (this is the third name I tried)
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D133589
Keith Smiley [Thu, 21 Jul 2022 03:09:03 +0000 (20:09 -0700)]
[ORC] Fix macho section name typo
I don't think __obj_selrefs is a thing, but __objc_selrefs definitely
is.
Differential Revision: https://reviews.llvm.org/D130221
Rahul Joshi [Fri, 16 Sep 2022 23:02:35 +0000 (16:02 -0700)]
BEGIN_PUBLIC
Use isa<> instead of dyn_cast
END_PUBLIC
Differential Revision: https://reviews.llvm.org/D134092
Krzysztof Parzyszek [Mon, 19 Sep 2022 16:35:23 +0000 (09:35 -0700)]
[Hexagon] Rework SplitHvxPairOp to be a general vector splitting utiity
Enable creating an idiom: V -> opJoin(SplitVectorOp(V))
Simon Pilgrim [Mon, 19 Sep 2022 16:37:54 +0000 (17:37 +0100)]
[CostModel][X86] Add CostKinds handling for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF instructions
This was achieved with the 'cost-tables vs llvm-mca' script D103695
Xiang Li [Sat, 10 Sep 2022 05:54:34 +0000 (22:54 -0700)]
[clang] Allow vector of BitInt
Remove check which disable BitInt as element type for ext_vector.
Enabling it for HLSL to use _BitInt(16) as 16bit int at https://reviews.llvm.org/D133668
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D133634
zhijian [Mon, 19 Sep 2022 16:14:02 +0000 (12:14 -0400)]
fixed a compiler error as description in
https://lab.llvm.org/buildbot/#/builders/174/builds/13432
XCOFFObjectFile.cpp:805:12: error: reinterpret_cast from 'unsigned long' to 'uintptr_t' (aka 'unsigned int') is not allowed
return reinterpret_cast<uintptr_t>(0ul);
Florian Hahn [Mon, 19 Sep 2022 16:12:31 +0000 (17:12 +0100)]
[LV] Move new epilog-vectorization-widen-inductions.ll to AArch64 dir.
The test requires the AArch64 backend, so move it to the right subdir.
Florian Hahn [Mon, 19 Sep 2022 16:10:40 +0000 (17:10 +0100)]
[LV] Add tests for epilogue vectorization with widened inductions.
Includes a test for the miscompile in #57712.
Krzysztof Parzyszek [Mon, 19 Sep 2022 16:03:12 +0000 (09:03 -0700)]
[Hexagon] Use proper output chain when widening HVX loads
Mingming Liu [Mon, 19 Sep 2022 05:59:24 +0000 (22:59 -0700)]
[NFC] Use opaqueptr in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll
Use opaqueptr for test case
llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll.
- Adjust variable number accordingly since bitcast between different pointer
types are not necessary.
Differential Revision: https://reviews.llvm.org/D134159
zhijian [Mon, 19 Sep 2022 15:57:45 +0000 (11:57 -0400)]
fixed a compiler error as description in
https://lab.llvm.org/buildbot/#/builders/216/builds/9977
XCOFFOtFile.cpp: error C3487: 'unsigned long': all return expressions must deduce to the same type: previously it was 'uintptr_t'
Katherine Rasmussen [Mon, 12 Sep 2022 15:18:30 +0000 (08:18 -0700)]
[flang] Write semantics test for atomic_and
Write a semantics test for the atomic intrinsic subroutine,
atomic_and.
Reviewed By: rouson
Differential Revision: https://reviews.llvm.org/D133727
Simon Pilgrim [Mon, 19 Sep 2022 15:44:03 +0000 (16:44 +0100)]
[CostModel][X86] Add CostKinds handling for vector ctlz instructions
This was achieved with the 'cost-tables vs llvm-mca' script D103695
spupyrev [Fri, 15 Jul 2022 19:26:40 +0000 (12:26 -0700)]
[BOLT] Unifying implementations of ext-tsp
After BOLT's merge to LLVM, there are two (almost identical) versions of the
code layout algorithm. The diff unifies the implementations by keeping the one
in LLVM.
There are mild changes in the resulting block orders. I tested the changes
extensively both on the clang binary and on prod services. Didn't see stat sig
differences on average.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D129895
zhijian [Mon, 19 Sep 2022 15:27:19 +0000 (11:27 -0400)]
[AIX] llvm-nm support environment "OBJECT_MODE" for option -X on AIX OS
Summary:
according nm in AIX OS , https://www.ibm.com/docs/en/aix/7.2?topic=n-nm-command
In AIX OS, The default is to process 32-bit object files (ignore 64-bit objects). The mode can also be set with the OBJECT_MODE environment variable. For example, OBJECT_MODE=64 causes nm to process any 64-bit objects and ignore 32-bit objects. The -X flag overrides the OBJECT_MODE variable.
In non AIX OS. The default is to process all support object files. and not support the OBJECT_MODE environment variable.
Reviewers: James Henderson
Differential Revision: https://reviews.llvm.org/D132494
Tue Ly [Mon, 19 Sep 2022 15:20:41 +0000 (11:20 -0400)]
[libc][Obvious] Fix exp10f spec.
Louis Dionne [Tue, 6 Sep 2022 21:07:18 +0000 (17:07 -0400)]
[libc++] Always query the compiler to find whether a type is always lockfree
In https://llvm.org/D56913, we added an emulation for the __atomic_always_lock_free
compiler builtin when compiling in Freestanding mode. However, the emulation
did (and could not) give exactly the same answer as the compiler builtin,
which led to a potential ABI break for e.g. enum classes.
After speaking to the original author of D56913, we agree that the correct
behavior is to instead always use the compiler builtin, since that provides
a more accurate answer, and __atomic_always_lock_free is a purely front-end
builtin which doesn't require any runtime support. Furthermore, it is
available regardless of the Standard mode (see https://godbolt.org/z/cazf3ssYY).
However, this patch does constitute an ABI break. As shown by https://godbolt.org/z/1eoex6zdK:
- In LLVM <= 11.0.1, an atomic<enum class with 1 byte> would not contain a lock byte.
- In LLVM >= 12.0.0, an atomic<enum class with 1 byte> would contain a lock byte.
This patch breaks the ABI again to bring it back to 1 byte, which seems
like the correct thing to do.
Fixes #57440
Differential Revision: https://reviews.llvm.org/D133377
Simon Pilgrim [Mon, 19 Sep 2022 15:00:36 +0000 (16:00 +0100)]
Fix MSVC warning "all return expressions must deduce to the same type"
Sam McCall [Wed, 7 Sep 2022 18:03:44 +0000 (20:03 +0200)]
[clangd] Allow programmatically disabling rename of virtual method hierarchies.
This feature relies on Relations in the index being complete.
An out-of-tree index implementation is missing some override relations, so
such renames end up breaking the code.
We plan to fix it, but this flag is a cheap band-aid for now.
Differential Revision: https://reviews.llvm.org/D133440
Simon Pilgrim [Mon, 19 Sep 2022 14:56:55 +0000 (15:56 +0100)]
[CostModel][X86] Add CostKinds handling for cttz
This was achieved with the 'cost-tables vs llvm-mca' script D103695
zhijian [Mon, 19 Sep 2022 14:55:48 +0000 (10:55 -0400)]
[AIX] llvm-readobj support a new option --exception-section for xcoff object file.
Summary:
llvm-readobj support a new option --exception-section for xcoff object file.
https://www.ibm.com/docs/en/aix/7.2?topic=formats-xcoff-object-file-format#XCOFF__iua3i23ajbau
Reviewers: James Henderson,Paul Scoropan
Differential Revision: https://reviews.llvm.org/D133030
Sam McCall [Thu, 15 Sep 2022 22:41:32 +0000 (00:41 +0200)]
[clangd] Improve inlay hints of things expanded from macros
When we aim a hint at some expanded tokens, we're only willing to attach it
to spelled tokens that exactly corresponde.
e.g.
int zoom(int x, int y, int z);
int dummy = zoom(NUMBERS);
Here we want to place a hint "x:" on the expanded "1", but we shouldn't
be willing to place it on NUMBERS, because it doesn't *exactly*
correspond (it has more tokens).
Fortunately we don't even have to implement this algorithm from scratch,
TokenBuffer has it.
Fixes https://github.com/clangd/clangd/issues/1289
Fixes https://github.com/clangd/clangd/issues/1118
Fixes https://github.com/clangd/clangd/issues/1018
Differential Revision: https://reviews.llvm.org/D133982
Benjamin Kramer [Mon, 19 Sep 2022 14:38:20 +0000 (16:38 +0200)]
[bazel] Port
233de4e808b3
Guray Ozen [Mon, 19 Sep 2022 10:19:21 +0000 (12:19 +0200)]
[mlir] Add map_nested_foreach_thread_to_gpu_threads op to transform dialect
This revision adds a new op `map_nested_foreach_thread_to_gpu_threads` to transform dialect. The op searches `scf.foreach_threads` inside the `gpu_launch` and distributes them with `gpu.thread_id` attribute.
Loop mapping is explicit and given by the `map_nested_foreach_thread_to_gpu_threads` op. Mapping is done one-to-one, therefore the loops dissappear.
The dynamic trip count or trip count that are larger than thread size are not supported for the time being. However, we can indeed support them by generating a loop inside with cyclic scheduling.
For the time being, trip counts that are dynamic or bigger than thread sizes are not supported. However, in the future the compiler can indeed generate a loop with static cyclic scheduling to support these cases.
Current mechanism allows `scf.foreach_threads` to be siblings or nested. There cannot be interleaving code between the loops when they are nested.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D133950
Tue Ly [Mon, 19 Sep 2022 14:13:29 +0000 (10:13 -0400)]
[libc][Obvious] Remove constexpr qualifier from Exp10Base::powb_lo.
Tue Ly [Sat, 17 Sep 2022 05:59:54 +0000 (01:59 -0400)]
[libc][math] Implement exp10f function correctly rounded to all rounding modes.
Implement exp10f function correctly rounded to all rounding modes.
Algorithm: perform range reduction to reduce
```
10^x = 2^(hi + mid) * 10^lo
```
where:
```
hi is an integer,
0 <= mid * 2^5 < 2^5
-log10(2) / 2^6 <= lo <= log10(2) / 2^6
```
Then `2^mid` is stored in a table of 32 entries and the product `2^hi * 2^mid` is
performed by adding `hi` into the exponent field of `2^mid`.
`10^lo` is then approximated by a degree-5 minimax polynomials generated by Sollya with:
```
> P = fpminimax((10^x - 1)/x, 4, [|D...|], [-log10(2)/64. log10(2)/64]);
```
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput : 10.215
System LIBC reciprocal throughput : 7.944
LIBC reciprocal throughput : 38.538
LIBC reciprocal throughput : 12.175 (with `-msse4.2` flag)
LIBC reciprocal throughput : 9.862 (with `-mfma` flag)
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency : 40.744
System LIBC latency : 37.546
BEFORE
LIBC latency : 48.989
LIBC latency : 44.486 (with `-msse4.2` flag)
LIBC latency : 40.221 (with `-mfma` flag)
```
This patch relies on https://reviews.llvm.org/D134002
Reviewed By: orex, zimmermann6
Differential Revision: https://reviews.llvm.org/D134104
Tue Ly [Mon, 19 Sep 2022 13:29:37 +0000 (09:29 -0400)]
[libc][Obvious] Remove constexpr qualifier from ExpBase::powb_lo.
Nikita Popov [Mon, 19 Sep 2022 13:07:03 +0000 (15:07 +0200)]
[SCEV] Don't verify dispositions of invalid loops
This should fix the expensive checks build. Ideally we would not
have invalid loops in LoopDispositions.
Simon Pilgrim [Mon, 19 Sep 2022 13:06:27 +0000 (14:06 +0100)]
[CostModel][X86] Add CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF cost handling
Without LZCNT/BMI, the *_ZERO_UNDEF costs are cheaper as they can avoid the zero handling.
Nikita Popov [Mon, 19 Sep 2022 12:44:39 +0000 (14:44 +0200)]
Revert "[SimplifyCFG] accumulate bonus insts cost"
This reverts commit
e5581df60a35fffb0c69589777e4e126c849405f.
This causes major compile-time regressions, about 2-3% end-to-end
on CTMark.
Tue Ly [Fri, 16 Sep 2022 00:48:50 +0000 (20:48 -0400)]
[libc][math] Improve tanhf performance.
Optimize the core part of `tanhf` implementation that is to compute `e^x`
similar to https://reviews.llvm.org/D133870. Factor the constants and
polynomial approximation out so that it can be used for `exp10f`
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput : 13.377
System LIBC reciprocal throughput : 55.046
BEFORE:
LIBC reciprocal throughput : 75.674
LIBC reciprocal throughput : 33.242 (with `-msse4.2` flag)
LIBC reciprocal throughput : 25.927 (with `-mfma` flag)
AFTER:
LIBC reciprocal throughput : 26.359
LIBC reciprocal throughput : 18.888 (with `-msse4.2` flag)
LIBC reciprocal throughput : 14.243 (with `-mfma` flag)
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency : 43.365
System LIBC latency : 123.499
BEFORE
LIBC latency : 112.968
LIBC latency : 104.908 (with `-msse4.2` flag)
LIBC latency : 92.310 (with `-mfma` flag)
AFTER
LIBC latency : 69.828
LIBC latency : 63.874 (with `-msse4.2` flag)
LIBC latency : 57.427 (with `-mfma` flag)
```
Reviewed By: orex, zimmermann6
Differential Revision: https://reviews.llvm.org/D134002
Simon Pilgrim [Fri, 16 Sep 2022 17:29:56 +0000 (18:29 +0100)]
[SLP][X86] Add AVX512 test coverage to CTLZ/CTTZ tests
Only AVX512 has decent CTTZ/CTLZ vector ops, add tests to ensure we definitely vectorize these
Aaron Ballman [Mon, 19 Sep 2022 11:50:53 +0000 (07:50 -0400)]
Add additional test coverage for C2x N2508
This spotted a mistake with the original patch, so it puts the status
back to "partial" in the C status tracking page.
This amends
510383626fe146e49ae5fa036638e543ce71e5d9.
Simon Pilgrim [Mon, 19 Sep 2022 11:44:36 +0000 (12:44 +0100)]
[DAG] SimplifyDemandedVectorElts - add MULHS/MULHU handling to existing MUL/AND handling
Allows to determine known zero elements, which particularly helps simplification of DIV/REM by constant patterns
Aaron Ballman [Mon, 19 Sep 2022 11:37:41 +0000 (07:37 -0400)]
Fix a typo in the release notes; NFC
Nicolas Vasilache [Mon, 19 Sep 2022 09:04:39 +0000 (02:04 -0700)]
[mlir][Transform] Add a new navigation op to retrieve the producer of an operand
Given an opOperand uniquely determined by the operation `%op` and the operand number `num`,
the `transform.get_producer_of_operand %op[num]` returns the handle to the unique operation
that produced the SSA value used as opOperand.
The transform fails if the operand is a block argument.
Differential Revision: https://reviews.llvm.org/D134171
Nicolas Vasilache [Mon, 19 Sep 2022 09:03:48 +0000 (02:03 -0700)]
[mlir][Linalg] NFC - Cleanup internal transform APIs and produce better messages on failure to apply.
Max Kazantsev [Mon, 19 Sep 2022 10:52:55 +0000 (17:52 +0700)]
[LoopRotate] Drop loop dispositions when rotating loops. PR56260
This is required because if there is a pure loop-invariant instruction, Loop Rotation
may decide to not clone it and just hoist it instead. If SCEV has previously cached
that it was loop-variant (not being smart enough to prove invariance), we may end
up with inconsistent cache state (which may later trigger false-negative assertion
failures checking that something was invariant).
This is a conservative fix that unconditionally drops the dispositions. We could
only drop it if the hoisting has actually happened, but it should take some time
understanding whether it's safe with all other things this function does.
Differential Revision: https://reviews.llvm.org/D134167
Reviewed By: fhahn
Nuno Lopes [Mon, 19 Sep 2022 10:59:35 +0000 (11:59 +0100)]
Introduce -enable-global-analyses to allow users to disable inter-procedural analyses
Alive2 doesn't support verification of optimizations that use inter-procedural analyses.
Right now, clang uses GlobalsAA by default and there's no way to disable it.
This leads to Alive2 producing false positives.
The added flag allows us to skip global analyses altogether.
Differential Revision: https://reviews.llvm.org/D134139
LLVM GN Syncbot [Mon, 19 Sep 2022 10:55:29 +0000 (10:55 +0000)]
[gn build] Port
1146d40d9ab2
Simon Pilgrim [Mon, 19 Sep 2022 10:53:25 +0000 (11:53 +0100)]
[UnitTests] Add ShuffleVectorInst unit test coverage for shuffle mask kind matchers
Add tests for the core static shuffle pattern match helpers
Lorenzo Chelini [Mon, 19 Sep 2022 10:34:49 +0000 (12:34 +0200)]
[MLIR][Linalg] introduce batch-reduce GEMM
The batch-reduce GEMM kernel essentially multiplies a sequence of input tensor
blocks (which form a batch) and the partial multiplication results are reduced
into a single output tensor block.
See: https://ieeexplore.ieee.org/document/9139809 for more details.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134163
Max Kazantsev [Mon, 19 Sep 2022 10:39:13 +0000 (17:39 +0700)]
[LoopFuse] Drop loop dispositions before reassigning blocks to other loop
This bug was found by recent improvement in SCEV verifier. The code in LoopFuse
directly reassigns blocks to be a part of a different loop, which should automatically
invalidate all related cached loop dispositions.
Differential Revision: https://reviews.llvm.org/D134173
Reviewed By: nikic
Max Kazantsev [Mon, 19 Sep 2022 10:37:17 +0000 (17:37 +0700)]
[SCEV] Verify contents of loop disposition cache
It seems that it is sometimes broken. Initial motivation for this was
investigation of https://github.com/llvm/llvm-project/issues/56260, but
it also seems that we have found an unrelated bug in LoopFusion that leaves
broken caches.
Differential Revision: https://reviews.llvm.org/D134158
Reviewed By: nikic
David Green [Mon, 19 Sep 2022 10:34:00 +0000 (11:34 +0100)]
[AArch64] Use fast-math-flags in isAssociativeAndCommutative
Previously only using the UnsafeFPMath option, this now looks for the
fast moth flags on the instructions, using the same flag flags as other
backends.
Lorenzo Chelini [Mon, 19 Sep 2022 10:17:30 +0000 (12:17 +0200)]
Revert "[MLIR][Linalg] introduce batch-reduce GEMM"
This reverts commit
f381768a8da6bd6bde8bdff34f080bf12bf20064.
lorenzo chelini [Mon, 19 Sep 2022 10:11:04 +0000 (12:11 +0200)]
[MLIR][Linalg] introduce batch-reduce GEMM
The batch-reduce GEMM kernel essentially multiplies a sequence of input tensor
blocks (which form a batch) and the partial multiplication results are reduced
into a single output tensor block.
See: https://ieeexplore.ieee.org/document/9139809 for more details.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134163
Simon Pilgrim [Mon, 19 Sep 2022 09:25:48 +0000 (10:25 +0100)]
[LoopVectorize] Regenerate runtime-check.ll
Simon Pilgrim [Mon, 19 Sep 2022 09:22:48 +0000 (10:22 +0100)]
[LoopVectorize][X86] Use quotes around the pass list to appease DOS cmd evaluation
DOS can't handle -passes='default<O3>' correctly
Nuno Lopes [Mon, 19 Sep 2022 09:18:45 +0000 (10:18 +0100)]
[LangRef] Change masked-off lanes from undef to poison for llvm.vp.* intrinsics
As discussed in https://reviews.llvm.org/D133967
Valentin Clement [Mon, 19 Sep 2022 07:51:11 +0000 (09:51 +0200)]
[flang][NFC] Remove not polymorphic from assumed type
Max Kazantsev [Mon, 19 Sep 2022 07:05:42 +0000 (14:05 +0700)]
[SCEV][NFC] Remove unused parameter from forgetLoopDispositions
Let's be honest about it, we don't drop loop dispositions for
particular loops. Remove the parameter that misleadingly makes
it apparent that we do.
Fangrui Song [Mon, 19 Sep 2022 06:25:58 +0000 (23:25 -0700)]
[TableGen] Optimize APInt |= with setBit. NFC
Zi Xuan Wu (Zeson) [Wed, 14 Sep 2022 07:52:27 +0000 (15:52 +0800)]
[llvm-tblgen] CodeGenSchedModels::hasReadOfWrite gets wrong predication result
CodeGenSchedModels::hasReadOfWrite tries to predicate whether the WriteDef is contained in the list of ValidWrites of someone ProcReadAdvance,
so that WriteID of WriteDef can be compressed and reusable.
It tries to iterate all ProcReadAdvance entry, but not all ProcReadAdvance defs also inherit from SchedRead.
Some ProcReadAdvances are defined by ReadAdvance.So it's not complete to enumerate all ProcReadAdvances if just iterate all SchedReads.
Differential Revision: https://reviews.llvm.org/D132205
LiaoChunyu [Mon, 19 Sep 2022 06:07:39 +0000 (14:07 +0800)]
[RISCV]Preserve (and X, 0xffff) in targetShrinkDemandedConstant
shrinkdemandedconstant does some optimizations, but is not very friendly to riscv, targetShrinkDemandedConstant to limit the damage.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D134155
Kazu Hirata [Mon, 19 Sep 2022 06:09:40 +0000 (23:09 -0700)]
[mlir] Don't include StringSwitch.h (NFC)
These files don't seem to use StringSwitch.
Mingming Liu [Fri, 16 Sep 2022 04:26:43 +0000 (21:26 -0700)]
[NFC][SimplifyCFG]Precommit test case to show inner-loop metadata may not be preserved
- There is an outer while-loop and an inner for-loop in the test case.
Inner-loop has `llvm.loop.unroll.enable` metadata that is not
preserved. This happens around [1], when the loop metadata of outer loop
overrides the inner loop metadata directly, without looking at whether inner-loop
itself has loop metadata.
[1] https://github.com/llvm/llvm-project/blob/
ab755e65629ea098cb6faa77b13ac087849ffc67/llvm/lib/Transforms/Utils/Local.cpp#L1146
Differential Revision: https://reviews.llvm.org/D134014
Christian Sigg [Mon, 19 Sep 2022 05:46:21 +0000 (07:46 +0200)]
[MLIR] NFC: improve comment about MLIR_CMAKE_DIR.
Kazu Hirata [Mon, 19 Sep 2022 05:21:32 +0000 (22:21 -0700)]
[clang] Don't include StringSwitch.h (NFC)
These files don't seem to use StringSwitch.
Kazu Hirata [Mon, 19 Sep 2022 05:01:32 +0000 (22:01 -0700)]
[llvm] Deprecate llvm::empty (NFC)
This patch deprecates llvm::empty as I've migrated all known uses of
llvm::empty(x) to x.empty().
Differential Revision: https://reviews.llvm.org/D134141
bixia1 [Fri, 16 Sep 2022 18:30:53 +0000 (11:30 -0700)]
[mlir][sparse] Add push_back op to support code generation.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D134062
Weining Lu [Mon, 19 Sep 2022 02:23:21 +0000 (10:23 +0800)]
[Clang][LoongArch] Implement ABI lowering
Reuse most of RISCV's implementation with several exceptions:
1. Assign signext/zeroext attribute to args passed in stack.
On RISCV, integer scalars passed in registers have signext/zeroext
when promoted, but are anyext if passed on the stack. This is defined
in early RISCV ABI specification. But after this change [1], integers
should also be signext/zeroext if passed on the stack. So I think
RISCV's ABI lowering should be updated [2].
While in LoongArch ABI spec, we can see that integer scalars narrower
than GRLEN bits are zero/sign-extended no matter passed in registers
or on the stack.
2. Zero-width bit fields are ignored.
This matches GCC's behavior but it hasn't been documented in ABI sepc.
See https://gcc.gnu.org/r12-8294.
3. `char` is signed by default.
There is another difference worth mentioning is that `char` is signed
by default on LoongArch while it is unsigned on RISCV.
This patch also adds `_BitInt` type support to LoongArch and handle it
in LoongArchABIInfo::classifyArgumentType.
[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/
cec39a064ee0e5b0129973fffab7e3ad1710498f
[2] https://github.com/llvm/llvm-project/issues/57261
Differential Revision: https://reviews.llvm.org/D132285
Chuanqi Xu [Mon, 19 Sep 2022 03:03:46 +0000 (11:03 +0800)]
[C++] [Modules] Generate the initializer for modules if we compile a
module unit directly
Previously we lack a test which ensures that the module unit will
generate initializer if it is compiled directly (instead of from a pcm
file). Now we add the test back.
jacquesguan [Fri, 16 Sep 2022 06:57:55 +0000 (14:57 +0800)]
[mlir][Math] Add constant folder for ErfOp.
This patch adds constant folder for ErfOp by using erf/erff of libm.
Reviewed By: ftynse, Mogball
Differential Revision: https://reviews.llvm.org/D134017
Kazu Hirata [Mon, 19 Sep 2022 02:45:34 +0000 (19:45 -0700)]
[llvm] Use has_value instead of hasValue (NFC)
wanglian [Mon, 19 Sep 2022 02:28:30 +0000 (10:28 +0800)]
[CUDA][NFC] Rename 'addDeviceDepences' to 'addDeviceDependences' in DeviceActionBuilder.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D134007
Chuanqi Xu [Mon, 19 Sep 2022 02:35:00 +0000 (10:35 +0800)]
[NFC] Move the position of CodeGen/module-initializer*.cpp
Previsouly the module-initializer*.cpp lives in the CodeGen dir instead
of CodeGenCXX dir, which is not consistency with other tests since
modules are features for C++.
LiaoChunyu [Mon, 19 Sep 2022 02:22:31 +0000 (10:22 +0800)]
[RISCV][NFC]Remove outdated comment from targetShrinkDemandedConstant
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D134154
Kazu Hirata [Mon, 19 Sep 2022 01:41:02 +0000 (18:41 -0700)]
Use std::make_unsigned_t (NFC)
Emilia Dreamer [Sun, 18 Sep 2022 23:49:58 +0000 (02:49 +0300)]
[clang-format] Disallow requires clauses to become function declarations
There already exists logic to disallow requires *expressions* to be
treated as function declarations, but this expands it to include
requires *clauses*, when they happen to also be parenthesized.
Previously, in the following case:
```
template <typename T>
requires(Foo<T>)
T foo();
```
The line with the requires clause was actually being considered as the
line with the function declaration due to the parentheses, and the
*real* function declaration on the next line became a trailing
annotation
(Together with https://reviews.llvm.org/D134049) Fixes https://github.com/llvm/llvm-project/issues/56213
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D134052
Emilia Dreamer [Sun, 18 Sep 2022 23:47:42 +0000 (02:47 +0300)]
[clang-format] Disallow trailing return arrows to be operators
In the following construction:
`template <typename T> requires Foo<T> || Bar<T> auto func() -> int;`
The `->` of the trailing return type was actually considered as an
operator as part of the binary operation in the requires clause, with
the precedence level of `PrecedenceArrowAndPeriod`, leading to fake
parens being inserted in strange locations, that would never be closed.
Fixes one part of https://github.com/llvm/llvm-project/issues/56213
(the rest will probably be in a separate patch)
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D134049