Mark de Wever [Fri, 19 Nov 2021 15:43:27 +0000 (16:43 +0100)]
[libc++] Improve CMake include directory search.
This patch has been tested in D70631, but it should be reviewed
separately.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D114248
Gabor Marton [Thu, 11 Nov 2021 13:43:03 +0000 (14:43 +0100)]
[Analyzer][Core] Simplify IntSym in SValBuilder
Make the SimpleSValBuilder capable to simplify existing IntSym
expressions based on a newly added constraint on the sub-expression.
Differential Revision: https://reviews.llvm.org/D113754
Kazu Hirata [Mon, 22 Nov 2021 16:21:09 +0000 (08:21 -0800)]
Use std::string::substr (NFC)
Kazu Hirata [Mon, 22 Nov 2021 16:21:07 +0000 (08:21 -0800)]
[Target] Use range-based for loops (NFC)
Alexey Bataev [Mon, 22 Nov 2021 15:41:07 +0000 (07:41 -0800)]
[SLP][NFC]Add a test that reveals the problem in the emission of
vector int division with undefs.
Zarko Todorovski [Mon, 22 Nov 2021 14:39:21 +0000 (09:39 -0500)]
[NFC][llvm][Hexagon] Inclusive Terms remove uses of sanity in Hexagon taget
Most changes are rewording comments but there are some assertions that I rephrased.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D114132
Hsiangkai Wang [Tue, 16 Nov 2021 08:01:37 +0000 (16:01 +0800)]
[RISCV] Reverse the order of loading/storing callee-saved registers.
Currently, we restore the return address register as the last restoring
instruction in the epilog. The next instruction is `ret` usually. It is
a use of return address register. In some microarchitectures, there is
load-to-use data hazard. To avoid the load-to-use data hazard, we could
separate the load instruction from its use as far as possible. In this
patch, we reverse the order of restoring callee-saved registers to
increase the distance of `load ra` and `ret` in the epilog.
Differential Revision: https://reviews.llvm.org/D113967
Dmitry Vyukov [Tue, 27 Apr 2021 11:55:41 +0000 (13:55 +0200)]
tsan: new runtime (v3)
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Depends on D112602.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112603
Dmitry Vyukov [Mon, 22 Nov 2021 07:22:01 +0000 (08:22 +0100)]
tsan: disable instrumentation in runtime callbacks in tests
All runtime callbacks must be non-instrumented with the new tsan runtime
(it's now more picky with respect to recursion into runtime).
Disable instrumentation in Darwin tests as we do in all other tests now.
Differential Revision: https://reviews.llvm.org/D114348
Nikita Popov [Mon, 22 Nov 2021 14:46:46 +0000 (15:46 +0100)]
Revert "[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency"
This reverts commit
d633db8f9dd4a361e60a9030c82adc490d5797e3.
Causes bootstrap assertion failures:
https://lab.llvm.org/buildbot/#/builders/168/builds/3459/steps/9/logs/stdio
Guillaume Chatelet [Mon, 22 Nov 2021 14:31:56 +0000 (14:31 +0000)]
[libc] add memmove basic building blocks
Differential Revision: https://reviews.llvm.org/D113321
Arjun P [Mon, 22 Nov 2021 14:22:54 +0000 (19:52 +0530)]
[MLIR] PresburgerSetTest: fix comment and add a test case
Nikita Popov [Sat, 30 Oct 2021 19:40:14 +0000 (21:40 +0200)]
[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency
This adds validation for consistency of ValueExprMap and
ExprValueMap, and fixes identified issues:
* Addrec construction directly wrote to ValueExprMap in a few places,
without updating ExprValueMap. Add a helper to ensures they stay
consistent. The adjustment in forgetSymbolicName() explicitly
drops the old value from the map, so that we don't rely on it
being overwritten.
* forgetMemoizedResultsImpl() was dropping the SCEV from
ExprValueMap, but not dropping the corresponding entries from
ValueExprMap.
Differential Revision: https://reviews.llvm.org/D113349
Pavel Labath [Thu, 18 Nov 2021 20:27:27 +0000 (21:27 +0100)]
[lldb] Fix [some] leaks in python bindings
Using an lldb_private object in the bindings involves three steps
- wrapping the object in it's lldb::SB variant
- using swig to convert/wrap that to a PyObject
- wrapping *that* in a lldb_private::python::PythonObject
Our SBTypeToSWIGWrapper was only handling the middle part. This doesn't
just result in increased boilerplate in the callers, but is also a
functionality problem, as it's very hard to get the lifetime of of all
of these objects right. Most of the callers are creating the SB object
(step 1) on the stack, which means that we end up with dangling python
objects after the function terminates. Most of the time this isn't a
problem, because the python code does not need to persist the objects.
However, there are legitimate cases where they can do it (and even if
the use case is not completely legitimate, crashing is not the best
response to that).
For this reason, some of our code creates the SB object on the heap, but
it has another problem -- it never gets cleaned up.
This patch begins to add a new function (ToSWIGWrapper), which does all
of the three steps, while properly taking care of ownership. In the
first step, I have converted most of the leaky code (except for
SBStructuredData, which needs a bit more work).
Differential Revision: https://reviews.llvm.org/D114259
Pavel Labath [Thu, 18 Nov 2021 12:52:44 +0000 (13:52 +0100)]
[lldb/test] Make it possible to run the mock gdb server on a single thread
This is a preparatory commit to enable mocking of qemu startup. That
will involve running the mock server in a separate process, so there's
no need for multithreading.
Initialization is moved from the start function into the constructor
(which can then take an actual socket instead of a class), and the run
method is made public.
Depends on D114156.
Differential Revision: https://reviews.llvm.org/D114157
Tobias Gysi [Mon, 22 Nov 2021 13:15:06 +0000 (13:15 +0000)]
[mlir][linalg] Use getAsOpFoldResult in padding (NFC).
After padding, we introduce a ExtractSliceOp to get the final unpadded result. This revision uses getAsOpFoldResult to compute the size of the unpadded result, which guarantees the result type has a partially static shape if some of the sizes of the unpadded result are statically known. At the moment, we rely on canonicalization to cleanup the types after padding.
Depends On D114085
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114153
Tobias Gysi [Mon, 22 Nov 2021 12:49:23 +0000 (12:49 +0000)]
[mlir][linalg] Always generate an extract/insert slice pair when tiling output tensors.
Adapt tiling to always generate an extract/insert slice pair for output tensors even if the tensor is not tiled. Having an explicit extract/insert slice pair simplifies followup transformations such as padding and bufferization. In particular, it makes read and written iteration argument slices explicit.
Depends On D114067
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114085
Tres Popp [Mon, 22 Nov 2021 09:37:42 +0000 (10:37 +0100)]
Rename MlirExecutionEngine lookup to lookupPacked
The purpose of the change is to make clear whether the user is
retrieving the original function or the wrapper function, in line with
the invoke commands. This new functionality is useful for users that
already have defined their own packed interface, so they do not want the
extra layer of indirection, or for users wanting to the look at the
resulting primary function rather than the wrapper function.
All locations, except the python bindings now have a `lookupPacked`
method that matches the original `lookup` functionality. `lookup`
still exists, but with new semantics.
- `lookup` returns the function with a given name. If `bool f(int,int)`
is compiled, `lookup` will return a reference to `bool(*f)(int,int)`.
- `lookupPacked` returns the packed wrapper of the function with the
given name. If `bool f(int,int)` is compiled, `lookupPacked` will return
`void(*mlir_f)(void**)`.
Differential Revision: https://reviews.llvm.org/D114352
Tobias Gysi [Mon, 22 Nov 2021 12:31:40 +0000 (12:31 +0000)]
[mlir][linalg] Remove tile and fuse test pass (NFC).
Remove the tile and fuse test pass that has been replaced by codegen strategy.
Depends On D114067
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114068
Bradley Smith [Thu, 18 Nov 2021 17:03:05 +0000 (17:03 +0000)]
[AArch64][ARM] Add missing SVE/SVE2 features from Cortex-A710
Differential Revision: https://reviews.llvm.org/D114169
Simon Moll [Mon, 22 Nov 2021 11:58:12 +0000 (12:58 +0100)]
[DA][NFC] Update publication - add remarks
Update the reference publication for the SyncDependenceAnalysis and Divergence Analysis. Fix phrasing, formatting. Add comments on reducible loop limitation.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D114146
Roman Lebedev [Mon, 22 Nov 2021 11:31:25 +0000 (14:31 +0300)]
[X86][TTI] Finish costmodel for AVX512BW's VPMOVM2[BW] / VPMOV[BW]2M instructions
Apparently my methodology was suboptimal, and not only did miss all the +VL tuples,
i also missed some plain tuples. I believe, this adds everything missing.
Indeed, these manual costmodels are just not okay long-term.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D114334
Roman Lebedev [Mon, 22 Nov 2021 11:31:18 +0000 (14:31 +0300)]
[X86][TTI] Costmodel for AVX512DQ's VPMOVM2[DQ] / VPMOV[DQ]2M instructions
Much like the VPMOVM2[BW] / VPMOV[BW]2M from AVX512BW,
these either sign-extent the mask register into a vector,
or pack the mask from vector register.
Apparently, we didn't even have MCA tests for these,
added in rG2f364f6f0d3a2420ca78cbd80abb186657180e05,
so i'm just guessing that their perf characteristics
are optimal.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D114314
Nicolas Vasilache [Mon, 22 Nov 2021 10:57:33 +0000 (10:57 +0000)]
[mlir] Add InitializeNativeTargetAsmParser to ExecutionEngine.
This is required to allow python to work with lowerings that use inline_asm.
Differential Revision: https://reviews.llvm.org/D114338
Tobias Gysi [Mon, 22 Nov 2021 10:56:34 +0000 (10:56 +0000)]
[mlir][linalg] Add a tile and fuse on tensors pattern.
Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests.
Depends On D114012
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114067
Diego Caballero [Mon, 22 Nov 2021 10:18:45 +0000 (10:18 +0000)]
[LV] Drop integer poison-generating flags from instructions that need predication
This patch fixes PR52111. The problem is that LV propagates poison-generating flags (`nuw`/`nsw`, `exact`
and `inbounds`) in instructions that contribute to the address computation of widen loads/stores that are
guarded by a condition. It may happen that when the code is vectorized and the control flow within the loop
is linearized, these flags may lead to generating a poison value that is effectively used as the base address
of the widen load/store. The fix drops all the integer poison-generating flags from instructions that
contribute to the address computation of a widen load/store whose original instruction was in a basic block
that needed predication and is not predicated after vectorization.
Reviewed By: fhahn, spatel, nlopes
Differential Revision: https://reviews.llvm.org/D111846
Nicolas Vasilache [Mon, 22 Nov 2021 08:52:40 +0000 (08:52 +0000)]
[mlir] Fix unintentional mutation by VectorType/RankedTensorType::Builder dropDim
Differential Revision: https://reviews.llvm.org/D113933
Tobias Gysi [Mon, 22 Nov 2021 10:17:53 +0000 (10:17 +0000)]
[mlir][linalg] Fix tile and fuse for outermost reduction.
Tile and fuse failed if the outermost tile loop is a reduction dimension. Add the necessary check to handle outermost reductions and introduce a test case to verify the change.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114012
Nicolas Vasilache [Mon, 22 Nov 2021 10:22:37 +0000 (10:22 +0000)]
[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:
```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```
The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.
This results in roughly 20% fewer cycles as reported by llvm-mca:
After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations: 100
Instructions: 5900
Total Cycles: 2415
Total uOps: 7300
Dispatch Width: 6
uOps Per Cycle: 3.02
IPC: 2.44
Block RThroughput: 24.0
Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
Resource Pressure [ 89.65% ]
- SKXPort1 [ 0.04% ]
- SKXPort2 [ 12.42% ]
- SKXPort3 [ 12.42% ]
- SKXPort5 [ 89.52% ]
Data Dependencies: [ 37.06% ]
- Register Dependencies [ 37.06% ]
- Memory Dependencies [ 0.00% ]
```
After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations: 100
Instructions: 6300
Total Cycles: 2015
Total uOps: 7700
Dispatch Width: 6
uOps Per Cycle: 3.82
IPC: 3.13
Block RThroughput: 20.0
Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
Resource Pressure [ 83.18% ]
- SKXPort0 [ 14.49% ]
- SKXPort1 [ 14.54% ]
- SKXPort2 [ 19.70% ]
- SKXPort3 [ 19.70% ]
- SKXPort5 [ 83.03% ]
- SKXPort6 [ 14.49% ]
Data Dependencies: [ 39.75% ]
- Register Dependencies [ 39.75% ]
- Memory Dependencies [ 0.00% ]
```
An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/
68c7f34012584b0e00f335bcb374ede0).
Reviewed By: ftynse, dcaballe
Differential Revision: https://reviews.llvm.org/D114335
Sjoerd Meijer [Thu, 18 Nov 2021 14:08:37 +0000 (14:08 +0000)]
[BPI] Look-up tables for non-loop branches. NFC.
This adds and uses look-up tables for non-loop branch probabilities, which have
have probabilities directly encoded into the tables for the different condition
codes. Compared to having this logic inlined in different functions, as it used
to be the case, I think this is compacter and thus also easier to check/cross
reference. This also adds a test for pointer heuristics that was missing.
Differential Revision: https://reviews.llvm.org/D114009
Arjun P [Sun, 21 Nov 2021 19:55:25 +0000 (01:25 +0530)]
[MLIR][NFC] Simplex: remove repeated words in comment
Diego Caballero [Mon, 22 Nov 2021 10:12:25 +0000 (10:12 +0000)]
[LV] Pre-commit test for D111846
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D112054
Guillaume Chatelet [Mon, 22 Nov 2021 10:12:46 +0000 (10:12 +0000)]
[libc] Remove unused variable
Manuel Klimek [Mon, 22 Nov 2021 08:07:57 +0000 (09:07 +0100)]
Fix various problems found by fuzzing.
1. IndexTokenSource::getNextToken cannot return nullptr; some code was
still written assuming it can; make getNextToken more resilient against
incorrect input and fix its call-sites.
2. Change various asserts that can happen due to user provided input to
conditionals in the code.
Salman Javed [Mon, 22 Nov 2021 09:49:49 +0000 (22:49 +1300)]
Add missing clang-tidy args in index.rst (NFC)
The RST docs have gone out of sync with the command-line args that the
clang-tidy program actually supports.
Kirill Bobyrev [Mon, 22 Nov 2021 09:44:21 +0000 (10:44 +0100)]
[clangd] IncludeCleaner: Mark possible expr resolutions as used
Fixes: https://github.com/clangd/clangd/issues/934
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D114287
David Green [Mon, 22 Nov 2021 08:11:35 +0000 (08:11 +0000)]
[AArch64] Sink splat shuffles to lane index intrinsics
This teaches AArch64TargetLowering::shouldSinkOperands to sink splat
shuffles to certain neon intrinsics, so that they can make use of the
lane variants of the instructions that are available.
Differential Revision: https://reviews.llvm.org/D112994
Salman Javed [Mon, 22 Nov 2021 08:06:08 +0000 (21:06 +1300)]
Fix nits in clang-tidy's documentation (NFC)
Add commas, articles, and conjunctions where missing.
Chuanqi Xu [Mon, 22 Nov 2021 07:53:51 +0000 (15:53 +0800)]
[C++20] [Coroutines] Warn for deprecated form 'for co_await'
The form 'for co_await' is part of CoroutineTS instead of C++20.
So if we detected the use of 'for co_await' in C++20, we should emit
a warning at least.
Dmitry Vyukov [Fri, 19 Nov 2021 15:51:30 +0000 (16:51 +0100)]
tsan: add another fork test
Add a fork test that models what happens on Mac
where fork calls malloc/free inside of our atfork
callbacks.
Reviewed By: vitalybuka, yln
Differential Revision: https://reviews.llvm.org/D114250
Igor Kudrin [Mon, 22 Nov 2021 07:19:07 +0000 (14:19 +0700)]
[ELF][NFC] Do not pass region name to expandMemoryRegion()
The name can be easily got on-site.
Differential Revision: https://reviews.llvm.org/D114228
wangpc [Mon, 22 Nov 2021 06:01:37 +0000 (14:01 +0800)]
[RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by running script
`llvm/utils/update_llc_test_checks.py`.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D112692
Kazu Hirata [Mon, 22 Nov 2021 03:24:17 +0000 (19:24 -0800)]
[llvm] Use make_early_inc_range (NFC)
Kazu Hirata [Mon, 22 Nov 2021 03:24:15 +0000 (19:24 -0800)]
[llvm] Use range-based for loops (NFC)
Roland McGrath [Mon, 22 Nov 2021 02:14:30 +0000 (18:14 -0800)]
NFC: clang-format lib/Transforms/Instrumentation/InstrProfiling.cpp
Differential Revision: https://reviews.llvm.org/D114343
Joe Loser [Sun, 21 Nov 2021 00:13:18 +0000 (19:13 -0500)]
[libc++][NFC] Sort includes in __ranges/concepts.h
Differential Revision: https://reviews.llvm.org/D114328
LLVM GN Syncbot [Mon, 22 Nov 2021 00:29:19 +0000 (00:29 +0000)]
[gn build] Port
1dc62f2653f8
Nikolas Klauser [Sun, 21 Nov 2021 23:22:55 +0000 (00:22 +0100)]
[libc++] Implement P1272R4 (std::byteswap)
Implement P1274R4
Reviewed By: Quuxplusone, Mordante, #libc
Spies: jloser, lebedev.ri, mgorny, libcxx-commits, arichardson
Differential Revision: https://reviews.llvm.org/D114074
Jacques Pienaar [Sun, 21 Nov 2021 23:06:08 +0000 (15:06 -0800)]
[mlir] Fix unused function warning (NFC)
Delete function no longer needed as all derived classes override
printer.
Jacques Pienaar [Sun, 21 Nov 2021 22:41:11 +0000 (14:41 -0800)]
[mlir] Move trait to InferTypeOpInterface
Step towards removing the hard coded behavior for this trait and to instead use common interface.
Differential Revision: https://reviews.llvm.org/D114208
Kazu Hirata [Sun, 21 Nov 2021 18:36:20 +0000 (10:36 -0800)]
[CodeGen] Use llvm::is_contained (NFC)
Kazu Hirata [Sun, 21 Nov 2021 18:36:18 +0000 (10:36 -0800)]
[llvm] Use range-based for loops (NFC)
Simon Pilgrim [Sun, 21 Nov 2021 18:33:05 +0000 (18:33 +0000)]
[ARM] Regenerate sxt_rot.ll tests
Simon Pilgrim [Sun, 21 Nov 2021 18:32:10 +0000 (18:32 +0000)]
[Thumb2] Regenerate ext + rot tests
Simon Pilgrim [Sun, 21 Nov 2021 18:30:58 +0000 (18:30 +0000)]
[PowerPC] Regenerate rlwinm2.ll test
Philip Reames [Sun, 21 Nov 2021 16:00:34 +0000 (08:00 -0800)]
Add a best practice section on how to configure a fast builder
This is based on conversations with a couple of folks currently running buildbots. There's a couple pieces which didn't make it in, but this tries to cover the common themes.
Differential Revision: https://reviews.llvm.org/D114325
Arjun P [Sun, 21 Nov 2021 13:53:15 +0000 (19:23 +0530)]
[MLIR][NFC] Simplex::restoreRow: improve documentation
Simon Pilgrim [Sun, 21 Nov 2021 12:01:44 +0000 (12:01 +0000)]
[ARM][ParallelDSP] Regenerate complex_dot_prod.ll test
David Green [Sun, 21 Nov 2021 11:46:34 +0000 (11:46 +0000)]
[AArch64] Extra testing for sinking splats to various instructions. NFC
Fangrui Song [Sun, 21 Nov 2021 06:18:09 +0000 (22:18 -0800)]
[ELF] Move getOutputSectionName from Writer.cpp to LinkerScript.cpp. NFC
and internalize it.
Kazu Hirata [Sun, 21 Nov 2021 02:42:10 +0000 (18:42 -0800)]
[llvm] Use range-based for loops (NFC)
Phoebe Wang [Sun, 21 Nov 2021 01:12:46 +0000 (09:12 +0800)]
[X86][FP16] Relax the pattern condition for VZEXT_MOVL to match more cases
Fixes pr52560
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D114313
Joe Loser [Sun, 21 Nov 2021 00:15:00 +0000 (19:15 -0500)]
[libc++][NFC] Fix typo in ranges::iterator_t synopsis
The `iterator_t` alias template is on `T` not a `R` like the other
neighboring alias templates. Fix the typo.
Arthur O'Dwyer [Thu, 14 Oct 2021 20:49:58 +0000 (16:49 -0400)]
[libc++] [doc] Mark some spaceship-related LWG issues as "Complete."
LWG3330 has been "Completed" since D99309, which was in the 13.x timeframe.
Reviewed as part of D110738.
Roman Lebedev [Sat, 20 Nov 2021 22:11:05 +0000 (01:11 +0300)]
[NFC][X86][Costmodel] Actually test +prefer-256-bit in replication-shuffle-related tests :(
While -prefer-256-bit indeed becomes complete with D114314,
the real-world (the one with +prefer-256-bit) coverage is lacking.
Hilarious.
Nikita Popov [Sat, 20 Nov 2021 22:17:41 +0000 (23:17 +0100)]
[DSE] Drop hasAnalyzableMemoryWrite() (NFCI)
The functionality of hasAnalyzableMemoryWrite() is effectively
subsumed by getLocForWriteEx(), which will return None if the
instruction is not analyzable. The implementations don't match
exactly (e.g. getLocForWriteEx() does not limit non-calls to
stores), but in conjunction with the isRemovable() check, it ends
up being the same.
Felix Berger [Fri, 19 Nov 2021 01:33:22 +0000 (20:33 -0500)]
[clang-tidy] performance-unnecessary-copy-initialization: Correctly match the type name of the thisPointertype.
The matching did not work correctly for pointer and reference types.
Differential Revision: https://reviews.llvm.org/D114212
Reviewed-by: courbet
Nikita Popov [Sat, 20 Nov 2021 20:06:08 +0000 (21:06 +0100)]
[LVI] Drop requirement that modulus is constant
If we're looking only at the lower bound, the actual modulus
doesn't matter. This is a leftover from when I wanted to consider
the upper bound as well, where the modulus does matter.
Nikita Popov [Sat, 20 Nov 2021 18:03:45 +0000 (19:03 +0100)]
[LVI] Support urem in implied conditions
If (X urem M) >= C we know that X >= C. Make use of this fact
when computing the implied condition range.
In some cases we could also establish an upper bound, but that's
both tricker and not interesting in practice.
Alive: https://alive2.llvm.org/ce/z/R5ZGSW
Nikita Popov [Sat, 20 Nov 2021 19:48:56 +0000 (20:48 +0100)]
[CVP] Add tests for implied conditions using urem (NFC)
Florian Hahn [Sat, 20 Nov 2021 17:59:47 +0000 (17:59 +0000)]
[VPlan] Wrap vector loop blocks in region.
A first step towards modeling preheader and exit blocks in VPlan as well.
Keeping the vector loop in a region allows for changing the VF as we
traverse region boundaries.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D113182
Sanjay Patel [Sat, 20 Nov 2021 15:55:41 +0000 (10:55 -0500)]
[InstCombine] add folds for binop with sexted bool and constant operands
This is a generalization/extension of the existing and/or
folds noted with TODO comments. Those have a one-use
constraint that is not necessary.
Potential follow-ups are noted by the TODO comments in
the new function. We can also call this function from
other binop visit* functions, but we need to add tests
first.
This solves:
https://llvm.org/PR52543
https://alive2.llvm.org/ce/z/NWuCR5
Sanjay Patel [Sat, 20 Nov 2021 15:19:27 +0000 (10:19 -0500)]
[InstCombine] add tests for bitwise logic with bool op; NFC
Arthur O'Dwyer [Mon, 8 Nov 2021 22:00:43 +0000 (17:00 -0500)]
[libc++] [test] Eliminate libcpp-no-noexcept-function-type and libcpp-no-structured-bindings.
At this point, every supported compiler that claims a -std=c++17 mode
should also support these features.
Differential Revision: https://reviews.llvm.org/D113436
Arnab Dutta [Sat, 20 Nov 2021 15:34:59 +0000 (21:04 +0530)]
[MLIR] Simplify Semi-affine expressions by rule based matching and replacing "expr - q * (expr floordiv q)" with "expr mod q" expression.
Add rule based matching for detecting and transforming "expr - q * (expr floordiv q)"
to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function.
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/D112985
Joseph Huber [Fri, 19 Nov 2021 15:02:28 +0000 (10:02 -0500)]
[Libomptarget] Remove undefined symbol in old runtime
A function with no definition was left in the old runtime, causing
linker errors when trying to compile.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D114264
Dimitry Andric [Sat, 20 Nov 2021 11:10:06 +0000 (12:10 +0100)]
compiler-rt: Use FreeBSD's elf_aux_info to detect AArch64 HW features
Using the out-of-line LSE atomics helpers for AArch64 on FreeBSD also
requires adding support for initializing __aarch64_have_lse_atomics
correctly. On Linux this is done with getauxval(3), on FreeBSD with
elf_aux_info(3), which has a slightly different interface.
Differential Revision: https://reviews.llvm.org/D109330
Roman Lebedev [Sat, 20 Nov 2021 10:55:13 +0000 (13:55 +0300)]
[NFC][X86][Costmodel] Add AVX512DQ runlines to trunc.ll/extend.ll
Roman Lebedev [Sat, 20 Nov 2021 10:09:18 +0000 (13:09 +0300)]
[NFC][X86][MCA] Add forgotten test coverage for AVX512's VPMOVM2[BWDQ] / VPMOV[BWDQ]2M
Arnab Dutta [Sat, 20 Nov 2021 06:30:49 +0000 (12:00 +0530)]
[MLIR] Avoid creation of buggy affine maps while replacing dimension and symbol
Initially before appending the newly composed dimension and symbols
to the dimension and symbol list whose size is to be passed in
AffineMap::get(), the call to the AffineMap::get() was made, resulting
in wrong dimCount and symbolCount being passed as argument. We move the
call to the AffineMap::get() after the diimension and symbol list are
updated.
Differential Revision: https://reviews.llvm.org/D114237
Craig Topper [Sat, 20 Nov 2021 03:05:10 +0000 (19:05 -0800)]
[X86] Don't combine (x86cmp (trunc (movmsk (bitcast X))), 0) if the truncate discards unknown bits.
We have transform that tries turn a pmovmskb into movmskps/pd or
movmskps to movmskpd. This transform isn't valid if the truncate
discarded bits that might be set by the original movmsk.
We could fix this by inserting an AND after the new movmsk to discard
the equivalent of the truncated bits, but I've left that for later
patch.
Fixes PR52567.
Differential Revision: https://reviews.llvm.org/D114306
Craig Topper [Sat, 20 Nov 2021 02:52:17 +0000 (18:52 -0800)]
[X86] Add test case for pr52567. NFC
Lang Hames [Sat, 20 Nov 2021 05:12:23 +0000 (21:12 -0800)]
[ORC] Make JITDylib::AsynchronousSymbolQuerySet private.
This type does not need to be public
Kazu Hirata [Sat, 20 Nov 2021 05:12:12 +0000 (21:12 -0800)]
[llvm] Use range-based for loops (NFC)
RamNalamothu [Fri, 19 Nov 2021 20:23:38 +0000 (01:53 +0530)]
[AMDGPU] Do not generate ELF symbols for the local branch target labels
The compiler was generating symbols in the final code object for local
branch target labels. This bloats the code object, slows down the loader,
and is only used to simplify disassembly.
Use '--symbolize-operands' with llvm-objdump to improve readability of the
branch target operands in disassembly.
Fixes: SWDEV-312223
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D114273
Lang Hames [Sat, 20 Nov 2021 04:51:44 +0000 (20:51 -0800)]
[ORC][JITLink] Move JITDylib name into JITLinkDylib base class.
This will enable better error messages and debug logs in JITLink.
ksyx [Sat, 13 Nov 2021 20:59:43 +0000 (15:59 -0500)]
[GVN][NFC] Remove redundant check
The if-check above deleted part guarantees that StoreOffset <= LoadOffset
and that StoreOffset + StoreSize >= LoadOffset + LoadSize, and given that
LoadOffset + LoadSize > LoadOffset when LoadSize > 0. Thus, this shows
StoreOffset + StoreSize > LoadOffset is guaranteed given LoadSize > 0,
while it could be meaningless to have a type with nonpositive size, so that
the check could be removed. The values are converted to signed types to
avoid unsigned operation with negative offsets.
Part of revision D100179
Reapply commit
c35e8185d8c170c20e28956e0c9f3c1be895fefb with fixing problem
reported by mstorsjo
Nathan Lanza [Wed, 11 Aug 2021 22:55:01 +0000 (18:55 -0400)]
[hmaptool] Port to python3
This is just a few trivial changes -- change the interpreter and fix a
few byte-vs-string issues.
Differential Revision: https://reviews.llvm.org/D107944
James Nagurne [Sat, 20 Nov 2021 00:21:23 +0000 (18:21 -0600)]
[NFC] Test commit, add whitespace to end-of-line
Sam McCall [Sat, 20 Nov 2021 00:10:30 +0000 (01:10 +0100)]
[clangd] Avoid possible crash: apply configuration after binding methods
The configuration may kick off indexing, which may involve sending LSP
messages.
The crash is fiddly to reproduce in a hermetic test (we need background
indexing on without disk storage, and to handle server->client messages
in LSPClient...)
Fixes https://github.com/clangd/clangd/issues/926
Ellis Hoag [Fri, 19 Nov 2021 23:44:48 +0000 (15:44 -0800)]
[InstrProf] Use i32 for GEP index from lowering llvm.instrprof.increment
The `llvm.instrprof.increment` intrinsic uses `i32` for the index. We should use this same type for the index into the GEP instructions.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D114268
Krzysztof Drewniak [Thu, 18 Nov 2021 22:37:53 +0000 (22:37 +0000)]
[MLIR][GPU] Link in device libraries during HSA compilation if needed
To perform some operations, such as sin() or printf(), code compiled
for AMD GPUs must be linked to a series of device libraries. This
commit adds support for linking in these libraries.
However, since these device libraries are delivered as LLVM bitcode,
raising the possibility of version incompatibilities, this commit only
links in libraries when the functions from those libraries are called
by the code being compiled.
This code also sets the math flags to their most conservative values,
as MLIR doesn't have a `-ffast-math` equivalent.
Depends on D114114
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114117
Quinn Pham [Thu, 18 Nov 2021 21:52:39 +0000 (15:52 -0600)]
[NFC][llvm] Inclusive language: remove instance of master from Thumb2SizeReduction.cpp
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with main in `Thumb2SizeReduction.cpp`.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D114196
rdzhabarov [Fri, 19 Nov 2021 21:43:17 +0000 (21:43 +0000)]
[mlir] Bug fix. Stream must outlive the pass manager.
Bug fix. Stream must outlive the pass manager.
Reviewed By: Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D114277
Wei Wang [Fri, 19 Nov 2021 21:14:41 +0000 (13:14 -0800)]
[Sema] fix nondeterminism in ASTContext::getDeducedTemplateSpecializationType
`DeducedTemplateSpecializationTypes` is a `llvm::FoldingSet<DeducedTemplateSpecializationType>` [1],
where `FoldingSetNodeID` is based on the values: {`TemplateName`, `QualType`, `IsDeducedAsDependent`},
those values are also used as `DeducedTemplateSpecializationType` constructor arguments.
A `FoldingSetNodeID` created by the static `DeducedTemplateSpecializationType::Profile` may not be equal
to`FoldingSetNodeID` created by a member `DeducedTemplateSpecializationType::Profile` of an instance
created with the same {`TemplateName`, `QualType`, `IsDeducedAsDependent`}, which makes
`DeducedTemplateSpecializationTypes` lookups nondeterministic.
Specifically, while `IsDeducedAsDependent` value is passes to the constructor, `IsDependent()` method on
the created instance may return a different value, because `IsDependent` is not saved as is:
```name=clang/include/clang/AST/Type.h
DeducedTemplateSpecializationType(TemplateName Template, QualType DeducedAsType, bool IsDeducedAsDependent)
: DeducedType(DeducedTemplateSpecialization, DeducedAsType,
toTypeDependence(Template.getDependence()) | // <~ also considers `TemplateName` parameter
(IsDeducedAsDependent ? TypeDependence::DependentInstantiation : TypeDependence::None)),
```
For example, if an instance A with key `FoldingSetNodeID {A, B, false}` is inserted. Then a key
`FoldingSetNodeID {A, B, true}` is probed:
If it happens to correspond to the same bucket in `FoldingSet` as the first key, and `A.Profile()` returns
`FoldingSetNodeID {A, B, true}`, then it's a hit.
If the bucket for the second key is different from the first key, instance A is not considered at all, and it's
a no hit, even if `A.Profile()` returns `FoldingSetNodeID {A, B, true}`.
Since `TemplateName`, `QualType` parameter values involve memory pointers, the lookup result depend on allocator,
and may differ from run to run. When this is used as part of modules compilation, it may result in "module out of date"
errors, if imported modules are built on different machines.
This makes `ASTContext::getDeducedTemplateSpecializationType` consider `Template.isDependent()` similar
`DeducedTemplateSpecializationType` constructor.
Tested on a very big codebase, by running modules compilations from directories with varied path length
(seem to affect allocator seed).
1. https://llvm.org/docs/ProgrammersManual.html#llvm-adt-foldingset-h
Patch by Wei Wang and Igor Sugak!
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D112481
Sanjay Patel [Fri, 19 Nov 2021 17:48:27 +0000 (12:48 -0500)]
[InstCombine] add/adjust tests for mask of sext i1; NFC
These are sibling transforms, but the test coverage was
uneven and incomplete.
Stefan Pintilie [Mon, 15 Nov 2021 21:26:30 +0000 (15:26 -0600)]
[PowerPC][NFC] Add a series of codegen tests for vector reductions.
This patch only adds tests for PowerPC. The purpose of these tests
is to track what code is generated for various vector reductions.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D113801
Louis Dionne [Fri, 19 Nov 2021 21:01:39 +0000 (16:01 -0500)]
[libc++][NFC] Add missing include in test
Becca Royal-Gordon [Fri, 19 Nov 2021 20:10:15 +0000 (12:10 -0800)]
Allow __attribute__((swift_attr)) in attribute push pragmas
This change allows SwiftAttr to be used with #pragma clang attribute push
to add Swift attributes to large regions of header files.
We plan to use this to annotate headers with concurrency information.
Patch by: Becca Royal-Gordon
Differential Revision: https://reviews.llvm.org/D112773
Krzysztof Drewniak [Thu, 18 Nov 2021 21:42:42 +0000 (21:42 +0000)]
[MLIR][GPU] Make the path to ROCm a runtime option
Our current build assumes that the path to ROCm we find at build time
will be the path at which ROCm is located when the built code is
executed. This commit adds a --rocm-path option to SerializeToHsaco,
and removes the HIP dependency that the SerializeToHsaco previously had.
Depends on D114113
(though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107)
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114114