Lei Zhang [Fri, 24 Jun 2022 02:19:08 +0000 (22:19 -0400)]
[mlir][spirv] Fix bitcast input order for UnifyAliasedResourcePass
spv.bitcast from a vector to a scalar expects the lower-numbered
components of the the vector to map to the lower-ordered bits of
the scalar. That actually already matches how little endian stores
data in the memory. So we just need to read and push to the back
of the vector sequentially.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D128473
Casey Carter [Fri, 24 Jun 2022 00:46:47 +0000 (17:46 -0700)]
[libcxx][test] Suppress complex<int> warnings when testing MSVC
Dmitri Gribenko [Fri, 24 Jun 2022 00:26:07 +0000 (02:26 +0200)]
clang: Tweak behaviour of warn_empty_while_body and warn_empty_if_body
Use the if/while statement right paren location instead of the end of the
condition expression to determine if the semicolon is on its own line, for the
purpose of not warning about code like this:
while (foo())
;
Using the condition location meant that we would also not report a warning on
code like this:
while (MACRO(a,
b));
body();
The right paren loc wasn't stored in the AST or passed into Sema::ActOnIfStmt
when this logic was first written.
Reviewed By: rnk, gribozavr2
Differential Revision: https://reviews.llvm.org/D128406
Greg Clayton [Mon, 23 May 2022 23:01:00 +0000 (16:01 -0700)]
Add support for decoding base64.
An upcoming patch to LLDB will require the ability to decode base64. This patch adds support for decoding base64 and adds tests.
Differential Revision: https://reviews.llvm.org/D126254
Evgenii Stepanov [Thu, 23 Jun 2022 23:03:21 +0000 (16:03 -0700)]
Revert "[LoopInterchange] New cost model for loop interchange"
llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed
integer overflow:
6148914691236517209 * 100 cannot be represented in
type 'long'
https://lab.llvm.org/buildbot/#/builders/5/builds/25185
This reverts commit
1b24fe34b06cd9f2337313f513a8b19f9a37c5de.
Frederik Gossen [Wed, 22 Jun 2022 20:36:24 +0000 (16:36 -0400)]
[MLIR] Add `decomposeMixedStridesOrOffsets` and `decomposeMixedSizes`
Add the reverse functions to the ViewLikeInterface's functions
`getMixedStrides`, `getMixedSizes`, and `getMixedOffsets`. The new functions
are useful to build view-like operations from an array of mixed static/dynamic
values.
Differential Revision: https://reviews.llvm.org/D128376
Philip Reames [Thu, 23 Jun 2022 22:58:54 +0000 (15:58 -0700)]
[RISCV] Replace two calls to getMinRVVVectorSizeInBits with useRVVForFixedLengthVectors [nfc]
Daniel Bertalan [Thu, 23 Jun 2022 22:19:18 +0000 (00:19 +0200)]
[NFC][lld] Fix typos to test commit access
Jonas Devlieghere [Thu, 23 Jun 2022 21:14:49 +0000 (14:14 -0700)]
[lldb] Fix up Objective-C ISA pointers
Support stripping the PAC bits from Objective-C ISA pointers in the
Objective-C runtime plugin.
Hui Xie [Thu, 23 Jun 2022 20:54:23 +0000 (21:54 +0100)]
Revert "[libc++] P2321R2 section [tuple.tuple]. Adding C++23 constructors, assignment operators and swaps to `tuple`"
When merging the changes of <type_traits> header with the commits on
this header over the last month, several conflicts were mistaken
resolved and the wrong branch was picked while resolving conflicts,
which leads to CI failure. In order to resolve the conflicts properly
with qualification CI job, this change is reverted.
This reverts commit
95733a55b986e73f4d8f5314e0d4557d8ae0b226.
Derek Schuff [Fri, 17 Jun 2022 20:09:43 +0000 (13:09 -0700)]
[WebAssembly][Object] Remove requirement that objects must have code sections
When parsing name and linking sections, we currently require that the object
must have a code section (it seems that this was intended to verify section
ordering). However it can be useful for binaries to have their code sections
stripped out (e.g. if we just want the debug info). In that case we need
the rest of the known sections (so e.g. we know how many functions there
are, to verify the name section) but not the actual code.
I've removed the restriction completely. I think this is OK because the
section-parsing code already checks function and global indices in many
places for validity and will return appropriate errors if the relevant sections
are missing. Also we can't just replace the requirement of seeing a code section
with a requirement that we see a function or global section, because a binary
may just not have any functions or globals.
But there's only an problem if the name or linking section tries to name a
nonexistent function.
Part of a fix for https://github.com/emscripten-core/emscripten/issues/13084
Differential Revision: https://reviews.llvm.org/D128094
Chelsea Cassanova [Thu, 23 Jun 2022 15:38:18 +0000 (11:38 -0400)]
[lldb/Fuzzer] Have fuzzers write artifacts to specific directory
This makes the LLDB fuzzers write their fuzzer artifacts to
their own directory in the build directory. It also adds an artifact
prefix to the target fuzzer to make it easier to tell which fuzzer
wrote the artifact.
Differential revision: https://reviews.llvm.org/D128450
Jim Ingham [Thu, 23 Jun 2022 20:53:59 +0000 (13:53 -0700)]
The help string for stop-on-shared-library-load was copied to stop-on-exec.
Fix it so it says it does what it does.
Siva Chandra Reddy [Thu, 23 Jun 2022 20:53:09 +0000 (20:53 +0000)]
[libc] Revert "Eliminate the internal header library target."
This reverts commit
306f2731f482d32ccf557996ff122f7293cb30cb. The CMake
version used by the bots does like it.
Siva Chandra Reddy [Thu, 9 Jun 2022 20:23:33 +0000 (20:23 +0000)]
[libc][NFC] Eliminate the internal header library target.
The internal header library target with name suffix .__header_library
has been removed as it serves no purpose now.
Siva Chandra Reddy [Wed, 22 Jun 2022 22:51:26 +0000 (22:51 +0000)]
[libc][NFC] Convert pthread tests which create threads to integration tests.
Congzhe Cao [Thu, 23 Jun 2022 20:26:24 +0000 (16:26 -0400)]
[LoopInterchange] New cost model for loop interchange
This is the second attempt to land this patch.
The patch proposed to use a new cost model for loop interchange,
which is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of
loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the
outermost loop, loop1 should be placed one more level inside, and loop2
one more level inside, etc. What loop cache analysis does is not only more
comprehensive than the current cost model, it is also a "one-shot" query
which means that we only need to query it once during the entire loop
interchange pass, which is better than the current cost model where we
query it every time we check whether it is profitable to interchange two
loops. Thus complexity is reduced, especially after D120386 where we do
more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections.
One change that applies to all tests is that we added an option
`-cache-line-size=64` to the RUN lines. This is ensure that loop cache
analysis receives a valid number of cache line size for correct analysis.
Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it
as fall-back in case the new cost model did not run successfully. This is
because currently we have some limitations in delinearization, which sometimes
makes loop cache analysis bail out. The longer term goal is to enhance
delinearization and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
Siva Chandra Reddy [Wed, 22 Jun 2022 21:27:57 +0000 (21:27 +0000)]
[libc][NFC] Convert threads unittests in to integration tests.
This is mostly a mechanical change. In a future pass, all tests from
pthread which create threads will also be converted to integration tests.
Some of thread related features are tightly coupled with the loader. So,
they can only be tested with the in-house loader. Hence, going forward, all
tests which create threads will have to be integration tests.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D128381
Hui Xie [Thu, 12 May 2022 12:23:11 +0000 (13:23 +0100)]
[libc++] P2321R2 section [tuple.tuple]. Adding C++23 constructors, assignment operators and swaps to `tuple`
1. for constructors that takes cvref variation of tuple<UTypes...>, there
used to be two SFINAE helper _EnableCopyFromOtherTuple,
_EnableMoveFromOtherTuple. And the implementations of these two helpers
seem to slightly differ from the spec. But now, we need 4 variations.
Instead of adding another two, this change refactored it to a single one
_EnableCtrFromUTypesTuple, which directly maps to the spec without
changing the C++11 behaviour. However, we need the helper __copy_cvref_t
to get the type of std::get<i>(cvref tuple<Utypes...>) for different
cvref, so I made __copy_cvref_t to be available in C++11.
2. for constructors that takes variations of std::pair, there used to be
four helpers _EnableExplicitCopyFromPair, _EnableImplicitCopyFromPair,
_EnableImplicitMoveFromPair, _EnableExplicitMoveFromPair. Instead of
adding another four, this change refactored into two helper
_EnableCtrFromPair and _BothImplicitlyConvertible. This also removes the
need to use _nat
3. for const member assignment operator, since the requirement is very
simple, I haven't refactored the old code but instead directly adding
the new c++23 code.
4. for const swap, I pretty much copy pasted the non-const version to make
these overloads look consistent
5. while doing these change, I found two of the old constructors wasn't
marked constexpr for C++20 but they should. fixed them and added unit
tests
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D116621
Louis Dionne [Mon, 20 Jun 2022 21:10:53 +0000 (17:10 -0400)]
[libc++] Add a test to pin down the set of transitive public includes
A situation that happens fairly often in libc++ is that we remove some
transitive includes in a header (either purposefully or not) and that
ends up breaking users. Of course, we want to be able to remove our
transitive includes, however it's also good to have a grip on that
to know which commit changed what and when. Furthermore, it's good
to accumulate include removals for a couple of releases to avoid
breaking users at every release for this reason.
This commit adds a test that should break whenever we remove an
include. Hence, it should allow us to track which headers include
which other headers transitively, giving us a traceable way to
remove headers.
Differential Revision: https://reviews.llvm.org/D128236
David Blaikie [Wed, 22 Jun 2022 23:26:23 +0000 (23:26 +0000)]
DebugInfo: Fully integrate ctor type homing into 'limited' debug info
Simplify debug info back to just "limited" or "full" by rolling the ctor
type homing fully into the "limited" debug info.
Also fix a bug I found along the way that was causing ctor type homing
to kick in even when something could be vtable homed (where vtable
homing is stronger/more effective than ctor homing) - fixing at the same
time as it keeps the tests (that were testing only "limited non ctor"
homing and now test ctor homing) passing.
Xiang Li [Sun, 19 Jun 2022 22:37:00 +0000 (15:37 -0700)]
[HLSL] Enable half type for hlsl.
HLSL supports half type.
When enable-16bit-types is not set, half will be treated as float.
When enable-16bit-types is set, half will be treated like real 16bit float type and map to llvm half type.
Also change CXXABI to Microsoft to match dxc behavior.
The mangle name for half is "$f16@" when half is treat as native half type and "$halff@" when treat as float.
In AST, half is still half.
The special thing is done at clang codeGen, when NativeHalfType is false, half will translated into float.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D124790
Nico Weber [Thu, 23 Jun 2022 16:11:47 +0000 (12:11 -0400)]
[lld, ELF and mac] Add --time-trace=<file>, remove --time-trace-file=<file>
`--time-trace=foo` has the same behavior as `--time-trace --time-trace-file=<file>`
had previously.
Also, for mac, make --time-trace-granularity *not* imply --time-trace, to match
behavior of the ELF port.
Differential Revision: https://reviews.llvm.org/D128451
Philip Reames [Thu, 23 Jun 2022 19:13:45 +0000 (12:13 -0700)]
[LV] Avoid a crash when costing a uniform store which doesn't correspond to a legal scatter
If we have an unaligned uniform store, then when costing a scalable VF we can't emit code to scalarize it. (Well, we could, but we haven't implemented that case.) This change replaces an assert with a cost-model bailout such that we reject vectorization with the scalable VF instead of crashing.
Joseph Huber [Thu, 23 Jun 2022 14:06:26 +0000 (10:06 -0400)]
[CUDA] Do not embed a fatbinary when using the new driver
Previously, when using the new driver we created a fatbinary with the
PTX and Cubin output. This was mainly done in an attempt to create some
backwards compatibility with the existing CUDA support that embeds the
fatbinary in each TU. This will most likely be more work than necessary
to actually implement. The linker wrapper cannot do anything with these
embedded PTX files because we do not know how to link them, and if we
did want to include multiple files it should go through the
`clang-offload-packager` instead. Also this didn't repsect the setting
that disables embedding PTX (although it wasn't used anyway).
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D128441
Jin Xin Ng [Wed, 1 Jun 2022 17:49:36 +0000 (10:49 -0700)]
[ThinLTO][ELF] Add --thinlto-emit-index-files option
Allows ThinLTO indices to be written to disk on-the-fly/as-part-of “normal” linker execution. Previously ThinLTO indices could be written via --thinlto-index-only but that would cause the linker to exit early. For MLGO specifically, this enables saving the ThinLTO index files without having to restart the linker to collect data only available at later stages (i.e. output of --save-temps) of the linker's execution.
Note, this option does not currently work with:
--thinlto-object-suffix-replace, as this is intended to be used to consume minimized IR bitcode files while --thinlto-emit-index-files is intended to be run together with InProcessThinLTO (which cannot parse minimized IR).
--thinlto-prefix-replace support is left unimplemented but can be implemented if needed
Differential Revision: https://reviews.llvm.org/D127777
Nicolas Vasilache [Thu, 23 Jun 2022 19:14:23 +0000 (12:14 -0700)]
[mlir][Transform] Fix applyToOne corner case when no op is matched.
Such situations manifest themselves with an empty payload which ends up producing empty results.
In such cases, we still want to match the transform op contract and return as many empty SmallVector<Operation*>
as the op requires.
Differential Revision: https://reviews.llvm.org/D128456
Nathan James [Thu, 23 Jun 2022 18:59:30 +0000 (19:59 +0100)]
[clang-tidy] Extend spelling for CheckOptions
The current way to specify CheckOptions is pretty verbose and unintuitive.
Given that the options are a dictionary it makes much more sense to treat them as such in the config files.
Example:
```
CheckOptions: {SomeCheck.Option: true, SomeCheck.OtherOption: 'ignore'}
# Or
CheckOptions:
SomeCheck.Option: true
SomeCheck.OtherOption: 'ignore'
```
This change will still handle the old syntax with no issue, ensuring we don't screw up users current config files.
The only observable differences are support for the new syntax and `-dump=config` will emit using the new syntax.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D128337
Med Ismail Bennani [Thu, 23 Jun 2022 18:52:25 +0000 (11:52 -0700)]
[llvm] Update module map to include the `IR/ConstantFold` header
This should fix the build failure occuring when enabling modules
(LLVM_ENABLE_MODULES=On):
https://green.lab.llvm.org/green/job/lldb-cmake/44785/
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Sam McCall [Thu, 23 Jun 2022 18:33:26 +0000 (20:33 +0200)]
[pseudo] Handle no-reductions-available on the fastpath. NFC
This is a ~2% speedup.
Alexey Bataev [Wed, 22 Jun 2022 16:31:03 +0000 (09:31 -0700)]
[SLP]Fix a crash when reorder masked gather nodes with reused scalars.
If the masked gather nodes must be reordered, we can just reorder
scalars, just like for gather nodes. But if the node contains reused
scalars, it must be handled same way as a regular vectorizable node,
since need to reorder reused mask, not the scalars directly.
Differential Revision: https://reviews.llvm.org/D128360
Peter Klausler [Wed, 22 Jun 2022 20:27:59 +0000 (13:27 -0700)]
[flang][runtime] Improve G0 output editing
G0 output editing should never overflow an output field and fill it
with asterisks. It should also never elide the "E" in an exponent
field, even if it has more than three digits.
Differential Revision: https://reviews.llvm.org/D128396
Nathan James [Thu, 23 Jun 2022 18:23:08 +0000 (19:23 +0100)]
[clang-tidy] Add `-verify-config` command line argument
Adds a `-verify-config` command line argument, that when specified will verify the Checks and CheckOptions fields in the config files:
- A warning will be raised for any check that doesn't correspond to a registered check, a suggestion will also be emitted for close misses.
- A warning will be raised for any check glob(containing *) that doesn't match any registered check.
- A warning will be raised for any CheckOption that isn't read by any registered check, a suggestion will also be emitted for close misses.
This can be useful if debuging why a certain check isn't enabled, or the options are being handled as you expect them to be.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D127446
Peter Klausler [Tue, 21 Jun 2022 00:22:33 +0000 (17:22 -0700)]
[flang] Make SQRT folding exact
Replace the latter half of the SQRT() folding algorithm with code that
calculates an exact root with extra rounding bits, and then lets the
usual normalization and rounding code do the right thing. Extend
tests to catch regressions.
Differential Revision: https://reviews.llvm.org/D128395
Peter Klausler [Fri, 17 Jun 2022 23:31:17 +0000 (16:31 -0700)]
[flang] Fix wording of warning message
"division on intrinsic call" should read "division by zero on intrinsic call".
Differential Revision: https://reviews.llvm.org/D128394
Sam McCall [Thu, 23 Jun 2022 18:06:04 +0000 (20:06 +0200)]
[pseudo] Store last node popped in the queue, not its parent(s). NFC
We have to walk up to the last node to find the start token, but no need
to go even one node further.
This is one node fewer to store, but more importantly if the last node
happens to have multiple parents we avoid storing the sequence multiple times.
This saves ~5% on glrParse.
Based on a comment by hokein@ on https://reviews.llvm.org/D128307
Peter Klausler [Wed, 22 Jun 2022 20:24:51 +0000 (13:24 -0700)]
[flang][runtime] FLUSH(bad or unconnected unit number) is an error
Some I/O control statements are no-ops when attempted on a bad or
unconnected UNIT=, but the standard says that FLUSH is an error
in that case.
Differential Revision: https://reviews.llvm.org/D128392
Wolfgang Pieb [Wed, 22 Jun 2022 20:02:01 +0000 (13:02 -0700)]
[Inline] Introduce a backend option to suppress inlining of functions with large stack sizes.
The hidden option max-inline-stacksize=<N> prevents the inlining of functions
with a stack size larger than N.
Reviewed By: mtrofin, aeubanks
Differential Review: https://reviews.llvm.org/D127988
Slava Zakharin [Thu, 23 Jun 2022 17:10:56 +0000 (10:10 -0700)]
[mlir][math] Lower atan to libm
Differential Revision: https://reviews.llvm.org/D128454
Sam McCall [Tue, 21 Jun 2022 22:20:38 +0000 (00:20 +0200)]
[pseudo] Store reduction sequences by pointer in heaps, instead of by value.
Copying sequences around as the heap resized is significantly expensive.
This speeds up glrParse by ~35% (2.4 => 3.25 MB/s)
Differential Revision: https://reviews.llvm.org/D128307
Matthias Springer [Thu, 23 Jun 2022 17:28:46 +0000 (19:28 +0200)]
[mlir][bufferization][NFC] Make `escape` a dialect attribute
All bufferizable ops that bufferize to an allocation receive a `bufferization.escape` attribute during TensorCopyInsertion.
Differential Revision: https://reviews.llvm.org/D128137
Peter Klausler [Fri, 17 Jun 2022 21:12:13 +0000 (14:12 -0700)]
[flang] Fix bogus errors from SIZE/SHAPE/UBOUND on assumed-shape
While it is indeed an error to use SIZE, SHAPE, or UBOUND on an
assumed-shape dummy argument without also supplying a DIM= argument
to the intrinsic function, it is *not* an error to use these intrinsic
functions on sections or expressions of such arrays. Refine the test
used for the error message.
Differential Revision: https://reviews.llvm.org/D128391
Sam McCall [Tue, 21 Jun 2022 20:19:06 +0000 (22:19 +0200)]
[pseudo] Turn glrReduce into a class, reuse storage across calls.
This is a ~5% speedup, we no longer have to allocate the priority queues and
other collections for each reduction step where we use them.
It's also IMO easier to understand the structure of a class with methods vs a
function with nested lambdas.
Differential Revision: https://reviews.llvm.org/D128301
Philip Reames [Thu, 23 Jun 2022 17:19:45 +0000 (10:19 -0700)]
[RISCV] Fix a crash in InsertVSETVLI where we hadn't properly guarded for a SEWLMULRatioOnly abstract state
A forward abstract state can be in the special SEWLMULRatioOnly state which means we're not allowed to inspect its fields. The scalar to vector move case was mising a guard, and we'd crash on an assert. Test cases included.
Florian Hahn [Thu, 23 Jun 2022 16:46:15 +0000 (18:46 +0200)]
[ConstraintElimination] Use stable_sort to sort worklist.
If there are multiple constraints in the same block, at the moment the
order they are processed may be different depending on the sort
implementation.
Use stable_sort to ensure consistent ordering.
Joseph Huber [Mon, 13 Jun 2022 19:26:55 +0000 (15:26 -0400)]
[Offloading] Embed the target features in the OffloadBinary
The target features are necessary for correctly compiling most programs
in LTO mode. Currently, these are derived in clang at link time and
passed as an arguemnt to the linker wrapper. This is problematic because
it requires knowing the required toolchain at link time, which should
not be necessry. Instead, these features should be embedded into the
offloading binary so we can unify them in the linker wrapper for LTO.
This also required changing the offload packager to interpret multiple
arguments as concatenation with a comma. This is so we can still use the
`,` separator for the argument list.
Depends on D127246
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127686
Arthur Eubanks [Wed, 22 Jun 2022 19:40:03 +0000 (12:40 -0700)]
[docs][NewPM] Add more info on why accessing mutable outer analyses is disallowed
Reviewed By: asbirlea, rnk
Differential Revision: https://reviews.llvm.org/D128374
Peter Klausler [Fri, 17 Jun 2022 19:14:46 +0000 (12:14 -0700)]
[flang][runtime] Handle READ of non-UTF-8 data into multi-byte CHARACTER
When a READ statement reads into a CHARACTER(2 or 4) variable from a
unit whose encoding is not UTF-8, don't copy bytes directly; they must
each be zero-extended.
Differential Revision: https://reviews.llvm.org/D128390
Arthur Eubanks [Thu, 23 Jun 2022 16:56:31 +0000 (09:56 -0700)]
[test][GlobalOpt] Update precommitted test
Peter Klausler [Fri, 17 Jun 2022 18:45:14 +0000 (11:45 -0700)]
[flang][runtime] Respect PAD='NO' on READ/WRITE
The check for the PAD= setting should examine the mutable modes
of the current I/O statement, not the persistent modes of the
I/O unit.
Differential Revision: https://reviews.llvm.org/D128389
gpetters94 [Tue, 21 Jun 2022 16:51:18 +0000 (16:51 +0000)]
Adding a named op for grouped convolutions
Sam McCall [Tue, 21 Jun 2022 19:47:14 +0000 (21:47 +0200)]
[pseudo] Add a fast-path to GLR reduce when both pop and push are trivial
In general we split a reduce into pop/push, so concurrently-available reductions
can run in the correct order. The data structures for this are expensive.
When only one reduction is possible at a time, we need not do this: we can pop
and immediately push instead.
Strictly this is correct whenever we yield one concurrent PushSpec.
This patch recognizes a trivial but common subset of these cases:
- there must be no pending pushes and only one head available to pop
- the head must have only one reduction rule
- the reduction path must be a straight line (no multiple parents)
On my machine this speeds up by 2.12 -> 2.30 MB/s = 8%
Differential Revision: https://reviews.llvm.org/D128299
Sam McCall [Thu, 23 Jun 2022 16:16:49 +0000 (18:16 +0200)]
Reland "[pseudo] Track heads as GSS nodes, rather than as "pending actions"."
This reverts commit
2c80b5319870b57fbdbb6c9cef9c86c26c65371d.
Fixes LRTable::buildForTest to create states that are referenced but
have no actions.
Peter Klausler [Fri, 17 Jun 2022 18:20:29 +0000 (11:20 -0700)]
[flang] Fix READ/WRITE with POS= on stream units, with refactoring
First, ExternalFileUnit::SetPosition was being used both as a utility
within the class' member functions as well as an API from I/O statement
processing. Make it private, and add APIs for SetStreamPos and SetDirectRec.
Second, ensure that SetStreamPos for POS= positioning in a stream
doesn't leave the current record number and endfile record number
in an arbitrary state. In stream I/O they are used only to manage
end-of-file detection, and shouldn't produce false positive results
from IsAtEnd() after repositioning.
Differential Revision: https://reviews.llvm.org/D128388
Sam McCall [Thu, 23 Jun 2022 16:16:03 +0000 (18:16 +0200)]
Revert "[pseudo] Track heads as GSS nodes, rather than as "pending actions"."
This reverts commit
e3ec054dfdf48f19cb6726cb3f4965b9ab320ed9.
Tests fail in asserts mode: https://lab.llvm.org/buildbot/#/builders/109/builds/41217
Philip Reames [Thu, 23 Jun 2022 16:11:24 +0000 (09:11 -0700)]
[BasicTTI] Avoid crash when costing scalable select expansion
If the target has chosen to expand a scalable vector type, BasicTTI tries to scalarize and we'd crash. As a minimum, we should return an invalid cost instead.
The added test provide coverage for the moment, but given they show a number of gaps in RISCV costing, they're likely not to cover this code path long term.
Jonas Devlieghere [Thu, 23 Jun 2022 15:08:36 +0000 (08:08 -0700)]
[lldb] Make thread safety the responsibility of the log handlers
Drop the thread-safe flag and make the locking strategy the
responsibility of the individual log handler.
Previously we got away with a non-thread safe mode because we were using
unbuffered streams that rely on the underlying syscalls/OS for
synchronization. With the introduction of log handlers, we can have
arbitrary logic involved in writing out the logs. With this patch the
log handlers can pick the most appropriate locking strategy for their
particular implementation.
Differential revision: https://reviews.llvm.org/D127922
Jonas Devlieghere [Thu, 23 Jun 2022 15:06:17 +0000 (08:06 -0700)]
[lldb] Support a buffered logging mode
This patch adds a buffered logging mode to lldb. A buffer size can be
passed to `log enable` with the -b flag. If no buffer size is specified,
logging is unbuffered.
Differential revision: https://reviews.llvm.org/D127986
Valentin Clement [Thu, 23 Jun 2022 16:04:50 +0000 (18:04 +0200)]
[flang] Handle boxed characters that are values when doing a conversion
Character conversion requires memory storage as it operates on a
sequence of code points.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128438
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Val Donaldson [Thu, 23 Jun 2022 16:03:06 +0000 (18:03 +0200)]
[flang] Increase support for intrinsic module procedures
* Make Semantics test doconcurrent01.f90 an expected failure pending a fix
for a problem in recognizing a PURE prefix specifier for a specific procedure
that occurs in new intrinsic module source code,
* review update
* review update
* Increase support for intrinsic module procedures
The f18 standard defines 5 intrinsic modules that define varying numbers
of procedures, including several operators:
2 iso_fortran_env
55 ieee_arithmetic
10 ieee_exceptions
0 ieee_features
6 iso_c_binding
There are existing fortran source files for each of these intrinsic modules.
This PR adds generic procedure declarations to these files for procedures
that do not already have them, together with associated specific procedure
declarations. It also adds the capability of recognizing intrinsic module
procedures in lowering code, making it possible to use existing language
intrinsic code generation for intrinsic module procedures for both scalar
and elemental calls. Code can then be generated for intrinsic module
procedures using existing options, including front end folding, direct
inlining, and calls to runtime support routines. Detailed code generation
is provided for several procedures in this PR, with others left to future PRs.
Procedure calls that reach lowering and don't have detailed implementation
support will generate a "not yet implemented" message with a recognizable name.
The generic procedures in these modules may each have as many as 36 specific
procedures. Most specific procedures are generated via macros that generate
type specific interface declarations. These specific declarations provide
detailed argument information for each individual procedure call, similar
to what is done via other means for standard language intrinsics. The
modules only provide interface declarations. There are no procedure
definitions, again in keeping with how language intrinsics are processed.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128431
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
LLVM GN Syncbot [Thu, 23 Jun 2022 15:49:40 +0000 (15:49 +0000)]
[gn build] Port
4045b62d4cc9
Craig Topper [Thu, 23 Jun 2022 15:41:12 +0000 (08:41 -0700)]
[RISCV] Disable <vscale x 1 x *> types with Zve32x or Zve32f.
According to the vector spec, mf8 is not supported for i8 if ELEN
is 32. Similarily mf4 is not suported for i16/f16 or mf2 for i32/f32.
Since RVVBitsPerBlock is 64 and LMUL is calculated as
((MinNumElements * ElementSize) / RVVBitsPerBlock) this means we
need to disable any type with MinNumElements==1.
For generic IR, these types will now be widened in type legalization.
For RVV intrinsics, we'll probably hit a fatal error somewhere. I plan
to work on disabling the intrinsics in the riscv_vector.h header.
Reviewed By: arcbbb
Differential Revision: https://reviews.llvm.org/D128286
Nico Weber [Wed, 22 Jun 2022 14:58:33 +0000 (10:58 -0400)]
[lld/mac] Add a few TimeTraceScopes
Identical literal folding takes ~1.4% of the time, and was missing
from the trace.
Signature computation still needs ~2.2% of the time, so probably worth
explicitly marking its contribution to "Write output file" (9.1%)
Differential Revision: https://reviews.llvm.org/D128343
Craig Topper [Thu, 23 Jun 2022 15:30:43 +0000 (08:30 -0700)]
[RISCV] Add macrofusion infrastructure and one example usage.
This adds the macrofusion plumbing and support fusing LUI+ADDI(W).
This is similar to D73643, but handles a different case. Other cases
can be added in the future.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D128393
Joe Nash [Mon, 20 Jun 2022 13:41:38 +0000 (09:41 -0400)]
[AMDGPU] gfx11 Select on Buffer Atomic FAdd Rtn type
Reviewed By: #amdgpu, foad, rampitec
Differential Revision: https://reviews.llvm.org/D128205
Florian Hahn [Thu, 23 Jun 2022 15:27:33 +0000 (17:27 +0200)]
Revert "[ConstraintElimination] Transfer info from ULT to signed system."
This reverts commit
316e106f49c4c86f3485d69d1539e2aed12251c0.
This breaks a bot with expensive checks.
Sam McCall [Tue, 21 Jun 2022 19:22:22 +0000 (21:22 +0200)]
[pseudo] Track heads as GSS nodes, rather than as "pending actions".
IMO this model is simpler to understand (borrowed from the LR0 patch D127357).
It also makes error recovery easier to implement, as we have a simple list of
head nodes lying around to recover from when needed.
(It's not quite as nice as LR0 in this respect though).
It's slightly slower (2.24 -> 2.12 MB/S on my machine = 5%) but nothing close
to as bad as LR0.
However
- I think we'd have to eat a litle performance loss otherwise to implement
error recovery.
- this frees up some complexity budget for optimizations like fastpath push/pop
(this + fastpath is already faster than head)
- I haven't changed the data structure here and it's now pretty dumb, we can
make it faster
Differential Revision: https://reviews.llvm.org/D128297
Mark de Wever [Tue, 28 Dec 2021 17:48:04 +0000 (18:48 +0100)]
[libc++][format] Copy code to new location.
This is a helper patch to ease the reviewing of D128139.
The originals will be removed at a later time when all formatters are
converted to the new style. (Floating-point and pointer aren't up for
review yet.)
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D128367
Florian Hahn [Thu, 23 Jun 2022 15:17:01 +0000 (17:17 +0200)]
[ConstraintElimination] Transfer info from ULT to signed system.
If A u< B holds, then A s>= 0 && A s< B holds if B s>= 0.
https://alive2.llvm.org/ce/z/RrNxHh
Jan Svoboda [Thu, 23 Jun 2022 15:15:13 +0000 (17:15 +0200)]
[clang][driver] NFC, test: Make test output order-independent
Daniel Bertalan [Thu, 23 Jun 2022 15:07:15 +0000 (11:07 -0400)]
[lld-macho] Use source information in duplicate symbol errors
Similarly to how undefined symbol diagnostics were changed in D128184,
we now show where in the source file duplicate symbols are defined at:
ld64.lld: error: duplicate symbol: _foo
>> defined in bar.c:42
>> /path/to/bar.o
>> defined in baz.c:1
>> /path/to/libbaz.a(baz.o)
For objects that don't contain DWARF data, the format is unchanged.
A slight difference to undefined symbol diagnostics is that we don't
print the name of the symbol on the third line, as it's already
contained on the first line.
Differential Revision: https://reviews.llvm.org/D128425
Bradley Smith [Fri, 17 Jun 2022 10:45:27 +0000 (10:45 +0000)]
[AArch64][SVE] Match (add x (lsr/asr y c)) -> usra/ssra x y c
Differential Revision: https://reviews.llvm.org/D128045
Baptiste Saleil [Thu, 23 Jun 2022 14:16:20 +0000 (10:16 -0400)]
[AMDGPU] Flush the vmcnt counter in loop preheaders when necessary
waitcnt vmcnt instructions are currently generated in loop bodies before using
values loaded outside of the loop. In some cases, it is better to flush the
vmcnt counter in a loop preheader before entering the loop body. This patch
detects these cases and generates waitcnt instructions to flush the counter.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D115747
Nico Weber [Thu, 23 Jun 2022 14:35:48 +0000 (10:35 -0400)]
Revert "[fastalloc] Support allocating specific register class in fastalloc"
This reverts commit
719658d078c4093d1ee716fb65ae94673df7b22b.
Breaks a few things, see comments on https://reviews.llvm.org/D128437
There's disagreement about the best fix.
So let's keep HEAD green while discussions are happening.
Joseph Huber [Thu, 23 Jun 2022 14:36:25 +0000 (10:36 -0400)]
[Binary] Fix leftoever line
Joseph Huber [Thu, 23 Jun 2022 14:22:05 +0000 (10:22 -0400)]
[Binary] Reserve the correct size for the OffloadBinary
Summary:
When writing the offload binary, we use a SmallVector. We already know
the size that we expect the buffer to take up so we should reserve all
that memory up-front to improve performance. Also this patch adds some
extra sanity checks for the binary format for safety.
Nikita Popov [Thu, 23 Jun 2022 14:30:02 +0000 (16:30 +0200)]
[BasicAA] Add test for call incorrectly treated as escape source (NFC)
David Green [Thu, 23 Jun 2022 14:25:24 +0000 (15:25 +0100)]
[ValueTracking] Teach isKnownNonZero that a vscale is never 0.
A llvm.vscale will always be at least 1, never zero. Teaching that to
isKnownNonZero can help fold away some statically known compares.
Differential Revision: https://reviews.llvm.org/D128217
Ilya Biryukov [Thu, 23 Jun 2022 13:52:16 +0000 (15:52 +0200)]
[Sema] Fix assertion failure when instantiating requires expression
Fixes #54629.
The crash is is caused by the double template instantiation.
See the added test. Here is what happens:
- Template arguments for the partial specialization get instantiated.
- This causes instantitation into the corrensponding requires
expression.
- `TemplateInsantiator` correctly handles instantiation of parameters
inside `RequiresExprBody` and instantiates the constraint expression
inside the `NestedRequirement`.
- To build the substituted `NestedRequirement`, `TemplateInsantiator`
calls `Sema::BuildNestedRequirement` calls
`CheckConstraintSatisfaction`, which results in another template
instantiation (with empty template arguments). This seem to be an
implementation detail to handle constraint satisfaction and is not
required by the standard.
- The recursive template instantiation tries to find the parameter
inside `RequiresExprBody` and fails with the corresponding assertion.
Note that this only happens as both instantiations happen with the class
partial template specialization set as `Sema.CurContext`, which is
considered a dependent `DeclContext`.
To fix the assertion, avoid doing the recursive template instantiation
and instead evaluate resulting expressions in-place.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D127487
Florian Hahn [Thu, 23 Jun 2022 14:04:45 +0000 (16:04 +0200)]
[LSR] Move transform test from test/Analysis to test/Transforms.
Also auto-generate check lines.
Jay Foad [Thu, 23 Jun 2022 13:59:30 +0000 (14:59 +0100)]
[AMDGPU] Use -check-prefixes in a test. NFC.
Florian Hahn [Thu, 23 Jun 2022 13:57:59 +0000 (15:57 +0200)]
[ConstraintElimination] Transfer info from SLT to unsigned system.
If A s< B holds, then A u< also holds, if A s>= 0.
https://alive2.llvm.org/ce/z/J4JZuN
chenglin.bi [Thu, 23 Jun 2022 13:47:45 +0000 (21:47 +0800)]
[InstCombine] Optimise shift+and+boolean conversion pattern to simple comparison
if (`C1` is pow2) & (`(C2 & ~(C1-1)) + C1)` is pow2):
((C1 << X) & C2) == 0 -> X >= (Log2(C2+C1) - Log2(C1));
https://alive2.llvm.org/ce/z/EJAl1R
((C1 << X) & C2) != 0 -> X < (Log2(C2+C1) - Log2(C1));
https://alive2.llvm.org/ce/z/3bVRVz
And remove dead code.
Fix: https://github.com/llvm/llvm-project/issues/56124
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D126591
Sam McCall [Thu, 9 Jun 2022 07:06:19 +0000 (09:06 +0200)]
[pseudo] Add xfail tests for a simple-declaration/function-definition ambiguity
I expect to eliminate this ambiguity at the grammar level by use of guards,
because it interferes with brace-based error recvoery.
Differential Revision: https://reviews.llvm.org/D127400
Rodrigo Dominguez [Tue, 30 Mar 2021 17:53:17 +0000 (13:53 -0400)]
[AMDGPU] GFX11: remove ShaderType from ds_ordered_count offset field
In GFX11 ShaderType is determined by the hardware and should no longer
be written into bits[3:2] of the ds_ordered_count offset field.
Differential Revision: https://reviews.llvm.org/D128196
Jay Foad [Thu, 23 Jun 2022 13:06:48 +0000 (14:06 +0100)]
[AMDGPU] Precommit test for D128196
Ruiling Song [Wed, 22 Jun 2022 02:50:46 +0000 (10:50 +0800)]
AMDGPU: Don't crash on global_ctor/dtor declaration
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D128320
Valentin Clement [Thu, 23 Jun 2022 12:57:24 +0000 (14:57 +0200)]
[flang] Add lowering TODO for separate module procedures
MODULE FUNCTION and MODULE SUBROUTINE currently cause lowering crash:
"symbol is not mapped to any IR value" because special care is needed
to handle their interface.
Add a TODO for now.
Example of program that crashed and will hit the TODO:
```
module mod
interface
module subroutine sub
end subroutine
end interface
contains
module subroutine sub
x = 42
end subroutine
end module
```
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128412
Co-authored-by: Jean Perier <jperier@nvidia.com>
Nikita Popov [Thu, 23 Jun 2022 09:44:20 +0000 (11:44 +0200)]
[llvm-c] Add LLVMGetAggregateElement() function
This adds LLVMGetAggregateElement() as a wrapper for
Constant::getAggregateElement(), which allows fetching a
struct/array/vector element without handling different possible
underlying representations.
As the changed echo test shows, previously you for example had to
treat ConstantArray (use LLVMGetOperand) and ConstantDataArray
(use LLVMGetElementAsConstant) separately, not to mention all the
other possible representations (like PoisonValue).
I've deprecated LLVMGetElementAsConstant() in favor of the new
function, which is strictly more powerful (but I could be convinced
to drop the deprecation).
This is partly motivated by https://reviews.llvm.org/D125795,
which drops LLVMConstExtractValue() because the underlying constant
expression no longer exists. This function could previously be used
as a poor man's getAggregateElement().
Differential Revision: https://reviews.llvm.org/D128417
Nikita Popov [Wed, 22 Jun 2022 14:29:15 +0000 (16:29 +0200)]
[X86][AMX] Update tests to use opaque pointers (NFC)
There are some codegen differences here, because presence of
bitcasts affects AMX codegen in minor ways (the bitcasts are not
always in the input IR, but may be added by X86PreAMXConfig
for example).
Differential Revision: https://reviews.llvm.org/D128424
Jacques Pienaar [Thu, 23 Jun 2022 12:31:31 +0000 (05:31 -0700)]
[mlir][pdll] Add new tablegen helper NFC
Command line option injected by tablegen rule cannot be respected by
PDLL here, so add new helper function that is copy of original without
any additional flags injected. This avoids compilation failure when
compiler warnings are disabled.
Kept it as a mechanical copy.
Fixes #55716
Nicolas Vasilache [Thu, 23 Jun 2022 09:29:43 +0000 (02:29 -0700)]
[mlir][Transform] Fix implementation of the generic apply that is based on applyToOne.
The result of applying an N-result producing transformation to M payload ops
is an M-wide result, each containing N result operations.
This requires a transposition of the results obtained by calling `applyToOne`.
This revision fixes the issue and adds more advanced tests that exercise the behavior.
Differential Revision: https://reviews.llvm.org/D128414
Jeroen Dobbelaere [Thu, 23 Jun 2022 12:18:49 +0000 (14:18 +0200)]
Revert "[tbaa] Handle base classes in struct tbaa"
This reverts commit
cdc59e2202c11a6a5dfd2ec83531523c58eaae45.
The Verifier finds a problem in a stage2 build. Reverting so Bruno can investigate.
Paulo Matos [Thu, 23 Jun 2022 12:10:52 +0000 (14:10 +0200)]
[WebAssembly] Update test to run it in opaque pointers mode
When opaque pointers was enabled, -no-opaque-pointers were added to some tests in order not to change behaviour. We now revert this and fix the test.
Reviewed By: asb, tlively
Differential Revision: https://reviews.llvm.org/D128282
Sergey Kosukhin [Wed, 22 Jun 2022 13:10:33 +0000 (16:10 +0300)]
[compiler-rt] Fix false positive detection of a target in compile-only mode
When `compiler-rt` is configured as a runtime, the configure-time target
detection for builtins is done in compile-only mode, which is basically a
test of whether the newly-built `clang` can compile a simple program with
an additional flag (`-m32` and `-m64` in my case). The problem is that on
my Debian system `clang` can compile `int foo(int x, int y) { return x + y; }`
with `-m32` but fails to include `limits.h` (or any other target-specific
header) for the `i386` target:
```
$ /path/to/build/./bin/clang --target=x86_64-unknown-linux-gnu -DVISIBILITY_HIDDEN -O3 -DNDEBUG -m32 -std=c11 -fPIC -fno-builtin -fvisibility=hidden -fomit-frame-pointer -MD -MT CMakeFiles/clang_rt.builtins-i386.dir/absvdi2.c.o -MF CMakeFiles/clang_rt.builtins-i386.dir/absvdi2.c.o.d -o CMakeFiles/clang_rt.builtins-i386.dir/absvdi2.c.o -c /path/to/src/compiler-rt/lib/builtins/absvdi2.c
In file included from /path/to/src/compiler-rt/lib/builtins/absvdi2.c:13:
In file included from /path/to/src/compiler-rt/lib/builtins/int_lib.h:93:
In file included from /path/to/build/lib/clang/15.0.0/include/limits.h:21:
In file included from /usr/include/limits.h:25:
/usr/include/features.h:364:12: fatal error: 'sys/cdefs.h' file not found
^~~~~~~~~~~~~
1 error generated.
```
This is an attempt to make the target detection more robust: extend the test
program with `#include <limits.h>`.
Differential Revision: https://reviews.llvm.org/D127975
Tobias Hieta [Thu, 23 Jun 2022 12:04:23 +0000 (14:04 +0200)]
[NFC] remove trailing whitespace
LLVM GN Syncbot [Thu, 23 Jun 2022 11:53:18 +0000 (11:53 +0000)]
[gn build] Port
2c3bbac0c715
Nikolas Klauser [Thu, 23 Jun 2022 10:23:41 +0000 (12:23 +0200)]
[libc++] Implement ranges::move{, _backward}
This patch also adds a new optimization to `std::move`. It unwraps three `reverse_iterator`s if the wrapped iterator is a `contiguous_iterator` and the iterated type is trivially_movable. This allows us to simplify `ranges::move_backward` to a forward to `std::move` without any pessimization.
Reviewed By: var-const, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D126616
Valentin Clement [Thu, 23 Jun 2022 11:43:38 +0000 (13:43 +0200)]
[flang] Lowering passing variables to OPTIONAL VALUE
The case where the dummy argument is OPTIONAL was missing in the
handling of VALUE numerical and logical dummies (passBy::BaseAddressValueAttribute).
This caused segfaults while unconditionally copying actual arguments that were legally
absent at runtime.
Takes this bug as an opportunity to share the code that lowers arguments
that must be passed by BaseAddress, BaseAddressValueAttribute, BoxChar,
and CharBoxValueAttribute.
It has to deal with the exact same issues (being able to make contiguous
copies of the actual argument, potentially conditionally at runtime,
and potentially requiring a copy-back).
The VALUE case is the same as the non value case, except there is never
a copy-back and there is always a copy-in for variables. This two
differences are easily controlled by a byValue flag.
This as the benefit of implementing CHARACTER, VALUE for free that was
previously a hard TODO.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128418
Co-authored-by: Jean Perier <jperier@nvidia.com>
Florian Hahn [Thu, 23 Jun 2022 11:44:41 +0000 (13:44 +0200)]
[VPlan] Update unit test after
569d84fe99e63.