Quentin Colombet [Mon, 3 Oct 2022 23:28:34 +0000 (23:28 +0000)]
[RISCV][ISel] Fold extensions when all the users can consume them
This patch allows the combines that fold extensions in binary operations
to have more than one use.
The approach here is pretty conservative: if all the users of an
extension can fold the extension, then the folding is done, otherwise we
don't fold.
This is the first step towards avoiding the one-use limitation.
As a result, we make a decision to fold/don't fold for a web of
instructions. An instruction is part of the web of instructions as soon
as it consumes an extension that needs to be folded for all its users.
Because of how SDISel works a web of instructions can be visited over
and over. More precisely, if the folding happens, it happens for the
whole web and that's the end of it, but if the folding fails, the whole
web may be revisited when another member of the web is visited.
To avoid a compile time explosion in pathological cases, we bail out
earlier for webs that are bigger than a given threshold (arbitrarily set
at 18 for now.) This size can be changed using
`--riscv-lower-ext-max-web-size=<maxWebSize>`.
At the current time, I didn't see a better scheme for that. Assuming we
want to stick with doing that in SDISel.
Differential Revision: https://reviews.llvm.org/D133739
David Green [Wed, 5 Oct 2022 20:47:36 +0000 (21:47 +0100)]
[AArch64] Add tablegen patterns for bf16 trn/zip/uzp.
This adds some missing tablegen patterns to handle trn1/trn2/zip1/zip2/uzp1/uzp2,
similar to the Arm handling in
5e1a9d319d2ee5d59a151e1d82f, but via tablegen
patterns for the AArch64 backend.
oToToT [Wed, 5 Oct 2022 20:32:04 +0000 (04:32 +0800)]
[libc++] Fix wrong implementation of CityHash
As PR56606 stated, the current implementation of CityHash in libc++
would drop some bits unintentionally. Cast the 32bit int to the 64bit
int to avoid this happened.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D134124
David Blaikie [Wed, 5 Oct 2022 20:22:19 +0000 (20:22 +0000)]
Use inheriting ctors for OSTargetInfo
(& remove PSPTargetInfo because it's unused - it had the wrong ctor in
it anyway, so wouldn't've been able to be instantiated - must've
happened due to bitrot over the years)
Mark de Wever [Sat, 17 Sep 2022 11:34:00 +0000 (13:34 +0200)]
[libc++][format] Implements formattable concept.
This concept is introduced in P2286, but was implemented in libc++
before. This implementation was used in the library internally. This
implementation lacked the resolution of LWG3636. The original formatter
had a non-const member function that wasn't trivial to make a const
member. The recent parser improvements made this member a const member
in preparation of LWG3636.
Note LWG3636 isn't voted in. Its status is Ready. P2286's concept has
been written as-if LWG3636 is accepted and refers to that LWG issue.
Updates some tests make format a const member function and removes a
tests that's mainly a duplicate of the formattable concept test.
Implements
- LWG3636 formatter<T>::format should be const-qualified
Implements parts of
- P2286R8 Formatting Ranges
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D134110
TatWai Chong [Wed, 5 Oct 2022 19:47:25 +0000 (12:47 -0700)]
[mlir][tosa] Update TOSA resize to match specification
Attribute stride and shift are removed, and has new scale and border.
Signed-off-by: TatWai Chong <tatwai.chong@arm.com>
Change-Id: I6cdbeb3978f5ee540bc6cf59eb7c273eb0131430
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D131629
Ben Langmuir [Tue, 4 Oct 2022 22:08:08 +0000 (15:08 -0700)]
[clang] Update ModuleMap::getModuleMapFile* to use FileEntryRef
Update SourceManager::ContentCache::OrigEntry to keep the original
FileEntryRef, and use that to enable ModuleMap::getModuleMapFile* to
return the original FileEntryRef. This change should be NFC for
most users of SourceManager::ContentCache, but it could affect behaviour
for users of getNameAsRequested such as in compileModuleImpl. I have not
found a way to detect that difference without additional functional
changes, other than incidental cases like changes from / to \ on
Windows so there is no new test.
Differential Revision: https://reviews.llvm.org/D135220
Filipp Zhinkin [Wed, 5 Oct 2022 20:09:49 +0000 (23:09 +0300)]
[AArch64] Add tests for i128 comparison; NFC
Baseline tests for D135302.
River Riddle [Mon, 3 Oct 2022 19:26:12 +0000 (12:26 -0700)]
[mlir:Parser] Always splice parsed operations to the end of the parsed block
The current splicing behavior dates back to when all blocks had terminators,
so we would "helpfully" splice before the terminator. This doesn't make sense
anymore, and leads to somewhat unexpected results when parsing multiple
pieces of IR into the same block.
Differential Revision: https://reviews.llvm.org/D135096
Zixu Wang [Wed, 5 Oct 2022 17:59:58 +0000 (10:59 -0700)]
[clang][ExtractAPI] Don't print locations for anonymous tags
ExtractAPI doesn't care about locations of anonymous TagDecls. Set the
printing policy to exclude that from anonymous decl names.
Differential Revision: https://reviews.llvm.org/D135295
Ivan Butygin [Thu, 8 Sep 2022 22:04:01 +0000 (00:04 +0200)]
[mlir][gpu] Introduce `host_shared` flag to `gpu.alloc`
Motivation: we have lowering pipeline based on upstream gpu and spirv dialects and and we are using host shared gpu memory to transfer data between host and device.
Add `host_shared` flag to `gpu.alloc` to distinguish between shared and device-only gpu memory allocations.
Differential Revision: https://reviews.llvm.org/D133533
Slava Zakharin [Wed, 5 Oct 2022 19:48:03 +0000 (12:48 -0700)]
[flang] Fixed build issue after
88f07a736bbc3f0062d7d8f4032f0b54aff5c018
nullptr matches both against ::mlir::UnitAttr and ::mlir::TypeRange,
so the following two candidates fit:
static void mlir::omp::OrderedRegionOp::build(::mlir::OpBuilder &odsBuilder,
::mlir::OperationState &odsState,
/*optional*/::mlir::UnitAttr simd)
static void mlir::omp::OrderedRegionOp::build(::mlir::OpBuilder &odsBuilder,
::mlir::OperationState &odsState,
::mlir::TypeRange resultTypes,
/*optional*/bool simd = false)
Argyrios Kyrtzidis [Mon, 3 Oct 2022 23:13:26 +0000 (16:13 -0700)]
[clang/Sema] Fix non-deterministic order for certain kind of diagnostics
In the context of caching clang invocations it is important to emit diagnostics in deterministic order;
the same clang invocation should result in the same diagnostic output.
rdar://
100336989
Differential Revision: https://reviews.llvm.org/D135118
Joseph Huber [Wed, 5 Oct 2022 16:03:24 +0000 (11:03 -0500)]
[DeviceRTL] Allow IsSPMDMode to be optimized out in LTO mode
A previous patch merged the static and bitcode versions of the
deviceRTL. We previously used the static library's separate compilation
to set a special flag that prevented `IsSPMDMode` from being put in the
used list and preventing it from being optimized out. When they were
merged we could no longer do this separate compilation that allowed
users of LTO to get more optimal code.
This patch rearranges the code. The `IsSPMDMode` global is now
transitively used by its inclusion in the changed `__keep_alive`
function. This allows us to then manually delete the `__keep_alive`
function from the module when building the static library via
`llvm-extract`. The result is that the bitcode library correctly will
maintain the needed shared state, while the static library will be able
to internalize it and optimize it out.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135280
Joseph Huber [Wed, 5 Oct 2022 17:08:22 +0000 (12:08 -0500)]
[OpenMP] Make the exec_mode global have protected visibility
We use protected visibility for almost everything with offloading. This
is because it provides us with the ability to read things from the host
without the expectation that it will be preempted by a shared library
load, bugs related to this have happened when offloading to the host.
This patch just makes the `exec_mode` global generated for each plugin
have protected visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135285
Sam McCall [Wed, 7 Sep 2022 11:30:08 +0000 (13:30 +0200)]
[clangd] Fix non-idempotent cases of canonicalRenameDecl()
The simplest way to ensure full canonicalization is to canonicalize
recursively in most cases.
This fixes an assertion failure and presumably correctness bugs.
It does show up that D132797's index-based virtual method renames doesn't handle
templates well (the AST behavior is different and IMO better).
We could choose to disable in this case or change the index behavior,
but this patch doesn't do either.
Differential Revision: https://reviews.llvm.org/D133415
Jakub Kuderski [Wed, 5 Oct 2022 19:09:45 +0000 (15:09 -0400)]
[mlir][arith] Add shli support to WIE
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D135234
Diego Caballero [Wed, 5 Oct 2022 18:31:22 +0000 (18:31 +0000)]
[mlir][NFC] Make 'printOp' public in AsmPrinter
This patch moves the 'printOp' functionality to the public API of
AsmPrinter and rename it to 'printCustomOrGenericOp'. No 'parseOp'
is needed at this time as existing APIs are able to parse operations
producing results where results are omitted in the textual form
(the LHS of an operation is redundant when it comes to building the
operation itself as it only contains the result names).
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D135006
Sam McCall [Wed, 5 Oct 2022 00:22:38 +0000 (02:22 +0200)]
[clangd] Don't clone SymbolSlab::Builder arenas when finalizing.
SymbolSlab::Builder has an arena to store strings of owned symbols, and
deduplicates them. build() copies all the strings and deduplicates them again!
This is potentially useful: we may have overwritten a symbol and
rendered some strings unreachable.
However in practice this is not the case. When testing on a variety of
files in LLVM (e.g. SemaExpr.cpp), the strings for the full preamble
index are 3MB and shrink by 0.4% (12KB). For comparison the serializde
preamble is >50MB.
There are also hundreds of smaller slabs (file sharding) that do not shrink at
all.
CPU time spent on this is significant (something like 3-5% of buildPreamble).
We're better off not bothering.
Differential Revision: https://reviews.llvm.org/D135231
Leonard Chan [Wed, 5 Oct 2022 18:53:54 +0000 (18:53 +0000)]
Murali Vijayaraghavan [Wed, 5 Oct 2022 18:41:46 +0000 (18:41 +0000)]
Added canonicalization for vector.multi_reduction
If there are reductions only along unit dimensions, then they are folded
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D134996
Mats Petersson [Wed, 5 Oct 2022 18:16:18 +0000 (19:16 +0100)]
Revert "[flang] Add -fpass-plugin option to Flang frontend"
This reverts commit
43fe6f7cc35ded691bbc2fa844086d321e705d46.
Reverting this as CI breaks.
To reproduce, run check-flang, and it will fail with an error saying
.../lib/Bye.so not found in pass-plugin.f90
Leonard Chan [Wed, 5 Oct 2022 18:40:50 +0000 (18:40 +0000)]
[llvm][NFC] Consolidate equivalent function type parsing code into
single function
Differential Revision: https://reviews.llvm.org/D135296
Jakub Kuderski [Wed, 5 Oct 2022 18:34:26 +0000 (14:34 -0400)]
[mlir][arith] Add andi, ori, and xori support to WIE
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D135204
Ellis Hoag [Wed, 5 Oct 2022 18:06:12 +0000 (11:06 -0700)]
[Dwarf] Remove unnecessary module flags from test
These extra module flags are not needed for this test, so remove them. In fact, leaving them in produces the following error message:
> invalid behavior operand in module flag (unexpected constant)
Differential Revision: https://reviews.llvm.org/D135294
Siva Chandra Reddy [Wed, 5 Oct 2022 17:12:06 +0000 (17:12 +0000)]
[libc][Obvious] Add "__" prefix to sched_getcpucount in the spec and elsewhere.
Without this fix, the declaration in sched.h will not have the "__" prefix and
will cause a compile failure.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D135286
Rob Suderman [Wed, 5 Oct 2022 17:55:25 +0000 (10:55 -0700)]
[mlir][mlprogram] Add CAPI project for MLProgram
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D135291
Quentin Colombet [Thu, 22 Sep 2022 04:16:08 +0000 (04:16 +0000)]
[RISCV][ISel] Refactor the formation of VW operations
This patch centralizes all the combines of add|sub|mul with extended
operands in one "framework".
The rationale for this change is to offer a one-stop-shop for all these
transformations so that, in the future, it is easier to make combine
decisions for a web of instructions (i.e., instructions connected
through s|zext operands).
Technically this patch is not NFC because the new version is more
powerful than the previous version.
In particular, it diverges in two cases:
- VWMULSU can now also be produced from `mul(splat, zext)`, whereas
previously only `mul(sext, splat)` were supported when `splat`s were
involved. (As demonstrated in rvv/fixed-vectors-vwmulsu.ll)
- VWSUB(U) can now also be produced from `sub(splat, ext)`, whereas
previously only `sub(ext, splat)` were supported when `splat`s were
involved. (As demonstrated in rvv/fixed-vectors-vwsub.ll)
If we wanted, we could block these transformations to make this
patch really NFC. For instance, we could do something similar to
`AllowSplatInVW_W`, which prevents the combines to form vw(add|sub)(u)_w
when the RHS is a splat.
Regarding the "framework" itself, the bulk of the patch is some
boilderplate code that abstracts away the actual extensions that are
present in the DAG. This allows us to handle `vwadd_w(ext a, b)` as if
it was a regular `add(ext a, ext b)`. Since the node `ext b` doesn't
actually exist in the DAG, we have a bunch of methods (all in the
NodeExtensionHelper class) that fake all that for us.
The other half of the change is around `CombineToTry` and
`CombineResult`. These helper structures respectively:
- Represent the kind of combines that can be applied to a node, and
- Store what needs to happen to do that combine.
This can be viewed as a two step approach:
- First, check if a pattern applies, and
- Second apply it.
The checks and the materialization of the combines are decoupled so that
in the future we can perform several checks and do all the related
applies in one go.
Differential Revision: https://reviews.llvm.org/D134703
Jacques Pienaar [Wed, 5 Oct 2022 17:40:58 +0000 (10:40 -0700)]
[mlir] Make UnitAttr's default val in unwrapped builder
UnitAttr is optional but unwrapped builders require it. Make Change onstructing
from bool as required for when not set at moment (for UnitAttr nothing needs to
be constructed, this is true for others here too and can be addressed
together).
Differential Revision: https://reviews.llvm.org/D135058
Akira Hatanaka [Wed, 5 Oct 2022 01:35:22 +0000 (18:35 -0700)]
[Sema][ObjC] Fix assertion failure in getCommonNonSugarTypeNode
Instead of checking that the protocols of both types are all equal,
check that the canonical decls are equal.
Nico Weber [Tue, 13 Sep 2022 12:53:05 +0000 (08:53 -0400)]
[aarch64] add missing run line to a test
The CHECK-IOS lines were added in
1c353419ab51f, but without a
matching FileCheck invocation. Add it.
The dead CHECK-IOS lines were found by Daniel Bertalan.
Differential Revision: https://reviews.llvm.org/D133772
Mahesh Ravishankar [Wed, 5 Oct 2022 17:26:34 +0000 (17:26 +0000)]
Add `const` to `dump` method of `OpFoldResult`.
While most `dump` methods are marked `const`, some arent marked as
`const`. Adding `const` to `OpFoldResult` here since this was
encountered as an issue while debugging (doing `dump` within a debug
console threw an error indicating the method should be marked
`const`).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D135241
Mahesh Ravishankar [Tue, 4 Oct 2022 22:04:43 +0000 (22:04 +0000)]
[mlir][Linalg] Expose vectorization precondition check as a utility function.
This patch exposes the method to check if an op can be vectorized or
not for downstream uses. Also adds a check to mark elementwise operations
that have non-vectorizable ops (like `tensor.extract`) as non vectorizable.
Reviewed By: nicolasvasilache, dcaballe, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D135201
Aiden Grossman [Sat, 1 Oct 2022 06:24:32 +0000 (06:24 +0000)]
[lld][ELF] Fix lazy ThinLTO index writing in thin archives
Currently when the --thinlto-emit-index-files is used with LLD and a
thin archive is passed containing references to object files to link
against where the object files are in a different folder than the thin
archive and some of the archives aren't linked against (ie stay lazy),
the empty index file writer ends up trying to write to a path that
doesn't exist. This patch changes the behavior of that function to use
the path of the obj member of the BitcodeFile object rather than just
the path of the BitcodeFile object itself, which matches the behavior of
the default (non-lazy) case.
Fixes #57963
Regression test added.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D135014
Vitaly Buka [Wed, 5 Oct 2022 16:59:25 +0000 (09:59 -0700)]
Revert "[compiler-rt][test] Heed COMPILER_RT_DEBUG when compiling unittests"
Breaks some bots, details in https://reviews.llvm.org/D91620
This reverts commit
93b1256e38f63a81561288b9a90c5d52af63cb6e.
Vitaly Buka [Wed, 5 Oct 2022 16:47:51 +0000 (09:47 -0700)]
Revert "[mlir][sparse] Restore case coverage warning fix"
Breaks https://lab.llvm.org/buildbot/#/builders/168/builds/9288
This reverts commit
83839700c32996c58ddebc0c74e3dc4970e005bc.
Aart Bik [Tue, 4 Oct 2022 21:34:37 +0000 (14:34 -0700)]
[mlir][sparse] introduce a higher-order tensor mapping
This extension to the sparse tensor type system in MLIR
opens up a whole new set of sparse storage schemes, such as
block sparse storage (e.g. BCSR) and ELL (aka jagged diagonals).
This revision merely introduces the type extension and
initial documentation. The actual interpretation of the type
(reading in tensors, lowering to code, etc.) will follow.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D135206
Mark de Wever [Sun, 20 Mar 2022 12:40:02 +0000 (13:40 +0100)]
[libc++][chrono] Implements formatter month.
Partially implements:
- P1361 Integration of chrono with text formatting
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D134138
Sam McCall [Mon, 26 Sep 2022 23:15:06 +0000 (01:15 +0200)]
Fix SourceManager::isBeforeInTranslationUnit bug with token-pasting
isBeforeInTranslationUnit compares SourceLocations across FileIDs by
mapping them onto a common ancestor file, following include/expansion edges.
It is possible to get a tie in the common ancestor, because multiple
"chunks" of a macro arg will expand to the same macro param token in the body:
#define ID(X) X
#define TWO 2
ID(1 TWO)
Here two FileIDs both expand into `X` in ID's expansion:
- one containing `1` and spelled on line 3
- one containing `2` and spelled by the macro expansion of TWO
isBeforeInTranslationUnit breaks this tie by comparing the two FileIDs:
the one "on the left" is always created first and is numerically smaller.
This seems correct so far.
Prior to this patch it also takes a shortcut (unclear if intentionally).
Instead of comparing the two FileIDs that directly expand to the same location,
it compares the original FileIDs being compared. These may not be the
same if there are multiple macro expansions in between.
This *almost* always yields the right answer, because macro expansion
yields "trees" of FileIDs allocated in a contiguous range: when comparing tree A
to tree B, it doesn't matter what representative you pick.
However, the splitting of >> tokens is modeled as macro expansion (as if
the first '>' was a macro that expands to a '>' spelled a scratch buffer).
This splitting occurs retroactively when parsing, so the FileID allocated is
larger than expected if it were a real macro expansion performed during lexing.
As a result, macro tree A can be on the left of tree B, and yet contain
a token-split FileID whose numeric value is *greator* than those in B.
In this case the tiebreak gives the wrong answer.
Concretely:
#define ID(X) X
template <typename> class S{};
ID(
ID(S<S<int>> x);
int y;
)
Given Greater = (typeloc of S<int>).getEndLoc();
Y = (decl of y).getLocation();
isBeforeInTranslationUnit(Greater, Y) should return true, but returns false.
Here the common FileID of (Greater, Y) is the body of the outer ID
expansion, and they both expand to X within it.
With the current tiebreak rules, we compare the FileID of Greater (a split)
to the FileID of Y (a macro arg expansion into X of the outer ID).
The former is larger because the token split occurred relatively late.
This patch fixes the issue by removing the shortcut. It tracks the immediate
FileIDs used to reach the common file, and uses these IDs to break ties.
In the example, we now compare the macro arg expansion of the inner ID()
to the macro arg expansion of Y, and find that it is smaller.
This requires some changes to the InBeforeInTUCacheEntry (sic).
We store a little more data so it's probably slightly slower.
It was difficult to resist more invasive changes:
- performance: the sizing is very suspicious, and once the cache "fills up"
we're thrashing a single entry
- API: the class seems to be needlessly complicated
However I tried to avoid mixing these with subtle behavior changes, and
will send a followup instead.
Differential Revision: https://reviews.llvm.org/D134685
Xiang Li [Fri, 30 Sep 2022 17:54:05 +0000 (10:54 -0700)]
[HLSL] Support register binding attribute on global variable
Allow register binding attribute on variables.
Report warning when register binding attribute applies to local variable or static variable.
It will be ignored in this case.
Type check for register binding is tracked with https://github.com/llvm/llvm-project/issues/57886.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D134617
Ellis Hoag [Tue, 4 Oct 2022 19:34:31 +0000 (12:34 -0700)]
[Dwarf] Reference the correct CU when inlining
Sometimes when a function is inlined into a different CU, `llvm-dwarfdump --verify` would find an inlined subroutine with an invalid abstract origin. This is because `DwarfUnit::addDIEEntry()` will incorrectly assume the inlined subroutine and the abstract origin are from the same CU if it can't find the CU for the inlined subroutine.
In the added test, the inlined subroutine for `bar()` is created before the CU for `B.swift` is created, so it tries to point to `goo()` in the wrong CU. Interestingly, if we swap the order of the two functions then we don't see a crash since the module for `goo()` is created first.
The fix is to give a parent DIE to `ScopeDIE` before calling `addDIEEntry()` so that its CU can be found. Luckily, `constructInlinedScopeDIE()` is only called once so we can pass it the DIE of the scope's parent and give it a child just after it's created.
`constructInlinedScopeDIE()` should always return a DIE, so assert that it is not null.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D135114
Alexandre Ganea [Fri, 30 Sep 2022 12:27:04 +0000 (08:27 -0400)]
[mlir][unittest] Fix crash when building with MSVC 2022
The test Dialect/Affine/ops.mlir was failing when building with
Visual Studio 2022 version 17.3.5. This was caused by a bad MSVC codegen, when
capturing a `constexpr` in a lambda. The bug was reported to Microsoft, see
differential for more information.
Differential revision: https://reviews.llvm.org/D134227
Alexandre Ganea [Fri, 30 Sep 2022 12:22:54 +0000 (08:22 -0400)]
[mlir] Fix ambiguity when building with Clang 14.0.6
Differential revision: https://reviews.llvm.org/D134219
Alexandre Ganea [Fri, 30 Sep 2022 12:17:14 +0000 (08:17 -0400)]
[Orc] Fix the SharedMemoryMapper dtor
As briefly discussed on https://reviews.llvm.org/rG1134d3a03facccd75efc5385ba46918bef94fcb6, fix the unintended copy while iterating on Reservations and add a mutex guard, to be symmetric with other usages of Reservations.
Differential revision: https://reviews.llvm.org/D134212
Sam McCall [Mon, 26 Sep 2022 01:22:09 +0000 (03:22 +0200)]
[Syntax] Fix macro-arg handling in TokenBuffer::spelledForExpanded
A few cases were not handled correctly. Notably:
#define ID(X) X
#define HIDE a ID(b)
HIDE
spelledForExpanded() would claim HIDE is an equivalent range of the 'b' it
contains, despite the fact that HIDE also covers 'a'.
While trying to fix this bug, I found findCommonRangeForMacroArgs hard
to understand (both the implementation and how it's used in spelledForExpanded).
It relies on details of the SourceLocation graph that are IMO fairly obscure.
So I've added/revised quite a lot of comments and made some naming tweaks.
Fixes https://github.com/clangd/clangd/issues/1289
Differential Revision: https://reviews.llvm.org/D134618
Louis Dionne [Tue, 4 Oct 2022 15:28:45 +0000 (11:28 -0400)]
[libc++] Get rid of _LIBCPP_HAS_OPEN_WITH_WCHAR in the test suite
Differential Revision: https://reviews.llvm.org/D135163
Florian Hahn [Wed, 5 Oct 2022 15:25:00 +0000 (16:25 +0100)]
[ConstraintElimination] Convert NewIndices to vector and rename (NFCI).
The callers of getConstraint only require a list of new variables.
Update the naming and types to make this clearer.
Joe Nash [Thu, 29 Sep 2022 17:30:25 +0000 (13:30 -0400)]
[AMDGPU] Fix V_CMP_CLASS_F16_t16_e64 src1 type.
For V_CMP_CLASS_F16_t16_e64 and V_CMPX_CLASS_F16_t16_e64,
https://reviews.llvm.org/D133723 changed the value type of src1 from i32 to i16.
These src1 operands are 16 bits, therefore need to be encoded as true16
operands. So the _e32 type was correctly set to VGPR_32_Lo128.
In _e64 form the operand class went from
VSrc_b32 to VSrc_b16. For some reason, we cannot encode inline literals for
VSrc_b16, see
5f5f566b265db00f577ead268400d99f34ba9cdd. In this phase of
the true16 implementation, VSrc_b16 and VSrc_b32 are still similar,
except from that quirk of inlines. So set the operand class to regain
that function.
Reviewed By: dp, arsenm
Differential Revision: https://reviews.llvm.org/D134897
Johannes Doerfert [Wed, 5 Oct 2022 15:06:35 +0000 (08:06 -0700)]
[OpenMP][FIX] Update device API to match recent changes
Nikita Popov [Wed, 5 Oct 2022 15:03:48 +0000 (17:03 +0200)]
[PhaseOrdering] Name instructions in test (NFC)
Run through opt -instnamer.
Mark de Wever [Sun, 20 Mar 2022 12:40:02 +0000 (13:40 +0100)]
[libc++][chrono] Implements formatter year.
Partially implements:
- P1361 Integration of chrono with text formatting
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D133663
Alexey Bataev [Wed, 5 Oct 2022 14:52:39 +0000 (07:52 -0700)]
[SLP][NFC]Add a test for CSE for extractelements.
Michael Maitland [Tue, 4 Oct 2022 19:18:39 +0000 (12:18 -0700)]
[RISCV][CodeGen] Add Scheduling for vset{i}vl{i} instruction
Differential Revision: https://reviews.llvm.org/D135188
Nikita Popov [Wed, 5 Oct 2022 14:40:29 +0000 (16:40 +0200)]
[LICM] Convert tests to opaque pointers (NFC)
Using https://gist.github.com/nikic/
98357b71fd67756b0f064c9517b62a34.
The opaque pointer migration resolves the TODO on test_fence3: The
transform now works as expected by dint of the bitcast no longer
existing.
Nikita Popov [Wed, 5 Oct 2022 14:40:29 +0000 (16:40 +0200)]
[LICM] Adjust speculation test to avoid no-op instruction (NFC)
Such GEPs don't exist with opaque pointers, give it an actual
offset.
Juan Manuel MARTINEZ CAAMAÑO [Wed, 5 Oct 2022 13:38:35 +0000 (08:38 -0500)]
[NFC][AMDGPULowerKernelAttributes] Factorize repeated code into function
Differential Revision: https://reviews.llvm.org/D135266
Fraser Cormack [Wed, 5 Oct 2022 07:31:00 +0000 (08:31 +0100)]
[LangRef][VP] Change masked-off lanes from undef to poison
These were all changed in
32b1b06b7081bd722750c6f3d528336f3f7ed34b (as
discussed in D133967) but some intrinsics introduced since have
re-introduced `undef` as the masked-off value.
Reviewed By: reames, eopXD
Differential Revision: https://reviews.llvm.org/D135244
Anton Sidorenko [Wed, 5 Oct 2022 13:58:07 +0000 (14:58 +0100)]
[NFC][RISCV] Move getSEWLMULRatio function to header
More uses of getSEWLMULRatio will be added in D130895.
Reviewed By: craig.topper, frasercrmck
Differential Revision: https://reviews.llvm.org/D135086
Valentin Clement [Wed, 5 Oct 2022 14:05:11 +0000 (16:05 +0200)]
[flang] Deallocate polymorphic and unlimited polymorphic intent(out) allocatable with runtime
Polymorphic and unlimited polymorphic entities should be handled by runtime. This patch
update the condition in `genDeallocate` to force polymorphic and unlimited polymorphic entities
to be deallocated through a runtime call and not inlined.
Depends on D135143
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D135144
Nikita Popov [Wed, 5 Oct 2022 13:28:15 +0000 (15:28 +0200)]
[DSE] Convert tests to opaque pointers (NFC)
Using https://gist.github.com/nikic/
98357b71fd67756b0f064c9517b62a34.
Sam McCall [Tue, 4 Oct 2022 20:05:16 +0000 (22:05 +0200)]
[Index] USRGeneration doesn't depend on unnamed.printName() => ''. NFC
This prepares for printName() to print `(anonymous struct)` etc in D134813.
Differential Revision: https://reviews.llvm.org/D135191
Sam McCall [Wed, 5 Oct 2022 13:48:28 +0000 (15:48 +0200)]
[clangd] Stop isSpelledInSource from printing source locations.
It shows up on profiles, albeit only at 0.1% or so.
Johannes Doerfert [Mon, 12 Sep 2022 20:22:05 +0000 (13:22 -0700)]
[Attributor] Teach AAPointerInfo about atomic cmxchg and rmw
The atomic operations behave similar to a store except that we don't
know the new value and we read the result first.
Dmitry Preobrazhensky [Wed, 5 Oct 2022 13:44:00 +0000 (16:44 +0300)]
[AMDGPU][MC][GFX11] Correct e64_dpp variants of v_movreld and v_movrelsd
Differential Revision: https://reviews.llvm.org/D135079
Kerry McLaughlin [Wed, 5 Oct 2022 13:01:24 +0000 (14:01 +0100)]
[AArch64][SME] Set up a lazy-save/restore around calls.
Setting up a lazy-save mechanism around calls is done during SelectionDAG
because calls to intrinsics may be expanded into an actual function call
(e.g. calls to @llvm.cos()), and maintaining an allowed-list in the SMEABI
pass is not feasible.
The approach for conditionally restoring the lazy-save based on the runtime
value of TPIDR2_EL0 is similar to how we handle conditional smstart/smstop.
We create a pseudo-node which gets expanded into a conditional branch and
expands to a call to __arm_tpidr2_restore(%tpidr2_object_ptr).
The lazy-save buffer and TPIDR2 block are only allocated once at the start
of the function. For each call, the TPIDR2 block is initialised, and at
the end of the call, a pseudo node (RestoreZA) is planted.
Patch by Sander de Smalen.
Differential Revision: https://reviews.llvm.org/D133900
Johannes Doerfert [Mon, 12 Sep 2022 01:43:20 +0000 (18:43 -0700)]
[Attributor] AAPointerInfo can model non-escaping call uses
If a call base use will not capture a pointer we can approximate the
effects. This is important especially for readnone/only uses. Even
may-write uses are not too bad with reachability in place. Capturing
is the problem as we loose track of update sides.
Nikita Popov [Wed, 5 Oct 2022 13:26:41 +0000 (15:26 +0200)]
[DSE] Regenerate test checks (NFC)
Nikita Popov [Wed, 5 Oct 2022 13:25:40 +0000 (15:25 +0200)]
[DSE] Fix variable name clash in test (NFC)
update_tests_checks.py generates the same identifier for lowercase
and uppercase variable names. Make sure they have a distinct name.
Kadir Cetinkaya [Wed, 5 Oct 2022 10:09:56 +0000 (12:09 +0200)]
[clang][Sema] Fix crash on invalid base destructor
LookupSpecialMember might fail, so changes the cast to cast_or_null.
Inside Sema, skip a particular base, similar to other cases, rather than
asserting on dtor showing up.
Other option would be to mark classes with invalid destructors as invalid, but
that seems like a lot more invasive and we do lose lots of diagnostics that
currently work on classes with broken members.
Differential Revision: https://reviews.llvm.org/D135254
Johannes Doerfert [Wed, 31 Aug 2022 19:13:42 +0000 (12:13 -0700)]
[Attributor] Teach AAPointerInfo to look into aggregates
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
Johannes Doerfert [Wed, 5 Oct 2022 12:39:02 +0000 (05:39 -0700)]
[Attributor][NFC] Re-run update_test_checks on all Attributor tests
Nikita Popov [Wed, 5 Oct 2022 12:26:57 +0000 (14:26 +0200)]
[MemCpyOpt] Convert tests to opaque pointers (NFC)
Converted using the script at
https://gist.github.com/nikic/
98357b71fd67756b0f064c9517b62a34.
Nikita Popov [Wed, 5 Oct 2022 12:51:07 +0000 (14:51 +0200)]
[MemCpyOpt] Don't hoist above producer of pointer operand
This was already handled correctly below, but not checked for the
original store pointer operand. Encountered when converting tests
to opaque pointers, where the intermediate bitcast goes away.
Shilei Tian [Wed, 5 Oct 2022 12:43:53 +0000 (08:43 -0400)]
[Clang][OpenMP] Only check value if the expression is not instantiation dependent
Currently the following case fails:
```
template<typename Ty>
Ty foo(Ty *addr, Ty val) {
Ty v;
#pragma omp atomic compare capture
{
v = *addr;
if (*addr > val)
*addr = val;
}
return v;
}
```
The compiler complains `addr` is not a lvalue. That's because when an expression
is instantiation dependent, we cannot tell if it is lvalue or not.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D135224
David Stuttard [Mon, 3 Oct 2022 18:21:18 +0000 (19:21 +0100)]
[AggressiveInstCombine] Fix cases where non-opaque pointers are used
In the case of non-opaque pointers, when combining consecutive loads,
need to bitcast the pointer source to the combined type size, otherwise
asserts are triggered.
Differential Revision: https://reviews.llvm.org/D135249
Oleksandr "Alex" Zinenko [Wed, 5 Oct 2022 12:33:47 +0000 (14:33 +0200)]
[mlir] tweak declarative assembly doc
Change the formal argument of the `functional-type` directive from "results" to "outputs" to avoid confusion with the `results` directive.
Peixin Qiao [Wed, 5 Oct 2022 12:22:33 +0000 (20:22 +0800)]
[flang][OpenMP] Support privatization for single construct
This supports the lowering of private and firstprivate clauses in single
construct. The alloca ops are emitted in the entry block according to
https://llvm.org/docs/Frontend/PerformanceTips.html#use-of-allocas, and
the load/store ops are emitted in the single region. The data race
problem is handled in OMPIRBuilder. That is, the barrier is emitted in
OMPIRBuilder.
Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com>
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128596
Nikita Popov [Fri, 30 Sep 2022 10:13:40 +0000 (12:13 +0200)]
Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
The infinite loop seen on buildbots should be fixed by
11897708c0229c92802e747564e7c34b722f045f (assuming there are not
multiple infinite combine loops...)
-----
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
Differential Revision: https://reviews.llvm.org/D134954
Emmmer [Wed, 28 Sep 2022 15:04:08 +0000 (23:04 +0800)]
[LLDB][RISCV][NFC] Rewrite instruction in algebraic datatype
The old approach (dedicated ExecXXX for each instruction) is not flexible and results in duplicated code when RVC kicks in.
According to the spec, every compressed instruction can be decoded to a non-compressed one. So we can lower compressed instructions to instructions we already had, which requires a decoupling between the decoder and executor.
This patch:
- use llvm::Optional and its combinators AMAP.
- use template constraints on common instruction.
- make instructions strongly-typed (no uint32_t everywhere bc it is error-prone and burdens the developer when lowering the RVC) with the help of algebraic datatype (std::variant).
Note:
(NFC) because this is more of a refactoring in preparation for RVC.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D135015
Nikita Popov [Wed, 5 Oct 2022 11:26:20 +0000 (13:26 +0200)]
[InstCombine] Directly replace instr in foldIntegerTypedPHI() (NFCI)
Rather than inserting a ptrtoint + inttoptr pair, directly replace
the inttoptr with the new phi node. This ensures that no other
transform can undo it before the pair gets folded away.
This avoids the infinite loop when combined with D134954.
This is NFCI in the sense that it shouldn't make a difference, but
could due to different worklist order.
Nikita Popov [Wed, 5 Oct 2022 11:11:13 +0000 (13:11 +0200)]
[InstCombine] Add test for infinite combine loop with D134954 (NFC)
The patch interacts badly with foldIntegerTypedPHI().
Guray Ozen [Wed, 5 Oct 2022 11:09:27 +0000 (13:09 +0200)]
[mlir][transform] Add failing test for GPU transform dialect
The GPU transform dialect currently has restrictions and several situations where we can't use transform dialect.
This update includes a method to test a failing cases in GPU transform dialect.
Differential Revision: https://reviews.llvm.org/D135063
Guray Ozen [Wed, 5 Oct 2022 06:48:19 +0000 (08:48 +0200)]
[mlir][transform][nfc] typo fix
fix typo
Reviewed By: nicolasvasilache, ftynse
Differential Revision: https://reviews.llvm.org/D135242
LLVM GN Syncbot [Wed, 5 Oct 2022 09:44:51 +0000 (09:44 +0000)]
[gn build] Port
f0f474dfd03b
David Sherwood [Wed, 5 Oct 2022 08:12:31 +0000 (08:12 +0000)]
[AArch64][SME] Add codegen pass to handle ZA state in arm_new_za functions.
The new pass implements the following:
* Inserts code at the start of an arm_new_za function to
commit a lazy-save when the lazy-save mechanism is active.
* Adds a smstart intrinsic at the start of the function.
* Adds a smstop intrinsic at the end of the function.
Patch co-authored by kmclaughlin.
Differential Revision: https://reviews.llvm.org/D133896
Fraser Cormack [Wed, 5 Oct 2022 09:32:44 +0000 (10:32 +0100)]
[VP] Fix unused variable in release configurations
Mikhail Goncharov [Wed, 5 Oct 2022 09:32:28 +0000 (11:32 +0200)]
Fix clang baremetal test
def48cae45a5085b7759f2be71768e27718b901a accidentally dropped -no-canonical-prefixes
Florian Hahn [Wed, 5 Oct 2022 09:28:15 +0000 (10:28 +0100)]
[SimpleLoopUnswitch] Clear dispos in deleteDeadBlocksFromLoop.
SimpleLoopUnswitch may remove blocks from loops. Clear block and loop
dispositions in that case, to clean up invalid entries in the cache.
Fixes #58158.
Fixes #58159.
Florian Hahn [Wed, 5 Oct 2022 09:19:54 +0000 (10:19 +0100)]
[SimpleLoopUnswitch] Simplify test, reduce the passes to trigger crash.
This simplifies the test case added in
e399dd601 to only require indvars
and simple-loop-unswitch. This allows adding the test case for #58158 to
the same file, keeping related tests together.
Kadir Cetinkaya [Wed, 5 Oct 2022 08:37:12 +0000 (10:37 +0200)]
Revert "[clang][Lex] Fix a crash on malformed string literals"
This reverts commit
36a200208facf58d454c9b7253c956c2f2a8b946.
Nikita Popov [Wed, 5 Oct 2022 08:30:44 +0000 (10:30 +0200)]
[SROA] Regenerate test checks (NFC)
David Sherwood [Wed, 5 Oct 2022 07:41:23 +0000 (07:41 +0000)]
[AArch64][SME] Prevent SVE object address calculations between smstop and call
This patch introduces a new AArch64 ISD node (OBSCURE_COPY) that can
be used when we want to prevent SVE object address calculations
from being rematerialised between a smstop/smstart and a call.
At the moment we use COPY to copy the frame index to a register,
which leads to problems because the "simple register coalescing"
pass understands the COPY instruction and attempts to rematerialise
an address calculation with 'addvl' between an smstop and a call.
When in streaming mode the 'addvl' instruction may have different
behaviour because the streaming SVE vector length is not guaranteed
to equal the normal SVE vector length.
The new ISD opcode OBSCURE_COPY gets lowered to a new pseudo
instruction also called OBSCURE_COPY. This ensures it cannot be
rematerialised and we expand this into a simple move very late in
the machine instruction pipeline.
A new test is added here:
CodeGen/AArch64/sme-streaming-interface.ll
Differential Revision: https://reviews.llvm.org/D134940
Valentin Clement [Wed, 5 Oct 2022 08:04:46 +0000 (10:04 +0200)]
[flang] Update to fir::isUnlimitedPolymorphicType and fir::isPolymorphicType functions
This patch update the fir::isUnlimitedPolymorphicType function
to reflect the chosen design. It adds also a fir::isPolymorphicType
function.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D135143
Martin Storsjö [Sat, 1 Oct 2022 12:30:25 +0000 (15:30 +0300)]
[AArch64] Add missing SEH_Nop when aligning the stack
This makes sure that the instructions of the prologue matches the
SEH opcodes.
Also remove a couple redundant cases of setting HasWinCFI; it was
already set unconditionally after the conditional cases.
Differential Revision: https://reviews.llvm.org/D135101
David Spickett [Wed, 5 Oct 2022 07:31:03 +0000 (07:31 +0000)]
Fix LLDB build on old Linux kernels (pre-4.1)
These fields are guarded elsewhere, but were missing here.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D133778
Nicolas Vasilache [Tue, 4 Oct 2022 12:14:30 +0000 (05:14 -0700)]
[mlir][Linalg] NFC - Add bbarg pretty printing to linalg::generic
Differential Revision: https://reviews.llvm.org/D135151
Kadir Cetinkaya [Tue, 4 Oct 2022 15:06:24 +0000 (17:06 +0200)]
[clang][Lex] Fix a crash on malformed string literals
Differential Revision: https://reviews.llvm.org/D135161
Nicolas Vasilache [Tue, 4 Oct 2022 22:42:41 +0000 (15:42 -0700)]
[mlir][Linalg] Retire LinalgStrategyLowerVectorsPass and filter-based patterns
Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785
Depends on D135200
Differential Revision: https://reviews.llvm.org/D135222
Rainer Orth [Wed, 5 Oct 2022 07:53:26 +0000 (09:53 +0200)]
[compiler-rt][test] Heed COMPILER_RT_DEBUG when compiling unittests
When trying to debug some `compiler-rt` unittests, I initially had a hard
time because
- even in a `Debug` build one needs to set `COMPILER_RT_DEBUG` to get
debugging info for some of the code and
- even so the unittests used a hardcoded `-O2` which often makes debugging
impossible.
This patch addresses this by instead using `-O0` if `COMPILER_RT_DEBUG`.
Two tests in `sanitizer_type_traits_test.cpp` need to be disabled since
they have undefined references to `__sanitizer::integral_constant<bool,
true>::value`.
Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D91620
Nicolas Vasilache [Tue, 4 Oct 2022 21:35:10 +0000 (14:35 -0700)]
[mlir][Linalg] Retire LinalgStrategyPeelPass and filter-based pattern.
Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785
Differential Revision: https://reviews.llvm.org/D135200