Aaron Ballman [Sat, 17 Sep 2022 12:21:33 +0000 (08:21 -0400)]
Fix release note formatting and style; NFC
Uses double backticks where appropriate, changes some instances of
GH12345 to be Issue 12345, etc.
Aaron Ballman [Sat, 17 Sep 2022 12:06:16 +0000 (08:06 -0400)]
Fix this test to be more robust
The test is failing because it lacks a target triple, so the number of
diagnostics differs between Windows and Linux targets.
This should correct the issue found by:
https://lab.llvm.org/buildbot/#/builders/109/builds/46804
Aaron Ballman [Sat, 17 Sep 2022 11:55:10 +0000 (07:55 -0400)]
Correctly diagnose use of long long literals w/o a suffix
We would diagnose use of `long long` as an extension in C89 and C++98
modes when the user spelled the type `long long` or used the `LL`
literal suffix, but failed to diagnose when the literal had no suffix
but required a `long long` to represent the value.
Christian Sigg [Fri, 16 Sep 2022 20:27:04 +0000 (22:27 +0200)]
[Bazel] Allow lit_test() macro to be used from other repos.
Wrap implicit dependencies in Label() so that they refer to @llvm-project, see https://bazel.build/rules/lib/Label#Label.
This change allows lit_test() to be used from other bazel repositories.
Filipp Zhinkin [Fri, 16 Sep 2022 18:01:09 +0000 (21:01 +0300)]
[ARM] Add more tests on instructions fusion with comparison with zero; NFC
Baseline tests for D131786
Daniel Bertalan [Fri, 16 Sep 2022 16:11:55 +0000 (18:11 +0200)]
[lld-macho] Simplify base address calculation for init offsets (NFC)
Xiang Li [Sat, 17 Sep 2022 07:11:44 +0000 (00:11 -0700)]
[NFC} update CodeGenHLSL tests to use cc1 instead of driver-mode
Alex Zinenko [Fri, 16 Sep 2022 14:16:27 +0000 (16:16 +0200)]
[mlir] use strided layouts in vector transfer on memrefs
One of the vector transformation patterns has been indiscriminately
converting layouts to affine maps. Leverage the strided form when
possible.
Reviewed By: nicolasvasilache, dcaballe
Differential Revision: https://reviews.llvm.org/D134047
Alex Zinenko [Fri, 16 Sep 2022 13:36:40 +0000 (15:36 +0200)]
[mlir] use strided layout in structured codegen-related tests
All relevant operations have been switched to primarily use the strided
layout, but still support the affine map layout. Update the relevant
tests to use the strided format instead for compatibility with how ops
now print by default.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134045
Emmmer [Sun, 11 Sep 2022 08:36:12 +0000 (16:36 +0800)]
[LLDB][RISCV] Add RVM and RVA instruction support for EmulateInstructionRISCV
Add:
- RVM and RVA instructions sets.
- corresponding unittests.
Further work:
- implement RVC, RVF, RVD, and RVV extension.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D133670
Kazu Hirata [Sat, 17 Sep 2022 01:26:19 +0000 (18:26 -0700)]
Revert "[llvm] Remove llvm::is_trivially_{copy/move}_constructible (NFC)"
This reverts commit
01ffe31cbb54bfd8e38e71b3cf804a1d67ebf9c1.
A build breakage with GCC 7.3 has been reported:
https://reviews.llvm.org/D132311#3797053
FWIW, GCC 7.5 is OK according to Pavel Chupin. I also personally
tested GCC 8.4.0.
Vladislav Dzhidzhoev [Fri, 9 Sep 2022 23:15:39 +0000 (02:15 +0300)]
[GlobalISel][DebugInfo] Salvage trivially dead instructions
Use salvageDebugInfo for instructions erased as trivially dead in
GlobalISel.
It would be helpful to implement support of G_PTR_ADD and G_FRAME_INDEX
in salvageDebugInfo in future in order to preserve more variable
location.
Reviewed by: arsenm
Differential Revision: https://reviews.llvm.org/D133986
Maksim Panchenko [Fri, 16 Sep 2022 23:45:19 +0000 (16:45 -0700)]
[BOLT][NFC] Remove unreachable assertion
Reviewed By: ayermolo
Differential Revision: https://reviews.llvm.org/D134094
Aart Bik [Fri, 16 Sep 2022 22:22:48 +0000 (15:22 -0700)]
[mlir][sparse] add loop simplification to sparsification pass
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D134090
Matheus Izvekov [Sat, 18 Jun 2022 02:21:59 +0000 (04:21 +0200)]
[clang] Fix AST representation of expanded template arguments.
Extend clang's SubstTemplateTypeParm to represent the pack substitution index.
Fixes PR56099.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D128113
Peiming Liu [Fri, 16 Sep 2022 18:21:59 +0000 (18:21 +0000)]
[mlir][sparse] Only try to compute a better iteraton graph when needed
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D134059
Michael Jones [Thu, 15 Sep 2022 18:24:47 +0000 (11:24 -0700)]
[libc][cmake] separate installing headers
Now libc headers can be installed separately from installing the rest of
the libc.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D133960
Kazu Hirata [Fri, 16 Sep 2022 22:36:40 +0000 (15:36 -0700)]
[Inliner] Retire DefaultInlineOrder (NFC)
DefaultInlineOrder was largely an exercise in generalizing the
traversal order of call sites within the inliner.
Now that the module inliner is starting to form its shape, there is no
point in sharing DefaultInlineOrder between the module inliner and the
CGSCC inliner. DefaultInlineOrder and all the other inline orders are
mutually exclusive in the following sense:
- The use of DefaultInlineOrder doesn't make sense in the module
inliner because there is no priority inherent in the order in which
call sites are added to the list of call sites -- SmallVector.
- The use of any other inline order doesn't make sense in the CGSCC
inliner because little prioritization can be done within one CGSCC.
This patch essentially reverts the addition of DefaultInlineOrder so
that the loop structure of Inliner.cpp looks like the state just
before we started working on the module inliner (circa June 2021).
At the same time, ww remove the choice of DefaultInlineOrder from
UseInlinePriority.
Differential Revision: https://reviews.llvm.org/D134080
Jacques Pienaar [Fri, 16 Sep 2022 22:08:55 +0000 (15:08 -0700)]
[mlir][emacs] Enable loading bytecode files as text
Use auto-compression-mode to read bytecode files in human readable
manner.
Differential Revision: https://reviews.llvm.org/D133879
Alexey Bataev [Wed, 14 Sep 2022 19:28:31 +0000 (12:28 -0700)]
[SLP]Improve isUndefVector function by adding insertelement analysis.
Added the mask and the analysis of the buildvector sequence in the
isUndefVector function, improves codegen and cost estimation.
Metric: SLP.NumVectorInstructions
Program SLP.NumVectorInstructions
results results0 diff
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 27362.00 27360.00 -0.0%
Metric: size..text
Program size..text
results results0 diff
test-suite :: External/SPEC/CFP2017rate/508.namd_r/508.namd_r.test 805299.00 806035.00 0.1%
526.blender_r - some extra code is vectorized.
508.namd_r - some extra code is optimized out.
Differential Revision: https://reviews.llvm.org/D133891
Siva Chandra Reddy [Fri, 16 Sep 2022 19:34:44 +0000 (19:34 +0000)]
[libc] Add implementation of POSIX "uname" function.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D134065
Siva Chandra Reddy [Fri, 16 Sep 2022 20:54:57 +0000 (20:54 +0000)]
[libc][Obvious] Fix typo in struct rlimit name - remove the "_t" suffix.
Chris Bieneman [Fri, 16 Sep 2022 19:42:40 +0000 (14:42 -0500)]
[HLSL] Enable availability attribute
Some HLSL functionality is gated on the target shader model version.
Enabling the use of availability markup allows us to diagnose
availability issues easily in the frontend.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D134067
Sotiris Apostolakis [Fri, 16 Sep 2022 20:34:37 +0000 (20:34 +0000)]
[SelectOpti] Restrict load sinking
This is a follow-up to D133777, which resolved a use-after-free case but
did not cover all possible memory bugs due to misplacement of loads.
In short, the overall problem was that sinked loads could be moved after
state-modifying instructions leading to memory bugs.
The solution is to restrict load sinking unless it is found to be sound.
i) Within a basic block (to-be-sinked load and select-user are in the same BB),
loads can be sinked only if there is no intervening state-modifying instruction.
This is a conservative approach to avoid resorting to alias analysis to detect
potential memory overlap.
ii) Across basic blocks, sinking of loads is avoided. This is because going over
multiple basic blocks looking for memory conflicts could be computationally
expensive and also unlikely to allow loads to sink. Further, experiments showed
that not sinking these loads has a slight positive performance effect.
Maybe for some of these loads, having some separation allows enough time
for the load to be executed in time for its user. This is not the case for
floating point operations that benefit more from sinking.
The solution in D133777 was essentially undone in this patch,
since the latter is a complete solution to the observed problem.
Overall, the performance impact of this patch is minimal.
Tested on two internal Google workloads with instrPGO.
Search application showed <0.05% perf difference,
while the database one showed a slight improvement,
but not statistically significant.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D133999
Siva Chandra Reddy [Thu, 15 Sep 2022 22:22:59 +0000 (22:22 +0000)]
[libc] Add implementation of POSIX setrlimit and getrlimit functions.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D134016
Teresa Johnson [Fri, 16 Sep 2022 03:23:44 +0000 (20:23 -0700)]
[WPD/LTT] Lower type test feeding assumes via phi correctly
This fixes https://github.com/llvm/llvm-project/issues/57616.
Type test lowering in ThinLTO modules relies on having type id
summaries set up for the referenced types, which provide the type
test resolution. If there is no summary, the type tests are lowered
to false. At the very least, a default type id summary gives the
type tests a resolution of Unknown, which is handled correctly (ignored
by the first invocation of LTT, and lowered to true by the second).
WPD sets up the type id summaries (with a default type test resolution)
as it is processing the type tests, but only does this for the patterns
handled by WPD, which is a type test directly feeding an assume. In the
case of type tests feeding an assume via a phi, the type id summary was
not being set up, leading to the type tests being lowered to false
incorrectly.
Fix this by adding the default type id summary entries for all type ids
used on globals during index-only WPD.
This is not an issue for hybrid (split-lto-unit) LTO, as in that case
the type test resolution is determined and set up during LTT, since the
type definitions are in the regular LTO split module, and exported via
the summary to the ThinLTO split module.
Differential Revision: https://reviews.llvm.org/D134012
Lang Hames [Fri, 16 Sep 2022 18:23:08 +0000 (11:23 -0700)]
[ORC][ORC-RT][MachO] Reset __data and __common sections on library close.
If we want to be able to close and then re-open a library then we need to reset
the data section states when the library is closed. This commit updates
MachOPlatform and the ORC runtime to track __data and __common sections, and
reset the state in MachOPlatformRuntimeState::dlcloseDeinitialize.
This is only a first step to full support -- there are other data sections that
we're not capturing, and we'll probably want a more efficient representation
for the sections (rather than passing their string name over IPC), but this is
a reasonable first step.
This commit also contains a fix to MapperJITLinkMemoryManager that prevents it
from calling OnDeallocated twice in the case of an error.
Maksim Panchenko [Sat, 10 Sep 2022 01:06:13 +0000 (18:06 -0700)]
[BOLT] Change base class of ExecutableFileMemoryManager
When we derive EFMM from SectionMemoryManager, it brings into EFMM extra
functionality, such as the registry of exception handling sections,
page permission management, etc. Such functionality is of no use to
llvm-bolt and can even be detrimental (see
https://github.com/llvm/llvm-project/issues/56726).
Change the base class of ExecutableFileMemoryManager to MemoryManager,
avoid registering EH sections, and skip memory finalization.
Fixes #56726
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133994
Maksim Panchenko [Thu, 15 Sep 2022 20:31:52 +0000 (13:31 -0700)]
[BOLT] Fix empty function emission in non-relocation mode
In non-relocation mode, every function is emitted in its own section. If
a function is empty, RuntimeDyld will still allocate 1-byte section
for the function and initialize it with zero. As a result, we will
overwrite the first byte of the original function contents with zero.
Such scenario can happen when the input function had only NOP
instructions which BOLT removes by default. Even though such functions
likely cause undefined behavior, it's better to preserve their contents.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133978
Jessica Paquette [Fri, 16 Sep 2022 20:32:42 +0000 (13:32 -0700)]
[GlobalISel] Combine select + fcmp to fminnum/fmaxnum/fminimum/fmaximum
This is a partial port of the code used by the SelectionDAGBuilder to
translate selects.
In particular, see matchSelectPattern in ValueTracking.cpp. This is a
GISel-equivalent of the portion which handles fminnum/fmaxnum/fminimum/fmaximum.
I tried to set it up so it'd be easy to add the non-FP cases. Those are simpler.
On the AArch64-end, it seems like the FP cases are more important for perf
right now, so I bit the bullet and went at the more complicated problem. :)
I elected to do this as a post-legalize combine rather than in the
IRTranslator because
Deciding which fmax/fmin to use can depend on legalization rules
Philosophically-speaking (TM), putting it in a combine just feels cleaner
Being able to enable/disable the combine is handy
Another option would be to use the ValueTracking code in the IRTranslator and
match what SelectionDAGBuilder::visitSelect does. I think that may be somewhat
annoying since we'd need to write lowerings back into the selects in the
legalizer. I'm not strongly opposed to the approach.
We'd also want to be careful with vector selects once that's implemented,
which explicitly check if a vector select is legal on the target. That'd
probably need a hook.
From what I can tell, doing this as a combine is probably a cleaner option
long-term.
Differential Revision: https://reviews.llvm.org/D116702
Pengxuan Zheng [Fri, 16 Sep 2022 00:52:46 +0000 (17:52 -0700)]
[MachineCSE] Add a threshold to avoid spending too much time in isProfitableToCSE
Currently, it can become extremely costly to compute MayIncreasePressure if the
size of CSUses turns out to be very large. In that case, it's no longer cost
effective to keep computing MayIncreasePressure. Therefore, to limit the amount
of time spent in isProfitableToCSE, we simply conservatively assume
MayIncreasePressure if the size of CSUses is too large. This can reduce overall
compile time by 30% for some benchmarks.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D134003
Craig Topper [Fri, 16 Sep 2022 19:59:13 +0000 (12:59 -0700)]
[VP][VE] Default VP_SREM/UREM to Expand and add generic expansion using VP_SDIV/UDIV+VP_MUL+VP_SUB.
I want to default all VP operations to Expand. These 2 were blocking
because VE doesn't support them and the tests were expecting them
to fail a specific way. Using Expand caused them to fail differently.
Seemed better to emulate them using operations that are supported.
@simoll mentioned on Discord that VE has some expansion downstream. Not
sure if its done like this or in the VE target.
Reviewed By: frasercrmck, efocht
Differential Revision: https://reviews.llvm.org/D133514
Xing Xue [Fri, 16 Sep 2022 20:08:40 +0000 (16:08 -0400)]
[libc++][lit][AIX] Enable test case last_write_time.pass.cpp for AIX
Summary:
This patch enables libc++ LIT test case last_write_time.pass.cpp for AIX. Because system call utimensat() of AIX which is used in the libc++ implementation of last_write_time() does not accept the times parameter with a negative tv_sec or tv_nsec field, testing of setting file time to before epoch time is excluded for AIX.
Reviewed by: ldionne, libc++
Differential Revision: https://reviews.llvm.org/D133124
Craig Topper [Fri, 16 Sep 2022 19:55:31 +0000 (12:55 -0700)]
[RISCV] Simplify some code in vector fp<->int handling. NFC
We changed the way container types are selected since this code
was written. We no longer need to use the largest type.
Aiden Grossman [Fri, 16 Sep 2022 19:35:50 +0000 (19:35 +0000)]
[Clang] Give error message for invalid profile path when compiling IR
Before this patch, when compiling an IR file (eg the .llvmbc section
from an object file compiled with -Xclang -fembed-bitcode=all) and
profile data was passed in using the -fprofile-instrument-use-path
flag, there would be no error printed (as the previous implementation
relied on the error getting caught again in the constructor of
CodeGenModule which isn't called when -x ir is set). This patch
moves the error checking directly to where the error is caught
originally rather than failing silently in setPGOUseInstrumentor and
waiting to catch it in CodeGenModule to print diagnostic information to
the user.
Regression test added.
Reviewed By: xur, mtrofin
Differential Revision: https://reviews.llvm.org/D132991
David Majnemer [Fri, 16 Sep 2022 18:59:15 +0000 (18:59 +0000)]
Revert "Revert "[clang, llvm] Add __declspec(safebuffers), support it in CodeView""
This reverts commit
cd20a1828605887699579789b5433111d5bc0319 and adds a
"let Heading" to NoStackProtectorDocs.
Kazu Hirata [Fri, 16 Sep 2022 19:32:16 +0000 (12:32 -0700)]
[ModuleInliner] Move InlinePriority and its derived classes to InlineOrder.cpp (NFC)
These classes are referred to only from getInlineOrder in
InlineOrder.cpp. This patch hides the entire class declarations and
definitions in InlineOrder.cpp.
Differential Revision: https://reviews.llvm.org/D134056
Stanislav Mekhanoshin [Fri, 16 Sep 2022 19:09:35 +0000 (12:09 -0700)]
[AMDGPU] Fix runline for windows in sdag-print-divergence.ll. NFC.
Abhina Sreeskantharajan [Fri, 16 Sep 2022 19:13:08 +0000 (15:13 -0400)]
[test] Use host platform specific error message substitution
This patch modifies the testcase to use error substitution so it will pass on all platforms.
Reviewed By: fanbo-meng, zibi
Differential Revision: https://reviews.llvm.org/D134034
Jakub Kuderski [Fri, 16 Sep 2022 19:02:47 +0000 (15:02 -0400)]
[mlir][arith] Remove misleading comment in EmulateWideInt. NFC.
At the request of @Mogball.
Amir Ayupov [Fri, 16 Sep 2022 18:43:16 +0000 (11:43 -0700)]
[BOLT] Verify externally referenced blocks against jump table targets
For functions with references to internal offsets from data, verify externally
referenced blocks against the set of jump table targets. Mark the function
as non-simple if there are any unclaimed data to code references.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D132495
Stanislav Mekhanoshin [Thu, 15 Sep 2022 22:12:14 +0000 (15:12 -0700)]
[SDAG] Print divergence in SDNode::dump
If target does not support divergence the field is set to false
and not printed.
Differential Revision: https://reviews.llvm.org/D133984
Wei Yi Tee [Fri, 16 Sep 2022 18:05:45 +0000 (18:05 +0000)]
Revert "[clang][dataflow] Replace `transfer(const Stmt *, ...)` with `transfer(const CFGElement *, ...)` in `Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel`."
This reverts commit
41f235d26887946f472d71a8417507c35d5f9074.
Details at https://lab.llvm.org/buildbot#builders/139/builds/28171.
Breakage due to API change.
Wei Yi Tee [Fri, 16 Sep 2022 17:38:55 +0000 (17:38 +0000)]
[clang][dataflow] Replace `transfer(const Stmt *, ...)` with `transfer(const CFGElement *, ...)` in `Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel`.
Reviewed By: gribozavr2, sgatev
Differential Revision: https://reviews.llvm.org/D133930
Justin Lebar [Fri, 16 Sep 2022 17:46:23 +0000 (10:46 -0700)]
[NFC] Fix indentation in ValueTracking.h.
In a separate patch I want to modify ValueTracking.h. When I touch the
header, arc wants to clang-format the lines I touch (reasonable!). But
then these whitespace changes get mixed into my patch.
Thomas Raoux [Fri, 16 Sep 2022 00:39:15 +0000 (00:39 +0000)]
[mlir][vector] Remove ExtractMap/InsertMap operations
As discussed on discourse: https://discourse.llvm.org/t/vector-vector-distribution-large-vector-to-small-vector/1983/22
removing insert_map/extract_map op as vector distribution now uses
warp_execute_on_lane_0 op.
Differential Revision: https://reviews.llvm.org/D134000
Michał Górny [Fri, 16 Sep 2022 07:07:10 +0000 (09:07 +0200)]
[clang] [Driver] Add an option to disable default config filenames
Add a `--no-default-config` option that disables the search for default
set of config filenames (based on the compiler executable name).
Suggested in https://discourse.llvm.org/t/rfc-adding-a-default-file-location-to-config-file-support/63606.
Differential Revision: https://reviews.llvm.org/D134018
Brett Wilson [Fri, 16 Sep 2022 17:24:51 +0000 (17:24 +0000)]
[clang-doc] Support default args for functions.
Adds support for default arguments in the internal representation and reads these values from the source. Implements writing these values to YAML but does not implement this for the HTML or markdown outputs.
Reviewed By: paulkirth
Differential Revision: https://reviews.llvm.org/D133732
Zequan Wu [Fri, 9 Sep 2022 18:47:15 +0000 (11:47 -0700)]
[LLDB][NativePDB] ResolveSymbolContext should return the innermost block
Before, it returns the outermost blocks if nested blocks have the same
address range. That casuses lldb unable to find variables that are inside
inner blocks.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D133601
LLVM GN Syncbot [Fri, 16 Sep 2022 16:49:26 +0000 (16:49 +0000)]
[gn build] Port
7061a3f3f89d
Jan Svoboda [Fri, 16 Sep 2022 16:42:28 +0000 (09:42 -0700)]
[clang][deps] Make sure ScanInstance outlives collector
The `ScanInstance` is a local variable in `DependencyScanningAction::runInvocation()` that is referenced by `ModuleDepCollector`. Since D132405, `ModuleDepCollector` can escape the function and can outlive its `ScanInstance`. This patch fixes that.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D133988
Kazu Hirata [Fri, 16 Sep 2022 16:41:42 +0000 (09:41 -0700)]
[ModuleInliner] clang-format ModuleInliner.cpp (NFC)
Michael Buch [Thu, 15 Sep 2022 02:37:08 +0000 (22:37 -0400)]
[clang][ASTImporter] DeclContext::localUncachedLookup: Continue lookup into decl chain when regular lookup fails
The uncached lookup is mainly used in the ASTImporter/LLDB code-path
where we're not allowed to load from external storage. When importing
a FieldDecl with a DeclContext that had no external visible storage
(but came from a Clang module or PCH) the above call to `lookup(Name)`
the regular `DeclContext::lookup` fails because:
1. `DeclContext::buildLookup` doesn't set `LookupPtr` for decls
that came from a module
2. LLDB doesn't use the `SharedImporterState`
In such a case we would never continue with the "slow" path of iterating
through the decl chain on the DeclContext. In some cases this means that
ASTNodeImporter::VisitFieldDecl ends up importing a decl into the
DeclContext a second time.
The patch removes the short-circuit in the case where we don't find
any decls via the regular lookup.
**Tests**
* Un-skip the failing LLDB API tests
Differential Revision: https://reviews.llvm.org/D133945
Michael Buch [Thu, 15 Sep 2022 02:36:07 +0000 (22:36 -0400)]
[lldb][tests][gmodules] Test for expression evaluator crash for types referencing the same template
The problem here is that the ASTImporter adds
the template class member FieldDecl to
the DeclContext twice. This happens because
we don't construct a `LookupPtr` for decls
that originate from modules and thus the
ASTImporter never realizes that the FieldDecl
has already been imported. These duplicate
decls then break the assumption of the LayoutBuilder
which expects only a single member decl to
exist.
The test will be fixed by a follow-up revision
and is thus skipped for now.
Differential Revision: https://reviews.llvm.org/D133944
Kazu Hirata [Fri, 16 Sep 2022 16:37:43 +0000 (09:37 -0700)]
[ModuleInliner] Remove a stale comment (NFC)
These comments refer to the nested loop in the module inliner where
the inner loop grouped call sites from the same caller. We don't
group call sites anymore, so the comment has become stale.
Kazu Hirata [Fri, 16 Sep 2022 16:32:02 +0000 (09:32 -0700)]
[ModuleInliner] Remove a redundaunt variable (NFC)
In the CGSCC inliner, DidInline was used as an indicator to update the call graph.
In the module inliner, DidInline is always true at the end of the
"while" loop, so can just drop it.
Simon Pilgrim [Fri, 16 Sep 2022 16:21:10 +0000 (17:21 +0100)]
[LoopIdiom][X86] Add non-LZCNT test coverage to 'rshift until zero' idiom tests
Nicolas Vasilache [Tue, 13 Sep 2022 06:01:25 +0000 (23:01 -0700)]
[mlir][scf][Transform] Refactor transform.fuse_into_containing_op so it is iterative and supports output fusion.
This revision revisits the implementation of `transform.fuse_into_containing_op` so that it iterates on
producers one use at a time.
Support is added to fuse a producer through a foreach_thread shared tensor argument, in which case we
tile and fuse the op inside the containing op and update the shared tensor argument to the unique destination operand.
If one cannot find such a unique destination operand the transform fails.
Differential Revision: https://reviews.llvm.org/D134051
mbs [Fri, 16 Sep 2022 16:20:10 +0000 (10:20 -0600)]
[support] Prepare TimeProfiler for cross-thread support
This NFC prepares the TimeProfiler to support the construction
and completion of time profiling 'entries' across threads.
Add ClockType alias so we can change the clock in one place.
(trivial) Use c++ usings instead of typedefs
Rename Entry to TimeTraceProfilerEntry since this type will eventually become public.
Add an intro comment.
Add some smoke unit tests.
Reviewed By: russell.gallop, rriddle, lattner, jloser
Differential Revision: https://reviews.llvm.org/D133153
Wei Yi Tee [Fri, 16 Sep 2022 15:16:49 +0000 (15:16 +0000)]
[clang][dataflow] Replace usage of the deprecated overload of `checkDataflow`.
Updated files:
- `ChromiumCheckModelTest.cpp`.
- `MatchSwitchTest.cpp`.
- `MultiVarConstantPropagationTest.cpp`.
- `SingleVarConstantPropagationTest.cpp`.
- `TestingSupportTest.cpp`.
- `TransferTest.cpp`.
Reviewed By: gribozavr2, sgatev
Differential Revision: https://reviews.llvm.org/D133865
Kazu Hirata [Fri, 16 Sep 2022 16:15:53 +0000 (09:15 -0700)]
[ModuleInliner] Remove a write-only variable (NFC)
InlinedCallees is a remnant from the CGSCC inliner. We don't use it
in the module inliner.
Jakub Kuderski [Fri, 16 Sep 2022 16:09:23 +0000 (12:09 -0400)]
[mlir][arith] Support wide int shrui emulation
Tested by checking all 16-bit LHS and all valid RHS when emulating i16 with i8 operations.
Reviewed By: antiagainst, Mogball
Differential Revision: https://reviews.llvm.org/D133722
Jakub Kuderski [Fri, 16 Sep 2022 16:02:06 +0000 (12:02 -0400)]
[mlir][arith] Support wide integer multiplication emulation
Emulate multiplication by splitting each input element of type i2N into 4
digits of type iN and bit width i(N/2). This is so that the intermediate
multiplications and additions do not overflow. We extract these i(N/2)
digits from iN vector elements by masking (low digit) and shifting right
(high digit).
The multiplication algorithm used is the standard (long) multiplication.
Multiplying two i2N integers produces (at most) a i4N result, but because
the calculation of top i2N is not necessary, we omit it.
In total, this implementations performs 10 intermediate multiplications
and 16 additions. The number of multiplications could be decreased by
switching to a more efficient algorithm like Karatsuba. This would,
however, require being able to perform (intermediate) wide additions and
subtractions, so it is not clear that such implementation would be more
efficient.
I tested this on all 16-bit inut pairs, when emulating i16 with i8.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D133629
Simon Pilgrim [Fri, 16 Sep 2022 15:56:40 +0000 (16:56 +0100)]
[CostModel][X86] Update throughput costs for CTLZ ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695 (and recent fixes to the bdver2 + alderlake models)
Adding full CostKinds costs are affecting some other tests as they make assumptions about SizeLatency costs, so they need addressing first
Kazu Hirata [Fri, 16 Sep 2022 15:56:17 +0000 (08:56 -0700)]
[IPO] Simplify the module inliner loop (NFC)
In the bottom-up inliner, we have a two-level nested "while" loop,
with the inner one grouping call sites with the same caller. We need
to do so to keep CGSCC up to date.
Now, with the module inliner, we don't have any per-caller work. We
don't update CGSCC. Plus, the caller will likely keep changing as we
pop call sites in some priority order.
This patch simply removes the inner "while" loop while indenting its
body. Further cleanup is possible, but that's left for follow-up
patches.
Differential Revision: https://reviews.llvm.org/D133969
Jakub Kuderski [Fri, 16 Sep 2022 15:49:41 +0000 (11:49 -0400)]
[mlir][arith] Add initial files for (runtime) integration tests
The goal is to have a set of runtime tests for further extercise the
wide integer emulation pass and its conversion patterns. This was
suggested by @Mogball in D133629.
Add a minimal runtime test to demonstrate that printing and pass
pipeline works as expected.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D134004
Juan Manuel MARTINEZ CAAMAÑO [Fri, 16 Sep 2022 09:40:33 +0000 (09:40 +0000)]
[DAGCombine] Do not fold SRA/SRL of MUL into MULH when MUL's LSB are
used, and MUL_LOHI is available
Folding into a sra(mul) / srl(mul) into a mulh introduces an extra
multiplication to compute the high half of the multiplication,
while it is more profitable to compute the high and lower halfs with a
single mul_lohi.
Differential Revision: https://reviews.llvm.org/D133768
Matheus Izvekov [Sat, 3 Sep 2022 16:36:59 +0000 (18:36 +0200)]
[clang] Fixes how we represent / emulate builtin templates
We change the template specialization of builtin templates to
behave like aliases.
Though unlike real alias templates, these might still produce a canonical
TemplateSpecializationType when some important argument is dependent.
For example, we can't do anything about make_integer_seq when the
count is dependent, or a type_pack_element when the index is dependent.
We change type deduction to not try to deduce canonical TSTs of
builtin templates.
We also change those buitin templates to produce substitution sugar,
just like a real instantiation would, making the resulting type correctly
represent the template arguments used to specialize the underlying template.
And make_integer_seq will now produce a TST for the specialization
of it's first argument, which we use as the underlying type of
the builtin alias.
When performing member access on the resulting type, it's now
possible to map from a Subst* node to the template argument
as-written used in a regular fashion, without special casing.
And this fixes a bunch of bugs with relation to these builtin
templates factoring into deduction.
Fixes GH42102 and GH51928.
Depends on D133261
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D133262
Daniel Bertalan [Mon, 5 Sep 2022 17:03:15 +0000 (19:03 +0200)]
[lld-macho] Parallelize linker optimization hint processing
This commit moves the parsing of linker optimization hints into
`ARM64::applyOptimizationHints`. This lets us avoid allocating memory
for holding the parsed information, and moves work out of
`ObjFile::parse`, which is not parallelized at the moment.
This change reduces the overhead of processing LOHs to 25-30 ms when
linking Chromium Framework on my M1 machine; previously it took close to
100 ms.
There's no statistically significant change in runtime for a --threads=1
link.
Performance figures with all 8 cores utilized:
N Min Max Median Avg Stddev
x 20 3.8027232 3.8760762 3.8505335 3.8454145 0.
026352574
+ 20 3.7019017 3.8660538 3.7546209 3.7620371 0.
032680043
Difference at 95.0% confidence
-0.0833775 +/- 0.019
-2.16823% +/- 0.494094%
(Student's t, pooled s = 0.0296854)
Differential Revision: https://reviews.llvm.org/D133439
Sourabh Singh Tomar [Thu, 15 Sep 2022 07:04:38 +0000 (12:34 +0530)]
[flang][OpenMP] Lower OpenMP `taskgroup` construct
Lower Fortran OpenMP `taskgroup` to FIR + OpenMP Dialect.
Reviewed By: kiranchandramohan, peixin
Differential Revision: https://reviews.llvm.org/D133918
Dave Lee [Tue, 13 Sep 2022 00:03:08 +0000 (17:03 -0700)]
[lldb] Use SWIG_fail in python-typemaps.swig (NFC)
When attempting to use SWIG's `-builtin` flag, there were a few compile
failures caused by a mismatch between return type and return value. In those
cases, the return type was `int` but many of the type maps assume returning
`NULL`/`nullptr` (only the latter caused compile failures).
This fix abstracts failure paths to use the `SWIG_fail` macro, which performs
`goto fail;`. Each of the generated functions contain a `fail` label, which
performs any resource cleanup and returns the appropriate failure value.
This change isn't strictly necessary at this point, but seems like the right
thing to do, and for anyone who tries `-builtin` later, it resolves those
issues.
Differential Revision: https://reviews.llvm.org/D133961
Dmitry Preobrazhensky [Fri, 16 Sep 2022 15:18:32 +0000 (18:18 +0300)]
[AMDGPU][MC][NFC] Correct error message
Differential Revision: https://reviews.llvm.org/D134028
Matheus Izvekov [Sat, 3 Sep 2022 17:48:14 +0000 (19:48 +0200)]
NFC: [clang] add template AST test for make_integer_seq and type_pack_element
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D133261
Zahira Ammarguellat [Mon, 29 Aug 2022 14:18:19 +0000 (10:18 -0400)]
Currently the options ‘ffast-math’ and ‘ffp-contract’ are connected.
When ‘ffast-math’ is set, ffp-contract is altered this way:
-ffast-math/ Ofast -> ffp-contract=fast
-fno-fast-math -> if ffp-contract= fast then ffp-contract=on else
ffp-contract unchanged
This differs from gcc which doesn’t connect the two options.
Connecting these two options in clang, resulted in spurious warnings
when the user combines these two options -ffast-math -fno-fast-math; see
issue https://github.com/llvm/llvm-project/issues/54625.
The issue is that the ‘ffast-math’ option is an on/off flag, but the
‘ffp-contract’ is an on/off/fast flag. So when ‘fno-fast-math’ is used
there is no obvious value for ‘ffp-contract’. What should the value of
ffp-contract be for -ffp-contract=fast -fno-fast-math and -ffast-math
-ffp-contract=fast -fno-fast-math? The current logic sets ffp-contract
back to on in these cases. This doesn’t take into account that the value
of ffp-contract is modified by an explicit ffp-contract` option.
This patch is proposing a set of rules to apply when ffp-contract',
ffast-math and fno-fast-math are combined. These rules would give the
user the expected behavior and no diagnostic would be needed.
See RFC
https://discourse.llvm.org/t/rfc-making-ffast-math-option-unrelated-to-ffp-contract-option/61912
Matheus Izvekov [Tue, 19 Jul 2022 09:02:32 +0000 (11:02 +0200)]
[clang] extend getCommonSugaredType to merge sugar nodes
This continues D111283 by extending the getCommonSugaredType
implementation to also merge non-canonical type nodes.
We merge these nodes by going up starting from the canonical
node, calculating their merged properties on the way.
If we reach a pair that is too different, or which we could not
otherwise unify, we bail out and don't try to keep going on to
the next pair, in effect striping out all the remaining top-level
sugar nodes. This avoids mismatching 'companion' nodes, such as
ElaboratedType, so that they don't end up elaborating some other
unrelated thing.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D130308
Nico Weber [Fri, 16 Sep 2022 14:48:49 +0000 (10:48 -0400)]
[gn build] port
2d52c6bfae80 more (follow-up to
41c79d0b6d4ce, __tuple split)
Sander de Smalen [Fri, 16 Sep 2022 14:21:22 +0000 (14:21 +0000)]
[AArch64][SME] Implement ABI for calls from streaming-compatible functions.
When a function is streaming-compatible and calls a function with a normal or streaming
interface, it may need to enable/disable stremaing mode before the call, and
needs to restore PSTATE.SM after the call.
This patch implements this with a Pseudo node that gets expanded to a
conditional branch and smstart/smstop node.
More details about the SME attributes and design can be found
in D131562.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D131578
Sander de Smalen [Fri, 16 Sep 2022 14:09:35 +0000 (14:09 +0000)]
[AArch64][SME] Document SME ABI implementation in LLVM
Adds a design document for implementing the SME ABI in LLVM. This document
can be used as a reference for follow-up patches that attempt to implement
the ABI.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D131562
Sanjay Patel [Fri, 16 Sep 2022 13:54:11 +0000 (09:54 -0400)]
[InstCombine] reduce code duplication in foldICmpMulConstant(); NFC
Matheus Izvekov [Sun, 10 Oct 2021 13:28:37 +0000 (15:28 +0200)]
[clang] use getCommonSugar in an assortment of places
For this patch, a simple search was performed for patterns where there are
two types (usually an LHS and an RHS) which are structurally the same, and there
is some result type which is resolved as either one of them (typically LHS for
consistency).
We change those cases to resolve as the common sugared type between those two,
utilizing the new infrastructure created for this purpose.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D111509
Simon Pilgrim [Fri, 16 Sep 2022 14:23:28 +0000 (15:23 +0100)]
[X86] Add missing (unsupported) zmm vector move classes
Although unsupported on HSW, we reuse this model for KNL which does require them
Noticed when running the cost model fuzz script from D103695 with -mcpu=knl
Liqiang Tao [Fri, 16 Sep 2022 14:15:15 +0000 (22:15 +0800)]
StackProtector: ensure stack checks are inserted before the tail call
The IR stack protector pass should insert stack checks before the tail
calls not only the musttail calls. So that the attributes `ssqreq` and
`tail call`, which are emited by llvm-opt, could be both enabled by
llvm-llc.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D133860
Muiez Ahmed [Fri, 16 Sep 2022 14:22:21 +0000 (10:22 -0400)]
[SystemZ][z/OS] define REMOVE_ALL_USE_DIRECTORY_ITERATOR (libc++)
This patch fixes the z/OS build by using the first implementation of __remove_all since we don't have access to the openat() family of POSIX functions.
Differential Revision: https://reviews.llvm.org/D132948
Mark de Wever [Thu, 1 Sep 2022 16:38:03 +0000 (18:38 +0200)]
[libc++] Shows the detailed compiler version info.
The libc++ pre-commit CI uses Clang nightly builds. Currently it's not
possible to determine the exact version used since CMake doesn't show
this information by default. Instead use the --version flag to get this
information.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D133122
Sander de Smalen [Thu, 15 Sep 2022 15:17:23 +0000 (15:17 +0000)]
[AArch64][SME] Implement ABI for calls to/from streaming functions.
This patch implements the ABI for calls from:
Normal -> Streaming
Normal -> Streaming-compatible
Streaming -> Normal
Streaming -> Streaming-compatible
Streaming -> Streaming
The compiler inserts SMSTART/SMSTOP instructions before and after the call,
depending on the required transition.
More details about the SME attributes and design can be found
in D131562.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D131576
Florian Hahn [Fri, 16 Sep 2022 13:57:43 +0000 (14:57 +0100)]
[AArch64] Use tbl for truncating vector FPtoUI conversions.
On AArch64, doing the vector truncate separately after the fptoui
conversion can be lowered more efficiently using tbl.4, building on
D133495.
https://alive2.llvm.org/ce/z/T538CC
Depends on D133495
Reviewed By: t.p.northover
Differential Revision: https://reviews.llvm.org/D133496
sstwcw [Fri, 16 Sep 2022 13:18:21 +0000 (13:18 +0000)]
[clang-format] Fix template arguments in macros
Fixes https://github.com/llvm/llvm-project/issues/57738
old
```
#define FOO(typeName, realClass) \
{ \
#typeName, foo < FooType>(new foo <realClass>(#typeName)) \
}
```
new
```
#define FOO(typeName, realClass) \
{ #typeName, foo<FooType>(new foo<realClass>(#typeName)) }
```
Previously, when an UnwrappedLine began with a hash in a macro
definition, the program incorrectly assumed the line was a preprocessor
directive. It should be stringification.
The rule in spaceRequiredBefore was added in
8b5297117b. Its purpose is
to add a space in an include directive. It also added a space to a
template opener when the line began with a stringification hash. So we
changed it.
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D133954
sstwcw [Sat, 10 Sep 2022 19:28:37 +0000 (19:28 +0000)]
[clang-format] Parse the else part of `#if 0`
Fixes https://github.com/llvm/llvm-project/issues/57539
Previously things outside of `#if` blocks were parsed as if only the
first branch of the conditional compilation branch existed, unless the
first condition is 0. In that case the outer parts would be parsed as
if nothing inside the conditional parts existed. Now we use the second
conditional branch if the first condition is 0.
Reviewed By: owenpan
Differential Revision: https://reviews.llvm.org/D133647
Adrian Vogelsgesang [Wed, 14 Sep 2022 10:21:34 +0000 (03:21 -0700)]
[libunwind] Fix usage of `_dl_find_object` on 32-bit x86
On 32-bit x86, `_dl_find_object` also returns a `dlfo_eh_dbase` address.
So far, compiling against a version of `_dl_find_object` which returns a
`dlfo_eh_dbase` was blocked using a `#if` + `#error`. This commit now
removes this compile time assertion and simply ignores the returned
`dlfo_eh_dbase`. All test cases are passing on a 32-bit build now.
According to https://www.gnu.org/software/libc/manual/html_node/Dynamic-Linker-Introspection.html,
`dlfo_eh_dbase` should be the base address for all DW_EH_PE_datarel
relocations. However, glibc/elf/dl-find_object.h says that eh_dbase
is the relocated DT_PLTGOT value. I don't understand how those two
statements fit together, but to fix 32-bit x86, ignoring `dlfo_eh_dbase`
seems to be good enough.
Fixes #57733
Differential Revision: https://reviews.llvm.org/D133846
Mark de Wever [Tue, 13 Sep 2022 15:40:18 +0000 (17:40 +0200)]
[NFC][libc++][test] Uses public functions.
Replaces std::__format_context_create with the public wrapper
test_format_context_create.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D133781
Simon Pilgrim [Fri, 16 Sep 2022 12:03:06 +0000 (13:03 +0100)]
[CostModel][X86] Add CostKinds handling for vector integer comparisons
These were based off a mixture of vector integer add/sub costs and the numbers from the 'cost-tables vs llvm-mca' script from D103695 - the extra costs for different predicates are still proving tricky to implement, but I've gotten most costs to within +/1 now - the AVX512 are tricky as we still don't handle predicate results properly, so most of these were done by hand.
Joseph Huber [Thu, 15 Sep 2022 23:28:52 +0000 (18:28 -0500)]
[Libomptarget] Revert changes to AMDGPU plugin destructors
These patches exposed a lot of problems in the AMD toolchain. Rather
than keep it broken we should revert it to its old semi-functional
state. This will prevent us from using device destructors but should
remove some new bugs. In the future this interface should be changed
once these problems are addressed more correctly.
This reverts commit
ed0f21811544320f829124efbb6a38ee12eb9155.
This reverts commit
2b7203a35972e98b8521f92d2791043dc539ae88.
Fixes #57536
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D133997
Florian Hahn [Fri, 16 Sep 2022 11:42:49 +0000 (12:42 +0100)]
[AArch64] Lower vector trunc using tbl.
Similar to using tbl to lower vector ZExts, tbl4 can be used to lower
vector truncates.
The initial version support i32->i8 conversions.
Depends on D120571
Reviewed By: t.p.northover
Differential Revision: https://reviews.llvm.org/D133495
Aaron Ballman [Fri, 16 Sep 2022 11:19:30 +0000 (07:19 -0400)]
Fix the clang Sphinx bot
This addresses failures introduced by:
https://lab.llvm.org/buildbot/#/builders/92/builds/32809
It also fixes a secondary issue that crept in after the above build
started failing.
Kadir Cetinkaya [Thu, 15 Sep 2022 18:57:07 +0000 (20:57 +0200)]
[clang(d)] Include/Exclude CLDXC options properly
This handles the new CLDXC options that was introduced in
https://reviews.llvm.org/D128462 inside clang-tooling to make sure cl driver
mode is not broken.
Fixes https://github.com/clangd/clangd/issues/1292.
Differential Revision: https://reviews.llvm.org/D133962
Matheus Izvekov [Fri, 16 Sep 2022 10:03:34 +0000 (12:03 +0200)]
Revert "[clang] use getCommonSugar in an assortment of places"
This reverts commit
aff1f6310e5f4cea92c4504853d5fd824754a74f.
Matheus Izvekov [Sun, 10 Oct 2021 13:28:37 +0000 (15:28 +0200)]
[clang] use getCommonSugar in an assortment of places
For this patch, a simple search was performed for patterns where there are
two types (usually an LHS and an RHS) which are structurally the same, and there
is some result type which is resolved as either one of them (typically LHS for
consistency).
We change those cases to resolve as the common sugared type between those two,
utilizing the new infrastructure created for this purpose.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D111509
Nikita Popov [Thu, 15 Sep 2022 15:00:55 +0000 (17:00 +0200)]
[CodeGen] Don't zero callee-save registers with zero-call-used-regs (PR57692)
Callee save registers must be preserved, so -fzero-call-used-regs
should not be zeroing them. The previous implementation only did
not zero callee save registers that were saved&restored inside the
function, but we need preserve all of them.
Fixes https://github.com/llvm/llvm-project/issues/57692.
Differential Revision: https://reviews.llvm.org/D133946
Stanislav Mekhanoshin [Thu, 15 Sep 2022 19:46:02 +0000 (12:46 -0700)]
[AMDGPU] Added __builtin_amdgcn_ds_bvh_stack_rtn
Differential Revision: https://reviews.llvm.org/D133966
Matheus Izvekov [Fri, 16 Sep 2022 09:37:55 +0000 (11:37 +0200)]
NFC: remove accidental inclusion of libcxx test changes
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>