David Blaikie [Wed, 2 Feb 2022 00:15:25 +0000 (16:15 -0800)]
Test fixes for prior patch
David Blaikie [Tue, 1 Feb 2022 02:27:39 +0000 (18:27 -0800)]
Revert "DebugInfo: Don't put types in type units if they reference internal linkage types"
This reverts commit
ab4756338c5b2216d52d9152b2f7e65f233c4dac.
Breaks some cases, including this:
namespace {
template <typename> struct a {};
} // namespace
class c {
c();
};
class b {
b();
a<c> ax;
};
b::b() {}
c::c() {}
By producing a reference to a type unit for "c" but not producing the type unit.
Kirill Stoimenov [Tue, 1 Feb 2022 20:39:29 +0000 (20:39 +0000)]
Revert "[ASan] Not linking asan_static library for DSO."
This reverts commit
cf730d8ce1341ba593144df2e2bc8411238e04c3. It turned out that D118184 is causing segfaults in some situations.
Reviewed By: vitalybuka, kda
Differential Revision: https://reviews.llvm.org/D118739
Bixia Zheng [Fri, 28 Jan 2022 18:56:50 +0000 (10:56 -0800)]
[mlir][taco] Add a utility to create an MLIR sparse tensor from a file.
Move the functions that retrieve the supporting C library, compile an MLIR
module and build a JIT execution engine to mlir_pytaco_utils.
Add a function to create an MLIR sparse tensor from a file and return a pointer
to the MLIR sparse tensor as well as the shape of the sparse tensor.
Add unit tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D118496
Fangrui Song [Tue, 1 Feb 2022 23:11:16 +0000 (15:11 -0800)]
[Driver][test] Fix fatal-warnings.c CHECK lines and fold the test into as-warnings.c
Hongtao Yu [Tue, 1 Feb 2022 22:44:37 +0000 (14:44 -0800)]
Revert "[llvm-profgen] Clean up unnecessary memory reservations between phases."
This reverts commit
057e784b0962a7c5a17e858932bb6f03c7676c47.
Konstantin Varlamov [Tue, 1 Feb 2022 22:39:53 +0000 (14:39 -0800)]
[libc++][ranges][NFC] In the Ranges status, list the changes to stream.iterators
Sander de Smalen [Tue, 1 Feb 2022 17:27:01 +0000 (17:27 +0000)]
[LV] Allow a scalable VF for the epilogue.
For some reason we limited the epilogue VF to be fixed-width, but there
is not necessarily a reason for doing so. If the main VF=vscale x 16, the
epilogue VF could be either fixed-width, or a scalable VF upto vscale x 8.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D118688
Jameson Nash [Tue, 1 Feb 2022 17:05:20 +0000 (12:05 -0500)]
Reland "enable plugins for clang-tidy"
This reverts commit
ab3b89855c5318f0009e1f016ffe5b1483507fd0 but
disables the new test if the user has disabled support for building it.
Konstantin Varlamov [Tue, 1 Feb 2022 22:34:40 +0000 (14:34 -0800)]
[libc++][ranges][NFC] In the Ranges status, list the changes to predef.iterators
Anna Thomas [Tue, 1 Feb 2022 21:46:02 +0000 (16:46 -0500)]
[LoopFuse] Add assertion for non-null DT in fusion candidate
The code paths analyzed (all constructor invocations of fusion
candidate) pass in a non-null DT.
Adding this assert as requested in D118472 before converting this to a
reference argument.
Anna Thomas [Tue, 1 Feb 2022 21:29:22 +0000 (16:29 -0500)]
[LoopPeel] Use reference instead of pointer for DT argument
Cleanup code in peelLoop API. We already have usage of DT without guarding
against a null DT, so this change constant folds the remaining null DT
checks.
Also make the argument a reference so that it is clear the argument is
a nonnull DT.
Extracted from D118472.
Nikolas Klauser [Tue, 1 Feb 2022 21:38:27 +0000 (22:38 +0100)]
[libc++] Make _VSTD and alias for std
There is no practical difference between `_VSTD` and `std` so we should just remove `_VSTD`. This is the first step.
Reviewed By: ldionne, #libc
Spies: jeroen.dobbelaere, wmaxey, EricWF, lebedev.ri, __simt__, dim, mgrang, sstefan1, wenlei, smeenai, libcxx-commits, #libc_vendors
Differential Revision: https://reviews.llvm.org/D117811
Aaron Ballman [Tue, 1 Feb 2022 21:37:07 +0000 (16:37 -0500)]
Add ClangLinkerWrapper to the TOC to appease the Sphinx build bot
Rainer Orth [Tue, 1 Feb 2022 21:33:56 +0000 (22:33 +0100)]
[sanitizer_common][test] Enable tests on SPARC
Unfortunately, the `sanitizer_common` tests are disabled on many targets
that are supported by `sanitizer_common`, making it easy to miss issues
with that support. This patch enables SPARC testing.
Beside the enabling proper, the patch fixes (together with D91607
<https://reviews.llvm.org/D91607>) the failures of the `symbolize_pc.cpp`,
`symbolize_pc_demangle.cpp`, and `symbolize_pc_inline.cpp` tests. They
lack calls to `__builtin_extract_return_addr`. When those are added, they
`PASS` when compiled with `gcc`. `clang` incorrectly doesn't implement a
non-default `__builtin_extract_return_addr` on several targets, SPARC
included.
Because `__builtin_extract_return_addr(__builtin_return_addr(0))` is quite
a mouthful and I'm uncertain if the code needs to compile with msvc which
appparently has it's own `_ReturnAddress`, I've introduced
`__sanitizer_return_addr` to hide the difference and complexity. Because
on 32-bit SPARC `__builtin_extract_return_addr` differs when the calling
function returns a struct, I've added a testcase for that.
There are a couple more tests failing on SPARC that I will deal with
separately.
Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D91608
Mark de Wever [Tue, 1 Feb 2022 21:32:49 +0000 (16:32 -0500)]
[libc++] Remove unneeded qualifier.
In D117811 @Quuxplusone pointed out the friend declarations don't need
to be qualified. Removing the qualification should avoid needing to add
a GCC work-around when changing _VSTD to std.
Reviewed By: Quuxplusone, philnik, #libc, ldionne
Differential Revision: https://reviews.llvm.org/D118719
Fangrui Song [Tue, 1 Feb 2022 21:24:39 +0000 (13:24 -0800)]
[hwasan][test] Remove obsoleted/removed -fno-experimental-new-pass-manager
Florian Hahn [Tue, 1 Feb 2022 21:02:41 +0000 (21:02 +0000)]
[GVN] Add additional tests after
216d1a729.
Further extend test coverage added in
216d1a729
Hongtao Yu [Tue, 1 Feb 2022 04:24:45 +0000 (20:24 -0800)]
[llvm-profgen] Clean up unnecessary memory reservations between phases.
Cleaning up data structures that are not used after a certain point. This further brings down peak memory usage by 15% for a large benchmark.
Before:
note: Before parsePerfTraces
note: VM: 40.73 GB RSS: 39.18 GB
note: Before parseAndAggregateTrace
note: VM: 40.73 GB RSS: 39.18 GB
note: After parseAndAggregateTrace
note: VM: 88.93 GB RSS: 87.97 GB
note: Before generateUnsymbolizedProfile
note: VM: 88.95 GB RSS: 87.99 GB
note: After generateUnsymbolizedProfile
note: VM: 93.50 GB RSS: 92.53 GB
note: After computeSizeForProfiledFunctions
note: VM: 101.13 GB RSS: 99.36 GB
note: After generateProbeBasedProfile
note: VM: 215.61 GB RSS: 210.88 GB
note: After postProcessProfiles
note: VM: 237.48 GB RSS: 212.50 GB
After:
note: Before parsePerfTraces
note: VM: 40.73 GB RSS: 39.18 GB
note: Before parseAndAggregateTrace
note: VM: 40.73 GB RSS: 39.18 GB
note: After parseAndAggregateTrace
note: VM: 88.93 GB RSS: 87.96 GB
note: Before generateUnsymbolizedProfile
note: VM: 88.95 GB RSS: 87.97 GB
note: After generateUnsymbolizedProfile
note: VM: 93.50 GB RSS: 92.51 GB
note: After computeSizeForProfiledFunctions
note: VM: 93.50 GB RSS: 92.53 GB
note: After generateProbeBasedProfile
note: VM: 164.87 GB RSS: 163.55 GB
note: After postProcessProfiles
note: VM: 182.28 GB RSS: 179.43 GB
Reviewed By: wenlei, wlei
Differential Revision: https://reviews.llvm.org/D118677
Sanjay Patel [Tue, 1 Feb 2022 20:28:21 +0000 (15:28 -0500)]
[x86] add tests for fmul/fdiv with identity constant in select arm; NFC
Sanjay Patel [Tue, 1 Feb 2022 16:20:27 +0000 (11:20 -0500)]
[x86] add more tests for select with identity constant; NFC
D118644
Daniel Resnick [Thu, 27 Jan 2022 00:13:24 +0000 (17:13 -0700)]
[mlir][capi] Add DialectRegistry to MLIR C-API
Exposes mlir::DialectRegistry to the C API as MlirDialectRegistry along with
helper functions. A hook has been added to MlirDialectHandle that inserts
the dialect into a registry.
A future possible change is removing mlirDialectHandleRegisterDialect in
favor of using mlirDialectHandleInsertDialect, which it is now implemented with.
Differential Revision: https://reviews.llvm.org/D118293
Stanislav Mekhanoshin [Mon, 31 Jan 2022 23:10:08 +0000 (15:10 -0800)]
[AMDGPU] Check atomics aliasing in the clobbering annotation
MemorySSA considers any atomic a def to any operation it dominates
just like a barrier or fence. That is correct from memory state
perspective, but not required for the no-clobber metadata since
we are not using it for reordering. Skip such atomics during the
scan just like a barrier if it does not alias with the load.
Differential Revision: https://reviews.llvm.org/D118661
Louis Dionne [Wed, 26 Jan 2022 16:07:49 +0000 (11:07 -0500)]
[libc++] Fix TOCTOU issue with std::filesystem::remove_all
https://bugs.chromium.org/p/llvm/issues/detail?id=19
rdar://
87912416
Differential Revision: https://reviews.llvm.org/D118134
Louis Dionne [Mon, 24 Jan 2022 20:49:56 +0000 (15:49 -0500)]
[libc++][ci] Re-enable the bootstrapping build
Differential Revision: https://reviews.llvm.org/D118067
Florian Hahn [Tue, 1 Feb 2022 20:24:19 +0000 (20:24 +0000)]
[GVN] Add tests for D118143 not requiring loops.
David Green [Tue, 1 Feb 2022 20:18:40 +0000 (20:18 +0000)]
Revert "[DAG] Extend SearchForAndLoads with any_extend handling"
This reverts commit
100763a88fe97b22cd5e3f69d203669aac3ed48f as it was
making incorrect assumptions about implicit zero_extends.
Arthur O'Dwyer [Tue, 18 Jan 2022 12:25:17 +0000 (07:25 -0500)]
[clang] Don't typo-fix an expression in a SFINAE context.
If this is a SFINAE context, then continuing to look up names
(in particular, to treat a non-function as a function, and then
do ADL) might too-eagerly complete a type that it's not safe to
complete right now. We should just say "okay, that's a substitution
failure" and not do any more work than absolutely required.
Fixes #52970.
Differential Revision: https://reviews.llvm.org/D117603
Arthur O'Dwyer [Fri, 28 Jan 2022 20:51:19 +0000 (15:51 -0500)]
[clang] Correctly(?) handle placeholder types in ExprRequirements.
Bug #52905 was originally papered over in a different way, but
I believe this is the actually proper fix, or at least closer to
it. We need to detect placeholder types as close to the front-end
as possible, and cause them to fail constraints, rather than letting
them persist into later stages.
Fixes #52905.
Fixes #52909.
Fixes #53075.
Differential Revision: https://reviews.llvm.org/D118552
Arthur O'Dwyer [Sat, 22 Jan 2022 19:33:12 +0000 (14:33 -0500)]
[libc++] Fix LWG3589 "The const lvalue reference overload of get for subrange..."
https://cplusplus.github.io/LWG/issue3589
Differential Revision: https://reviews.llvm.org/D117961
Florian Mayer [Mon, 31 Jan 2022 21:10:41 +0000 (13:10 -0800)]
[hwasan] work around lifetime issue with setjmp.
setjmp can return twice, but PostDominatorTree is unaware of this. as
such, it overestimates postdominance, leaving some cases (see attached
compiler-rt) where memory does not get untagged on return. this causes
false positives later in the program execution.
this is a crude workaround to unblock use-after-scope for now, in the
longer term PostDominatorTree should bemade aware of returns_twice
function, as this may cause problems elsewhere.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D118647
Valentin Clement [Tue, 1 Feb 2022 19:53:00 +0000 (20:53 +0100)]
[flang] Lower basic STOP statement
This patch lowers STOP statement without arguments
and ERROR STOP. STOP statement with arguments lowering will
come in later patches ince it requires some expression lowering
to be added.
STOP statement is lowered to a runtime call.
Also makes sure we are creating a constant in the MLIR arith constant.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan, schweitz
Differential Revision: https://reviews.llvm.org/D118697
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Peter Klausler [Tue, 1 Feb 2022 19:51:19 +0000 (11:51 -0800)]
[flang] Fix/work around warnings from GCC 11
Apply part of a pending patch for GCC 11 warnings, and
rework a piece of code, to dodge warnings on flag from
GCC 11 build bots exposed by a recent patch.
Applying without review to get bots working again; changes
also tested against GCC 9.3.0.
Stanislav Mekhanoshin [Fri, 28 Jan 2022 00:27:43 +0000 (16:27 -0800)]
[AMDGPU] Allow scalar loads after barrier
Currently we cannot convert a vector load into scalar if there
is dominating barrier or fence. It is considered a clobbering
memory access to prevent memory operations reordering. While
reordering is not possible the actual memory is not being clobbered
by a barrier or fence and we can still use a scalar load for a
uniform pointer.
The solution is not to bail on a first clobbering access but
traverse MemorySSA to the root excluding barriers and fences.
Differential Revision: https://reviews.llvm.org/D118419
Jeremy Morse [Tue, 1 Feb 2022 19:39:09 +0000 (19:39 +0000)]
[DebugInfo][InstrRef][NFC] Bypass a frequently-noop loop
Bypass this loop if it would do nothing -- if there are no register masks
to be examined, there's no point looking at each location to see if the
location has been def'd. Awkwardly, this was responsible for almost an
entire half a percent of performance improvement on CTMark.
Differential Revision: https://reviews.llvm.org/D118613
Jeremy Morse [Tue, 1 Feb 2022 19:19:20 +0000 (19:19 +0000)]
[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out
In certain circumstances with things like autogenerated code and asan, you
can end up with thousands of Values live at the same time, causing a large
working set and a lot of information spilled to the stack. Unfortunately
InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory
when there are many many stack slots. See the reproducer in D116821.
It seems very unlikely that a developer would be able to reason about
hundreds of live named local variables at the same time, so a huge working
set and many stack slots is an indicator that we're likely analysing
autogenerated or instrumented code. In those cases: gracefully degrade by
setting an upper bound on the amount of stack slots to track. This limits
peak memory consumption, at the cost of dropping some variable locations,
but in a rare scenario where it's unlikely someone is actually going to
use them.
In terms of the patch, this adds a cl::opt for max number of stack slots to
track, and has the stack-slot-numbering code optionally return None. That
then filters through a number of code paths, which can then chose to not
track a spill / restore if it touches an untracked spill slot. The added
test checks that we drop variable locations that are on the stack, if we
set the limit to zero.
Differential Revision: https://reviews.llvm.org/D118601
Matt Morehouse [Tue, 1 Feb 2022 19:23:36 +0000 (11:23 -0800)]
[HWASan] Properly handle musttail calls.
Fixes a compile error when the `clang::musttail` attribute is used.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D118712
Chris Bieneman [Mon, 31 Jan 2022 22:15:47 +0000 (16:15 -0600)]
[NFC] These tests require a default target
These test cases all rely on a default target being specified. Adding
the requirement gets the tests properly skipped when
LLVM_DEFAULT_TARGET_TRIPLE is unset.
Shubham Sandeep Rastogi [Tue, 1 Feb 2022 18:30:28 +0000 (10:30 -0800)]
Change namespace llvm::swift to namespace llvm::binaryformat because of clashes with the apple/llvm-project repository
The namespace llvm::swift is causing errors to pop up in the apple/llvm-project build when cherry-picking
4ce1f3d47c33 into apple/llvm-project
Differential Review: https://reviews.llvm.org/D118716
Chris Bieneman [Mon, 31 Jan 2022 22:12:41 +0000 (16:12 -0600)]
[NFC] Use llvm-as instead of llc
llvm-as does everything this test requires, but doesn't depend on a
target being registered. This gets the test passing when
LLVM_DEFAUL_TARGET_TRIPLE is unset.
Anna Thomas [Fri, 28 Jan 2022 21:45:04 +0000 (13:45 -0800)]
[InstCombine] Remove weaker fence adjacent to a stronger fence
We have an instCombine rule to remove identical consecutive fences.
We can extend this to remove weaker fences when we have consecutive stronger
fence.
As stated in the LangRef, a fence with a stronger ordering also implies
ordering weaker than itself: "A fence which has seq_cst ordering, in addition to
having both acquire and release semantics specified above, participates in the
global program order of other seq_cst operations and/or fences."
Reviewed-By: reames
Differential Revision: https://reviews.llvm.org/D118607
Jeremy Morse [Tue, 1 Feb 2022 18:55:08 +0000 (18:55 +0000)]
[DebugInfo][InstrRef][NFC] Don't build a map of un-needed values
When finding locations for variable values at the start of a block, we
build a large map of every value to every location, and then pick out the
locations for values that are desired. This takes up quite a lot of time,
because, unsurprisingly, there are usually more values in registers and
stack slots than there are variables.
This patch instead creates a map of desired values to their locations,
which are initially illegal locations. Then, as we examine every available
value, we can select locations for values we care about, and ignore those
that we don't. This substantially reduces the amount of work done (i.e.,
building a map up of values to locations that nothing wants or needs).
Geomean performance improvement of 1% on CTMark, woo.
Differential Revision: https://reviews.llvm.org/D118597
Joseph Huber [Tue, 1 Feb 2022 16:46:20 +0000 (11:46 -0500)]
[OpenMP] Add kernel string attribute to kernel function
This patch adds a function attribute to the kernel function generated in
OpenMP offloading. We already create a `nvvm.annotations` metadata node
indicating the kernels present in the program. However, this created
some indirection when trying to identify if a specific function was an
entry. We add a single function attribute for each function now to
simplify this.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118708
Jez Ng [Tue, 1 Feb 2022 18:45:38 +0000 (13:45 -0500)]
[lld-macho][nfc] Comments and style fixes
Added some comments (particularly around finalize() and
finalizeContents()) as well as doing some rephrasing / grammar fixes for
existing comments.
Also did some minor style fixups, such as by putting methods together in
a class definition and having fields of similar types next to each
other.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D118714
Tanya Lattner [Tue, 1 Feb 2022 18:45:40 +0000 (10:45 -0800)]
Update status of move.
Fangrui Song [Tue, 1 Feb 2022 18:41:16 +0000 (10:41 -0800)]
[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.
Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.
Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```
While here, refine the condition for the alias as well.
For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.
For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```
There are other places where GlobalAlias isInterposable usage may need to be
fixed.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D107249
Mahesh Ravishankar [Tue, 1 Feb 2022 16:54:05 +0000 (16:54 +0000)]
Avoid doing tile + fuse if tile sizes are zero.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D118576
Chris Bieneman [Mon, 31 Jan 2022 22:11:11 +0000 (16:11 -0600)]
[NFC] Add CFGuard to opt build
If you don't include a target that directly references CFGuard it
doesn't get built into opt or the llvm library build, which causes some
test cases to fail.
Including this in opt explicitly resolve those issues.
Fangrui Song [Tue, 1 Feb 2022 18:23:45 +0000 (10:23 -0800)]
[AMDGPU][test] Add dso_local to prevent preemptible alias resolution
Fangrui Song [Tue, 1 Feb 2022 18:19:30 +0000 (10:19 -0800)]
[ELF] Update flag propagation rule to ignore discarded output sections
See the updated insert-before.test for the effects: many synthetic
sections are SHF_ALLOC|SHF_WRITE. If they are discarded, we don't want
to propagate their flags to subsequent output section descriptions.
`getFirstInputSection(sec) == nullptr` can technically be merged into
`isDiscardable` but I'd like to postpone that as not sharing code may give more
refactoring opportunity.
Depends on D118529.
Reviewed By: peter.smith, bluca
Differential Revision: https://reviews.llvm.org/D118530
Fangrui Song [Tue, 1 Feb 2022 18:16:12 +0000 (10:16 -0800)]
[ELF] Rename adjustSectionsBeforeSorting to adjustOutputSections and make it affect INSERT commands
adjustSectionsBeforeSorting updates some output section attributes
(alignment/flags) and removes discardable empty sections. When it is called,
INSERT commands have not been processed. Therefore the flags propagation rule
may not affect output sections defined in an INSERT command properly.
Fix this by moving processInsertCommands before adjustSectionsBeforeSorting.
adjustSectionsBeforeSorting is somewhat misnamed. The order between it and
sortInputSections does not matter. With the pass shuffle, the name of
adjustSectionsBeforeSorting becomes wrong. Therefore rename it. The new
name is not set into stone. The function mixes several tasks and the
code may be refactored in a way that we may give them more meaningful
names.
With this patch, I think the behavior of attribute propagation becomes more
reasonable. In particular, in the absence of non-INSERT SECTIONS,
inserting a section after a SHF_ALLOC one will give us a SHF_ALLOC section,
not a non-SHF_ALLOC one (see linkerscript/insert-after.test).
Reviewed By: peter.smith, bluca
Differential Revision: https://reviews.llvm.org/D118529
David Green [Tue, 1 Feb 2022 18:15:34 +0000 (18:15 +0000)]
[AArch64] Add some CCMP testing. NFC
Fangrui Song [Tue, 1 Feb 2022 18:10:22 +0000 (10:10 -0800)]
[ELF] Deduplicate names of local symbols only with -O2
The deduplication requires a DenseMap of the same size of the local part of
.strtab . I optimized it in
e20544543478b259eb09fa0a253d4fb1a5525d9e but it is
still quite slow.
For Release build of clang, deduplication makes .strtab 1.1% smaller and makes the link 3% slower.
For chrome, deduplication makes .strtab 0.1% smaller and makes the link 6% slower.
I suggest that we only perform the optimization with -O2 (default is -O1).
Not deduplicating local symbol names will simplify parallel symbol table write.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D118577
Fangrui Song [Tue, 1 Feb 2022 17:59:51 +0000 (09:59 -0800)]
[llvm-ar] -s: don't convert a thin archive to a regular one
In binutils, ar -s and randlib don't convert a thin archive to a regular one.
This behavior makes sense and this patch ports the behavior.
Reviewed By: gbreynoo
Differential Revision: https://reviews.llvm.org/D117443
Fangrui Song [Tue, 1 Feb 2022 17:56:50 +0000 (09:56 -0800)]
[llvm-ar] Add --thin for creating a thin archive
In GNU ar (since 2008), the modifier 'T' means creating a thin archive.
In many other ar implementations (FreeBSD, macOS, elfutils, etc), -T
means "allow filename truncation of extracted files", as specified by
X/Open System Interface.
For portability, 'T' with thin archive semantics should be avoided.
See https://sourceware.org/bugzilla/show_bug.cgi?id=28759 binutils 2.38
will deprecate 'T' (without diagnostic) and add --thin.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D116979
Alexey Bataev [Thu, 16 Dec 2021 16:55:52 +0000 (08:55 -0800)]
[SLP]Alternate vectorization for cmp instructions.
Added support for alternate ops vectorization of the cmp instructions.
It allows to vectorize either cmp instructions with same/swapped
predicate but different (swapped) operands kinds or cmp instructions
with different predicates and compatible operands kinds.
Differential Revision: https://reviews.llvm.org/D115955
Fangrui Song [Tue, 1 Feb 2022 17:53:28 +0000 (09:53 -0800)]
[ELF] Simplify code with invokeELFT. NFC
Krzysztof Parzyszek [Tue, 1 Feb 2022 17:46:34 +0000 (09:46 -0800)]
[Hexagon] Punt on registers without reaching defs in addr mode opt
This fixes https://github.com/llvm/llvm-project/issues/52636.
Josh Mottley [Fri, 7 Jan 2022 17:06:47 +0000 (17:06 +0000)]
[flang] Upstream partial lowering of EXIT intrinsic
This patch adds partial lowering of the "EXIT" intrinsic to
the backend runtime hook implemented in patch D110741. It also adds a
helper function to the `RuntimeCallTestBase.h` for testing for an
intrinsic function call in a `mlir::Block`.
Differential Revision: https://reviews.llvm.org/D118141
Fangrui Song [Tue, 1 Feb 2022 17:47:56 +0000 (09:47 -0800)]
[ELF] De-template LinkerDriver::link. NFC
Replace `f<ELFT>(x)` with `InvokeELFT(f, x)`.
The size reduction comes from turning `link` from 4 specializations into 1.
My x86-64 lld executable is 26KiB smaller.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D118551
Jonas Paulsson [Tue, 1 Feb 2022 17:28:13 +0000 (11:28 -0600)]
[TableGen] Fix reporting from CodeGenSchedModels::checkCompleteness().
Make the check for a complete SchedModel work as expected: report any
supported instruction not having scheduler info.
For unclear reasons there was a variable 'HadCompleteModel' that caused
e.g. new instructions for a new subtarget not to be reported. This variable
is now simply removed as all in-tree targets seem to build fine without it.
Review: Simon Pilgrim
Differential Revision: https://reviews.llvm.org/D118628
Alexey Bataev [Tue, 1 Feb 2022 17:27:11 +0000 (09:27 -0800)]
[SLP][NFC]Add a test for alternate vectorization in cmp instructions
with same/swapped predicate.
Alexander Belyaev [Tue, 1 Feb 2022 17:07:33 +0000 (18:07 +0100)]
Revert "Revert "[mlir] Purge `linalg.copy` and use `memref.copy` instead.""
This reverts commit
25bf6a2a9bc6ecb3792199490c70c4ce50a94aea.
Nikolas Klauser [Tue, 1 Feb 2022 17:11:49 +0000 (18:11 +0100)]
[libc++][NFC] Add namespace comments in ranges
With this patch there should be no more namespaces without closing comment
Reviewed By: ldionne, Quuxplusone, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D118668
Peter Steinfeld [Tue, 1 Feb 2022 15:38:10 +0000 (07:38 -0800)]
[flang] Rename the runtime routine that reports a fatal user error
As per Steve Scalpone's suggestion, I've renamed the runtime routine to
better evoke its purpose.
I implemented a routine called "Crash" and added a test.
Differential Revision: https://reviews.llvm.org/D118703
Steven Wan [Tue, 1 Feb 2022 16:23:50 +0000 (11:23 -0500)]
[NFC][AIX]Disable failed tests due to aggressive byval alignment warning on AIX
These tests emit unexpected diagnostics on AIX because the byval alignment warning is emitted too aggressively. https://reviews.llvm.org/D118350 is supposed to provide a proper fix to the problem, but for the time being disable the tests to unblock.
Differential Revision: https://reviews.llvm.org/D118670
Alex Zinenko [Wed, 19 Jan 2022 12:42:49 +0000 (13:42 +0100)]
[mlir] Better error message in PybindAdaptors.h
When attempting to cast a pybind11 handle to an MLIR C API object through
capsules, the binding code would attempt to directly access the "_CAPIPtr"
attribute on the object, leading to a rather obscure AttributeError when the
attribute was missing, e.g., on non-MLIR types. Check for its presence and
throw a TypeError instead.
Depends On D117646
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D117658
Jake Egan [Tue, 1 Feb 2022 16:17:35 +0000 (11:17 -0500)]
[AIX] Bump DWARF versions to 3 because XCOFF64 requires DWARF64
DWARF64 was implemented at version 3, so if a DWARF version less than 3 is specified, DWARF64 does not get selected. Since XCOFF64 requires DWARF64, the modified tests fail on 64-bit AIX. This patch bumps these tests to dwarf version 3 to maintain test coverage on 64-bit AIX.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114110
Olle Fredriksson [Tue, 1 Feb 2022 16:01:15 +0000 (11:01 -0500)]
[DFAJumpThreading] make update order deterministic
We tracked down some non-determinism in compilation output to the
DFAJumpThreading pass. These changes fixed our issue:
* Make the DefMap type a MapVector to make its iteration order depend on
insertion order.
* Sort the values to be inserted into NewDefs by instruction order to
make the insertion order deterministic. Since these values come from
iterating over a ValueMap, which doesn't have deterministic iteration
order, I couldn't fix this at its source.
Reviewed By: alexey.zhikhar
Differential Revision: https://reviews.llvm.org/D118590
Mircea Trofin [Tue, 1 Feb 2022 15:58:00 +0000 (07:58 -0800)]
[nfc][regalloc] Move DefaultEvictionAdvisor::* to RegAllocEvictionAdvisor.cpp
This is leftover from the advisor refactoring. Straight-forward copy and
paste.
Craig Topper [Tue, 1 Feb 2022 15:39:51 +0000 (07:39 -0800)]
[RISCC] Add missing words to comment. NFC
Craig Topper [Tue, 1 Feb 2022 15:23:35 +0000 (07:23 -0800)]
[RISCV] Fix a vsetvli insertion bug involving loads/stores.
The first phase of the analysis can avoid a vsetvli if an earlier
instruction in the block used an SEW and LMUL that when combined with
the EEW of the load/store would produce the desired EMUL. If we
avoided a vsetvli this will affect the global analysis we do in the
second phase.
The third phase where we really insert the vsetvlis needs to agree
with the first phase. If it doesn't we can insert vsetvlis that
invalidate the global analysis.
In the test case there is a VSETVLI in the preheader that sets
SEW=64 and LMUL=1. Inside the loop there is a VADD with SEW=64 and LMUL=1.
This VADD is followed by a store that wants wants SEW=32 LMUL=1/2.
Because it has EEW=32 as part of the opcode the SEW=64 LMUL=1 from the
VADD can be become EMUL=1 for the store. So the first phase determines no
vsetvli is needed.
The third phase manages CurInfo differently than BBInfo.Change from the
first phase. CurInfo is only updated when we see a vsetvli or insert
a vsetvli. This was done to allow predecessor block information from
the global analysis to be applied to multiple instructions. Since the
loop body has no vsetvli we won't update CurInfo for either the VADD
or the VSE. This prevented us from checking the store vsetvli elision
for the VSE resulting in a vsetvli SEW=32 LMUL=1/2 being emitted which
invalidated the global analysis.
To mitigate this, I've added a BBLocalInfo variable that more closely
matches the first phase propagation. This gets updated based on the
VADD and prevents emitting a vsetvli for the store like we did in the
first phase.
I wonder if we should do an earlier phase to handle the load/store case
by adding more pseudo opcodes and changing the SEW/LMUL for those
instructions before the insertion analysis. That might be more robust
than trying to guarantee two phases make the same decision.
Fixes the test from D118629.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D118667
Stanislav Gatev [Mon, 31 Jan 2022 10:43:07 +0000 (10:43 +0000)]
[clang][dataflow] Enable comparison of distinct values in Environment
Make specializations of `DataflowAnalysis` extendable with domain-specific
logic for comparing distinct values when comparing environments.
This includes a breaking change to the `runDataflowAnalysis` interface
as the return type is now `llvm::Expected<...>`.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D118596
Craig Topper [Tue, 1 Feb 2022 07:01:26 +0000 (23:01 -0800)]
[RISCV] Don't make it an error have Zve* and V at the same time.
This should not be an error. V is a valid implementation of Zve.
Spec clarified here
https://github.com/riscv/riscv-v-spec/commit/
9a877e8553362ff03a9b22b98e321b59aff50398
Differential Revision: https://reviews.llvm.org/D118679
Sam McCall [Tue, 1 Feb 2022 15:01:46 +0000 (16:01 +0100)]
[clangd] Fix handling of co_await in go-to-type
Nikita Popov [Tue, 1 Feb 2022 14:56:42 +0000 (15:56 +0100)]
[GlobalOpt] Avoid early exit before dead constant check
In a similar vein to
236fbf571dc6cebcb81ac5187a170c8de6d5bc0e,
make sure we don't early-exit before the dead constant check.
Jon Chesterfield [Tue, 1 Feb 2022 14:56:14 +0000 (14:56 +0000)]
Revert "[OpenMP][FIX] Explicit barriers in SPMD mode are not aligned"
This seems to be the root cause of hangs on amdgpu. Reverting while investigating.
This reverts commit
7b9844cc8dd0045f5251450ba2980d6d6ac48ef9.
Joseph Huber [Tue, 1 Feb 2022 14:49:58 +0000 (09:49 -0500)]
[OpenMP] Temporarily remove checks to fix failing test on MACOS
Summary:
This patch removes some of the check lines that are problematic on
MACOS. The output on the MAC systems works but should be slightly
different. Because this is simply the output being slightly different
rather than broken functionality the test is being changed.
Shao-Ce SUN [Tue, 1 Feb 2022 14:52:24 +0000 (22:52 +0800)]
[RISCV] Adjust some comments.
Sam McCall [Tue, 1 Feb 2022 14:51:05 +0000 (15:51 +0100)]
[clangd] Group and extend release notes
Nikita Popov [Tue, 1 Feb 2022 14:49:38 +0000 (15:49 +0100)]
[GlobalStatus] Skip non-pointer dead constant users
Constant expressions with a non-pointer result type used an early
exit that bypassed the later dead constant user check, and resulted
in different optimization outcomes depending on whether dead users
were present or not.
This fixes the issue reported in https://reviews.llvm.org/D117223#3287039.
David Green [Tue, 1 Feb 2022 14:51:23 +0000 (14:51 +0000)]
[AArch64] Add signed version of uaddlv test. NFC
Amy Kwan [Fri, 28 Jan 2022 15:26:12 +0000 (09:26 -0600)]
[PowerPC] Update P10 vector insert patterns to use refactored load/stores, and update handling of v4f32 vector insert.
This patch updates the P10 patterns with a load feeding into an insertelt to
utilize the refactored load and store infrastructure, as well as updating any
tests that exhibit any codegen changes.
Furthermore, custom legalization is added for v4f32 on Power9 and above to not
only assist with adjusting the refactored load/stores for P10 vector insert,
but also it enables the utilization of direct moves.
Differential Revision: https://reviews.llvm.org/D115691
Valentin Clement [Tue, 1 Feb 2022 14:26:47 +0000 (15:26 +0100)]
[flang] Add lowering for basic empty SUBROUTINE
This patch adds the ability to lower an empty subroutine.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D118695
Nikita Popov [Tue, 1 Feb 2022 08:42:36 +0000 (09:42 +0100)]
[AArch64] Do not use ABI alignment for mops.memset.tag
Pointer element types do not imply that the pointer is ABI aligned.
We should be using either an explicit align attribute here, or fall
back to an alignment of 1. This fixes a new element type access
introduced in D117764.
I don't think this makes any practical difference though, as the
lowering does not depend on alignment.
Differential Revision: https://reviews.llvm.org/D118681
Pavel Labath [Fri, 28 Jan 2022 09:53:49 +0000 (10:53 +0100)]
[lldb] Convert ProcessGDBRemoteLog to the new API
Christian Kühnel [Tue, 1 Feb 2022 10:14:07 +0000 (10:14 +0000)]
[clangd] Cleanup of readability-identifier-naming
Auto-generated patch based on clang-tidy readability-identifier-naming.
Only some manual cleanup for `extern "C"` declarations and a GTest change was required.
I'm not sure if this cleanup is actually very useful. It cleans up clang-tidy findings to the number of warnings from clang-tidy should be lower. Since it was easy to do and required only little cleanup I thought I'd upload it for discussion.
One pattern that keeps recurring: Test **matchers** are also supposed to start with a lowercase letter as per LLVM convention. However GTest naming convention for matchers start with upper case. I would propose to keep stay consistent with the GTest convention there. However that would imply a lot of `//NOLINT` throughout these files.
To re-product this patch run:
```
run-clang-tidy -checks="-*,readability-identifier-naming" -fix -format ./clang-tools-extra/clangd
```
To convert the macro names, I was using this script with some manual cleanup afterwards:
https://gist.github.com/ChristianKuehnel/
a01cc4362b07c58281554ab46235a077
Differential Revision: https://reviews.llvm.org/D115634
Nathan Sidwell [Mon, 24 Jan 2022 14:38:47 +0000 (06:38 -0800)]
[demangler] Preserve line numbering in copied demangler sources
While prepending lines to the copied source files is functional, it
disturbs the line numbering between the original and the copy. That
makes development more awkward than necessary, as it is the copy that
generally gets compiled first and emits compiler errors.
This uses sed to alter the first two lines, and also emits better
emacs mode setting, getting both C++ mode and read-only mode.
While here, also update and clarify documentation.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D118135
Marek Kurdej [Tue, 1 Feb 2022 13:29:31 +0000 (14:29 +0100)]
[clang-format] Use std::iota and reserve when sorting Java imports. NFC.
This way we have at most 1 allocation even if the number of includes is greater than the on-stack size of the small vector.
Marek Kurdej [Tue, 1 Feb 2022 13:24:01 +0000 (14:24 +0100)]
[clang-format] Use std::iota and reserve. NFC.
This way we have at most 1 allocation even if the number of includes is greater than the on-stack size of the small vector.
Marek Kurdej [Tue, 1 Feb 2022 13:10:19 +0000 (14:10 +0100)]
[clang-format] De-pessimize appending newlines. NFC.
* Avoid repeatedly calling std::string::append(char) in a loop.
* Reserve before calling std::string::append(const char *) in a loop.
Marek Kurdej [Tue, 1 Feb 2022 12:55:05 +0000 (13:55 +0100)]
[clang-format] Use ranged for loops. NFC.
Jon Chesterfield [Tue, 1 Feb 2022 12:59:35 +0000 (12:59 +0000)]
[openmp] Disable tests that presently hang on CI
Nicolas Vasilache [Tue, 1 Feb 2022 12:59:11 +0000 (07:59 -0500)]
[mlir][vector][integration] Reactivate LLI in vector integration test.
The test introduced in https://reviews.llvm.org/D118006 was missing a return and would
introduce a non-0 return which would fail tests.
Valentin Clement [Tue, 1 Feb 2022 12:49:49 +0000 (13:49 +0100)]
[flang] Add lowering placeholders
This patch puts in place the differents
function to lower the evaluation list. All functions
are just placholders with TODOs for now.
Follow up patches will bring the proper lowering in these
functions.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D118678
Simon Pilgrim [Tue, 1 Feb 2022 12:33:17 +0000 (12:33 +0000)]
[DAG] SimplifyMultipleUseDemandedBits - add default Depth = 0 argument.
Simplifies an upcoming change.
Nicolas Vasilache [Tue, 1 Feb 2022 12:24:49 +0000 (07:24 -0500)]
Temporarily disable LLI to investigate weird non 0 error code
Somehow the test introduced in https://reviews.llvm.org/D118006 produces the expected result but running
through lli with Intel SDE activated sneaks in an error code 2 (before this commit) or an error code 10
(after this commit).
The test as is is still meaningful in that the LLVMIR generation would crash if the `elementtype` is set
improperly.
Still, this should run with lli turned on.
Nico Weber [Tue, 1 Feb 2022 12:22:33 +0000 (07:22 -0500)]
[gn build] unconfuse sync script after
762f0b546328
Alexander Shaposhnikov [Tue, 1 Feb 2022 12:14:25 +0000 (12:14 +0000)]
[CodeGen][AArch64] Fix typo in legalizer-info-validation.mir
Alexander Shaposhnikov [Tue, 1 Feb 2022 12:08:06 +0000 (12:08 +0000)]
[CodeGen][AArch64] Fix typo in arm64-zero-cycle-zeroing.ll