Simon Pilgrim [Sat, 29 Oct 2022 15:28:37 +0000 (16:28 +0100)]
[X86] Remove the WriteDPPSZ schedule pair
There's never been a 512-bit vdpps instruction (and the implementation is so convoluted there probably won't ever be)
Simon Pilgrim [Sat, 29 Oct 2022 15:23:41 +0000 (16:23 +0100)]
[X86] Remove 256-bit scheduler classes
SLM (silvermont) doesn't support any AVX instructions
Serge Pavlov [Fri, 14 Oct 2022 13:07:09 +0000 (20:07 +0700)]
Handle errors in expansion of response files
Previously an error raised during an expansion of response files (including
configuration files) was ignored and only the fact of its presence was
reported to the user with generic error messages. This made it difficult to
analyze problems. For example, if a configuration file tried to read an
inexistent file, the error message said that 'configuration file cannot
be found', which is wrong and misleading.
This change enhances handling errors in the expansion so that users
could get more informative error messages.
Differential Revision: https://reviews.llvm.org/D136090
Sanjay Patel [Sat, 29 Oct 2022 12:55:37 +0000 (08:55 -0400)]
[InstCombine] reduce code duplication in visitMul(); NFC
Sanjay Patel [Fri, 28 Oct 2022 16:05:10 +0000 (12:05 -0400)]
[InstCombine] add tests for mul with shl operand; NFC
Simon Pilgrim [Sat, 29 Oct 2022 13:09:52 +0000 (14:09 +0100)]
[ARC] Regenerate ldst.ll
Reported by the (experimental) arc buildbot after D136042
Jacques Pienaar [Sat, 29 Oct 2022 12:42:28 +0000 (05:42 -0700)]
[mlir] Split parser fuzzer for bytecode & text
Enable fuzzing these independently. Currently still not linking in
dialects beyond Builtin.
Simon Pilgrim [Sat, 29 Oct 2022 11:29:58 +0000 (12:29 +0100)]
[DAG] Enable combineShiftOfShiftedLogic folds after type legalization
This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides.
Fixes #57872
Differential Revision: https://reviews.llvm.org/D136042
Michał Górny [Sat, 29 Oct 2022 04:44:11 +0000 (06:44 +0200)]
[lit] Deduplicate README and longdescription, and update it
Since long_description is effectively the README that's displayed
on pypi project page, combine it with the existing README file and read
the file in `setup.py`. This is a common practice among Python
projects, to the point of declarative-style setuptools configurations
providing a shorthand for it.
While at it, update the outdated information about LLVM Bugzilla for use
of GitHub issues.
Differential Revision: https://reviews.llvm.org/D137006
Simon Pilgrim [Sat, 29 Oct 2022 11:03:38 +0000 (12:03 +0100)]
[X86] WriteFShuffle256 shuffles aren't microcoded in the llvm sense
znver1/2 might have poor throughput for crosslane shuffles but they don't consume 100 cycles of resources
I think there was a misunderstanding between the AMD definition of microcoding (more than 2-3 uops) and LLVM (here be dragons - impossible to approximately model the instruction)
This is more yak shaving to come from D103695 - this time working out why codegen involving broadcasts gives such weird numbers
Simon Pilgrim [Sat, 29 Oct 2022 10:52:12 +0000 (11:52 +0100)]
[X86] Ensure 256-bit inlane shuffles are set to 2 uops + half rate
znver1 double pumps regular 256-bit shuffles (crosslane shuffles are messier....)
Fixes yet another mismatch between the numbers coming out of the script from D103695 and the znver1 scheduler model
Confirmed with the AMD SoG, Agner + instlatx64
Jonas Hahnfeld [Fri, 28 Oct 2022 21:05:27 +0000 (23:05 +0200)]
[JITLink][RISCV] Add names for GOT/PLT relocations
It is confusing to see "Unrecognized edge kind" in debugging output
for supported relocations; this was probably an oversight in commit
89f546f6ba which added the support.
Differential Revision: https://reviews.llvm.org/D136985
Xiaodong Liu [Thu, 27 Oct 2022 09:42:54 +0000 (17:42 +0800)]
[LoongArch] Improve the "out of range" error information reported by `adjustFixupValue`
There are three reduplicate error messages for different conditions. I
add meaningful information to make them more informative.
Differential Revision: https://reviews.llvm.org/D136742
Fangrui Song [Sat, 29 Oct 2022 07:15:24 +0000 (00:15 -0700)]
[docs] clang.rst: gnu++14 => gnu++17
LLVM GN Syncbot [Sat, 29 Oct 2022 06:45:02 +0000 (06:45 +0000)]
[gn build] Port
30ea3fcc4c69
owenca [Thu, 27 Oct 2022 09:53:23 +0000 (02:53 -0700)]
[clang-format][NFC] Move BracesRemover tests out of FormatTest.cpp
Differential Revision: https://reviews.llvm.org/D136830
Chia-hung Duan [Fri, 28 Oct 2022 22:17:06 +0000 (22:17 +0000)]
Revert "Revert "[scudo] Fix the calculating of memory group usage""
This reverts commit
69fe7abb393ba7d6ee9c8ff1429316845b5bad37.
Fixed the arguments order while calling batchGroupBase()
Differential Revision: https://reviews.llvm.org/D136995
YingChi Long [Sat, 29 Oct 2022 06:20:19 +0000 (14:20 +0800)]
[clang][NFC] sync comments from declaration of InitializePreprocessor
Siva Chandra Reddy [Sat, 29 Oct 2022 05:41:17 +0000 (05:41 +0000)]
[libc] Fix the return value of fread and fwrite.
They were previously returning the number of bytes read. They should
instead be returning the number of objects read.
Haojian Wu [Sat, 29 Oct 2022 06:00:29 +0000 (08:00 +0200)]
Matt Arsenault [Sat, 29 Oct 2022 05:28:23 +0000 (22:28 -0700)]
ARM: Fix stack warning test
Matt Arsenault [Thu, 27 Oct 2022 23:28:46 +0000 (16:28 -0700)]
clang: Improve errors for DiagnosticInfoResourceLimit
Print source location info and demangle the name, compared
to the default behavior.
Several observations:
1. Specially handling this seems to give source locations
without enabling debug info, and also gives columns compared
to the backend diagnostic.
2. We're duplicating diagnostic effort in DiagnosticInfo
and clang. This feels wrong, but clang can demangle and I guess
have better debug info available? Should clang really have any of this
code? For the purposes of this diagnostic, the important piece
is just reading the source location out of the llvm::Function.
3. lld is not duplicating the same effort as clang with LTO, and
just directly printing the DiagnosticInfo as-is. e.g.
$ clang -fgpu-rdc
lld: error: local memory (480000) exceeds limit (65536) in function '_Z12use_huge_ldsIiEvv'
lld: error: local memory (960000) exceeds limit (65536) in function '_Z12use_huge_ldsIdEvv'
$ clang -fno-gpu-rdc
backend-resource-limit-diagnostics.hip:8:17: error: local memory (480000) exceeds limit (65536) in 'void use_huge_lds<int>()'
__global__ void use_huge_lds() {
^
backend-resource-limit-diagnostics.hip:8:17: error: local memory (960000) exceeds limit (65536) in 'void use_huge_lds<double>()'
2 errors generated when compiling for gfx90a.
4. Backend errors are not observed with -save-temps and -fno-gpu-rdc or -flto,
and the compile incorrectly succeeds.
5. The backend version prints error: <location info>; clang prints <location info>: error:
6. -emit-codegen-only is totally broken for AMDGPU. MC
gets a null target streamer. I do not understand why this
is a thing. This just creates a horrible edge case.
Just work around this by emitting actual code instead of blocking
this patch.
Matt Arsenault [Thu, 27 Oct 2022 23:29:26 +0000 (16:29 -0700)]
DiagnosticInfo: Report function location for resource limits
We have some odd redundancy where clang specially handles
the stack size case. If clang prints it, the source location is first
followed by "warning". The backend diagnostic, as printed by other tools
puts "warning" first.
Matt Arsenault [Sat, 29 Oct 2022 02:01:09 +0000 (19:01 -0700)]
llvm-reduce: Fix typo
Fangrui Song [Sat, 29 Oct 2022 03:46:27 +0000 (20:46 -0700)]
[Frontend] -MP: remove blank lines
GCC 10 removed blank lines for phony targets during a refactoring.
The blank lines seems unuseful, so let's follow suit.
Fangrui Song [Sat, 29 Oct 2022 03:35:29 +0000 (20:35 -0700)]
[Frontend] Fix -MP when main file is <stdin>
rC220726 had a bug: `echo "<cstdlib>" | clang -M -MP -x c++ - 2>/dev/null`
(used by glibc/configure.ac find_cxx_header) omitted a `cstdlib:` line. Instead
of filtering out `<stdin>` in `Dependencies`, retain it (so that the number of
entries does not change whether or not main file is `<stdin>`) and filter the
`PhonyTarget` output.
wren romano [Thu, 27 Oct 2022 23:25:41 +0000 (16:25 -0700)]
[mlir][sparse] Cleaning up function names in test
The old "dumpAndRelease" names are no longer valid since the "release" part is handled separately now.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D136899
Peiming Liu [Fri, 28 Oct 2022 20:45:06 +0000 (20:45 +0000)]
[mlir][sparse] run canonicalization pass after DenseOpBufferize.
As pointed out by Matthias: "DenseBufferizationPass should be run right after TensorCopyInsertionPass. (Running it after bufferizing the sparse IR is also OK.) The reason for this is that whether copies are needed for not depends on the structure of the program (SSA use-def chains). In particular, running the canonicalizer in-between is problematic because it could introduce new RaW conflicts"
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D136980
Slava Zakharin [Sat, 29 Oct 2022 00:50:11 +0000 (17:50 -0700)]
Fix MLIR build after D136931
ld.lld: error: undefined symbol: mlir::extractFromI64ArrayAttr(mlir::Attribute)
>>> referenced by NVVMToLLVMIRTranslation.cpp:142
(/llvm-project/mlir/lib/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.cpp:142)
>>> referenced by NVVMToLLVMIRTranslation.cpp:152
(/llvm-project/mlir/lib/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.cpp:152)
Matt Arsenault [Thu, 20 Oct 2022 05:11:18 +0000 (22:11 -0700)]
llvm-reduce: Fix producing invalid reductions with landingpads
It's not valid to simply branch to a landingpad block, so it
needs to be removed.
Also stop trying to scan forward to find a block that can be merged.
The predecessor merge rules are more complex than this. This also
would need to have considered landingpads. Just do the minimum
to delete the block, and let the simplify-cfg reduction handle
the branch chain cleanups.
Matt Arsenault [Tue, 4 Oct 2022 05:06:09 +0000 (22:06 -0700)]
llvm-reduce: Fix block reduction with unreachable blocks
Previously this would produce many invalid reductions with
"Instruction does not dominate uses" verifier errors.
This fixes issues in cases where the incoming IR
has unreachable blocks, and the resulting reduction
introduced new reachable blocks.
Have basic-blocks skip functions that have unreachable
blocks, Introduce a separate reduction which only
deletes unreachable blocks. Cleanup any newly unreachable
blocks after trimming out the requested deletions.
Includes a variety of meta-reduced tests for llvm-reduce
itself with -abort-on-invalid-reduction that were failing
on different iterations of this patch.
Bugpoint's implementation is much simpler (but currently I don't
understand how it avoids disconnecting interesting blocks from the CFG).
Matt Arsenault [Wed, 19 Oct 2022 20:10:41 +0000 (13:10 -0700)]
llvm-reduce: Don't turn switches into returns
Re-use one of the existing successors as the new default.
This helps with a future patch to fix handling of unreachable
blocks.
Ben Langmuir [Fri, 28 Oct 2022 23:59:38 +0000 (16:59 -0700)]
[clang][test] Require x86 target in a couple new tests
Attempt to fix test failures on bots that don't configure x86 target.
Matt Arsenault [Fri, 28 Oct 2022 23:49:32 +0000 (16:49 -0700)]
clang: Add required target to test
James Y Knight [Fri, 28 Oct 2022 22:38:49 +0000 (18:38 -0400)]
[llvm-tblgen] NFC: Simplify DecoderEmitter.
Currently the DecoderEmitter constructor takes a bunch of string
parameters containing bits of code to interpolate.
However, there's only two ways it can be called. The one used for most
targets which doesn't handle the SoftFail DecoderStatus (not a
problem, because they don't use SoftFail). The other mode, which is
used for ARM/AArch64, does handle SoftFail, but requires an externally
defined helper function in those targets.
This is unnecessary complication; remove the parameters, and unify
onto a single version which does support SoftFail, defining the helper
itself.
Peiming Liu [Fri, 28 Oct 2022 21:42:45 +0000 (21:42 +0000)]
[mlir][sparse] Fold invariant op only when it has only one use.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D136990
Matt Arsenault [Mon, 24 Oct 2022 18:05:06 +0000 (11:05 -0700)]
llvm-reduce: Stop checking workitem is interesting before each pass
Each delta pass run should have guaranteed the output is still
interesting, so it should be pointless to recheck this each
iteration. I have many issues that take multiple minutes
to reproduce, so this ends up being a huge waste of time.
Also, remove broken line counting. This never worked, since
getLines was failing to open the temporary file which was just
deleted.
Matt Arsenault [Fri, 28 Oct 2022 18:49:15 +0000 (11:49 -0700)]
AMDGPU: Register a null MC streamer for -emit-codegen-only
For some reason null is a valid MC target, used from clang with
-emit-codegen-only. Previously the target streamer was null,
which was inconsistently null checked resulting in crashes
if using amdhsa.
Paul Kirth [Fri, 28 Oct 2022 21:57:23 +0000 (21:57 +0000)]
Revert "[AArch64] Optimize memcmp when the result is tested for [in]equality with 0"
This reverts commit
01ff511593d1a4920fa3c1d450ad2077661e0bdc.
It triggers an assertion failure in SelectionDAG.cpp
see https://github.com/llvm/llvm-project/issues/58675 for details.
Arthur Eubanks [Thu, 4 Aug 2022 00:09:40 +0000 (17:09 -0700)]
[lldb] Support simplified template names
See https://discourse.llvm.org/t/dwarf-using-simplified-template-names/58417 for background on simplified template names.
lldb doesn't work with simplified template names because it uses DW_AT_name which doesn't contain template parameters under simplified template names.
Two major changes are required to make lldb work with simplified template names.
1) When building clang ASTs for struct-like dies, we use the name as a cache key. To distinguish between different instantiations of a template class, we need to add in the template parameters.
2) When looking up types, if the requested type name contains '<' and we didn't initially find any types from the index searching the name, strip the template parameters and search the index, then filter out results with non-matching template parameters. This takes advantage of the clang AST's ability to print full names rather than doing it by ourself.
An alternative is to fix up the names in the index to contain the fully qualified name, but that doesn't respect .debug_names.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D134378
Fangrui Song [Fri, 28 Oct 2022 23:15:14 +0000 (16:15 -0700)]
[Driver] Use addOptInFlag/addOptOutFlag. NFC
Aart Bik [Fri, 28 Oct 2022 23:02:33 +0000 (16:02 -0700)]
[mlir][arm][sve] fix broken integration tests
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D136997
Ben Langmuir [Thu, 27 Oct 2022 23:52:15 +0000 (16:52 -0700)]
Move getenv for AS_SECURE_LOG_FILE to clang
Avoid calling getenv in the MC layer and let the clang driver do it so
that it is reflected in the command-line as an -mllvm option.
rdar://
101558354
Differential Revision: https://reviews.llvm.org/D136888
Aart Bik [Fri, 28 Oct 2022 18:14:58 +0000 (11:14 -0700)]
[mlir][sparse] build proper insertion chain
The alloc->insert/compress->load chain needs to be
properly represented with an SSA chain now in loops
and if statements to properly reflect the modifying
behavior (runtime support lib is forgiving on breaking
this, but the new codegen is not).
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D136966
Fangrui Song [Fri, 28 Oct 2022 22:37:46 +0000 (15:37 -0700)]
[Driver] Fix -fdebug-default-version= -Wunused-command-line-argument after D136707
-fdebug-default-version= is designed to suppress -Wunused-command-line-argument
warnings even without a -g option.
Nathan Chancellor [Fri, 28 Oct 2022 22:24:13 +0000 (15:24 -0700)]
Revert "[AArch64]SME2 Outer Product and Accumulate instructions"
This reverts commit
4df36f168763af725ba5bc852a4321afd0f769c4.
This change does not pass check-llvm.
David Blaikie [Fri, 28 Oct 2022 22:29:50 +0000 (22:29 +0000)]
Follow-up to Itanium ABI POD patchnotes
Yaxun (Sam) Liu [Fri, 28 Oct 2022 20:45:52 +0000 (16:45 -0400)]
[HIP] add float to fp16 convert functions
Reviewed by: Brian Sumner, Artem Belevich
Differential Revision: https://reviews.llvm.org/D136981
Chia-hung Duan [Fri, 28 Oct 2022 22:15:50 +0000 (22:15 +0000)]
Revert "[scudo] Fix the calculating of memory group usage"
This reverts commit
6130d67f70ad2e063b4407b6ee31c7db19464fee.
Augusto Noronha [Wed, 26 Oct 2022 21:53:19 +0000 (14:53 -0700)]
[lldb] Explicitly open file to write with utf-8 encoding in crashlog.py
The python "open" function will use the default encoding for the
locale (the result of "locale.getpreferredencoding()"). Explicitly set
the locale to utf-8 when opening the crashlog for writing, as there may
be non-ascii symbols in there (for example, Swift uses "τ" to indicate
generic parameters).
rdar://
101402755
Differential Revision: https://reviews.llvm.org/D136798
Andrzej Warzynski [Fri, 28 Oct 2022 18:17:06 +0000 (18:17 +0000)]
[mlir][sve] Canonicalise MLIR_RUN_ARM_SVE_TESTS
Similarly to other CMake variables used to configure LIT tests, this
patch makes sure that MLIR_RUN_ARM_SVE_TESTS is canonicalised. The
corresponding LIT configuration is updated accordingly.
Differential Revision: https://reviews.llvm.org/D136967
Chia-hung Duan [Thu, 27 Oct 2022 22:47:03 +0000 (22:47 +0000)]
[scudo] Fix the calculating of memory group usage
In SizeClassAllocator64, the boundary of a memory group may not align to
the region begin. Which means the begin addr of a memory group may
smaller than region begin. This leads to wrong judgement of memory group
usage.
Differential Revision: https://reviews.llvm.org/D136898
Chia-hung Duan [Thu, 27 Oct 2022 17:51:56 +0000 (17:51 +0000)]
[scudo] Lazy initialize the PageMap while page releasing
We allocate the page map before knowing if there're groups can be
released. This may result in many redundant map()/unmap() operations if
there's no page to release.
Make the page map be lazy initialized.
Differential Revision: https://reviews.llvm.org/D136873
Jonathan Peyton [Mon, 3 Oct 2022 20:14:40 +0000 (15:14 -0500)]
[OpenMP][libomp] Add hidden helper affinity
Add new hidden helper affinity via the environment variable,
KMP_HIDDEN_HELPER_AFFINITY, which allows users to assign thread
affinity to hidden helper threads using the same syntax as
KMP_AFFINITY. OMP_PLACES/OMP_PROC_BIND have no interaction with
KMP_HIDDEN_HELPER_AFFINITY.
Differential Revision: https://reviews.llvm.org/D135113
Jonathan Peyton [Mon, 3 Oct 2022 20:12:08 +0000 (15:12 -0500)]
[OpenMP][libomp] Make affinity warnings parameterized
Separate change for the warnings to depend on the relevant affinity
settings verbose and warnings settings.
Differential Revision: https://reviews.llvm.org/D135112
Jonathan Peyton [Mon, 1 Aug 2022 22:03:27 +0000 (17:03 -0500)]
[OpenMP][libomp] Parameterize affinity functions
This patch parameterizes the affinity initialization code to allow multiple
affinity settings. Almost all global affinity settings are consolidated
and put into a structure kmp_affinity_t. This is in anticipation of the
addition of hidden helper affinity which will have the same syntax and
semantics as KMP_AFFINITY only for the hidden helper team.
Differential Revision: https://reviews.llvm.org/D135109
Hanhan Wang [Fri, 28 Oct 2022 20:02:40 +0000 (13:02 -0700)]
[mlir][scf] Enhance sizes computation in tileUsingSCFForOp.
The boundary is always 1 if the tile size is 1.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D136884
Benjamin Kramer [Fri, 28 Oct 2022 19:17:42 +0000 (21:17 +0200)]
[mlir][linalg] Add missing region when building TransposeOp
This fixes a regression from
ad89eb5b1fccf002eb59dfbab0fdb515ea3e65b7
Craig Topper [Fri, 28 Oct 2022 17:49:27 +0000 (10:49 -0700)]
[RISCV] Merge WriteLDW and WriteLDWU schedule classes.
We don't distinquish signed vs unsigned for B and H loads.
Maybe this split was because LDWU isn't in RV32I? I don't think
that distinction matters to the scheduler. If your processor
only supports RV32I then having LWU in the SchedClass doesn't matter.
If your target supports RV64I, then LW and LWU are likely the same.
Aaron Ballman [Fri, 28 Oct 2022 18:42:33 +0000 (14:42 -0400)]
Update the status of some C2x features
Only N2670 had testable changes in it, the rest can be trivially
assumed to be implemented as the changes are editorial.
Michael Jones [Wed, 26 Oct 2022 23:12:49 +0000 (16:12 -0700)]
[libc] add locale free strcoll
The strcoll function is intended to compare strings based on their
ordering in the current locale. Since the locale facilities have not yet
been added, a simple implementation that is the same as strcmp has been
added as a placeholder.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D136802
Michael Jones [Fri, 28 Oct 2022 18:00:14 +0000 (11:00 -0700)]
[libc][obvious] fix scanf parser test
One of the expected values wasn't being initialized correctly.
Differential Revision: https://reviews.llvm.org/D136965
Michael Jones [Wed, 19 Oct 2022 20:28:15 +0000 (13:28 -0700)]
[libc] add scanf parser and core utilities
This is the first piece of scanf. It's very similar in design to printf,
and so much of the code is copied from that. There were potential issues
with conflicting macros so I've also renamed the "ASSERT_FORMAT_EQ"
macro for printf to "ASSERT_PFORMAT_EQ".
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D136288
Michael Jones [Thu, 27 Oct 2022 20:29:55 +0000 (13:29 -0700)]
[libc] add features to bitset
This patch adds the flip, set_range, and operator== functions to bitset.
These will be used in scanf.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D136881
Arthur Eubanks [Fri, 28 Oct 2022 17:38:12 +0000 (10:38 -0700)]
[polly][test] Remove -polly-target from tests
This flag was removed in D136621.
Yaxun (Sam) Liu [Thu, 27 Oct 2022 16:40:35 +0000 (12:40 -0400)]
[HIP] add fmax/fmin for fp16
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D136859
Arthur Eubanks [Fri, 28 Oct 2022 17:26:58 +0000 (10:26 -0700)]
Revert "[LegacyPM] Remove pipeline extension mechanism"
This reverts commit
4ea6ffb7e8edcea7f2cfb22acc907640a9ba44b9.
Breaks various backends.
Arthur Eubanks [Fri, 28 Oct 2022 17:25:23 +0000 (10:25 -0700)]
[polly] Format RegisterPasses.cpp
Erich Keane [Fri, 28 Oct 2022 17:02:04 +0000 (10:02 -0700)]
[Concepts] Fix an assert when trying to form a recovery expr on a
concept
When we failed the lookup of the function, we tried to form a
RecoveryExpr that caused us to recursively re-check the same constraint,
which caused us to try to double-insert the satisfaction into the cache.
This patch makes us just return the inner-cached version instead. We DO
end up double-evaluating thanks to the recovery-expr, but there isn't a
good way around that.
Arthur Eubanks [Wed, 26 Oct 2022 18:33:02 +0000 (11:33 -0700)]
[clang] Remove no-op -fexperimental-new-pass-manager/-fno-legacy-pass-manager flags
These have been no-op for a while and keeping them around may be confusing.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D136789
Craig Topper [Fri, 28 Oct 2022 17:16:57 +0000 (10:16 -0700)]
[RISCV] Optimize i64 insertelt on RV32.
We can use tail undisturbed vslide1down to insert into the vector.
This should make D136640 unneeded.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D136738
Arthur Eubanks [Mon, 24 Oct 2022 17:21:39 +0000 (10:21 -0700)]
[LegacyPM] Remove pipeline extension mechanism
Part of gradually removing the legacy PM optimization pipeline.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D136622
Arthur Eubanks [Mon, 24 Oct 2022 17:12:32 +0000 (10:12 -0700)]
[polly] Remove legacy pass manager hooks
And some options that only throw errors with the new PM.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D136621
Caroline Concatto [Thu, 27 Oct 2022 14:49:36 +0000 (15:49 +0100)]
[AArch64]SME2 Outer Product and Accumulate instructions
This patch adds the assembly/disassembly for the following instructions:
BMOPA: Bitwise exclusive NOR population count outer product and accumulate.
BMOPS: Bitwise exclusive NOR population count outer product and subtract.
SMOPA (2-way): Signed integer sum of outer products and accumulate.
SMOPS (2-way): Signed integer sum of outer products and subtract.
UMOPA (2-way): Unsigned integer sum of outer products and accumulate.
UMOPS (2-way): Signed integer sum of outer products and accumulate.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
Differential Revision: https://reviews.llvm.org/D136077
Aaron Ballman [Fri, 28 Oct 2022 16:45:56 +0000 (12:45 -0400)]
Add test coverage for WG14 N2322
The changes in this paper add a new recommended practice. I had
originally marked Clang as supporting this paper because we're not
obligated to follow a recommended practice. However, in retrospect, it
seems more useful to document whether we implement the recommendation
or not. This adds a test for those changes.
Caroline Concatto [Tue, 25 Oct 2022 10:26:53 +0000 (11:26 +0100)]
[AArch64]SME2 Multi-vector - Index/Single/Multi Array Vectors LONG INT MLA sources
This patch adds the assembly/disassembly for the following instructions:
SMLALL: (multiple and indexed vector): Multi-vector signed integer multiply-add long long by indexed element.
(multiple and single vector): Multi-vector signed integer multiply-add long long by vector.
(multiple vectors): Multi-vector signed integer multiply-add long long.
SMLSLL: (multiple and indexed vector): Multi-vector signed integer multiply-subtract long long by indexed element.
(multiple and single vector): Multi-vector signed integer multiply-subtract long long by vector.
(multiple vectors): Multi-vector signed integer multiply-subtract long long.
SUMLALL: (multiple and indexed vector): Multi-vector signed by unsigned integer multiply-add long long by indexed element.
(multiple and single vector): Multi-vector signed by unsigned integer multiply-add long long by vector.
UMLALL: (multiple and indexed vector): Multi-vector unsigned integer multiply-add long long by indexed element.
(multiple and single vector): Multi-vector unsigned integer multiply-add long long by vector.
(multiple vectors): Multi-vector unsigned integer multiply-add long long.
UMLSLL: (multiple and indexed vector): Multi-vector unsigned integer multiply-subtract long long by indexed element.
(multiple and single vector): Multi-vector unsigned integer multiply-subtract long long by vector.
(multiple vectors): Multi-vector unsigned integer multiply-subtract long long.
USMLALL: (multiple and indexed vector): Multi-vector unsigned by signed integer multiply-add long long by indexed element.
(multiple and single vector): Multi-vector unsigned by signed integer multiply-add long long by vector.
(multiple vectors): Multi-vector unsigned by signed integer multiply-add long long.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
It also adds a new immediate:
uimm2s4range for off2
uimm1s4range for o1
to represent the vector select offset.
The new operands have the range between the first and the last vector position.
Depends on : D135785
Differential Revision: https://reviews.llvm.org/D136075
Simon Pilgrim [Fri, 28 Oct 2022 16:10:36 +0000 (17:10 +0100)]
[DAG] ExpandIntRes_MINMAX - simplify cases with sufficient number of sign bits
When legalizing a smax/smin/umax/umin op, if we know that the upper half is all sign bits, then we can perform the op on the lower half and then sign extend the result to the upper half.
Alive2: https://alive2.llvm.org/ce/z/rk8Rfd
Fixes #58630
Simon Pilgrim [Fri, 28 Oct 2022 13:41:56 +0000 (14:41 +0100)]
[X86] Add basic test coverage for Issue #58630
If we have sufficient sign bits, we should be able to expand the IMINMAX using only the lower half (and then sign-extend the result to the upper half)
Timm Bäder [Fri, 28 Oct 2022 16:06:40 +0000 (18:06 +0200)]
[clang][Interp] Add missing expected test output
Matt Devereau [Fri, 28 Oct 2022 15:56:12 +0000 (15:56 +0000)]
[InstCombine] Add shuffle-binop tests
These tests are a precommit for https://reviews.llvm.org/D135876
Timm Bäder [Fri, 21 Oct 2022 07:10:29 +0000 (09:10 +0200)]
[clang][Interp] Implement inc and dec operators
Differential Revision: https://reviews.llvm.org/D136423
Timm Bäder [Fri, 21 Oct 2022 06:46:33 +0000 (08:46 +0200)]
[clang][Interp][NFC] Use right visit() function
visit (lowercase V) sets DiscardValue to false and calls Visit
(uppercase V). So we can't just call Visit (uppercase V) ourselves,
since then we aren't handling DiscardValue correctly.
This is currently irrelevant but will make a difference later.
Also, the naming isn't my fault and might change later.
Yaxun (Sam) Liu [Thu, 27 Oct 2022 16:04:45 +0000 (12:04 -0400)]
[HIP] add --offload-add-rpath
Add an option --[no-]offload-add-rpath to control whether to
pass -rpath to linker for HIP runtime library. By default it
is off to match gcc/clang behavior for not adding -rpath
for runtime library by default.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D136854
Kevin P. Neal [Fri, 28 Oct 2022 15:37:04 +0000 (11:37 -0400)]
[clang][Docs] Correct typo: "may_trap" is rejected, the value is "maytrap".
Nikita Popov [Fri, 28 Oct 2022 15:29:07 +0000 (17:29 +0200)]
[GVN] Regenerate test checks (NFC)
Craig Topper [Fri, 28 Oct 2022 15:13:35 +0000 (08:13 -0700)]
[RISCV] Adjust RV64I data layout by using n32:64 in layout string
Although i32 type is illegal in the backend, RV64I has pretty good support for i32 types by using W instructions.
By adding n32 to the DataLayout string, middle end optimizations will consider i32 to be a native type. One known effect of this is enabling LoopStrengthReduce on loops with i32 induction variables. This can be beneficial because C/C++ code often has loops with i32 induction variables due to the use of `int` or `unsigned int`.
If this patch exposes performance issues, those are better addressed by tuning LSR or other passes.
Reviewed By: asb, frasercrmck
Differential Revision: https://reviews.llvm.org/D116735
Shilei Tian [Fri, 28 Oct 2022 15:22:07 +0000 (11:22 -0400)]
[NFC][OpenMP] Fix compile warnings introduced by D134396
Momchil Velikov [Fri, 28 Oct 2022 10:30:41 +0000 (11:30 +0100)]
[FuncSpec][NFC] Avoid redundant computations of DominatorTree/LoopInfo
The `FunctionSpecialization` pass needs loop analysis results for its
cost function. For this purpose, it computes the `DominatorTree` and
`LoopInfo` for a function in `getSpecializationBonus`. This function,
however, is called O(number of call sites x number of arguments), but
the DominatorTree/LoopInfo can be computed just once.
This patch plugs into the PassManager infrastructure to obtain
LoopInfo for a function and removes ad-hoc computation from
`getSpecializatioBonus`.
Reviewed By: ChuanqiXu, labrinea
Differential Revision: https://reviews.llvm.org/D136332
Eduard Zingerman [Fri, 28 Oct 2022 14:59:08 +0000 (07:59 -0700)]
[clang][DebugInfo] Emit DISubprogram for extern functions with reserved names
Callsite `DISubprogram` entries are not generated for:
- builtin functions;
- external functions with reserved names (e.g. names starting from "__").
This limitation was added by the commit [1] as a workaround for the
situation described in [2] that triggered the IR verifier error.
The goal of the present commit is to lift this limitation by adjusting
the IR verifier logic.
The logic behind [1] is to avoid the following situation:
- a `DISubprogram` is added for some builtin function;
- there is some location where this builtin is also emitted by a
transformation (w/o debug location);
- the `Verifier::visitCallBase` sees a call to a function with
`DISubprogram` but w/o debug location and emits an error.
Here is an updated example of such situation taken from [2]:
```
extern "C" int memcmp(void *, void *, long);
struct a { int b; int c; int d; };
struct e { int f[1000]; };
bool foo(e g, e &h) {
// DISubprogram for memcmp is created here when [1] is commented out
return memcmp(&g, &h, sizeof(e));
}
bool bar(a &g, a &h) {
// memcmp might be generated here by MergeICmps
return g.b == h.b && g.c == h.c && g.d == h.d;
}
```
This triggers the verifier error when:
- compiled for AArch64:
`clang++ -c -g -Oz -target aarch64-unknown-linux-android21 test.cpp`;
- [1] check is commented out.
Instead of forbidding generation of `DISubprogram` entries as in [1]
one can instead adjust the verifier to additionally check if callee
has a body. Functions w/o bodies cannot be inlined and thus verifier
warning is not necessary.
E.g. `llvm::InlineFunction` requires functions for which
`GlobalValue::isDeclaration() == false`.
[1]
568db780bb7267651a902da8e85bc59fc89aea70
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=1022296
Differential Revision: https://reviews.llvm.org/D136041
Timm Bäder [Thu, 27 Oct 2022 09:32:31 +0000 (11:32 +0200)]
[clang][Interp] Make sure we free() allocated InitMaps
They get allocated when calling initialize() on a primitive array. And
they get free'd when the array is fully initialized. However, when that
never happens, they get leaked. Fix that by calling the destructor of
global variables.
Differential Revision: https://reviews.llvm.org/D136826
Timm Bäder [Sat, 15 Oct 2022 07:56:26 +0000 (09:56 +0200)]
[clang][Interp] Fix ignoring expression return values
Randomly noticed this. We need to honor DiscardResult here.
Differential Revision: https://reviews.llvm.org/D136013
Timm Bäder [Sat, 15 Oct 2022 07:22:34 +0000 (09:22 +0200)]
[clang][Interp] Fix record members of reference type
When assigning to them, we can't classify the expression type, because
that doesn't contain the right information.
And when reading from them, we need to do the extra deref, just like we
do when reading from a DeclRefExpr.
Differential Revision: https://reviews.llvm.org/D136012
Timm Bäder [Fri, 28 Oct 2022 13:06:24 +0000 (15:06 +0200)]
[clang][Interp] Remove unused getGlobalIdx()
Remove the only use with the version we already use in
VisitDeclRefExpr().
Nicolai Hähnle [Thu, 22 Sep 2022 16:14:45 +0000 (18:14 +0200)]
clang-tblgen build: avoid duplicate inclusion of libLLVMSupport
TableGen executables are supposed to never be linked against libLLVM-*.so,
even when LLVM_LINK_LLVM_DYLIB=ON, presumably for cross-compilation.
It turns out that clang-tblgen *did* link against libLLVM-*.so,
indirectly so via the clangSupport.
This lead to a regression in what should have been unrelated work of
cleaning up ManagedStatics in LLVMSupport. A running clang-tblgen
process ended up with two copies of a cl::opt static global:
- one from libLLVMSupport linked statically into clang-tblgen as a
direct dependency
- one from libLLVMSupport linked into libLLVM-*.so, which clang-tblgen
linked against due to the clangSupport dependency
For a bit more context, see the discussion at
https://discourse.llvm.org/t/flang-aarch64-dylib-buildbot-need-help-understanding-a-regression-in-clang-tblgen/64871/
None of the potential solutions I could find are perfect. Presumably one
possible solution would be to remove "Support" from the explicit
dependencies of clang-tblgen. However, relying on the transitive
inclusion via clangSupport seems risky, and in any case this wouldn't
address the issue of clang-tblgen surprisingly linking against libLLVM-*.so.
This change instead creates a second version of the clangSupport library
that is explicitly linked statically, to be used by clang-tblgen.
v2:
- define an alias so that clang-tblgen can always link against
"clangSupport_tablegen"
- use add_llvm_library(... BUILDTREE_ONLY ...)
v3:
- use the object library version of clangSupport if available
Differential Revision: https://reviews.llvm.org/D134637
Alexander Belyaev [Fri, 28 Oct 2022 13:24:34 +0000 (15:24 +0200)]
[mlir] Rename getInputs->getDpsInputs and getOutputs->getDpsInits in DPS interface.
https://discourse.llvm.org/t/rfc-interface-for-destination-style-ops/64056
Differential Revision: https://reviews.llvm.org/D136943
John Brawn [Fri, 28 Oct 2022 13:38:33 +0000 (14:38 +0100)]
Revert "[MachineCSE] Allow PRE of instructions that read physical registers"
This reverts commit
628467e53f4ceecd2b5f0797f07591c66d9d9d2a.
This is causing a miscompile in ffmpeg when compiled for armv7.
Sanjay Patel [Fri, 28 Oct 2022 12:41:57 +0000 (08:41 -0400)]
[SDAG] avoid crash from mismatched types in scalar-to-vector fold
This bug was introduced with D136713 /
54eeadcf442df91aed0 .
As an enhancement, we could cast operands to the expected type,
but we need to make sure that is done correctly (zext vs. sext).
It's also possible (but seems unlikely) that an operand can have
a type larger than the result type.
Fixes #58661
Qiongsi Wu [Fri, 28 Oct 2022 12:33:57 +0000 (08:33 -0400)]
[clang][LTO] Passing vec-extabi to the Backend on AIX
This patch passes on the `vec-extabi` mabi option on AIX.
Reviewed By: w2yehia
Differential Revision: https://reviews.llvm.org/D136874
Kadir Cetinkaya [Mon, 24 Oct 2022 15:14:52 +0000 (17:14 +0200)]
[llvm] Fix minimum Apple Clang requirement
This was stated as 9.3, but as pointed out in
https://discourse.llvm.org/t/rfc-bump-minimal-requirements-apple-clang-9-3-10-0-0-before-4th-tue-in-january/66156/7?u=kadircet
9.3 doesn't exist, hence this was effectively 10.0.
This patch merely reflects the reality more closely.
Differential Revision: https://reviews.llvm.org/D136609
Martin Storsjö [Fri, 28 Oct 2022 13:10:17 +0000 (16:10 +0300)]
[clang] Fix a -Wcast-qual GCC warning. NFC.
This fixes the following warning:
../tools/clang/lib/AST/Interp/Disasm.cpp: In member function ‘void clang::interp::Function::dump(llvm::raw_ostream&) const’:
../tools/clang/lib/AST/Interp/Disasm.cpp:43:25: warning: cast from type ‘const clang::interp::Function*’ to type ‘void*’ casts away qualifiers [-Wcast-qual]
43 | OS << " " << (void*)this << ":\n";
| ^~~~