Nico Weber [Fri, 3 Jun 2022 13:28:07 +0000 (09:28 -0400)]
check_clang_tidy.py: Update run line to python3
`python` no longer exists on several systems, and the script
runs under python3 when run as part of lit.
Michał Górny [Wed, 1 Jun 2022 11:00:43 +0000 (13:00 +0200)]
[lldb] [Process/FreeBSD] Do not send SIGSTOP to stopped process
Do not send SIGSTOP when requested to halt a process that's already
stopped. This results in the signal being queued for delivery once
the process is resumed, and unexpectedly stopping it again.
This is necessary for non-stop protocol patches to land.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D126770
Aaron Ballman [Fri, 3 Jun 2022 12:59:00 +0000 (08:59 -0400)]
Correct the behavior of this test for non-Windows targets
This should address build failures like:
https://lab.llvm.org/buildbot/#/builders/188/builds/14980
https://lab.llvm.org/buildbot/#/builders/171/builds/15515
https://lab.llvm.org/buildbot/#/builders/91/builds/9877
Nikita Popov [Fri, 3 Jun 2022 12:36:47 +0000 (14:36 +0200)]
[SCCP] Regenerate test checks with function signature (NFC)
The previous checks were manually modified to avoid the label
clash. Use the --function-signature flag that exists for this
purpose.
Aaron Ballman [Fri, 3 Jun 2022 12:28:16 +0000 (08:28 -0400)]
Updating more entries in the C DR Status page
Adds test coverage or information for ~25 more C DRs.
Nikita Popov [Fri, 3 Jun 2022 12:27:20 +0000 (14:27 +0200)]
[SCCP] Regenerate test checks (NFC)
Hans Wennborg [Fri, 3 Jun 2022 12:23:41 +0000 (14:23 +0200)]
Update old mailing list link in the nullability doc
Benjamin Kramer [Fri, 3 Jun 2022 12:07:56 +0000 (14:07 +0200)]
[VPlan] Silence another unused variable warning in release builds
lewuathe [Fri, 3 Jun 2022 12:04:04 +0000 (14:04 +0200)]
[mlir][complex] Check the correctness of tanh in complex dialect
Correctness check for tanh operation in complex dialect.
Ref: https://reviews.llvm.org/D126858
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D126946
Benjamin Kramer [Fri, 3 Jun 2022 11:59:48 +0000 (13:59 +0200)]
[VPlan] Inline variable into assertion. NFC.
Avoids a warning in release builds
llvm/lib/Transforms/Vectorize/VPlanHCFGBuilder.cpp:311:14: warning: unused variable 'BrCond' [-Wunused-variable]
Value *BrCond = Br->getCondition();
Nico Weber [Fri, 3 Jun 2022 11:49:28 +0000 (07:49 -0400)]
[gn build] (manually) port
b94db7ed7eaf (Confusables.inc)
CHIANG, YU-HSUN (Tommy Chiang, oToToT) [Tue, 10 May 2022 01:53:16 +0000 (09:53 +0800)]
[pp-trace] Print HashLoc in InclusionDirective callback
The HashLoc in InclusionDirective callback is an unused parameter.
Since pp-trace is also used as a test of Clang’s PPCallbacks interface,
add it to the output of pp-trace could avoid some unintended change on
it.
This shuold resolves PR52673
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125373
Benjamin Kramer [Fri, 3 Jun 2022 11:25:40 +0000 (13:25 +0200)]
[DAGCombiner] Add bf16 to the matrix of types that we don't promote to integer stores
Remove a few stray semicolons while there.
Paul Walker [Tue, 31 May 2022 09:57:15 +0000 (10:57 +0100)]
[SVE] Refactor sve-bitcast.ll to include all combinations for legal types.
Patch enables custom lowering for MVT::nxv4bf16 because otherwise
the refactored test file triggers a selection failure.
The reason for the refactoring it to highlight cases where the
generated code is wrong.
Florian Hahn [Fri, 3 Jun 2022 11:05:00 +0000 (12:05 +0100)]
[VPlan] Update failing HCFG unit tests after
a5bb4a3b4d3db.
Florian Hahn [Fri, 3 Jun 2022 10:47:16 +0000 (11:47 +0100)]
[VPlan] Replace CondBit with BranchOnCond VPInstruction.
This patch removes CondBit and Predicate from VPBasicBlock. To do so,
the patch introduces a new branch-on-cond VPInstruction opcode to model
a branch on a condition explicitly.
This addresses a long-standing TODO/FIXME that blocks shouldn't be users
of VPValues. Those extra users can cause issues for VPValue-based
analyses that don't expect blocks. Addressing this fixme should allow us
to re-introduce
266ea446ab7476.
The generic branch opcode can also be used in follow-up patches.
Depends on D123005.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D126618
Adrian Kuegel [Fri, 3 Jun 2022 10:46:14 +0000 (12:46 +0200)]
[mlir] Fix ClangTidy warning (NFC).
virtual is redundant since the function is already declared 'override'.
David Green [Fri, 3 Jun 2022 10:36:40 +0000 (11:36 +0100)]
[AArch64] Add extra addp codegen tests. NFC
serge-sans-paille [Fri, 15 Oct 2021 13:20:22 +0000 (15:20 +0200)]
[clang-tidy] Confusable identifiers detection
Detect identifiers that are confusable according to Unicode definition
http://www.unicode.org/reports/tr39/#Confusable_Detection
and have conflicting scopes.
Differential Revision: https://reviews.llvm.org/D112916
Kristof Beyls [Fri, 3 Jun 2022 09:24:49 +0000 (11:24 +0200)]
[docs] Fix RST code-block syntax in HowToSetUpLLVMStyleRTTI.rst
LLVM GN Syncbot [Fri, 3 Jun 2022 08:36:05 +0000 (08:36 +0000)]
[gn build] Port
a29a1a33ac7b
Martin Boehme [Fri, 3 Jun 2022 08:27:36 +0000 (10:27 +0200)]
[clang-tidy] Add missing close quote in release notes.
Sorry for the breakage.
Nikolas Klauser [Fri, 3 Jun 2022 08:31:30 +0000 (10:31 +0200)]
[libc++] Fix conjunction/disjunction and mark a few LWG issues as complete
Fixes #54803
Fixes #53133
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D125221
Jonas Hahnfeld [Fri, 3 Jun 2022 08:17:10 +0000 (10:17 +0200)]
[cmake] Fix typo in CrossCompile.cmake
Diana Picus [Wed, 25 May 2022 08:42:38 +0000 (08:42 +0000)]
[flang][test-suite] Document need for NO_STOP_MESSAGE environment variable. NFC
When running the llvm-test-suite with flang, we get a lot of failures
because of the output of the `STOP` statement. We can workaround them by
setting `NO_STOP_MESSAGE=1` in the environment. This patch adds a few
words about it to the docs about the Fortran part of the llvm-test-suite.
See also https://reviews.llvm.org/D126360
Shraiysh Vaishay [Fri, 3 Jun 2022 07:31:07 +0000 (13:01 +0530)]
[mlir][OpenMP] Add memory_order clause tests
This patch adds tests for memory_order clause for atomic update and
capture operations. This patch also adds a check for making sure that
the operations inside and omp.atomic.capture region do not specify the
memory_order clause.
Reviewed By: kiranchandramohan, peixin
Differential Revision: https://reviews.llvm.org/D126195
Nikita Popov [Thu, 2 Jun 2022 15:09:07 +0000 (17:09 +0200)]
[DAGCombine] Handle promotion of shift with both operands the same
When promoting a shift, make sure we only fetch the second operand
after promoting the first. Load promotion may replace users of the
old load, and we don't want to be left with a dangling reference to
the old load instruction.
The crashing test case is from https://reviews.llvm.org/D126689#3553212.
Differential Revision: https://reviews.llvm.org/D126886
Guillaume Chatelet [Fri, 3 Jun 2022 07:53:36 +0000 (07:53 +0000)]
[NFC] Format CGBuilder.h
Timm Bäder [Fri, 3 Jun 2022 07:42:01 +0000 (09:42 +0200)]
[clang][sema] Remove unused paramter from VerifyBitField
The ZeroWidth paramter is unused in every call site of VerifyBitField.
Martin Boehme [Fri, 3 Jun 2022 07:08:17 +0000 (09:08 +0200)]
[clang-tidy] `bugprone-use-after-move`: Fix handling of moves in lambda captures
Previously, we were treating a move in the lambda capture as if it happened
within the body of the lambda, not within the function that defines the lambda.
This fixes the same bug as https://reviews.llvm.org/D119165 (which it appears
may have been abandoned by the author?) but does so more simply.
Reviewed By: njames93
Differential Revision: https://reviews.llvm.org/D126780
Fangrui Song [Fri, 3 Jun 2022 07:30:34 +0000 (00:30 -0700)]
Revert "[SLP]Improve shuffles cost estimation where possible."
This reverts commit
9980c9971892378ea82475e000de8df210a58e69.
Caused assertion failures: https://reviews.llvm.org/D115462#3555350
Nicolas Vasilache [Fri, 3 Jun 2022 07:13:06 +0000 (07:13 +0000)]
[mlir][SCF] Add bufferization hook for scf.foreach_thread and terminator.
`scf.foreach_thread` results alias with the underlying `scf.foreach_thread.parallel_insert_slice` destination operands
and they bufferize to equivalent buffers in the absence of other conflicts.
`scf.foreach_thread.parallel_insert_slice` conflict detection is similar to `tensor.insert_slice` conflict detection.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D126769
Jonas Hahnfeld [Mon, 30 May 2022 18:54:21 +0000 (20:54 +0200)]
[Driver] Add multiarch path for RISC-V
This is required to find headers on the Debian port for RISC-V.
Differential Revision: https://reviews.llvm.org/D126672
Martin Storsjö [Mon, 2 May 2022 21:22:27 +0000 (00:22 +0300)]
[clang] [MSVC] Enable unwind tables for ARM
The backend now can generate working unwind information for this
target.
Improve the existing windows-exceptions.cpp testcase to check for
the state of unwind tables on all MSVC architectures.
Differential Revision: https://reviews.llvm.org/D126862
Martin Storsjö [Thu, 2 Jun 2022 10:17:14 +0000 (13:17 +0300)]
[ARM] Fix restoring stack for varargs with SEH split frame pointer push
Previously, the "add sp, #12" ended up inserted after "bx lr".
Differential Revision: https://reviews.llvm.org/D126872
Timm Bäder [Wed, 18 May 2022 08:31:41 +0000 (10:31 +0200)]
[clang][driver] Dynamically select gcc-toolset/devtoolset
Instead of adding all devtoolset and gcc-toolset prefixes to the list of
prefixes, just scan the /opt/rh/ directory for the one with the highest
version number and only add that one.
Differential Revision: https://reviews.llvm.org/D125862
Alexander Batashev [Fri, 3 Jun 2022 06:07:42 +0000 (09:07 +0300)]
[mlir][cf] Implement missing SwitchOp::build function
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D126594
bzcheeseman [Fri, 3 Jun 2022 04:58:14 +0000 (21:58 -0700)]
[LLVM][Docs] Update for HowToSetUpLLVMStyleRTTI.rst, NFC.
This patch updates the document with some advanced use cases and examples on how to set up and use LLVM-style RTTI. It includes a few motivating examples to get readers comfortable with the concepts.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D126943
Max Kazantsev [Fri, 3 Jun 2022 05:31:06 +0000 (12:31 +0700)]
[NFC][MemDep] Remove unnecessary Worklist.clear
This execution path leads to return 'false' where the Worklist
will be deallocated anyways. No need to clear it separately.
Serguei Katkov [Fri, 27 May 2022 07:51:31 +0000 (14:51 +0700)]
[SSAUpdaterImpl] Do not generate phi node with all the same incoming values
If all available vals to basic block are the same - do not build new phi node and
just use this value.
Reviewed By: sameerds
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D126525
Tue Ly [Sun, 8 May 2022 17:47:08 +0000 (13:47 -0400)]
[libc] Automatically add -mfma flag for architectures supporting FMA.
Detect if the architecture supports FMA instructions and if
the targets depend on fma.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D123615
Douglas Chen [Fri, 3 Jun 2022 05:19:03 +0000 (13:19 +0800)]
[M68k] Instruction selection to choose neg x when mul x -1 (Fix issue 48588)
This patch is trying to fix issue 48588(https://github.com/llvm/llvm-project/issues/48588)
I found the results of Instruction Selection between SelectionDAG and FastISEL for the `%mul = mul i32 %A,
4294967295`:
(seldag-isel) mul --> sub --> SUB32dp
(fast-isel) mul --> sub --> NEG32d
My patch to fix this issue is by overriding a virtual function M68kDAGToDAGISel::IsProfitableToFold(). Return `false` when it was trying to match with SUB, then it will match with NEG.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D116886
Thomas Raoux [Fri, 3 Jun 2022 04:23:27 +0000 (04:23 +0000)]
[mlir][VectorToGPU] Fix bug generating incorrect ldmatrix ops
ldmatrix transpose can only be used with types that are 16bits wide.
Differential Revision: https://reviews.llvm.org/D126846
Serguei Katkov [Fri, 27 May 2022 05:17:46 +0000 (12:17 +0700)]
[MachineSSAUpdate] Add a test for redundant phi generation.
Thomas Raoux [Wed, 1 Jun 2022 05:42:00 +0000 (05:42 +0000)]
[mlir][scf] Add option to loop pipelining to not peel the epilogue
Add an option to predicate the epilogue within the kernel instead of
peeling the epilogue. This is a useful option to prevent generating
large amount of code for deep pipeline. This currently require a user
lamdba to implement operation predication.
Differential Revision: https://reviews.llvm.org/D126753
Craig Topper [Fri, 3 Jun 2022 03:49:15 +0000 (20:49 -0700)]
[RISCV] Give CSImm12MulBy4 PatLeaf priority over CSImm12MulBy8. NFC
The immediate range check for CSImm12MulBy8 included some values
covered by CSImm12MulBy4. I assume CSImm12MulBy4 had priority due
to pattern order in the td file, but this makes the priority
explicit in the predicate.
Fangrui Song [Fri, 3 Jun 2022 03:34:52 +0000 (20:34 -0700)]
[llvm-c-test] Default to opaque pointers
Fangrui Song [Fri, 3 Jun 2022 03:27:10 +0000 (20:27 -0700)]
[llvm-c][test] Convert tests to opaque pointers
echo.ll is unchanged to test typed pointers.
River Riddle [Thu, 2 Jun 2022 04:00:49 +0000 (21:00 -0700)]
[mlir][NFC] Simplify the various `parseSourceFile<T>` overloads
These effectively all share the same implementation, i.e. forward
to the non-templated overload and then construct the container op.
Amir Ayupov [Fri, 3 Jun 2022 02:08:59 +0000 (19:08 -0700)]
[BOLT][NFC] Make ICP::verifyProfile static
Follow LLVM style guide suggestion to avoid function definitions in anonymous
namespaces: https://llvm.org/docs/CodingStandards.html#anonymous-namespaces
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124896
Shilei Tian [Fri, 3 Jun 2022 01:50:07 +0000 (21:50 -0400)]
[NFC][Doc] Finish atomic compare
Shilei Tian [Fri, 3 Jun 2022 01:38:12 +0000 (21:38 -0400)]
[Clang][OpenMP] Add the codegen support for `atomic compare capture`
This patch adds the codegen support for `atomic compare capture` in clang.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D120290
Amir Ayupov [Mon, 23 May 2022 04:44:27 +0000 (21:44 -0700)]
[BOLT][DOCS] Add PACKAGE_VERSION to doxygen config
Clang's doxygen documentation specifies LLVM revision. Do the same for BOLT.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D126912
Daniil Suchkov [Fri, 3 Jun 2022 00:52:08 +0000 (00:52 +0000)]
Revert "[LoopInterchange] New cost model for loop interchange"
Reverting the commit due to numerous buildbot failures.
This reverts commit
006334470d8d1b5d8f630890336fcb45795749d1.
Mike Rice [Fri, 3 Jun 2022 00:29:54 +0000 (17:29 -0700)]
[OpenMP][NFC] update status for 'omp_all_memory' directive to 'done'
Akira Hatanaka [Fri, 20 May 2022 19:16:29 +0000 (12:16 -0700)]
[Sema] Reject list-initialization of enumeration types from a
brace-init-list containing a single element of a different scoped
enumeration type
It is rejected because it doesn't satisfy the condition that the element
has to be implicitly convertible to the underlying type of the
enumeration.
http://eel.is/c++draft/dcl.init.list#3.8
Differential Revision: https://reviews.llvm.org/D126084
Aart Bik [Thu, 2 Jun 2022 22:11:02 +0000 (15:11 -0700)]
[mlir][python][f16] add ctype python binding support for f16
Similar to complex128/complex64, float16 has no direct support
in the ctypes implementation. This fixes the issue by using a
custom F16 type to change the view in and out of MLIR code
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D126928
Akira Hatanaka [Fri, 3 Jun 2022 00:02:41 +0000 (17:02 -0700)]
Add a release note for the scope enum initialization bug fix in
https://reviews.llvm.org/D126084
River Riddle [Thu, 2 Jun 2022 23:35:09 +0000 (16:35 -0700)]
[vscode-mlir] Bump to version 0.8
Since version 0.7 we've added:
* Initial language support for TableGen
* Tweaked syntax highlighting for PDLL
* Added a new command to view intermediate PDLL output
River Riddle [Tue, 3 May 2022 18:43:30 +0000 (11:43 -0700)]
[mlir:PDLL] Add better support for providing Constraint/Pattern/Rewrite documentation
This commit enables providing long-form documentation more seamlessly to the LSP
by revamping decl documentation. For ODS imported constructs, we now also import
descriptions and attach them to decls when possible. For PDLL constructs, the LSP will
now try to provide documentation by parsing the comments directly above the decls
location within the source file. This commit also adds a new parser flag
`enableDocumentation` that gates the import and attachment of ODS documentation,
which is unnecessary in the normal build process (i.e. it should only be used/consumed
by tools).
Differential Revision: https://reviews.llvm.org/D124881
Arjun P [Thu, 2 Jun 2022 23:29:17 +0000 (00:29 +0100)]
[MLIR][Presburger] Simplex: remove redundant member vars nRow, nCol
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D126790
Chia-hung Duan [Thu, 2 Jun 2022 23:23:31 +0000 (23:23 +0000)]
[mlir:MultiOpDriver] Quick fix the assertion position
The assertion should come after null check
Nicolas van Kempen [Thu, 2 Jun 2022 21:51:13 +0000 (22:51 +0100)]
[clang-tidy] Add proper emplace checks to modernize-use-emplace
modernize-use-emplace only recommends going from a push_back to an
emplace_back, but does not provide a recommendation when emplace_back is
improperly used. This adds the functionality of warning the user when
an unecessary temporary is created while calling emplace_back or other "emplacy"
functions from the STL containers.
Reviewed By: kuhar, ivanmurashko
Differential Revision: https://reviews.llvm.org/D101471
Congzhe Cao [Thu, 2 Jun 2022 21:53:13 +0000 (17:53 -0400)]
[LoopInterchange] New cost model for loop interchange
This patch proposed to use a new cost model for loop interchange, which
is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of loops
[loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost
loop, loop1 should be placed one more level inside, and loop2 one more level
inside, etc. What loop cache analysis does is not only more comprehensive than
the current cost model, it is also a "one-shot" query which means that we only
need to query it once during the entire loop interchange pass, which is better
than the current cost model where we query it every time we check whether it is
profitable to interchange two loops. Thus complexity is reduced, especially after
D120386 where we do more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections.
Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it as
fall-back in case the new cost model did not run successfully. This is because
currently we have some limitations in delinearization, which sometimes makes
loop cache analysis bail out. The longer term goal is to enhance delinearization
and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
Paul Pluzhnikov [Thu, 2 Jun 2022 21:58:56 +0000 (17:58 -0400)]
Clean "./" from __FILE__ expansion.
This is alternative to https://reviews.llvm.org/D121733
and helps with Clang header modules in which FILE
may expand to "./foo.h" or "foo.h" depending on whether the file was
included directly or not.
Only do this when UseTargetPathSeparator is true, as we are already
changing the path in that case.
Reviewed By: ayzhao
Differential Revision: https://reviews.llvm.org/D126396
Shilei Tian [Thu, 2 Jun 2022 21:42:50 +0000 (17:42 -0400)]
[Clang][OpenMP] Avoid using `IgnoreImpCasts` if possible
This patch removes all `IgnoreImpCasts` in Sema, and only uses it if necessary. If the expression is not of the same type as the pointer value, a cast is inserted.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D126602
Paul Robinson [Thu, 2 Jun 2022 21:40:52 +0000 (14:40 -0700)]
[PS5] Ignore 'packed' on one-byte bitfields, matching PS4
Mehdi Amini [Thu, 2 Jun 2022 21:24:06 +0000 (21:24 +0000)]
Revert "[mlir] Add integer range inference analysis"
This reverts commit
1350c9887dca5ba80af8e3c1e61b29d6696eb240.
Shared library build is broken with undefined references.
Matt Arsenault [Thu, 2 Jun 2022 18:59:27 +0000 (14:59 -0400)]
AMDGPU: Move SpilledReg from MFI to SIRegisterInfo
This isn't the most natural place for it, but it avoids a circular
include dependency in an out of tree patch.
Julien Pages [Thu, 2 Jun 2022 20:55:39 +0000 (16:55 -0400)]
[AMDGPU] Improve codegen of extractelement/insertelement in some cases
This patch improves the codegen of extractelement and insertelement for vector
containing 8 elements. Before, a dag combine transformation was generating a
sequence of 8 select/cmp.
This patch changes the upper limit for this transformation and the movrel
instruction will eventually be used instead. Extractlement/insertelement for
vectors containing less than 8 elements are unchanged.
Differential Revision: https://reviews.llvm.org/D126389
Alexander Smarus [Wed, 1 Jun 2022 18:10:06 +0000 (18:10 +0000)]
cmake fill `cmake_args` when cross-compiling external project with non-clang compiler
This makes it possible to crosscompile runtimes with cl.exe on Windows.
An external project is completely misconfigured otherwise because
cmake_args is set only for native builds or builds crosscompiled with
clang.
Differential Revision: https://reviews.llvm.org/D122578
Reviewed By: beanz, compnerd
David Blaikie [Sun, 9 May 2021 03:20:52 +0000 (20:20 -0700)]
Support warn_unused_result on typedefs
While it's not as robust as using the attribute on enums/classes (the
type information may be lost through a function pointer, a declaration
or use of the underlying type without using the typedef, etc) but I
think there's still value in being able to attribute a typedef and have
all return types written with that typedef pick up the
warn_unused_result behavior.
Specifically I'd like to be able to annotate LLVMErrorRef (a wrapper for
llvm::Error used in the C API - the underlying type is a raw pointer, so
it can't be attributed itself) to reduce the chance of unhandled errors.
Differential Revision: https://reviews.llvm.org/D102122
Craig Topper [Thu, 2 Jun 2022 20:28:07 +0000 (13:28 -0700)]
[RISCV] Add custom isel for (add X, imm) used by load/stores.
If the imm is out of range for an ADDI, we will materialize it in
a register using multiple instructions. If the ADD is used by a
load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from
the constant materialization into the load/store offset. This only
works if the ADD has a single use, otherwise the peephole would have
to rebuild multiple nodes.
This patch instead tries to solve the problem when the add is selected.
We check that the add is only used by loads/stores and if it is
we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable
the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI
used as a pointer. As a result we can remove the more complicated
peephole from doPeepholeLoadStoreADDI.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126576
Florian Hahn [Thu, 2 Jun 2022 20:43:50 +0000 (21:43 +0100)]
[CaptureTracking] Increase limit and use it for all visited uses.
Currently the MaxUsesToExplore limit only applies to the number of users
per value, not the total number of users to explore.
The current limit of 20 pessimizes IR with opaque pointers in some
cases. Without opaque pointers, we have deeper pointer def-use chains in
general due to extra bitcasts and geps for structs with index 0.
With opaque pointers the def-use chain is not as deep but wider, due to
bitcasts & 0-geps missing.
To improve the situation for opaque pointers, this patch does 2 things:
1. Apply the limit to the total number of uses visited. From the
wording in the description of the option it seems like this may be
the original intention. With the current implementation we could
still end up walking a lot of uses.
2. Increase the limit to 100. This is quite arbitrary, but enables
a good number of additional optimizations.
Those adjustments have a noticeable compile-time impact though. In part
that is likely due to additional transformations (and conversely
the current baseline misses optimizations after switching to opaque
pointers).
This recovers some regressions that showed up after enabling opaque
pointers.
Limit=100:
* NewPM-O3: +0.21%
* NewPM-ReleaseThinLTO: +0.87%
* NewPM-ReleaseLTO-g: +0.46%
https://llvm-compile-time-tracker.com/compare.php?from=
2e50ecb2ef4e1da1aeab05bcf66380068e680991&to=
7e6fbe519d958d09f32f01d5d44a622f551e2031&stat=instructions
Limit=60:
* NewPM-O3: +0.14%
* NewPM-ReleaseThinLTO: +0.41%
* NewPM-ReleaseLTO-g: +0.21%
https://llvm-compile-time-tracker.com/compare.php?from=
aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=
520563fdc146319aae90d06f88d87f2e9e1247b7&stat=instructions
Limit=40:
* NewPM-O3: +0.11%
* NewPM-ReleaseThinLTO: +0.12%
* NewPM-ReleaseLTO-g: +0.09%
https://llvm-compile-time-tracker.com/compare.php?from=
aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=
c9182576e9fe3f1c84a71479665aef91a416318c&stat=instructions
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D126236
Fangrui Song [Thu, 2 Jun 2022 20:37:19 +0000 (13:37 -0700)]
[ELF] Remove support for legacy .zdebug sections
.zdebug is unlikely used any longer: gcc -gz switched from legacy
.zdebug to SHF_COMPRESSED with binutils 2.26 (2016), which has been
several years. clang 14 dropped -gz=zlib-gnu support. According to
Debian Code Search (`gz=zlib-gnu`), no project uses -gz=zlib-gnu.
Remove .zdebug support to (a) simplify code and (b) allow removal of llvm-mc's
--compress-debug-sections=zlib-gnu.
In case the old object file `a.o` uses .zdebug, run `objcopy --decompress-debug-sections a.o`
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D126793
Fangrui Song [Thu, 2 Jun 2022 20:28:42 +0000 (13:28 -0700)]
[docs] Mention LLVMContext::setOpaquePointers for C++ API
Krzysztof Drewniak [Thu, 2 Jun 2022 19:04:42 +0000 (19:04 +0000)]
[mlir] Add integer range inference analysis
This commit defines a dataflow analysis for integer ranges, which
uses a newly-added InferIntRangeInterface to compute the lower and
upper bounds on the results of an operation from the bounds on the
arguments. The range inference is a flow-insensitive dataflow analysis
that can be used to simplify code, such as by statically identifying
bounds checks that cannot fail in order to eliminate them.
The InferIntRangeInterface has one method, inferResultRanges(), which
takes a vector of inferred ranges for each argument to an op
implementing the interface and a callback allowing the implementation
to define the ranges for each result. These ranges are stored as
ConstantIntRanges, which hold the lower and upper bounds for a
value. Bounds are tracked separately for the signed and unsigned
interpretations of a value, which ensures that the impact of
arithmetic overflows is correctly tracked during the analysis.
The commit also adds a -test-int-range-inference pass to test the
analysis until it is integrated into SCCP or otherwise exposed.
Finally, this commit fixes some bugs relating to the handling of
region iteration arguments and terminators in the data flow analysis
framework.
Depends on D124020
Depends on D124021
Reviewed By: rriddle, Mogball
Differential Revision: https://reviews.llvm.org/D124023
Mingming Liu [Wed, 1 Jun 2022 15:53:31 +0000 (08:53 -0700)]
[Inline][Remark][NFC] Optionally provide inline context to inline
advisor.
This patch has no functional change, and merely a preparation patch for
main functional change. The motivating use case is to annotate inline
remark pass name with context information (e.g. prelink or postlink,
CGSCC or always-inliner), see D125495 for more details.
Differential Revision: https://reviews.llvm.org/D126824
Adrian Prantl [Thu, 2 Jun 2022 20:05:33 +0000 (13:05 -0700)]
Adapt IRForTarget::RewriteObjCConstStrings() for D126689.
With opaque pointers, the LLVM IR expected by this function changed.
Philip Reames [Thu, 2 Jun 2022 20:04:13 +0000 (13:04 -0700)]
[RISCV] Inline one copy of needVSETVLI into the other [NFC]
Calling the non-MI version directly was unsound (as fixed in
dcdb0bf2), so remove that version to decrease likelyhood of future mistakes.
Xiang Li [Thu, 2 Jun 2022 07:25:12 +0000 (00:25 -0700)]
[HLSL] Add WaveActiveCountBits as Langugage builtin function for HLSL
One clang builtins are introduced
uint WaveActiveCountBits( bool bBit ) as Langugage builtin function for HLSL.
The detail for WaveActiveCountBits is at
https://github.com/microsoft/DirectXShaderCompiler/wiki/Wave-Intrinsics#uint-waveactivecountbits-bool-bbit-
This is only clang part change to make WaveActiveCountBits into AST.
llvm intrinsic for WaveActiveCountBits will be add in separate PR.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D126857
Sanjay Patel [Thu, 2 Jun 2022 18:02:11 +0000 (14:02 -0400)]
[InstCombine] add tests for mul with low-bit mask operand; NFC
Sanjay Patel [Thu, 2 Jun 2022 17:03:37 +0000 (13:03 -0400)]
[InstCombine] make pattern matching more consistent; NFC
We could go either way on this and several similar matches.
Just matching as a binop is possibly slightly more efficient;
we don't need to re-confirm the opcode of the instruction.
Maksim Panchenko [Thu, 2 Jun 2022 01:18:54 +0000 (18:18 -0700)]
[BOLT][NFC] Fix braces in BinaryEmitter
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D126844
Craig Topper [Thu, 2 Jun 2022 19:25:40 +0000 (12:25 -0700)]
[DAGCombiner][RISCV] Improve computeKnownBits for (smax X, C) where C is non-negative.
If C is non-negative, the result of the smax must also be
non-negative, so all sign bits of the result are 0.
This allows DAGCombiner to remove a zext_inreg in the modified test.
This zext_inreg started as a sext that became zext before type
legalization then was promoted to a zext_inreg.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126896
Joe Loser [Thu, 2 Jun 2022 18:11:06 +0000 (12:11 -0600)]
[libc++][test] Fix unused variable warning in string_view tests
In
6423a9f0ec8ba70049ea76e7bcfc9a9d1a54e826, I accidentally thought this was
getting tested, but these variables are unused. Just remove the lines instead of
leaving them commented out.
Differential Revision: https://reviews.llvm.org/D126901
Daniel Douglas [Thu, 2 Jun 2022 19:04:58 +0000 (14:04 -0500)]
[OpenMP][libomp] do not try to dlopen libmemkind on macOS
The memkind library is only available for linux. Calling dlopen here
can also be problematic in a client app that fork'ed.
Differential Revision: https://reviews.llvm.org/D126579
Paul Robinson [Thu, 2 Jun 2022 19:25:48 +0000 (12:25 -0700)]
[PS5] Pack non-POD members in packed structs, matching PS4 ABI
Paul Robinson [Thu, 2 Jun 2022 19:14:54 +0000 (12:14 -0700)]
[PS5] Apply 'packed' attribute to base classes, matching PS4 ABI
Florian Hahn [Thu, 2 Jun 2022 18:47:43 +0000 (19:47 +0100)]
[GVN] Add test for capture tracking use limit.
Test for capture-tracking-max-uses-to-explore, adjusted in D126236.
Aart Bik [Thu, 2 Jun 2022 17:32:04 +0000 (10:32 -0700)]
[mlir][sparse][bufferization] fix doc on new init operation
The example was still using the -now- removed sparse_tensor.init_tensor.
Also, I made the input operands of the matrix multiplication sparse too
(since it looks a bit strange to multiply two dense matrices into a sparse).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D126897
Adrian Prantl [Thu, 2 Jun 2022 18:41:33 +0000 (11:41 -0700)]
Adapt IRForTarget::RewriteObjCSelector() for D126689.
With opaque pointers, the LLVM IR expected by this function changed.
Joe Nash [Mon, 16 May 2022 19:19:31 +0000 (15:19 -0400)]
[AMDGPU] gfx11 vop3 and inherited vop instructions
This patch includes MC layer support for VOP3 encoded instructions and generic VOP support
classes.
Some VOP1 and VOP2 instructions which share an encoding with gfx10 and are using
the AssemblerPredicate = isGFX10Plus are also enabled. That predicate
will be changed to isGFX10Only in a later patch.
Patch 15/N for upstreaming of AMDGPU gfx11 architecture.
Depends on D126468
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D126475
Chia-hung Duan [Thu, 2 Jun 2022 18:27:36 +0000 (18:27 +0000)]
[mlir:MultiOpDriver] Don't add ops which are not in the allowed list
In strict mode, only the new inserted operation is allowed to add to the
worklist. Before this change, it would add the users of a replaced op
and it didn't check if the users are allowed to be pushed into the
worklist
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D126899
Alexey Bataev [Thu, 9 Dec 2021 18:34:08 +0000 (10:34 -0800)]
[SLP]Improve shuffles cost estimation where possible.
Improved/fixed cost modeling for shuffles by providing masks, improved
cost model for non-identity insertelements.
Differential Revision: https://reviews.llvm.org/D115462
Anders Waldenborg [Mon, 9 May 2022 06:11:34 +0000 (08:11 +0200)]
scan-build-py: Change scripts to explicitly require python3
The "#!" line in all scan-build-py scripts were using just bare
"/usr/bin/python" which according to PEP-0394 can be either python3,
python2 or not exist at all.
E.g in latest debian and ubuntu releases "/usr/bin/python" does not
exist at all by default and user must install python-is-python2 or
python-is-python3 packages to get the bare version less "python"
command.
Until recently (
70b06fe8a186 "scan-build-py: Force the opening in utf-8"
changed "libscanbuild") these scripts worked in both python2 and
python3, but now they (rightfully) are python3 only, and broke on
systems where the "python" command means python2.
By changing the "#!" to be "python3" it is not only explicit that the
scripts require python3 it also works on systems where "python" command
is python2 or nonexistent.
Differential Revision: https://reviews.llvm.org/D126804
Paul Pluzhnikov [Thu, 2 Jun 2022 17:30:24 +0000 (10:30 -0700)]
Fix a buglet in remove_dots().
The function promises to canonicalize the path, but neglected to do so
for the root component.
For example, calling remove_dots("/tmp/foo.c", Style::windows_backslash)
resulted in "/tmp\foo.c". Now it produces "\tmp\foo.c".
Also fix FIXME in the corresponding test.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D126412
Joe Nash [Thu, 12 May 2022 17:32:19 +0000 (13:32 -0400)]
[AMDGPU] gfx11 ds instructions
MC layer support for ds instructions
Contributors:
Piotr Sobczak <Piotr.Sobczak@amd.com>
Patch 14/N for upstreaming of AMDGPU gfx11 architecture.
Depends on D126463
Reviewed By: arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D126468
Paul Robinson [Thu, 2 Jun 2022 18:00:32 +0000 (11:00 -0700)]
[PS5] Make passing unions in registers match PS4 ABI
Paul Robinson [Thu, 2 Jun 2022 17:53:40 +0000 (10:53 -0700)]
[PS5] Classify __m64 as integer, matching PS4 ABI