Valentin Clement [Mon, 24 Jul 2023 16:33:09 +0000 (09:33 -0700)]
[flang][openacc] Update materialization recipe for private copy in reduction init region
Update the code generated in the init region to materialize the private
copy.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D155882
Alexey Bataev [Mon, 10 Jul 2023 20:49:26 +0000 (13:49 -0700)]
[SLP]Check scalars before trying scheduling.
Need to check the scalars if they can be vectorized before trying to
schedule them. It may save compile time and improve vectorization on
large functions/basic blocks.
Differential Revision: https://reviews.llvm.org/D154891
Owen Pan [Sun, 23 Jul 2023 20:45:22 +0000 (13:45 -0700)]
[clang-format] Insert namespace comments with leading spaces
Insert missing namespace comments with SpacesBeforeTrailingComments
leading spaces.
Fixes #64051.
Differential Revision: https://reviews.llvm.org/D156065
Matt Arsenault [Tue, 27 Jun 2023 14:16:23 +0000 (10:16 -0400)]
RegisterCoaleser: Fix empty subrange verifier error
In this example an implicit def had live-out undef subrange
defs. After coalescing with the def from a previous block, the
undef-defed lanes are no longer live out of the block in the new
interval. An empty subrange was tenatively created for these lanes,
but it must be deleted.
Matt Arsenault [Wed, 21 Jun 2023 12:36:22 +0000 (08:36 -0400)]
RegisterCoalescer: Fix verifier error on redef of subregister for live out implicit_defs
A live out implicit_def wasn't deleted, but the subranges weren't
correctly updated. The main range was correct but the def
corresponding to the initial main range def instruction was missing
from the lanes redefined in another block.
The written lanes are not quite the same as the valid lanes in the
case of an implicit_def.
Fixes verifier error in blender. There is an additional verifier in
some of the testcase variants where an empty subrange remains.
Corentin Jabot [Mon, 10 Jul 2023 06:54:15 +0000 (08:54 +0200)]
[Clang] Fix consteval propagation for aggregates and defaulted constructors
This patch does a few things:
* Fix aggregate initialization.
When an aggregate has an initializer that is immediate-escalating,
the context in which it is used automatically becomes an immediate function.
The wording does that by rpretending an aggregate initialization is itself
an invocation which is not really how clang works, so my previous attempt
was... wrong.
* Fix initialization of defaulted constructors with immediate escalating
default member initializers.
The wording was silent about that case and I did not handled it fully
https://cplusplus.github.io/CWG/issues/2760.html
* Fix diagnostics
In some cases clang would produce additional and unhelpful
diagnostics by listing the invalid references to consteval
function that appear in immediate escalating functions
Fixes https://github.com/llvm/llvm-project/issues/63742
Reviewed By: aaron.ballman, #clang-language-wg, Fznamznon
Differential Revision: https://reviews.llvm.org/D155175
Lang Hames [Mon, 24 Jul 2023 00:56:49 +0000 (17:56 -0700)]
[llvm-jitlink] Don't return immediately in -noexec mode, just skip execution.
Skipping execution rather than bailing out early means that:
1. Explicit teardown of JIT'd code will happen at the same point (via the call
to ExecutionSession::endSession) regardless of whether -noexec is used.
2. The -show-times option will work with -noexec.
Lang Hames [Thu, 20 Jul 2023 16:32:38 +0000 (09:32 -0700)]
[llvm-jitlink] Move statistics code into a separate file.
Further isolates statistics gathering / reporting code from the rest of llvm-jitlink.
Craig Topper [Mon, 24 Jul 2023 16:05:43 +0000 (09:05 -0700)]
[RISCV] Remove combineCmpOp and associated code. NFCI
This code was originally added in D134277. This transform is now
available in target independent DAG combine after D153502.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D156075
Alex Bradbury [Mon, 24 Jul 2023 15:58:21 +0000 (16:58 +0100)]
Revert "[clang][RISCV] Fix ABI handling of empty structs with hard FP calling conventions in C++"
This reverts commit
17a58b3ca7ec18585e9ea8ed8b39d72fe36fb6cb and the
minor documentation fix
569e99a471f618b7fdf045d5e96f21d3e3a7f898.
An issue was reported in https://reviews.llvm.org/D142327#inline-1510301
so reverting until it can be investigated and fixed.
Matt Arsenault [Sun, 2 Jul 2023 00:28:48 +0000 (20:28 -0400)]
AMDGPU: Implement combineRepeatedFPDivisors
Yuanqiang Liu [Mon, 24 Jul 2023 15:00:34 +0000 (17:00 +0200)]
[NVPTX] Expand select_cc on bfloat16 type
Expand select_cc on bfloat16 and bfloat16v2 type.
Differential Revision: https://reviews.llvm.org/D156085
Nikita Popov [Mon, 24 Jul 2023 14:57:57 +0000 (16:57 +0200)]
[ConstantFolding] Avoid use of ConstantExpr::getOr() (NFC)
Constant folding cannot fail here, because we're really working
on plain integers. It might be better to make all of this work
on APInts instead of Constants.
Sander de Smalen [Mon, 24 Jul 2023 14:47:19 +0000 (14:47 +0000)]
[AArch64] Ignore instructions not supported by CPU in AArch64SVESchedPseudoTest
When adding new Pseudos for instructions that are not supported
by the CPU for which the scheduler model is being tested, the test fails
if these pseudos are not covered by the regex's in the scheduling model.
Rather than failing, this test should check that the CPU supports the
original instruction modelled by the pseudo. If not, the pseudo is
not relevant to the scheduling model being tested.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D156094
Nikita Popov [Mon, 24 Jul 2023 14:45:32 +0000 (16:45 +0200)]
[InstCombine] Avoid uses of ConstantExpr::getOr()
Replace these with IRBuilder uses, as we don't (from a type
perspective) care about Constant results.
Switch the predicate to m_ImmConstant() instead of isa<Constant>
to guarantee that these do get folded away and our assumptions
about simplifications hold true.
Craig Topper [Mon, 24 Jul 2023 14:43:01 +0000 (07:43 -0700)]
[RISCV] Add CZERO_EQZ/CZERO_NEZ to ComputeNumSignBitsForTargetNode.
Reviewed By: wangpc
Differential Revision: https://reviews.llvm.org/D156082
Craig Topper [Mon, 24 Jul 2023 14:42:45 +0000 (07:42 -0700)]
[RISCV] Add test case for D156082 to condops.ll
This test is copied from select-cc.ll. It wasn't worth adding
Zicond RUN lines to that file.
Reviewed By: asb, wangpc
Differential Revision: https://reviews.llvm.org/D156083
Craig Topper [Mon, 24 Jul 2023 14:18:22 +0000 (07:18 -0700)]
[RISCV] Add CZERO_EQZ/CZERO_NEZ to computeKnownBitsForTargetNode.
Reviewed By: wangpc
Differential Revision: https://reviews.llvm.org/D156081
Sander de Smalen [Mon, 24 Jul 2023 13:57:12 +0000 (13:57 +0000)]
[Clang][AArch64] svldr_vnum/svstr_vnum should use cntsb iso vscale for the offset
The specification for LDR/STR says that:
The ZA array vector is selected by the sum of the vector select register
and immediate offset, modulo the number of bytes in a Streaming SVE
vector. [..] This instruction does not require the PE to be in Streaming
SVE mode
When the instruction is used outside of streaming mode, 'vscale' will result
in the wrong value being used for the offset because LLVM's code-generator
will emit the non-streaming 'RDVL/ADDVL' instead of the 'RDSVL/ADDSVL'
instructions which are used to get the Streaming-SVE vector length.
Reviewed By: bryanpkc
Differential Revision: https://reviews.llvm.org/D156121
Lorenzo Chelini [Mon, 24 Jul 2023 09:01:36 +0000 (11:01 +0200)]
[MLIR][Linalg] Move AggregatedOpInterface in linalg namespace (NFC)
For now, the interface is specific to linalg only.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D156091
Pravin Jagtap [Mon, 24 Jul 2023 14:19:36 +0000 (10:19 -0400)]
[AMDGPU] Fix llvm.amdgcn.wave.reduce.umax/umin MIR tests
Fixes the MIR tests reported in https://lab.llvm.org/buildbot/#/builders/16/builds/51955
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D156125
Simon Pilgrim [Mon, 24 Jul 2023 14:11:15 +0000 (15:11 +0100)]
[X86] fpclamptosat.ll - add nounwind to get rid of cfi noise
Helps cleanup D150372
Quinn Dawkins [Sun, 23 Jul 2023 21:02:14 +0000 (17:02 -0400)]
[mlir][Transform] Allow printing inside matchers
Enables printf style debugging of matchers through `transform.print`
within the body of a matcher.
Differential Revision: https://reviews.llvm.org/D156078
David Green [Mon, 24 Jul 2023 13:55:38 +0000 (14:55 +0100)]
[AArch64] Add vselect(fmin/fmax) SVE patterns
For both minnum/maxnum and minimum/maximum, this adds tablegen patterns for
vselect(fmin/fmax), creating a predicate fminnm/fmaxnm/fmin/fmax nodes.
Differential Revision: https://reviews.llvm.org/D155872
David Green [Fri, 21 Jul 2023 07:50:27 +0000 (08:50 +0100)]
[AArch64] Extra testing for vselect(fmin/max patterns. NFC
See D155872.
Simon Pilgrim [Mon, 24 Jul 2023 13:50:10 +0000 (14:50 +0100)]
[X86] combineConcatVectorOps - add concat(ctpop)/concat(ctlz)/concat(cttz) handling
Simon Pilgrim [Mon, 24 Jul 2023 13:41:26 +0000 (14:41 +0100)]
[X86] Add some basic concat(ctpop)/concat(ctlz)/concat(cttz) widening tests
Zain Jaffal [Mon, 24 Jul 2023 13:46:48 +0000 (14:46 +0100)]
[Remark] Overload `<<` for Remark, RemarkType and RemarkLocation.
Represent different remark concepts as strings by overloading the `<<`
operator.
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D155058
Jie Fu [Mon, 24 Jul 2023 13:35:52 +0000 (21:35 +0800)]
[clang][dataflow] Fix build failure due to -Wunused-variable in DataflowEnvironment.cpp (NFC)
/data/llvm-project/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:125:11: error: unused variable 'StructVal2' [-Werror,-Wunused-variable]
auto *StructVal2 = cast<StructValue>(&Val2);
^
1 error generated.
Joseph Huber [Mon, 24 Jul 2023 13:28:16 +0000 (08:28 -0500)]
[OpenMP] Make the nested parallelism global hidden
Summary:
These will probably be removed with the kernel environment, but they
should have hidden visibliity so they can be optimized out.
pvanhout [Mon, 24 Jul 2023 13:18:06 +0000 (15:18 +0200)]
[GlobalISel] Fix GIM_CheckIsSameOperandIgnoreCopies
If the MI had more than one def it incorrectly returrned true.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D156119
David Spickett [Mon, 24 Jul 2023 13:23:30 +0000 (13:23 +0000)]
[lldb] Remove Windows XFAIL for TestDollarInVariable.py
Since
5d66f9fd8e97c05a5dba317d3ad2566e61ead1ff this test has
been upassing on Linaro's Windows on Arm lldb bot:
https://lab.llvm.org/buildbot/#/builders/219/builds/4320
I can't explain exactly how that happened, but I do see a bunch
of QEnvironment packets going by in that test. It is very likely
that the order would have been different on Windows.
Indeed, when it was xfailed back in
df9051e7cfda5519f4584cda22e9ef2006517e94
the reason was not known either.
Martin Braenne [Thu, 20 Jul 2023 11:12:58 +0000 (11:12 +0000)]
[clang][dataflow] Remove checks that test for consistency between `StructValue` and `AggregateStorageLocation`.
Now that the redundancy between these two classes has been eliminated, these
checks aren't needed any more.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155813
Martin Braenne [Thu, 20 Jul 2023 11:12:39 +0000 (11:12 +0000)]
[clang][dataflow] Eliminate duplication between `AggregateStorageLocation` and `StructValue`.
After this change, `StructValue` is just a wrapper for an `AggregateStorageLocation`. For the wider context, see https://discourse.llvm.org/t/70086.
## How to review
- Start by looking at the comments added / changed in Value.h, StorageLocation.h,
and DataflowEnvironment.h. This will give you a good overview of the semantic
changes.
- Look at the corresponding .cpp files that implement the semantic changes.
- Transfer.cpp, TypeErasedDataflowAnalysis.cpp, and RecordOps.cpp show how the
core of the framework is affected by the semantic changes.
- UncheckedOptionalAccessModel.cpp shows how this complex model is affected by
the changes.
- Many of the changes in the rest of the patch are mechanical in nature.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155446
Florian Hahn [Mon, 24 Jul 2023 13:17:17 +0000 (15:17 +0200)]
[LAA] Add assertion to check both Start and End are invariant (NFC).
Add extra assert to check invariant of RuntimePointerChecking::insert to
guard against subtle changes when extending the scope of LAA.
Guray Ozen [Thu, 20 Jul 2023 15:28:51 +0000 (17:28 +0200)]
[llvm][nvptx] Add sm_90a
This works adds `sm_90a` as nvptx target. `sm_90a` is required to generate wgmma and setmaxnreg instructions.
Here is information about "a" prefix in PTX document:
Target architectures with suffix “a”, such as sm_90a, include architecture-accelerated features that are supported on the specified architecture only, hence such targets do not follow the onion layer model. Therefore, PTX code generated for such targets cannot be run on later generation devices. Architecture-accelerated features can only be used with targets that support these features.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D155851
Guray Ozen [Mon, 24 Jul 2023 10:51:22 +0000 (12:51 +0200)]
[mlir][gpu] Increase default SM version from 35 to 50
Current SM version is 35 but it is deprecated long time ago. D155563 introduced ptxas compilations, using sm_35 causes failures in builtbot. This change increase default SM version to 50.
Differential Revision: https://reviews.llvm.org/D156098
Guray Ozen [Mon, 24 Jul 2023 10:45:04 +0000 (12:45 +0200)]
[mlir][gpu] Fallback to JIT compilation
Recent change introduces compilation with ptxas compiler. The change is important to be able to different versions of ptxas compiler without changing the compiler.
It causes some failures in builtbot. This change adds fallback mechanism to JIt compilation that is original path.
Differential Revision: https://reviews.llvm.org/D156096
Tony Tao [Mon, 24 Jul 2023 13:02:59 +0000 (09:02 -0400)]
[SystemZ][z/OS] Add OpenFlags to CreateMissingDirectories path when creating temp files
Additional patch to https://reviews.llvm.org/D103806 to add the same flags in the path where the CreateMissingDirectories booleans is set to true.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D155651
Simon Pilgrim [Mon, 24 Jul 2023 12:49:45 +0000 (13:49 +0100)]
[X86] combineConcatVectorOps - add basic concat(unpack(x,y),unpack(z,w)) -> unpack(concat(x,z),concat(y,w)) handling
Very limited support as we don't want to interfere with build_vector patterns
Podchishchaeva, Mariya [Mon, 24 Jul 2023 12:31:47 +0000 (05:31 -0700)]
[NFC][clang] Fix static analyzer concerns
HeaderIncludesCallback and HeaderIncludesJSONCallback classes may own
resources and free them in the destructor. However they don't have copy
user-written constructors/assignment operators, so an attempt to copy a
HeaderIncludesCallback object will use compiler-generated copy
constructor which will only do dummy copy and afterwards there will be
use-after-free issues.
Reviewed By: aaron.ballman, tahonermann
Differential Revision: https://reviews.llvm.org/D155842
John Brawn [Mon, 24 Jul 2023 12:29:22 +0000 (13:29 +0100)]
Revert "[Sema] Fix handling of functions that hide classes"
This reverts commit
dfca88341794eec88c5009a93c569172fff62635.
Causes clang/test/Modules/stress1.cpp to fail.
Podchishchaeva, Mariya [Mon, 24 Jul 2023 12:15:40 +0000 (05:15 -0700)]
[NFC][clang] Fix static analyzer concerns
OMPTransformDirectiveScopeRAII doesn't have user-written copy
constructor/assignment operator but it frees memory in the destructor.
Delete these members since doesn't seem that OMPTransformDirectiveScopeRAII
objects are intended for copy.
Reviewed By: tahonermann, ABataev
Differential Revision: https://reviews.llvm.org/D155849
John Brawn [Wed, 28 Jun 2023 09:31:38 +0000 (10:31 +0100)]
[Sema] Fix handling of functions that hide classes
When a function is declared in the same scope as a class with the same
name then the function hides that class. Currently this is done by a
single check after the main loop in LookupResult::resolveKind, but
this can give the wrong result when we have a using declaration in
multiple namespace scopes in two different ways:
* When the using declaration is hidden in one namespace but not the
other we can end up considering only the hidden one when deciding
if the result is ambiguous, causing an incorrect "not ambiguous"
result.
* When two classes with the same name in different namespace scopes
are both hidden by using declarations this can result in
incorrectly deciding the result is ambiguous. There's currently a
comment saying this is expected, but I don't think that's correct.
Solve this by checking each Decl to see if it's hidden by some other
Decl in the same scope. This means we have to delay removing anything
from Decls until after the main loop, in case a Decl is hidden by
another that is removed due to being non-unique.
Differential Revision: https://reviews.llvm.org/D154503
Sander de Smalen [Fri, 14 Jul 2023 08:45:10 +0000 (09:45 +0100)]
[AArch64] NFC: Move fadda tests to separate file.
We want to test the fadda tests with 'streaming-compatible' flags,
such that we can ensure no 'fadda' (not valid in streaming mode) is
generated.
Sander de Smalen [Thu, 13 Jul 2023 09:45:22 +0000 (10:45 +0100)]
[AArch64][SME] NFC: Pass target feature on RUN line, instead of function attribute.
This is anticipating adding new RUN lines testing for +sme, alongside +sve/+sve2.
Joseph Huber [Sat, 22 Jul 2023 02:30:36 +0000 (21:30 -0500)]
[NVPTX] Fix lack of `.noreturn` on certain functions for aliases
Forgot to include this special handling on the declaration of the alias
function.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D156012
Aaron Ballman [Mon, 24 Jul 2023 11:32:48 +0000 (07:32 -0400)]
Fix the Clang sphinx build
This addresses issues found by:
https://lab.llvm.org/buildbot/#/builders/92/builds/47783
XChy [Mon, 24 Jul 2023 11:04:32 +0000 (13:04 +0200)]
[InstCombine] Transform bitwise (A >> C - 1, zext(icmp)) -> zext (bitwise(A < 0, icmp))
This extends foldCastedBitwiseLogic to handle the similar cases.
I have recently submitted a patch to implement a single fold like:
(A > 0) | (A < 0) -> zext (A != 0)
But it is not general enough, and some problems like
a < b & a >= b - 1 happen again.
So I generalize this fold by matching the pattern
bitwise(A >> C - 1, zext(icmp)), and replace A >> C - 1 with
zext(A < 0) here. (C is the scalar size bits of the type of A.)
Then we get bitwise(zext(A < 0), zext(icmp)), this will be folded
by original code in foldCastedBitwiseLogic, into
zext(bitwise(A < 0, icmp)). And finally, any related icmp fold will
be automatically implemented because bitwise(icmp,icmp) had been
implemented.
The proof of the correctness is obvious, because the folds below
were previously proved and implemented.
A >> C - 1 -> zext(A < 0)
bitwise(zext(A), zext(B)) -> zext(bitwise(A, B))
And the fold of this patch is the combination of folds above.
Fixes https://github.com/llvm/llvm-project/issues/63751.
Differential Revision: https://reviews.llvm.org/D154791
XChy [Mon, 24 Jul 2023 11:02:40 +0000 (13:02 +0200)]
[InstCombine] Add tests for bitwise (A >> C - 1, zext(icmp)) -> zext (bitwise(A<0, icmp)) fold (NFC)
Tests for an upcoming bitwise (A >> C - 1, zext(icmp)) ->
zext (bitwise(A<0, icmp)) fold.
Differential Revision: https://reviews.llvm.org/D154789
Florian Hahn [Mon, 24 Jul 2023 10:50:46 +0000 (11:50 +0100)]
[LV] Re-use existing broadcast value for live-ins.
When requesting a vector value for a live-in, we can re-use the
broadcast of the live-in of part 0 for parts > 0.
Nikita Popov [Mon, 24 Jul 2023 10:43:52 +0000 (12:43 +0200)]
[InstCombine] Add test for infinite combine loop (NFC)
Guard against the issue reported at
https://reviews.llvm.org/rG086ee99564af#1230303.
Sandeep Kosuri [Mon, 24 Jul 2023 10:33:19 +0000 (05:33 -0500)]
[OPENMP][NFC] Editing OpenMP support page
dingfei [Mon, 24 Jul 2023 10:38:42 +0000 (18:38 +0800)]
[clang][ASTDumper] Remove redundant dump of BlockDecl's ParmVarDecl
ParmVarDecl of BlockDecl is unnecessarily dumped twice.
Remove this duplication as other FunctionDecls.
Fixes https://github.com/llvm/llvm-project/issues/64005 (#2)
Differential Revision: https://reviews.llvm.org/D155985
Simon Pilgrim [Mon, 24 Jul 2023 10:27:02 +0000 (11:27 +0100)]
[X86] combineConcatVectorOps - add concat(psadbw(x,y),psadbw(z,w)) -> psadbw(concat(x,z),concat(y,w)) handling
Simon Pilgrim [Mon, 24 Jul 2023 10:13:56 +0000 (11:13 +0100)]
[X86] Add reduce_add(ctpop(x)) 'count all bits in a vector' tests
Also add some basic buildvector variants: build_vector(reduce_add(ctpop(x0)), reduce_add(ctpop(x1)), ...)
dingfei [Mon, 24 Jul 2023 10:31:01 +0000 (18:31 +0800)]
[Sema][ObjC] Invalidate BlockDecl with invalid ParmVarDecl
BlockDecl should be invalidated because of its invalid ParmVarDecl.
Fixes #1 of https://github.com/llvm/llvm-project/issues/64005
Differential Revision: https://reviews.llvm.org/D155984
Guray Ozen [Mon, 24 Jul 2023 07:40:38 +0000 (09:40 +0200)]
[mlir][gpu] Improving Cubin Serialization with ptxas Compiler
This work improves how we compile the generated PTX code using the `ptxas` compiler. Currently, we rely on the driver's jit API to compile the PTX code. However, this approach has some limitations. It doesn't always produce the same binary output as the ptxas compiler, leading to potential inconsistencies in the generated Cubin files.
This work introduces a significant improvement by directly utilizing the ptxas compiler for PTX compilation. By doing so, we can achieve more consistent and reliable results in generating cubin files. Key Benefits:
- Using the Ptxas compiler directly ensures that the cubin files generated during the build process remain consistent with CUDA compilation using `nvcc` or `clang`.
- Another advantage of this work is that it allows developers to experiment with different ptxas compilers without the need to change the compiler. Performance among ptxas compiler versions are vary, therefore, one can easily try different ptxas compilers.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155563
dingfei [Mon, 24 Jul 2023 10:21:14 +0000 (18:21 +0800)]
[Sema][ObjC] Invalidate BlockDecl with invalid return expr & its parent BlockExpr
Invalidate BlockDecl with implicit return type, in case any of the return value exprs is invalid.
Propagating the error info up by replacing BlockExpr with a RecoveryExpr.
The idea of this fix is given by @hokein(Haojian Wu)
Fix https://github.com/llvm/llvm-project/issues/63863.
Differential Revision: https://reviews.llvm.org/D155396
pvanhout [Mon, 24 Jul 2023 10:22:26 +0000 (12:22 +0200)]
[test][TableGen] Reenable pattern-parsing.td with reverse_iteration
D155821 should have fixed this.
Nikita Popov [Tue, 4 Jul 2023 11:06:20 +0000 (13:06 +0200)]
[Mips] Fix argument lowering for illegal vector types (PR63608)
The Mips MSA ABI requires that legal vector types are passed in
scalar registers in packed representation. E.g. a type like v16i8
would be passed as two i64 registers.
The implementation attempts to do the same for illegal vectors with
non-power-of-two element counts or non-power-of-two element types.
However, the SDAG argument lowering code doesn't support this, and
it is not easy to extend it to support this (we would have to deal
with situations like passing v7i18 as two i64 values).
This patch instead opts to restrict the special argument lowering
to only vectors with power-of-two elements and round element types.
Everything else is lowered naively, that is by passing each element
in promoted registers.
Fixes https://github.com/llvm/llvm-project/issues/63608.
Differential Revision: https://reviews.llvm.org/D154445
Jay Foad [Mon, 24 Jul 2023 09:57:37 +0000 (10:57 +0100)]
[CodeGen] Add machine verification to some tests
This is to catch errors in an upcoming patch.
Dhruv Chawla [Fri, 21 Jul 2023 14:17:02 +0000 (19:47 +0530)]
[NFC][ValueTracking]: Remove redundant computeKnownBits call for LoadInst in isKnownNonZero
For load instructions, computeKnownBits only checks the range metadata.
This check is already present in isKnownNonZero, so there is no need to
fall through to computeKnownBits.
This change gives a speed improvement of 0.12-0.18%:
https://llvm-compile-time-tracker.com/compare.php?from=
3c6ed559e5274307995586c1499a2c8e4e0276a0&to=
78b462d8c4ae079638b728c6446da5999c4ee9f8&stat=instructions:u
Differential Revision: https://reviews.llvm.org/D155958
Luke Lau [Mon, 24 Jul 2023 09:58:22 +0000 (10:58 +0100)]
[RISCV] Set Fast flag for unaligned memory accesses
The +unaligned-scalar-mem and +unaligned-vector-mem features were added in
D126085 and D149375 respectively to allow subtargets to indicate that
they supported misaligned loads/stores with "sufficient" performance.
This is separate from whether or not the target actually supports
misaligned accesses, which could be determined from Zicclsm.
This patch enables the Fast flag under the assumption that any subtarget
that declares support for +unaligned-*-mem will want to opt into
optimisations that take advantage of misaligned scalar accesses, such as
store merging.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D150771
Tomasz Kamiński [Mon, 24 Jul 2023 09:02:44 +0000 (11:02 +0200)]
[NFC][analyzer] Enable implicit destructor for cfg-lifetime tests
This enables `cfg-temporary-dtors`, `cfg-rich-constructors`, and
`cfg-implicit-dtors` (defaults for CSA) for CFGLifetime test,
making int consistent with `cfg-scopes` test.
Before the fixes implemented in https://reviews.llvm.org/D153273,
this flags were incompatible.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D155694
WANG Rui [Mon, 24 Jul 2023 09:47:09 +0000 (17:47 +0800)]
[LoongArch] Implement isZextFree
This returns true for 8-bit and 16-bit loads, allowing ld.bu/ld.hu to be selected and avoiding unnecessary masks.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154819
WANG Rui [Mon, 24 Jul 2023 09:47:08 +0000 (17:47 +0800)]
[LoongArch] Add test case showing suboptimal codegen when loading unsigned char/short
Implementing isZextFree will allow ld.bu or ld.hu to be selected rather than ld.b+mask and ld.h+mask.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154818
WANG Rui [Mon, 24 Jul 2023 09:35:03 +0000 (17:35 +0800)]
[LoongArch] Implement isLegalICmpImmediate
This causes a trivial improvement in the legalicmpimm.ll test case.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154811
WANG Rui [Mon, 24 Jul 2023 09:35:02 +0000 (17:35 +0800)]
[LoongArch][NFC] Add tests for (X & -256) == 256 -> (X >> 8) == 1
Add tests for (X & -256) == 256 -> (X >> 8) == 1.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: xen0n
Differential Revision: https://reviews.llvm.org/D154810
Simon Pilgrim [Mon, 24 Jul 2023 09:37:03 +0000 (10:37 +0100)]
[clangd] InlayHints.cpp - fix MSVC "not all control paths return a value" warning. NFC.
Simon Pilgrim [Mon, 24 Jul 2023 09:34:40 +0000 (10:34 +0100)]
[include_cleaner] IncludeSpeller.cpp - fix MSVC "not all control paths return a value" warning. NFC.
Simon Pilgrim [Mon, 24 Jul 2023 09:33:28 +0000 (10:33 +0100)]
[JITLink] ppc64.h - fix MSVC "not all control paths return a value" warning. NFC.
Alex Bradbury [Mon, 24 Jul 2023 09:36:42 +0000 (10:36 +0100)]
[clang][docs] Attempt to fix warning when building ReleaseNotes
I believe my previous patch,
17a58b3ca7ec18585e9ea8ed8b39d72fe36fb6cb
introduced a warning here.
Alex Bradbury [Mon, 24 Jul 2023 09:22:38 +0000 (10:22 +0100)]
[clang][RISCV] Fix ABI handling of empty structs with hard FP calling conventions in C++
As reported in <https://github.com/llvm/llvm-project/issues/58929>,
Clang's handling of empty structs in the case of small structs that may
be eligible to be passed using the hard FP calling convention doesn't
match g++. In general, C++ record fields are never empty unless
[[no_unique_address]] is used, but the RISC-V FP ABI overrides this.
After this patch, fields of structs that contain empty records will be
ignored, even in C++, when considering eligibility for the FP calling
convention ('flattening'). See also the relevant psABI issue
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/358> which
seeks to clarify the documentation.
Fixes https://github.com/llvm/llvm-project/issues/58929
Differential Revision: https://reviews.llvm.org/D142327
WANG Rui [Mon, 24 Jul 2023 09:16:03 +0000 (17:16 +0800)]
[LoongArch] Implement isLegalAddImmediate
This brings a trivial improvement in the and-add-lsr.ll test case.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154762
WANG Rui [Mon, 24 Jul 2023 09:03:32 +0000 (17:03 +0800)]
[LoongArch] Add tests for (and (add x, c1), (lshr y, c2))
Add tests for (and (add x, c1), (lshr y, c2)).
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154809
Tomasz Kamiński [Mon, 24 Jul 2023 07:23:01 +0000 (09:23 +0200)]
[analyzer] Fix crash in GenericTaintChecker when propagatig taint to AllocaRegion
The `GenericTaintChecker` checker was crashing, when the taint
was propagated to `AllocaRegion` region in following code:
```
int x;
void* p = alloca(10);
mempcy(p, &x, sizeof(x));
```
This crash was caused by the fact that determining type of
`AllocaRegion` returns a null `QualType`.
This patch makes `AllocaRegion` expose its type as `void`,
making them consistent with results of `malloc` or `new`
that produce `SymRegion` with `void*` symbol.
Reviewed By: steakhal, xazax.hun
Differential Revision: https://reviews.llvm.org/D155847
WANG Rui [Mon, 24 Jul 2023 05:24:07 +0000 (13:24 +0800)]
[DAGCombine] Canonicalize operands for visitANDLike
During the construction of SelectionDAG, there are no explicit canonicalization rules to adjust the order of operands for AND nodes. This may prevent the optimization in DAGCombiner::visitANDLike from being triggered. This patch canonicalizes the operands before matches, which can be observed to improve optimization on the RISC-V target architecture.
Canonicalize:
```
and(x, add) -> and(add, x)
```
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D154760
WANG Rui [Mon, 24 Jul 2023 05:23:57 +0000 (13:23 +0800)]
[RISCV] Add tests for (and (add x, c1), (lshr y, c2))
Add tests for (and (add x, c1), (lshr y, c2)).
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154808
Sander de Smalen [Mon, 24 Jul 2023 08:04:15 +0000 (08:04 +0000)]
[AArch64][SME] Use `fmov` instead of NEON `movi` for FP value.
NEON `movi` is not valid in Streaming SVE mode, so use an `fmov`
instruction instead for zero-initializing a FP value.
Reviewed By: hassnaa-arm
Differential Revision: https://reviews.llvm.org/D155432
Michael Platings [Mon, 24 Jul 2023 08:35:19 +0000 (09:35 +0100)]
Revert "[NFC] Add checks for self-assignment."
This reverts commit
8ac137acefc01caf636db5f95eb0977c97def1ba.
The code does not compile.
pvanhout [Mon, 24 Jul 2023 08:28:39 +0000 (10:28 +0200)]
[TableGen][GlobalISel] Fix warning when casting to `void *`
Sindhu Chittireddy [Thu, 20 Jul 2023 03:15:47 +0000 (20:15 -0700)]
[NFC] Add checks for self-assignment.
Differential Revision: https://reviews.llvm.org/D155776
Antonio Frighetto [Mon, 24 Jul 2023 07:23:31 +0000 (09:23 +0200)]
[clang][CodeGen] Introduce `-frecord-command-line` for MachO
Allow clang driver command-line recording when
targeting MachO object files as well.
Reviewed-by: sgraenitz
Differential Revision: https://reviews.llvm.org/D155716
pvanhout [Thu, 20 Jul 2023 12:24:12 +0000 (14:24 +0200)]
[TableGen][GlobalISel] Guarantee stable iteration order for stop-after-parse
Builds on top of
6de2735c2428 to fix remaining issues with iteration order in the MatchTable Combiner backend.
See D155789 as well.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D155821
Yingwei Zheng [Mon, 24 Jul 2023 07:03:34 +0000 (15:03 +0800)]
[ConstraintElim] Add facts implied by MinMaxIntrinsic
Fixes https://github.com/llvm/llvm-project/issues/63896 and https://github.com/rust-lang/rust/issues/113757.
This patch adds facts implied by llvm.smin/smax/umin/umax intrinsics.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D155412
Piotr Zegar [Sun, 23 Jul 2023 19:22:42 +0000 (19:22 +0000)]
[clang-tidy] Add bugprone-empty-catch check
Detects and suggests addressing issues with empty catch statements.
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D144748
Balazs Benics [Mon, 24 Jul 2023 06:26:54 +0000 (08:26 +0200)]
[analyzer][docs] Add CSA release notes
We'll soon branch off, and start releasing clang-17.
Here is a patch, adjusting the release notes for what we achieved since
the last release.
I used this command to inspect the interesting commits:
```
git log --oneline llvmorg-16.0.0..llvm/main \
clang/{lib/StaticAnalyzer,include/clang/StaticAnalyzer} | \
grep -v NFC | grep -v -i revert
```
This filters in CSA directories and filters out NFC and revert commits.
Given that in the release-notes, we usually don't put links to commits,
I'll remove them from this patch as well. I just put them there to make
it easier to review for you.
I tried to group the changes into meaningful chunks, and dropped some of
the uninteresting commits.
I've also dropped the commits that were backported to clang-16.
Check out how it looks, and propose changes like usual.
---
FYI the `ninja docs-clang-html` produces the html docs, including the `ReleaseNotes`.
And the produced artifact will be at `build/tools/clang/docs/html/ReleaseNotes.html`.
Differential Revision: https://reviews.llvm.org/D155445
Craig Topper [Mon, 24 Jul 2023 05:42:36 +0000 (22:42 -0700)]
[RISCV] Add Zicond RUN lines to xaluo.ll. NFC
A couple of these tests show a need for computeKnownBits support
for Zicond.
Kai Luo [Mon, 24 Jul 2023 06:00:20 +0000 (14:00 +0800)]
[JITLink][PowerPC] Correct handling of R_PPC64_REL24_NOTOC
According to the ELFv2 ABI
> This relocation type is used to specify a function call where the TOC pointer is not initialized. It is similar to R_PPC64_REL24 in that it specifies a symbol to be resolved. If the symbol resolves to a function that requires a TOC pointer (as determined by st_other bits) then a link editor must arrange for the call to be via the global entry point of the called function. Any stub code must not rely on a valid TOC base address in r2.
This patch fixes handling of `R_PPC64_REL24_NOTOC` by using the same stub code sequence as lld.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D155672
Yingwei Zheng [Mon, 24 Jul 2023 05:57:00 +0000 (13:57 +0800)]
[ConstraintElim] Add test cases from PR63896. NFC.
This patch adds some test cases from https://github.com/llvm/llvm-project/issues/63896.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D155853
Jim Lin [Mon, 24 Jul 2023 04:29:07 +0000 (12:29 +0800)]
[RISCV] Remove unused check prefixes for tests. NFC
Also remove the warning line for that these prefixes are unused.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156048
eopXD [Fri, 21 Jul 2023 03:59:34 +0000 (20:59 -0700)]
[RISCV] Support register allocation for GHC when f/d is not specified in the architecture
This patch supports register allocation for floating-point types when
`zfinx` and `zdinx` is specified in the architecture for the GHC
calling convention.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155910
wanglei [Mon, 24 Jul 2023 03:40:29 +0000 (11:40 +0800)]
[LoongArch] Add definition for LVZ/LBT instructions
This patch defines the `LVZ` and `LBT` extension instructions, which
provide enough definitions for llvm-mc and llvm-objdump to correctly
handle these instructions.
It also defines the `SCR` (Scratchpad Register) register class, which
are used by the `LBT` extension instructions.
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D155917
Jacques Pienaar [Mon, 24 Jul 2023 04:40:12 +0000 (21:40 -0700)]
[mlir] Enable converting properties during C create
This enables querying properties passed as attributes during
construction time. In particular needed for type inference where the
Operation has not been created at this point. This allows Python
construction of operations whose type inference depends on properties.
Differential Revision: https://reviews.llvm.org/D156070
esmeyi [Mon, 24 Jul 2023 04:35:24 +0000 (00:35 -0400)]
[XCOFF] Write source language ID and CPU version ID into C_FILE symbol.
Summary: The source language ID and CPU version ID are required by debuggers on AIX. AIX's system assembler determines the source language ID based on the source file's name suffix, and the behavior in this patch is consistent with it.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D155684
Jacques Pienaar [Mon, 24 Jul 2023 04:26:52 +0000 (21:26 -0700)]
[mlir][py] Reuse more of CAPI build time inference.
This reduces code generated for type inference and instead reuses
facilities CAPI side that performed same role.
Differential Revision: https://reviews.llvm.org/D156041t
Pravin Jagtap [Mon, 24 Jul 2023 04:05:42 +0000 (00:05 -0400)]
[AMDGPU] Add llvm.amdgcn.wave.reduce.umin/umax Intrinsic.
When input to intrinsic is uniform value, reduced value is
same as input whereas if input value is divergent we need
to iterate over all active lanes of WaveFront to perform
the reduction.
The control flow for a `loop` has been set up, which
iterates over `only` active lanes to perform reduction.
Introduced WAVE_REDUCE_UMIN_PSEUDO_U32 and
WAVE_REDUCE_UMAX_PSEUDO_U32 Pseudos which
are lowered Post-ISel (in `EmitInstrWithCustomInserter `).
Reviewed By: arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D154858
Jim Lin [Mon, 24 Jul 2023 02:33:40 +0000 (10:33 +0800)]
[RISCV] Adjust definition order in RISCVInstrInfoZvk.td to be the same with other td file
The definition order is operand/SDNode, instruction class template,
instruction, pseudo instruciton, codegen patterns, ....
Jie Fu [Mon, 24 Jul 2023 03:33:02 +0000 (11:33 +0800)]
[AArch64][GlobalISel] Remove unused variable 'v2s8' in AArch64LegalizerInfo.cpp (NFC)
/Users/jiefu/llvm-project/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp:53:13: error: unused variable 'v2s8' [-Werror,-Wunused-variable]
const LLT v2s8 = LLT::fixed_vector(2, 8);
^
1 error generated.