platform/upstream/llvm.git
2 years ago[InstCombine] Transform strrchr to memrchr for constant strings
Martin Sebor [Fri, 1 Jul 2022 16:09:42 +0000 (10:09 -0600)]
[InstCombine] Transform strrchr to memrchr for constant strings

Add an emitter for the memrchr common extension and simplify the strrchr
call handler to use it. This enables transforming calls with the empty
string to the test C ? S : 0.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128954

2 years ago[reland][Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef.
Alexey Lapshin [Tue, 28 Jun 2022 16:52:12 +0000 (19:52 +0300)]
[reland][Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef.

This review is extracted from D96035.

This patch adds possibility to keep not only DwarfStringPoolEntry, but also
pointer to it. The DwarfStringPoolEntryRef keeps reference to the string map entry.
String map keeps string data and corresponding DwarfStringPoolEntry
info. Not all string map entries may be included into the result,
and then not all string entries should have DwarfStringPoolEntry
info. Currently StringMap keeps DwarfStringPoolEntry for all entries.
It leads to extra memory usage. This patch allows to keep
DwarfStringPoolEntry info only for entries which really need it.

[reland] : make msan happy.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D126883

2 years ago[LLD][COFF] Ignore /kernel flag
Pengxuan Zheng [Tue, 21 Jun 2022 01:44:32 +0000 (18:44 -0700)]
[LLD][COFF] Ignore /kernel flag

There exists some description of the flag from Microsoft, but not sure if
there's more to it. We ignore the flag for now until we find out more about it.

https://docs.microsoft.com/en-us/cpp/build/reference/kernel-create-kernel-mode-binary?view=msvc-170

Reviewed By: thieta, hans

Differential Revision: https://reviews.llvm.org/D128238

2 years ago[MLIR][Presburger] support symbolicLexMin for IntegerRelation
Arjun P [Fri, 1 Jul 2022 16:53:13 +0000 (17:53 +0100)]
[MLIR][Presburger] support symbolicLexMin for IntegerRelation

This also changes the space of the returned lexmin for IntegerPolyhedrons;
the symbols in the poly now correspond to symbols in the result rather than dims.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D128933

2 years ago[MLIR][Presburger] Simplex: refactor (symbolic)lex to support specifying multiple...
Arjun P [Fri, 1 Jul 2022 16:46:50 +0000 (17:46 +0100)]
[MLIR][Presburger] Simplex: refactor (symbolic)lex to support specifying multiple varKinds as symbols

This is also required to support lexmin for relations.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D128931

2 years ago[libc][math] Improved ExhaustiveTest performance.
Kirill Okhotnikov [Fri, 1 Jul 2022 14:37:27 +0000 (16:37 +0200)]
[libc][math] Improved ExhaustiveTest performance.

Previous implementation splits value ranges around threads. Because of
very different performance of testing functions over different ranges,
CPU utilization were poor. Current implementation split test range
over small pieces and threads take the pieces when they finish with
previous. Therefore the CPU load is constant during testing.

Differential Revision: https://reviews.llvm.org/D128995

2 years ago[llvm-objdump] -r: print non-SHF_ALLOC relocations for non-ET_REL files
Fangrui Song [Fri, 1 Jul 2022 16:08:42 +0000 (09:08 -0700)]
[llvm-objdump] -r: print non-SHF_ALLOC relocations for non-ET_REL files

ET_EXEC and ET_DYN files may contain non-SHF_ALLOC relocation sections
(e.g. ld --emit-relocs). Match GNU objdump by dumping them.

* Remove Object/dynamic-reloc.test. Replace it with a -r RUN line in dynamic-relocs.test
* Update relocations-in-nonreloc.test to set sh_link/sh_info. GNU
  objdump seems to ignore a SHT_REL/SHT_RELA section not linking to SHT_SYMTAB.
  The test did not test what it intended to test.

Fix https://github.com/llvm/llvm-project/issues/41246

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D128959

2 years ago[OpenMP] Initial parsing and semantic support for 'parallel masked taskloop simd...
Fazlay Rabbi [Fri, 1 Jul 2022 00:08:17 +0000 (17:08 -0700)]
[OpenMP] Initial parsing and semantic support for 'parallel masked taskloop simd' construct

This patch gives basic parsing and semantic support for
"parallel masked taskloop simd" construct introduced in
OpenMP 5.1 (section 2.16.10)

Differential Revision: https://reviews.llvm.org/D128946

2 years agoRevert "[NFC] Add a missing test for for clang-repl"
Jun Zhang [Fri, 1 Jul 2022 15:55:55 +0000 (23:55 +0800)]
Revert "[NFC] Add a missing test for for clang-repl"

This reverts commit 2750985a5ccb97f4630c3443e75d78ed435d2bd0.
This has caused Windows buildbot unhappy :(

2 years ago[NFC] Add a missing test for for clang-repl
Jun Zhang [Fri, 1 Jul 2022 15:26:08 +0000 (23:26 +0800)]
[NFC] Add a missing test for for clang-repl

This adds a missing test for 0ecbedc0986bd4b7b90a60a5f31d32337160d4c4
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D128991

2 years ago[MLIR][Linalg] Update filename to reflect implementation (NFC)
lorenzo chelini [Fri, 1 Jul 2022 09:44:39 +0000 (11:44 +0200)]
[MLIR][Linalg] Update filename to reflect implementation (NFC)

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D128978

2 years ago[AArch64] Make nxv1i1 types a legal type for SVE.
Sander de Smalen [Fri, 1 Jul 2022 14:29:07 +0000 (14:29 +0000)]
[AArch64] Make nxv1i1 types a legal type for SVE.

One motivation to add support for these types are the LD1Q/ST1Q
instructions in SME, for which we have defined a number of load/store
intrinsics which at the moment still take a `<vscale x 16 x i1>` predicate
regardless of their element type.

This patch adds basic support for the nxv1i1 type such that it can be passed/returned
from functions, as well as some basic support to support some existing tests that
result in a nxv1i1 type. It also adds support for splats.

Other operations (e.g. insert/extract subvector, logical ops, etc) will be
supported in follow-up patches.

Reviewed By: paulwalker-arm, efriedma

Differential Revision: https://reviews.llvm.org/D128665

2 years ago[AST] Don't assert instruction reads/writes memory (PR51333)
Nikita Popov [Thu, 16 Jun 2022 08:22:11 +0000 (10:22 +0200)]
[AST] Don't assert instruction reads/writes memory (PR51333)

This function is well-defined for an instruction that doesn't access
memory (and thus trivially doesn't alias anything in the AST), so
drop the assert. We can end up with a readnone call here if we
originally created a MemoryDef for an indirect call, which was
later replaced with a direct readnone call.

Fixes https://github.com/llvm/llvm-project/issues/51333.

Differential Revision: https://reviews.llvm.org/D127947

2 years ago[pseudo] temporary fix for missing generated header after fe66aebd755191fac6
Sam McCall [Fri, 1 Jul 2022 14:44:36 +0000 (16:44 +0200)]
[pseudo] temporary fix for missing generated header after fe66aebd755191fac6

Better fix to be added by Haojian later!

2 years ago[Build][NFC] Fixes for building on Windows with libc++
Andrew Ng [Thu, 23 Jun 2022 14:16:48 +0000 (15:16 +0100)]
[Build][NFC] Fixes for building on Windows with libc++

Differential Revision: https://reviews.llvm.org/D128514

2 years ago[SCEV] Remove unnecessary pointer handling in BuildConstantFromSCEV (NFCI)
Nikita Popov [Fri, 1 Jul 2022 14:28:56 +0000 (16:28 +0200)]
[SCEV] Remove unnecessary pointer handling in BuildConstantFromSCEV (NFCI)

Nowadays, we do not allow pointers in multiplies, and adds can only
have a single pointer, which is also guaranteed to be last by
complexity sorting. As such, we can somewhat simplify the treatment
of pointer types.

2 years ago[LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266)
Nikita Popov [Fri, 1 Jul 2022 14:11:39 +0000 (16:11 +0200)]
[LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266)

LoopSimplify only requires that the loop predecessor has a single
successor and is safe to hoist into -- it doesn't necessarily have
to be an unconditional BranchInst.

Adjust LoopDeletion to assert conditions closer to what it actually
needs for correctness, namely a single successor and a
side-effect-free terminator (as the terminator is getting dropped).

Fixes https://github.com/llvm/llvm-project/issues/56266.

2 years ago[clangd][ObjC] Fix ObjC method definition completion
David Goldman [Wed, 29 Jun 2022 14:04:21 +0000 (10:04 -0400)]
[clangd][ObjC] Fix ObjC method definition completion

D124637 improved filtering of method expressions, but not method
definitions. With this change, clangd will now filter ObjC method
definition completions based on their entire selector instead of
only the first selector fragment.

Differential Revision: https://reviews.llvm.org/D128821

2 years agoRe-apply "Deferred Concept Instantiation Implementation""
Erich Keane [Thu, 30 Jun 2022 19:03:42 +0000 (12:03 -0700)]
Re-apply "Deferred Concept Instantiation Implementation""

This reverts commit d4d47e574ecae562ab32f8ac7fa3f4d424bb6574.

This fixes the lldb crash that was observed by ensuring that our
friend-'template contains reference to' TreeTransform properly handles a
TemplateDecl.

2 years ago[NFC][OpenMP][CUDA] Remove unnecessary default label
Shilei Tian [Fri, 1 Jul 2022 13:50:29 +0000 (09:50 -0400)]
[NFC][OpenMP][CUDA] Remove unnecessary default label

2 years ago[ConstantRange] Fix sdiv() with one bit values (PR56333)
Nikita Popov [Fri, 1 Jul 2022 13:43:27 +0000 (15:43 +0200)]
[ConstantRange] Fix sdiv() with one bit values (PR56333)

Signed one bit values can only be -1 or 0, not positive. The code
was interpreting the 1 as -1 and intersecting with a full range
rather than an empty one.

Fixes https://github.com/llvm/llvm-project/issues/56333.

2 years ago[SVE][AArch64] Refine hasSVEArgsOrReturn
Matt Devereau [Tue, 7 Jun 2022 11:19:23 +0000 (11:19 +0000)]
[SVE][AArch64] Refine hasSVEArgsOrReturn

As described in aapcs64 (https://github.com/ARM-software/abi-aa/blob/2022Q1/aapcs64/aapcs64.rst#scalable-vector-registers)
AAVPCS is used only when registers z0-z7 take an SVE argument. This fixes the case where floats occupy the lower bits
of registers z0-z7 but SVE arguments in registers greater than z7 cause a function to use AAVPCS where it should use AAPCS.

Moving SVE function deduction from AArch64RegisterInfo::hasSVEArgsOrReturn to AArch64TargetLowering::LowerFormalArguments
where physical register lowering is more accurate fixes this.

Differential Revision: https://reviews.llvm.org/D127209

2 years ago[AMDGPU][GlobalISel] Always use VGPR bank for G_FCMP
Mirko Brkusanin [Fri, 1 Jul 2022 10:50:58 +0000 (12:50 +0200)]
[AMDGPU][GlobalISel] Always use VGPR bank for G_FCMP

Differential Revision: https://reviews.llvm.org/D128980

2 years ago[LLVM][LTO][LLD] Enable Profile Guided Layout (--call-graph-profile-sort) for FullLTO
Ben Dunbobbin [Thu, 30 Jun 2022 22:01:30 +0000 (23:01 +0100)]
[LLVM][LTO][LLD] Enable Profile Guided Layout (--call-graph-profile-sort) for FullLTO

The CGProfilePass needs to be run during FullLTO compilation at link
time to emit the .llvm.call-graph-profile section to the compiled LTO
object file. Currently, it is being run only during the initial
LTO-prelink compilation stage (to produce the bitcode files to be
consumed by the linker) and so the section is not produced.

ThinLTO is not affected because:
- For ThinLTO-prelink compilation the CGProfilePass pass is not run
  because ThinLTO-prelink passes are added via
  buildThinLTOPreLinkDefaultPipeline. Normal and FullLTO-prelink
  passes are both added via buildPerModuleDefaultPipeline which uses
  the LTOPreLink parameter to customize its behavior for the
  FullLTO-prelink pass differences.
- ThinLTO backend compilation phase adds the CGProfilePass (see:
  buildModuleOptimizationPipeline).

Adjust when the pass is run so that the .llvm.call-graph-profile
section is produced correctly for FullLTO.

Fixes #56185 (https://github.com/llvm/llvm-project/issues/56185)

2 years ago[IRBuilder] Move CreateNeg() to fold API
Nikita Popov [Fri, 1 Jul 2022 12:54:10 +0000 (14:54 +0200)]
[IRBuilder] Move CreateNeg() to fold API

Remove the CreateNeg() method from IRBuilderFolder and base it on
CreateSub(0, V) instead, which will call FoldNoWrapBinaryOp().

May not be NFC if InstSimplifyFolder is used.

2 years ago[IRBuilder] Move CreateNot() to fold API
Nikita Popov [Fri, 1 Jul 2022 12:47:56 +0000 (14:47 +0200)]
[IRBuilder] Move CreateNot() to fold API

Drop the IRBuilderFolder method entirely and base this on
CreateXor(V, -1) instead, so this will now go through FoldBinOp.

May not be NFC if the InstSimplifyBuilder is used.

2 years ago[LV] Don't optimize exit cond during epilogue vectorization.
Florian Hahn [Fri, 1 Jul 2022 12:48:38 +0000 (13:48 +0100)]
[LV] Don't optimize exit cond during epilogue vectorization.

At the moment, the same VPlan can be used code generation of both the
main vector and epilogue vector loop. This can lead to wrong results, if
the plan is optimized based on the VF of the main vector loop and then
re-used for the epilogue loop.

One example where this is problematic is if the scalar loops need to
execute at least one iteration, e.g. due to interleave groups.

To prevent mis-compiles in the short-term, disable optimizing exit
conditions for VPlans when using epilogue vectorization. The proper fix
is to avoid re-using the same plan for both loops, which will require
support for cloning plans first.

Fixes #56319.

2 years ago[lldb/test] Don't use preexec_fn for launching inferiors
Pavel Labath [Fri, 1 Jul 2022 12:32:50 +0000 (14:32 +0200)]
[lldb/test] Don't use preexec_fn for launching inferiors

As the documentation states, using this is not safe in multithreaded
programs, and I have traced it to a rare deadlock in some of the tests.

The reason this was introduced was to be able to attach to a program
from the very first instruction, where our usual mechanism of
synchronization -- waiting for a file to appear -- does not work.

However, this is only needed for a single test
(TestGdbRemoteAttachWait) so instead of doing this everywhere, I create
a bespoke solution for that single test. The solution basically
consists of outsourcing the preexec_fn code to a separate (and
single-threaded) shim process, which enables attaching and then executes
the real program.

This pattern could be generalized in case we needed to use it for other
tests, but I suspect that we will not be having many tests like this.

This effectively reverts commit
a997a1d7fbe229433fb458bb0035b32424ecf3bd.

2 years ago[SimplifyLibCalls] Use inbounds GEP
Nikita Popov [Fri, 1 Jul 2022 12:27:38 +0000 (14:27 +0200)]
[SimplifyLibCalls] Use inbounds GEP

When converting strchr(p, '\0') to p + strlen(p) we know that
strlen() must return an offset that is inbounds of the allocated
object (otherwise it would be UB), so we can use an inbounds GEP.
An equivalent argument can be made for the other cases.

2 years ago[InstCombine] add code comment for icmp transform; NFC
Sanjay Patel [Fri, 1 Jul 2022 12:21:55 +0000 (08:21 -0400)]
[InstCombine] add code comment for icmp transform; NFC

This was accidentally left out of cc88445a9106

2 years agoAdd some more expected warnings to this C99 DR test
Aaron Ballman [Fri, 1 Jul 2022 12:11:46 +0000 (08:11 -0400)]
Add some more expected warnings to this C99 DR test

This should address the issue found by:
https://lab.llvm.org/buildbot/#/builders/171/builds/16835

2 years agoEnsure that the generic associations aren't redundant
Aaron Ballman [Fri, 1 Jul 2022 11:48:07 +0000 (07:48 -0400)]
Ensure that the generic associations aren't redundant

This should hopefully address the test failure found in:
https://lab.llvm.org/buildbot/#/builders/171/builds/16833

2 years ago[AArch64][SVE] Create AArch64ISD node for DUPQLANE128
Matt Devereau [Thu, 23 Jun 2022 14:58:56 +0000 (14:58 +0000)]
[AArch64][SVE] Create AArch64ISD node for DUPQLANE128

Create an AArch64ISD node instead of emitting machine node DUP_ZZI_Q.
This allows a simpler DAG combine for work previously attempted
in https://reviews.llvm.org/D128503

Differential Revision: https://reviews.llvm.org/D128902

2 years agoFix this C99 DR to be more robust
Aaron Ballman [Fri, 1 Jul 2022 11:33:37 +0000 (07:33 -0400)]
Fix this C99 DR to be more robust

This should fix the following test issue on ARM:
https://lab.llvm.org/buildbot/#/builders/171/builds/16815

2 years ago[VPlan] Move addMetadata to VPTransformState (NFC).
Florian Hahn [Fri, 1 Jul 2022 11:03:24 +0000 (12:03 +0100)]
[VPlan] Move addMetadata to VPTransformState (NFC).

The moved helpers are only used for codegen. It will allow moving the
remaining ::execute implementations out of LoopVectorize.cpp.

Depends on D127966.
Depends on D127965.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D127968

2 years agoRevert "[reland] algorithm_test.cpp"
Guillaume Chatelet [Fri, 1 Jul 2022 10:48:57 +0000 (10:48 +0000)]
Revert "[reland] algorithm_test.cpp"

This reverts commit 1514acb20f404fa3fe0e20f068b1caf763396176.

2 years ago[reland] algorithm_test.cpp
Guillaume Chatelet [Thu, 30 Jun 2022 14:43:52 +0000 (14:43 +0000)]
[reland] algorithm_test.cpp

Removing `-ffreestanding` for the tests should allow us to use `<iostream>`

Differential Revision: https://reviews.llvm.org/D128916

2 years ago[VE][NFC] Correct comment
Kazushi (Jam) Marukawa [Fri, 1 Jul 2022 10:24:33 +0000 (19:24 +0900)]
[VE][NFC] Correct comment

2 years ago[LV] Update test for #56319 to use interleave group.
Florian Hahn [Fri, 1 Jul 2022 10:12:00 +0000 (11:12 +0100)]
[LV] Update test for #56319 to use interleave group.

The original test was over-reduced. It requires an interleave group, so
the last vector iteration of the epilogue vector loop doesn't execute.

2 years ago[flang] File omp_lib.f90 is not a standard intrinsic module
Valentin Clement [Fri, 1 Jul 2022 10:04:19 +0000 (12:04 +0200)]
[flang] File omp_lib.f90 is not a standard intrinsic module

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D128976

Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
2 years ago[lld-macho] Fix left shift of negative value UB
Daniel Bertalan [Fri, 1 Jul 2022 09:47:34 +0000 (11:47 +0200)]
[lld-macho] Fix left shift of negative value UB

I introduced this mistake in 573c7e6b3c79c7ce80a2221e000fab7dd20c0bb4.

Fixes the failure on this UBSan bot:
https://lab.llvm.org/buildbot/#/builders/5/builds/25537

2 years ago[AMDGPU][GFX908][DOC][NFC] Update assembler syntax description
Dmitry Preobrazhensky [Fri, 1 Jul 2022 09:34:59 +0000 (12:34 +0300)]
[AMDGPU][GFX908][DOC][NFC] Update assembler syntax description

Summary of changes:
- Remove dst for global_atomic_add_f32, global_atomic_pk_add_f16.
- Make vdata input-only for buffer_atomic_add_f32, buffer_atomic_pk_add_f16.
- Other minor improvements.

2 years agoRevert rG057db2002bb3: [X86] combineAndnp - constant fold ANDNP(C,X) -> AND(~C,X)
Simon Pilgrim [Fri, 1 Jul 2022 09:36:01 +0000 (10:36 +0100)]
Revert rG057db2002bb3: [X86] combineAndnp - constant fold ANDNP(C,X) -> AND(~C,X)

If the LHS op has a single use then using the more general AND op is likely to allow commutation, load folding, generic folds etc.

Reverted due to reports from @alexfh about it causing an infinite loop (repro still pending).

2 years ago[AMDGPU][GFX940][DOC][NFC] Update assembler syntax description
Dmitry Preobrazhensky [Mon, 27 Jun 2022 16:30:44 +0000 (19:30 +0300)]
[AMDGPU][GFX940][DOC][NFC] Update assembler syntax description

Summary of changes:
- Update SMEM syntax (see https://reviews.llvm.org/D127314).
- Minor improvements.

2 years ago[LV] Add test case for #56319.
Florian Hahn [Fri, 1 Jul 2022 09:09:23 +0000 (10:09 +0100)]
[LV] Add test case for #56319.

Test case for PR56319.

2 years ago[gn build] (manually) port fe66aebd7551 (PseudoCLI)
Nico Weber [Fri, 1 Jul 2022 08:31:35 +0000 (04:31 -0400)]
[gn build] (manually) port fe66aebd7551 (PseudoCLI)

2 years ago[gn build] (manually) port cd2292ef824 (PseudoCXX)
Nico Weber [Wed, 25 May 2022 12:39:29 +0000 (08:39 -0400)]
[gn build] (manually) port cd2292ef824 (PseudoCXX)

This target will be used in the next commit.

2 years ago[clangd] Also mark output arguments of array subscript expressions
Christian Kandeler [Fri, 1 Jul 2022 08:43:23 +0000 (04:43 -0400)]
[clangd] Also mark output arguments of array subscript expressions

... with the "usedAsMutableReference" semantic token modifier.
It's quite unusual to declare the index parameter of a subscript
operator as a non-const reference type, but arguably that makes it even
more helpful to be aware of it when working with such code.

Reviewed By: nridge

Differential Revision: https://reviews.llvm.org/D128892

2 years agoRevert "[FPEnv] Allow CompoundStmt to keep FP options"
Serge Pavlov [Fri, 1 Jul 2022 08:32:56 +0000 (15:32 +0700)]
Revert "[FPEnv] Allow CompoundStmt to keep FP options"

On some buildbots test `ast-print-fp-pragmas.c` fails, need to investigate it.

This reverts commit 0401fd12d4aa0553347fe34d666fb236d8719173.
This reverts commit b822efc7404bf09ccfdc1ab7657475026966c3b2.

2 years ago[flang] Fix for broken/degenerate forall case
Valentin Clement [Fri, 1 Jul 2022 08:36:45 +0000 (10:36 +0200)]
[flang] Fix for broken/degenerate forall case

Fix for broken/degenerate forall case where there is no assignment to an
array under the explicit iteration space. While this is a multiple
assignment, semantics only raises a warning.
The fix is to add a test that the explicit space has any sort of array
to be updated, and if not then the do_loop nest will not require a
terminator to forward array values to the next iteration.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D128973

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2 years ago[fix/build] bazel rule for ParallelCombiningOpInterface
Mikhail Goncharov [Fri, 1 Jul 2022 08:33:48 +0000 (10:33 +0200)]
[fix/build] bazel rule for ParallelCombiningOpInterface

2 years ago[LLDB] Xfail TestStepNoDebug.py AArch64/Windows
Muhammad Omair Javaid [Fri, 1 Jul 2022 08:21:27 +0000 (12:21 +0400)]
[LLDB] Xfail TestStepNoDebug.py AArch64/Windows

LLDB fails to step in/out/over code with missing debug information.
This is only reproducible on AArch64/Windows. I have reported a issue
upstream at llvm.org/pr56292

This patch Xfail TestStepNoDebug.py for AArch64/Windows.

2 years ago[SCEV] pre-commit test case for D127835, NFC
Chen Zheng [Wed, 29 Jun 2022 09:07:23 +0000 (05:07 -0400)]
[SCEV] pre-commit test case for D127835, NFC

2 years agoFix warning on unhandled enumeration value
Serge Pavlov [Fri, 1 Jul 2022 08:17:04 +0000 (15:17 +0700)]
Fix warning on unhandled enumeration value

2 years ago[flang] Add correct number of args for wait
Valentin Clement [Fri, 1 Jul 2022 08:16:09 +0000 (10:16 +0200)]
[flang] Add correct number of args for wait

Add source coordinates to BeginWait and BeginWaitAll calls

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D128970

Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
2 years ago[InstructionSimplify] handle denormal input for fcmp
Chen Zheng [Mon, 27 Jun 2022 13:56:20 +0000 (09:56 -0400)]
[InstructionSimplify] handle denormal input for fcmp

Handle denormal constant input for fcmp instructions based on the
denormal handling mode.

Reviewed By: spatel, dcandler

Differential Revision: https://reviews.llvm.org/D128647

2 years ago[lld-macho] Handle LOH_ARM64_ADRP_LDR linker optimization hints
Daniel Bertalan [Thu, 30 Jun 2022 09:01:18 +0000 (11:01 +0200)]
[lld-macho] Handle LOH_ARM64_ADRP_LDR linker optimization hints

This linker optimization hint transforms a pair of adrp+ldr (immediate)
instructions into an ldr (literal) load from a PC-relative address if
it is 4-byte aligned and within +/- 1 MiB, as ldr can encode a signed
19-bit offset that gets multiplied by 4.

In the wild, only a small number of these hints are applicable because
not many loads end up close enough to the data segment. However, the
added helper functions will be useful in implementing the rest of the
LOH types.

Differential Revision: https://reviews.llvm.org/D128942

2 years ago[FPEnv] Allow CompoundStmt to keep FP options
Serge Pavlov [Mon, 28 Sep 2020 07:32:06 +0000 (14:32 +0700)]
[FPEnv] Allow CompoundStmt to keep FP options

AST does not have special nodes for pragmas. Instead a pragma modifies
some state variables of Sema, which in turn results in modified
attributes of AST nodes. This technique applies to floating point
operations as well. Every AST node that can depend on FP options keeps
current set of them.

This technique works well for options like exception behavior or fast
math options. They represent instructions to the compiler how to modify
code generation for the affected nodes. However treatment of FP control
modes has problems with this technique. Modifying FP control mode
(like rounding direction) usually requires operations on hardware, like
writing to control registers. It must be done prior to the first
operation that depends on the control mode. In particular, such
operations are required for implementation of `pragma STDC FENV_ROUND`,
compiler should set up necessary rounding direction at the beginning of
compound statement where the pragma occurs. As there is no representation
for pragmas in AST, the code generation becomes a complicated task in
this case.

To solve this issue FP options are kept inside CompoundStmt. Unlike to FP
options in expressions, these does not affect any operation on FP values,
but only inform the codegen about the FP options that act in the body of
the statement. As all pragmas that modify FP environment may occurs only
at the start of compound statement or at global level, such solution
works for all relevant pragmas. The options are kept as a difference
from the options in the enclosing compound statement or default options,
it helps codegen to set only changed control modes.

Differential Revision: https://reviews.llvm.org/D123952

2 years ago[ConstExpr] Don't create insertvalue expressions
Nikita Popov [Wed, 29 Jun 2022 08:48:40 +0000 (10:48 +0200)]
[ConstExpr] Don't create insertvalue expressions

In preparation for the removal in D128719, this stops creating
insertvalue constant expressions (well, unless they are directly
used in LLVM IR).

Differential Revision: https://reviews.llvm.org/D128792

2 years ago[mlir][SCF] Add a ParallelCombiningOpInterface to decouple scf::PerformConcurrently...
Nicolas Vasilache [Thu, 30 Jun 2022 10:37:21 +0000 (03:37 -0700)]
[mlir][SCF] Add a ParallelCombiningOpInterface to decouple scf::PerformConcurrently from its contained operations

This allows purging references of scf.ForeachThreadOp and scf.PerformConcurrentlyOp from
ParallelInsertSliceOp.
This will allowmoving the op closer to tensor::InsertSliceOp with which it should share much more
code.

In the future, the decoupling will also allow extending the type of ops that can be used in the
parallel combinator as well as semantics related to multiple concurrent inserts to the same
result.

Differential Revision: https://reviews.llvm.org/D128857

2 years ago[mlir][vector] Untangle TransferWriteDistribution and avoid crashing in the 0-D case.
Nicolas Vasilache [Wed, 29 Jun 2022 08:59:33 +0000 (01:59 -0700)]
[mlir][vector] Untangle TransferWriteDistribution and avoid crashing in the 0-D case.

This revision avoids a crash in the 0-D case of distributing vector.transfer ops out of
vector.warp_execute_on_lane_0.
Due to the code complexity and lack of documentation, it took untangling the implementation
before realizing that the simple fix was to fail in the 0-D case.
The rewrite is still very useful to understand this code better.

Differential Revision: https://reviews.llvm.org/D128793

2 years ago[SCCP] Only handle unknown lattice values in resolvedUndefsIn()
Nikita Popov [Tue, 21 Jun 2022 08:34:41 +0000 (10:34 +0200)]
[SCCP] Only handle unknown lattice values in resolvedUndefsIn()

This is a minor refinement of resolvedUndefsIn(), mostly for clarity.
If the value of an instruction is undef, then that's already a legal
final result -- we can safely rauw such an instruction with undef.
We only need to mark unknown values as overdefined, as that's the
result we get for an instruction that has not been processed because
it has an undef operand.

Differential Revision: https://reviews.llvm.org/D128251

2 years ago[AMDGPU] Add WMMA clang builtins
Piotr Sobczak [Fri, 1 Jul 2022 06:18:09 +0000 (08:18 +0200)]
[AMDGPU] Add WMMA clang builtins

Add WMMA clang builtins and tests. Extra changes in code
are needed to handle function overloads.

WavefrontSize 32:
__builtin_amdgcn_wmma_f32_16x16x16_f16_w32
__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32
__builtin_amdgcn_wmma_f16_16x16x16_f16_w32
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32
__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32
__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32

WavefrontSize 64:
__builtin_amdgcn_wmma_f32_16x16x16_f16_w64
__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64
__builtin_amdgcn_wmma_f16_16x16x16_f16_w64
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64
__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64
__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D128952

2 years ago[AMDGPU] Update WMMA intrinsics with explicit f16 types
Piotr Sobczak [Fri, 1 Jul 2022 06:18:03 +0000 (08:18 +0200)]
[AMDGPU] Update WMMA intrinsics with explicit f16 types

Update intrinsics to use n x f16 and n x i16 instead
of 32-bit types. This may avoid the need for a bitcast
and is probably less confusing.

Depends on making v16f16 and v16i16 types legal.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D128951

2 years ago[NFC] add --match-full-lines to the RUN line
Chen Zheng [Fri, 1 Jul 2022 06:36:23 +0000 (02:36 -0400)]
[NFC] add --match-full-lines to the RUN line

2 years ago[pseudo] Define a clangPseudoCLI library.
Haojian Wu [Tue, 28 Jun 2022 20:37:03 +0000 (22:37 +0200)]
[pseudo] Define a clangPseudoCLI library.

- define a common data structure Language which is a compiled result of the
  bnf grammar. It is defined in Language.h;
- creates a clangPseudoCLI lib which defines a grammar commandline flag and
  expose a function to get the Language. It supports --grammar=cxx,
  --grammmar=/path/to/file.bnf;
- use the clangPseudoCLI in clang-pseudo, fuzzer, and benchmark tools (
  simplify the code and use the prebuilt cxx grammar);

Split out from https://reviews.llvm.org/D127448.

Differential Revision: https://reviews.llvm.org/D128679

2 years ago[flang] Fix APFloat conversion cases
Valentin Clement [Fri, 1 Jul 2022 06:29:19 +0000 (08:29 +0200)]
[flang] Fix APFloat conversion cases

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D128935

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Peter Steinfeld <psteinfeld@nvidia.com>
2 years ago[Inline] don't add noalias metadata for unknown objects.
Chen Zheng [Fri, 24 Jun 2022 12:04:06 +0000 (08:04 -0400)]
[Inline] don't add noalias metadata for unknown objects.

The unidentified objects recognized in `getUnderlyingObjects` may
still alias to the noalias parameter because `getUnderlyingObjects`
may not check deep enough to get the underlying object because of
`MaxLookup`. The real underlying object for the unidentified object
 may still be the noalias parameter.

Originally Patched By: tingwang

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D127202

2 years ago[lldb] Add tests which simulate the various std::string layouts
Pavel Labath [Thu, 21 Apr 2022 08:57:18 +0000 (10:57 +0200)]
[lldb] Add tests which simulate the various std::string layouts

Checking whether a formatter change does not break some of the supported
string layouts is difficult because it requires tracking down and/or
building different versions and build configurations of the library.

The purpose of this patch is to avoid that by providing an in-tree
simulation of the string class. It is a reduced version of the real
string class, obtained by elimitating all non-trivial code, leaving
just the internal data structures used by the data formatter. Different
versions of the class can be simulated through preprocessor defines.

The test (ab)uses the fact that our formatters kick in for any
double-underscore sub-namespace of `std`, so it avoids colliding with
the real string class by declaring the test class in the std::__lldb
namespace.

I do not consider this to be a replacement for the existing data
formatter tests, as producing this kind of a test is not trivial, and it
is easy to make a mistake in the process. However, it's also not
realistic to expect that every person changing the data formatter will
test it against all versions of the real class, so I think it can be
useful as a first line of defence.

Adding support for new layouts can become particularly unwieldy, but
this complexity will also be reflected in the actual code, so if we find
ourselves needing to support too many variants, we may need to start
dropping support for old ones, or come up with a completely different
strategy.

Differential Revision: https://reviews.llvm.org/D124155

2 years ago[lldb/dyld-posix] Avoid reading the module list in inconsistent states
Pavel Labath [Mon, 20 Jun 2022 13:47:27 +0000 (15:47 +0200)]
[lldb/dyld-posix] Avoid reading the module list in inconsistent states

New glibc versions (since 2.34 or including this
<https://github.com/bminor/glibc/commit/ed3ce71f5c64c5f07cbde0ef03554ea8950d8f2c>
patch) trigger the rendezvous breakpoint after they have already added
some modules to the list. This did not play well with our dynamic
loader plugin which was doing a diff of the the reported modules in the
before (RT_ADD) and after (RT_CONSISTENT) states. Specifically, it
caused us to miss some of the modules.

While I think the old behavior makes more sense, I don't think that lldb
is doing the right thing either, as the documentation states that we
should not be expecting a consistent view in the RT_ADD (and RT_DELETE)
states.

Therefore, this patch changes the lldb algorithm to compare the module
list against the previous consistent snapshot. This fixes the previous
issue, and I believe it is more correct in general. It also reduces the
number of times we are fetching the module info, which should speed up
the debugging of processes with many shared libraries.

The change in RefreshModules ensures we don't broadcast the loaded
notification for the dynamic loader (ld.so) module more than once.

Differential Revision: https://reviews.llvm.org/D128264

2 years ago[PATCH] [lldb-server] Skip shared regions for memory allocation
Emre Kultursay [Fri, 1 Jul 2022 05:44:01 +0000 (13:44 +0800)]
[PATCH] [lldb-server] Skip shared regions for memory allocation

Differential Revision: https://reviews.llvm.org/D128832

2 years ago[clang][NFC][tests] dr208.c optional signext handling
Hubert Tong [Fri, 1 Jul 2022 04:03:58 +0000 (00:03 -0400)]
[clang][NFC][tests] dr208.c optional signext handling

Fixes llvm/llvm-project#56325.

2 years ago[mlir][Vector] Fold InsertStridedSliceOp of ExtractStridedSliceOp.
jacquesguan [Thu, 30 Jun 2022 11:24:31 +0000 (19:24 +0800)]
[mlir][Vector] Fold InsertStridedSliceOp of ExtractStridedSliceOp.

This patch supports to fold InsertStridedSliceOp(ExtractStridedSliceOp(dst), dst) to dst.

Differential Revision: https://reviews.llvm.org/D128903

2 years ago[mlir][Vector] Fold InsertStridedSliceOp of two splat with the same input to splat.
jacquesguan [Thu, 30 Jun 2022 08:30:59 +0000 (16:30 +0800)]
[mlir][Vector] Fold InsertStridedSliceOp of two splat with the same input to splat.

This patch folds InsertStridedSliceOp(SplatOp(X):src_type, SplatOp(X):dst_type) to SplatOp(X):dst_type.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D128891

2 years ago[ODRHash diagnostics] Move common code for calculating diag locations in `DiagnoseODR...
Volodymyr Sapsai [Fri, 24 Jun 2022 02:05:50 +0000 (19:05 -0700)]
[ODRHash diagnostics] Move common code for calculating diag locations in `DiagnoseODRMismatch` into a lambda. NFC.

Differential Revision: https://reviews.llvm.org/D128489

2 years agoRemove unneeded cl::ZeroOrMore. NFC
Fangrui Song [Fri, 1 Jul 2022 02:11:27 +0000 (19:11 -0700)]
Remove unneeded cl::ZeroOrMore. NFC

2 years ago[mlir] Remove unneeded cl::ZeroOrMore for ListOption variables. NFC
Fangrui Song [Fri, 1 Jul 2022 02:04:44 +0000 (19:04 -0700)]
[mlir] Remove unneeded cl::ZeroOrMore for ListOption variables. NFC

2 years ago[ODRHash diagnostics] Split `err_module_odr_violation_mismatch_decl_diff` into per...
Volodymyr Sapsai [Thu, 23 Jun 2022 03:18:42 +0000 (20:18 -0700)]
[ODRHash diagnostics] Split `err_module_odr_violation_mismatch_decl_diff` into per-entity diagnostics. NFC.

We'll need to add more cases for Objective-C entities and adding
everything to `err_module_odr_violation_mismatch_decl_diff` makes it
harder to work with over time.

Differential Revision: https://reviews.llvm.org/D128488

2 years ago[mlir][tblgen] Improving error messages
wren romano [Mon, 27 Jun 2022 23:28:45 +0000 (16:28 -0700)]
[mlir][tblgen] Improving error messages

This differential improves two error conditions, by detecting them earlier and by providing better messages to help users understand what went wrong.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D128555

2 years ago[ISel] Match all bits when merge undefs for DAG combine
Xiang1 Zhang [Fri, 1 Jul 2022 01:05:36 +0000 (09:05 +0800)]
[ISel] Match all bits when merge undefs for DAG combine

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D128570

2 years agoRevert "[ISel] Match all bits when merge undef(s) for DAG combine"
Xiang1 Zhang [Fri, 1 Jul 2022 00:59:04 +0000 (08:59 +0800)]
Revert "[ISel] Match all bits when merge undef(s) for DAG combine"

This reverts commit 5fe5aa284efed1ee1492e1f266351b35f0a8bb69.

2 years ago[ISel] Match all bits when merge undef(s) for DAG combine
Xiang1 Zhang [Thu, 30 Jun 2022 11:07:25 +0000 (19:07 +0800)]
[ISel] Match all bits when merge undef(s) for DAG combine

2 years ago[SLP][NFC]Cleanup up operands of the removed insertelements, NFC.
Alexey Bataev [Fri, 1 Jul 2022 00:10:04 +0000 (17:10 -0700)]
[SLP][NFC]Cleanup up operands of the removed insertelements, NFC.

Replace all operands of the insertelement instruction, replaced by
shuffles, by poisons to avoid false-positive reports about incorrect function.

2 years ago[X86] Pre-commit tests for D128769. NFC
Craig Topper [Tue, 28 Jun 2022 23:10:22 +0000 (16:10 -0700)]
[X86] Pre-commit tests for D128769. NFC

2 years ago[RISCV] Avoid repeated code in SelectAddrRegImm. NFC
Craig Topper [Fri, 1 Jul 2022 00:15:55 +0000 (17:15 -0700)]
[RISCV] Avoid repeated code in SelectAddrRegImm. NFC

2 years ago[SVE] Use CPY to zero active lanes of a floating point vector.
Paul Walker [Fri, 24 Jun 2022 08:21:28 +0000 (09:21 +0100)]
[SVE] Use CPY to zero active lanes of a floating point vector.

Patterns exist for the integer case that are trivially expandable
to cover 0.0f.

Differential Revision: https://reviews.llvm.org/D128669

2 years ago[SVE] Extend "and(ipg,cmp(x,y))" patterns to cover the case when y is an immediate.
Paul Walker [Wed, 22 Jun 2022 15:05:17 +0000 (16:05 +0100)]
[SVE] Extend "and(ipg,cmp(x,y))" patterns to cover the case when y is an immediate.

Differential Revision: https://reviews.llvm.org/D128479

2 years ago[BOLT][DWARF] Support mix mode DWARF
Alexander Yermolovich [Thu, 30 Jun 2022 23:50:49 +0000 (16:50 -0700)]
[BOLT][DWARF] Support mix mode DWARF

Added support for mixing monolithic DWARF5 with legacy DWARF, and monolithic legacy and DWARF5 split dwarf.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D128232

2 years ago[runtimes] adds llvm-libgcc to the list of runtimes to be sorted
Christopher Di Bella [Sat, 25 Jun 2022 00:17:30 +0000 (00:17 +0000)]
[runtimes] adds llvm-libgcc to the list of runtimes to be sorted

llvm-libgcc is not a part of `LLVM_ALL_RUNTIMES` because llvm-libgcc is
incompatible with an explicit libunwind and compiler-rt. This meant that
it was being filtered out and not built.

Differential Revision: https://reviews.llvm.org/D128568

2 years ago[MC][Mips] Support .reloc *, BFD_RELOC_{NONE,16,32,64}, *
Fangrui Song [Thu, 30 Jun 2022 23:39:23 +0000 (16:39 -0700)]
[MC][Mips] Support .reloc *, BFD_RELOC_{NONE,16,32,64}, *

... to match most other common architectures which already support BFD_RELOC_*.
BFD_RELOC_NONE provides a generic way indicating a dependency between two
sections and is useful for some instrumentations which encode symbol index
information (e.g. `.cg_profile`).

2 years ago[VE] Support load/store vm regsiters
Kazushi (Jam) Marukawa [Sat, 25 Jun 2022 02:34:08 +0000 (11:34 +0900)]
[VE] Support load/store vm regsiters

Support load/store vm registers to memory location as a first step.
As a next step, support load/store vm registers to stack location.
This patch also adds several regression tests for not only load/store
vm registers but also missing load/store for vr registers.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D128610

2 years ago[Sanitizer][Darwin] Lookup dyld image header via shared cache
Julian Lettner [Thu, 30 Jun 2022 19:13:15 +0000 (12:13 -0700)]
[Sanitizer][Darwin] Lookup dyld image header via shared cache

On macOS 13+, dyld itself has moved into the shared cache.  Looking it
up via vm_region_recurse_64() now causes spins/hangs/crashes.  We use a
different set of dyld APIs to find the image header in the shared cache.

rdar://92131949

Differential Revision: https://reviews.llvm.org/D128936

2 years agoAdds AST matcher for ObjCStringLiteral
Rashmi Mudduluru [Wed, 29 Jun 2022 21:21:42 +0000 (14:21 -0700)]
Adds AST matcher for ObjCStringLiteral

Differential Revision: https://reviews.llvm.org/D128103

2 years ago[RISCV] Remove an unnecessary copy of X0 in selectShiftMask.
Craig Topper [Thu, 30 Jun 2022 22:10:31 +0000 (15:10 -0700)]
[RISCV] Remove an unnecessary copy of X0 in selectShiftMask.

We know which instruction we're emitting so its ok to directly
encode X0 into the instruction. We only need to create a copy when
a constant 0 is selected without context of what instructions uses it.

2 years ago[NFC] Switch a few uses of undef to poison as placeholders for unreachble code
Nuno Lopes [Thu, 30 Jun 2022 22:01:27 +0000 (23:01 +0100)]
[NFC] Switch a few uses of undef to poison as placeholders for unreachble code

2 years agoImprove the formatting of static_assert messages
Corentin Jabot [Wed, 29 Jun 2022 17:13:19 +0000 (19:13 +0200)]
Improve the formatting of static_assert messages

Display 'static_assert failed: message' instead of
'static_assert failed "message"' to be consistent
with other implementations and be slightly more
readable.

Reviewed By: #libc, aaron.ballman, philnik, Mordante

Differential Revision: https://reviews.llvm.org/D128844

2 years ago[fix/build] Fix bazel build rule.
rdzhabarov [Thu, 30 Jun 2022 21:43:09 +0000 (21:43 +0000)]
[fix/build] Fix bazel build rule.

2 years ago[LLDB][NativePDB] Return LLDB_INVALID_ADDRESS in PdbIndex::MakeVirtualAddress when...
Zequan Wu [Thu, 30 Jun 2022 21:32:31 +0000 (14:32 -0700)]
[LLDB][NativePDB] Return LLDB_INVALID_ADDRESS in PdbIndex::MakeVirtualAddress when input is invalid due to missing address info in symbol/public records.

2 years ago[flang] Expand semantics test coverage of collective subroutines
Naje George [Thu, 30 Jun 2022 21:26:54 +0000 (14:26 -0700)]
[flang] Expand semantics test coverage of collective subroutines

Add non-standard conforming calls that violate the intent(inout)
of errmsg argument for co_sum, co_max, co_min, and co_broadcast.
Add non-standard conforming calls that violate the argument
typing of errmsg argument for co_max, co_min, and co_broadcast.
Add standard conforming calls that reorder keyword arguments
for co_sum and co_reduce.

Reviewed By: ktras

Differential Revision: https://reviews.llvm.org/D128468

2 years ago[InstCombine] Changing constant-indexed GEP of GEP to i8* for merging
William Huang [Thu, 30 Jun 2022 18:12:20 +0000 (18:12 +0000)]
[InstCombine] Changing constant-indexed GEP of GEP to i8* for merging

When merging GEP of GEP with constant indices, if the second GEP's offset is not divisible by the first GEP's element size, convert both type to i8* and merge.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D125934