platform/upstream/llvm.git
2 years agoIntrinsics: Mark llvm.eh.sjlj.callsite argument as immarg
Matt Arsenault [Sun, 17 Apr 2022 20:39:02 +0000 (16:39 -0400)]
Intrinsics: Mark llvm.eh.sjlj.callsite argument as immarg

The assert in SelectionDAG implies that it is

2 years agoAArch64/GlobalISel: Add -global-isel-abort=1 to select tests
Matt Arsenault [Mon, 11 Apr 2022 20:04:44 +0000 (16:04 -0400)]
AArch64/GlobalISel: Add -global-isel-abort=1 to select tests

Otherwise the legalizer verifier error isn't triggered since the
default is fallback.

2 years agoGlobalISel: Add LegalizeMutations to help use More/FewerElements
Matt Arsenault [Tue, 12 Apr 2022 02:04:17 +0000 (22:04 -0400)]
GlobalISel: Add LegalizeMutations to help use More/FewerElements

2 years agoAArch64/GlobalISel: Reduce use of getMinClassForRegBank
Matt Arsenault [Sat, 9 Apr 2022 13:49:08 +0000 (09:49 -0400)]
AArch64/GlobalISel: Reduce use of getMinClassForRegBank

getMinClassForRegBank and getRegClassForTypeOnBank were basically
identical functions with different APIs. Consolidate on the version
that uses LLT instead of a bitwidth, since that would be more
appropriate to use in a generic API. Keep getMinClassForRegBank around
for now, since copies are a special case that can't simply read the
type from the register operands.

2 years agoGlobalISel: Add LLT helper to multiply vector sizes
Matt Arsenault [Sat, 9 Apr 2022 14:45:31 +0000 (10:45 -0400)]
GlobalISel: Add LLT helper to multiply vector sizes

2 years agoAArch64/GlobalISel: Remove some null checks for getVRegDef
Matt Arsenault [Sat, 9 Apr 2022 13:11:06 +0000 (09:11 -0400)]
AArch64/GlobalISel: Remove some null checks for getVRegDef

getVRegDef is not allowed to fail for generic virtual registers, so
there's not much point in checking it.

2 years agoAArch64/GlobalISel: Remove asserts on copy instructions
Matt Arsenault [Sat, 9 Apr 2022 13:01:32 +0000 (09:01 -0400)]
AArch64/GlobalISel: Remove asserts on copy instructions

These things are checked in the verifier already, so there's not much
point in re-asserting them here. They aren't directly verified for the
copy-like extension artifacts, but the incorrect output copies would
be caught on the other side.

2 years ago[lldb/gdb-remote] Fix -Wswitch after D116462
Fangrui Song [Wed, 20 Apr 2022 01:01:06 +0000 (18:01 -0700)]
[lldb/gdb-remote] Fix -Wswitch after D116462

2 years agoApply clang-tidy fixes for llvm-twine-local in OpenMPToLLVMIRTranslation.cpp (NFC)
Mehdi Amini [Sat, 16 Apr 2022 08:21:24 +0000 (08:21 +0000)]
Apply clang-tidy fixes for llvm-twine-local in OpenMPToLLVMIRTranslation.cpp (NFC)

2 years ago[CodeGen] Fix -Wswitch after D116462
Fangrui Song [Wed, 20 Apr 2022 00:33:15 +0000 (17:33 -0700)]
[CodeGen] Fix -Wswitch after D116462

2 years ago[CodeGen] Fix -Wswitch after D116462
Fangrui Song [Wed, 20 Apr 2022 00:28:54 +0000 (17:28 -0700)]
[CodeGen] Fix -Wswitch after D116462

2 years ago[DFSan] Print an error before calling null extern_weak functions, incase dfsan instru...
Andrew Browne [Tue, 19 Apr 2022 22:45:28 +0000 (15:45 -0700)]
[DFSan] Print an error before calling null extern_weak functions, incase dfsan instrumentation optimized out a null check.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D124051

2 years ago[msan] Destroy ConstantTokenNone before types above
Vitaly Buka [Tue, 19 Apr 2022 23:29:07 +0000 (16:29 -0700)]
[msan] Destroy ConstantTokenNone before types above

~ConstantTokenNone access them, so it should be destroyed first.

2 years ago[msan] Disable assert with msan
Vitaly Buka [Tue, 19 Apr 2022 23:26:17 +0000 (16:26 -0700)]
[msan] Disable assert with msan

The assert uses data from just destroyed BasicBlock.

2 years ago[msan] Advance before destroying entry
Vitaly Buka [Tue, 19 Apr 2022 23:22:37 +0000 (16:22 -0700)]
[msan] Advance before destroying entry

-fsanitize-memory-use-after-dtor reports this memory access.

2 years ago[SPIR-V](6/6) Add the module analysis pass and the simplest tests
Ilia Diachkov [Thu, 14 Apr 2022 00:46:45 +0000 (03:46 +0300)]
[SPIR-V](6/6) Add the module analysis pass and the simplest tests

This patch adds one SPIRV analysis pass and extends AsmPrinter. It is
essential for minimum SPIR-V output. Also it adds several simplest tests
to show that the target basically works.

Differential Revision: https://reviews.llvm.org/D116465

Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2 years ago[SPIR-V](5/6) Add LegalizerInfo, InstructionSelector and utilities
Ilia Diachkov [Wed, 13 Apr 2022 22:11:15 +0000 (01:11 +0300)]
[SPIR-V](5/6) Add LegalizerInfo, InstructionSelector and utilities

The patch adds SPIRVLegalizerInfo, SPIRVInstructionSelector and
SPIRV-specific utilities.

Differential Revision: https://reviews.llvm.org/D116464

Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2 years ago[SPIR-V](4/6) Add target lowering, TargetMachine and AsmPrinter
Ilia Diachkov [Wed, 13 Apr 2022 22:10:25 +0000 (01:10 +0300)]
[SPIR-V](4/6) Add target lowering, TargetMachine and AsmPrinter

The patch contains target lowering for SPIRV. Also it implements
TargetMachine and AsmPrinter.

Differential Revision: https://reviews.llvm.org/D116463

Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2 years ago[SPIR-V](3/6) Add MC layer, object file support, and InstPrinter
Ilia Diachkov [Wed, 13 Apr 2022 22:10:08 +0000 (01:10 +0300)]
[SPIR-V](3/6) Add MC layer, object file support, and InstPrinter

The patch adds SPIRV-specific MC layer implementation, SPIRV object
file support and SPIRVInstPrinter.

Differential Revision: https://reviews.llvm.org/D116462

Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2 years ago[SPIR-V](2/6) Add SPIRV target description files
Ilia Diachkov [Wed, 13 Apr 2022 22:07:33 +0000 (01:07 +0300)]
[SPIR-V](2/6) Add SPIRV target description files

Differential Revision: https://reviews.llvm.org/D115786

Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2 years ago[SPIR-V](1/6) Add stub for SPIRV backend
Ilia Diachkov [Wed, 13 Apr 2022 22:06:35 +0000 (01:06 +0300)]
[SPIR-V](1/6) Add stub for SPIRV backend

This patch contains enough for lib/Target/SPIRV to compile: a basic
SPIRVTargetMachine and SPIRVTargetInfo.

Differential Revision: https://reviews.llvm.org/D115009

Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2 years agoAMDGPU: More mad_64_32 test cases for multiple uses
Nicolai Hähnle [Tue, 19 Apr 2022 22:34:23 +0000 (17:34 -0500)]
AMDGPU: More mad_64_32 test cases for multiple uses

Also use gfx90a for the gfx9 test, whose code gen should be affected by
faster multiply-add instructions.

2 years ago[PS5] Avoid a driver crash
Paul Robinson [Tue, 19 Apr 2022 22:55:01 +0000 (15:55 -0700)]
[PS5] Avoid a driver crash

In some cases, an error constructing a compiler or assembler job could
leave the Inputs in a state that the code for constructing the linker
job was not ready for.

2 years ago[OpenMP] Add necessary registered targets for linker wrapper test
Joseph Huber [Tue, 19 Apr 2022 22:46:12 +0000 (18:46 -0400)]
[OpenMP] Add necessary registered targets for linker wrapper test

Summary:
The linker wrapper needs to use the registered backend to perform LTO.
This was causing problems on the buildbots that didn't support it.

2 years ago[OpenMP] Fix deleted move constructor failing on some compiles
Joseph Huber [Tue, 19 Apr 2022 22:40:15 +0000 (18:40 -0400)]
[OpenMP] Fix deleted move constructor failing on some compiles

Summary:
A previous commit added some new errors that were not correctly casted
to an r-value. This doesn't work on some compilers.

2 years ago[OpenMP] Add better testing for the linker wrapper
Joseph Huber [Tue, 19 Apr 2022 18:14:16 +0000 (14:14 -0400)]
[OpenMP] Add better testing for the linker wrapper

The linker wrapper is used to perform linking and wrapping of embedded
device object files. Currently its internals are not able to be tested
easily. This patch adds the `--dry-run` and `--print-wrapped-module`
options to investigate the link jobs that will be run along with the
wrapped code that will be created to register the binaries.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D124039

2 years ago[BPF] Fix a bug in BPFMISimplifyPatchable pass
Peter Klausler [Mon, 4 Apr 2022 18:39:51 +0000 (11:39 -0700)]
[BPF] Fix a bug in BPFMISimplifyPatchable pass

LLVM BPF pass SimplifyPatchable is used to do necessary
code conversion for CO-RE operations. When studying bpf
selftest 'exhandler', I found a corner case not handled properly.
The following is the C code, modified from original 'exhandler'
code.
  int g;
  int test(struct t1 *p) {
    struct t2 *q = p->q;
    if (q)
      return 0;
    struct t3 *f = q->f;
    if (!f) g = 5;
    return 0;
  }

For code:
  struct t3 *f = q->f;
  if (!f) ...
The IR before BPFMISimplifyPatchable pass looks like:
  %5:gpr = LD_imm64 @"llvm.t2:0:8$0:1"
  %6:gpr = LDD killed %5:gpr, 0
  %7:gpr = LDD killed %6:gpr, 0
  JNE_ri killed %7:gpr, 0, %bb.3
  JMP %bb.2
Note that compiler knows q = 0 based dataflow and value analysis.
The correct generated code after the pass should be
  %5:gpr = LD_imm64 @"llvm.t2:0:8$0:1"
  %7:gpr = LDD killed %5:gpr, 0
  JNE_ri killed %7:gpr, 0, %bb.3
  JMP %bb.2

But the current implementation did further optimization for the
above code and generates
  %5:gpr = LD_imm64 @"llvm.t2:0:8$0:1"
  JNE_ri killed %5:gpr, 0, %bb.3
  JMP %bb.2
which is incorrect.

This patch added a cache to remember those load insns not associated
with CO-RE offset value and will skip these load insns during
transformation.

Differential Revision: https://reviews.llvm.org/D123883

2 years ago[MLIR] [Python] Add a method to clear live operations map
John Demme [Tue, 19 Apr 2022 22:03:15 +0000 (15:03 -0700)]
[MLIR] [Python] Add a method to clear live operations map

Introduce a method on PyMlirContext (and plumb it through to Python) to
invalidate all of the operations in the live operations map and clear
it. Since Python has no notion of private data, an end-developer could
reach into some 3rd party API which uses the MLIR Python API (that is
behaving correctly with regard to holding references) and grab a
reference to an MLIR Python Operation, preventing it from being
deconstructed out of the live operations map. This allows the API
developer to clear the map when it calls C++ code which could delete
operations, protecting itself from its users.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123895

2 years ago[RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x).
Craig Topper [Tue, 19 Apr 2022 21:38:23 +0000 (14:38 -0700)]
[RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x).

There's an existing generic combine that does this for legal types.
This patch adds a RISCV specific combine for W instructions.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123983

2 years ago[Driver][test] Remove unneeded clang from -cc1 CHECK lines
Fangrui Song [Tue, 19 Apr 2022 21:58:48 +0000 (14:58 -0700)]
[Driver][test] Remove unneeded clang from -cc1 CHECK lines

The convention is to omit "clang" for -cc1 CHECK lines and test that -triple is adjacent to -cc1.

2 years ago[modules] Merge variable template specializations.
Richard Smith [Tue, 19 Apr 2022 21:40:52 +0000 (14:40 -0700)]
[modules] Merge variable template specializations.

2 years ago[BPF] Emit fatal error if out of range for FK_PCRel_2 branch target
Yonghong Song [Fri, 15 Apr 2022 21:57:48 +0000 (14:57 -0700)]
[BPF] Emit fatal error if out of range for FK_PCRel_2 branch target

Currently for the branch insn like
   "if $dst "#OpcodeStr#" $imm goto $BrDst"
The $BrDst range needs to be in the range of [INT16_MIN, INT16_MAX].

When running bpf selftest with latest llvm, I found
pyperf600.o generated insn with range outside
of [INT16_MIN, INT16_MAX], which caused verifier failure.
See below insn #12.

  0000000000000000 <on_event>:
  ; {
         0:       7b 1a 00 ff 00 00 00 00 *(u64 *)(r10 - 256) = r1
  ;       uint64_t pid_tgid = bpf_get_current_pid_tgid();
         1:       85 00 00 00 0e 00 00 00 call 14
         2:       bf 06 00 00 00 00 00 00 r6 = r0
  ;       pid_t pid = (pid_t)(pid_tgid >> 32);
         3:       bf 61 00 00 00 00 00 00 r1 = r6
         4:       77 01 00 00 20 00 00 00 r1 >>= 32
         5:       63 1a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r1
         6:       bf a2 00 00 00 00 00 00 r2 = r10
         7:       07 02 00 00 fc ff ff ff r2 += -4
  ;       PidData* pidData = bpf_map_lookup_elem(&pidmap, &pid);
         8:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
        10:       85 00 00 00 01 00 00 00 call 1
        11:       bf 08 00 00 00 00 00 00 r8 = r0
  ;       if (!pidData)
        12:       15 08 15 e8 00 00 00 00 if r8 == 0 goto -6123 <LBB0_27588+0xffffffffffdae100>
        13:       b4 01 00 00 00 00 00 00 w1 = 0

We may need to add new insn to extend the range of $BrDst.
This patch added a fatal error if out of range so compiler can warn
the otherwise incorrect code generation.

Differential Revision: https://reviews.llvm.org/D123877

2 years ago[gn build] Port bac6cd5bf856
LLVM GN Syncbot [Tue, 19 Apr 2022 21:23:58 +0000 (21:23 +0000)]
[gn build] Port bac6cd5bf856

2 years ago[misexpect] Re-implement MisExpect Diagnostics
Paul Kirth [Fri, 1 Apr 2022 22:03:48 +0000 (22:03 +0000)]
[misexpect] Re-implement MisExpect Diagnostics

Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907

2 years ago[clang][dataflow] Do not crash on missing `Value` for struct-typed variable init.
Yitzhak Mandelbaum [Thu, 14 Apr 2022 13:41:06 +0000 (13:41 +0000)]
[clang][dataflow] Do not crash on missing `Value` for struct-typed variable init.

Remove constraint that an initializing expression of struct type must have an
associated `Value`. This invariant is not and will not be guaranteed by the
framework, because of potentially uninitialized fields.

Differential Revision: https://reviews.llvm.org/D123961

2 years ago[Libomptarget][remote] Fix compile-time error
Atmn Patel [Wed, 16 Feb 2022 23:20:43 +0000 (18:20 -0500)]
[Libomptarget][remote] Fix compile-time error

This fixes a compile-time error recently introduced within the remote
offloading plugin. This patch also removes some extra linker flags that are unnecessary, and adds an explicit abseil linker flag without which we occasionally get problems.

Differential Revision: https://reviews.llvm.org/D119984

2 years ago[gn build] Port c57f03415f96
LLVM GN Syncbot [Tue, 19 Apr 2022 20:13:49 +0000 (20:13 +0000)]
[gn build] Port c57f03415f96

2 years ago[clang][Sema] Add flag to LookupName to force C/ObjC codepath
Alex Langford [Mon, 18 Apr 2022 18:41:20 +0000 (11:41 -0700)]
[clang][Sema] Add flag to LookupName to force C/ObjC codepath

Motivation: The intent here is for use in Swift.
When building a clang module for swift consumption, swift adds an
extension block to the module for name lookup purposes. Swift calls
this a SwiftLookupTable. One purpose that this serves is to handle
conflicting names between ObjC classes and ObjC protocols. They exist in
different namespaces in ObjC programs, but in Swift they would exist in
the same namespace. Swift handles this by appending a suffix to a
protocol name if it shares a name with a class. For example, if you have
an ObjC class named "Foo" and a protocol with the same name, the
protocol would be renamed to "FooProtocol" when imported into swift.

When constructing the previously mentioned SwiftLookupTable, we use
Sema::LookupName to look up name conflicts for the previous problem.
By this time, the Parser has long finished its job so the call to
LookupName gets nullptr for its Scope (TUScope will be nullptr
by this point). The C/ObjC path does not have this problem because it
only uses the Scope in specific scenarios. The C++ codepath uses the
Scope quite extensively and will fail early on if the Scope it gets is
null. In our very specific case of looking up ObjC classes with a
specific name, we want to force sema::LookupName to take the C/ObjC
codepath even if C++ or ObjC++ is enabled.

2 years ago[mlir] Adds getUpperBound() to LoopLikeInterface.
Krzysztof Drewniak [Tue, 8 Mar 2022 18:39:52 +0000 (18:39 +0000)]
[mlir] Adds getUpperBound() to LoopLikeInterface.

getUpperBound is analogous to getLowerBound(), except for the upper
bound, and is used in range analysis.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124020

2 years ago[mlir][transform] Introduce transform.sequence op
Alex Zinenko [Tue, 19 Apr 2022 14:36:37 +0000 (16:36 +0200)]
[mlir][transform] Introduce transform.sequence op

Sequence is an important transform combination primitive that just indicates
transform ops being applied in a row. The simplest version requires fails
immediately if any transformation in the sequence fails. Introducing this
operation allows one to start placing transform IR within other IR.

Depends On D123135

Reviewed By: Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D123664

2 years ago[analyzer] Implemented RangeSet::Factory::castTo function to perform promotions,...
Denys Petrov [Tue, 25 May 2021 15:24:48 +0000 (18:24 +0300)]
[analyzer] Implemented RangeSet::Factory::castTo function to perform promotions, truncations and conversions.

Summary: Handle casts for ranges working similarly to APSIntType::apply function but for the whole range set. Support promotions, truncations and conversions.
Example:
promotion: char [0, 42] -> short [0, 42] -> int [0, 42] -> llong [0, 42]
truncation: llong [42950330884295033130] -> int [65792, 65834] -> short [256, 298] -> char [0, 42]
conversion: char [-42, 42] -> uint [0, 42]U[42949672544294967295] -> short[-42, 42]

Differential Revision: https://reviews.llvm.org/D103094

2 years ago[MLIR] Add function to create BFloat16 array attribute
Ashay Rane [Tue, 19 Apr 2022 18:53:29 +0000 (18:53 +0000)]
[MLIR] Add function to create BFloat16 array attribute

This patch adds a new function `mlirDenseElementsAttrBFloat16Get()`,
which accepts the shaped type, the number of BFloat16 values, and a
pointer to an array of BFloat16 values, each of which is a `uint16_t`
value.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D123981

2 years ago[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls.
Jonas Paulsson [Thu, 14 Apr 2022 08:50:26 +0000 (10:50 +0200)]
[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls.

A new set of overloaded functions named getOrInsertLibFunc() are now supposed
to be used instead of getOrInsertFunction() when building a libcall from
within an LLVM optimizer(). The idea is that this new function also makes
sure that any mandatory argument attributes are added to the function
prototype (after calling getOrInsertFunction()).

inferLibFuncAttributes() is renamed to inferNonMandatoryLibFuncAttrs() as it
only adds attributes that are not necessary for correctness but merely
helping with later optimizations.

Generally, the front end is responsible for building a correct function
prototype with the needed argument attributes. If the middle end however is
the one creating the call, e.g. when replacing one libcall with another, it
then must take this responsibility.

This continues the work of properly handling argument extension if required
by the target ABI when building a lib call. getOrInsertLibFunc() now does
this for all libcalls currently built by any LLVM optimizer. It is expected
that when in the future a new optimization builds a new libcall with an
integer argument it is to be added to getOrInsertLibFunc() with the proper
handling. Note that not all targets have it in their ABI to sign/zero extend
integer arguments to the full register width, but this will be done
selectively as determined by getExtAttrForI32Param().

Review: Eli Friedman, Nikita Popov, Dávid Bolvanský

Differential Revision: https://reviews.llvm.org/D123198

2 years ago[InstCombine] C0 shift (X add nuw C) --> (C0 shift C) shift X
Sanjay Patel [Tue, 19 Apr 2022 19:09:45 +0000 (15:09 -0400)]
[InstCombine] C0 shift (X add nuw C) --> (C0 shift C) shift X

With 'nuw' we can convert the increment of the shift amount
into a pre-shift (constant fold) of the shifted constant:
https://alive2.llvm.org/ce/z/FkTyR2

Fixes issue #41976

2 years ago[InstCombine] add tests for shift-of-add with constants; NFC
Sanjay Patel [Tue, 19 Apr 2022 18:48:51 +0000 (14:48 -0400)]
[InstCombine] add tests for shift-of-add with constants; NFC

2 years ago[ASan] Removed checks if the tested functions were emitted.
Kirill Stoimenov [Tue, 19 Apr 2022 19:00:41 +0000 (19:00 +0000)]
[ASan] Removed checks if the tested functions were emitted.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D124030

2 years ago[NFC][SLP] Improved description of getShallowScore() and getScoreAtLevelRec()
Vasileios Porpodas [Tue, 19 Apr 2022 17:59:55 +0000 (10:59 -0700)]
[NFC][SLP] Improved description of getShallowScore() and getScoreAtLevelRec()

Differential Revision: https://reviews.llvm.org/D124027

2 years ago[CUDA][HIP] Fix delete operator for -fopenmp
Yaxun (Sam) Liu [Tue, 19 Apr 2022 02:21:47 +0000 (22:21 -0400)]
[CUDA][HIP] Fix delete operator for -fopenmp

When new operator is called in OpenMP parallel region,
delete operator is resolved and checked. Due to similar
issue fixed by https://reviews.llvm.org/D121765,
when resolving delete operator, the caller was not
determined correctly, which results in error as
shown in https://godbolt.org/z/jKhd8qKos.

This patch fixes the issue in a similar way as
https://reviews.llvm.org/D121765

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D123976

2 years ago[IRSim] Ignore debug instructions when creating canonical numbering
Andrew Litteken [Sat, 16 Apr 2022 21:11:39 +0000 (16:11 -0500)]
[IRSim] Ignore debug instructions when creating canonical numbering

When constructing canonical relationships between two regions, the first instruction of a basic block from the first region is used to find the corresponding basic block from the second region. However, debug instructions are not included in similarity matching, and therefore do not have a canonical numbering. This patch makes sure to ignore the debug instructions when finding the first instruction in a basic block.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D123903

2 years ago[Go] Remove PopulateLTOPassManager binding after D123882
Fangrui Song [Tue, 19 Apr 2022 18:16:27 +0000 (11:16 -0700)]
[Go] Remove PopulateLTOPassManager binding after D123882

2 years ago[compiler-rt] Use ld64 flag -lto_library instead of DYLD_LIBRARY_PATH
Nico Weber [Tue, 19 Apr 2022 17:29:44 +0000 (13:29 -0400)]
[compiler-rt] Use ld64 flag -lto_library instead of DYLD_LIBRARY_PATH

Makes

 bin/llvm-lit \
  projects/compiler-rt/test/profile/Profile-arm64/instrprof-darwin-dead-strip.c

pass on my machine.

Without this change, ld64 complains that the bitcode was generated by LLVM 15
while the reader is 13.1 -- the version of Xcode on my machine. Looks like the
DYLD_LIBRARY_PATH technique isn't working.

-lto_library was added back in ld64-136, which was in Xcode 4.6, which was
released over 10 years ago. So relying on it should be safe by now.

Differential Revision: https://reviews.llvm.org/D124018

2 years agoPrint custom assembly on pass failure by default
Mehdi Amini [Tue, 19 Apr 2022 17:26:33 +0000 (17:26 +0000)]
Print custom assembly on pass failure by default

The printer is now resilient to invalid IR and will already automatically
fallback to the generic form on invalid IR. Using the generic printer on
pass failure was a conservative option before the printer was made
failsafe.

Reviewed By: lattner, rriddle, jpienaar, bondhugula

Differential Revision: https://reviews.llvm.org/D123915

2 years ago[clangd] Dont include version string in update tasks
Kadir Cetinkaya [Tue, 19 Apr 2022 16:21:03 +0000 (18:21 +0200)]
[clangd] Dont include version string in update tasks

This increases cardinality of span latency metrics. Currently this was
being shown to the user via file status updates as `Running Update (x)` after
this change we'll only display `Running Update`. This also affects logs in case
of a crash, but contents and version number for inputs are printed separately in
that case already.

Differential Revision: https://reviews.llvm.org/D124013

2 years agoApply clang-tidy fixes for llvm-qualified-auto in OpenMPToLLVMIRTranslation.cpp ...
Mehdi Amini [Sat, 16 Apr 2022 08:20:41 +0000 (08:20 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in OpenMPToLLVMIRTranslation.cpp (NFC)

2 years agoApply clang-tidy fixes for performance-unnecessary-value-param in ControlFlowInterfac...
Mehdi Amini [Sat, 16 Apr 2022 08:06:25 +0000 (08:06 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in ControlFlowInterfaces.cpp (NFC)

2 years ago[InstCombine] add tests for freeze of partial undef vector constants; NFC
Sanjay Patel [Mon, 18 Apr 2022 20:45:25 +0000 (16:45 -0400)]
[InstCombine] add tests for freeze of partial undef vector constants; NFC

2 years ago[OCaml] Fix pass builder test
Nikita Popov [Tue, 19 Apr 2022 16:34:31 +0000 (18:34 +0200)]
[OCaml] Fix pass builder test

The LTO API has been removed.

2 years ago[Test] Add more tests showing duplicate PHIs generated by RS4GC (NFC)
Dmitry Makogon [Tue, 19 Apr 2022 15:53:57 +0000 (22:53 +0700)]
[Test] Add more tests showing duplicate PHIs generated by RS4GC (NFC)

This adds more tests with derived pointers.

2 years ago[PPCGCodeGeneration] Look for function instead of function pointer type
Nikita Popov [Tue, 19 Apr 2022 15:59:34 +0000 (17:59 +0200)]
[PPCGCodeGeneration] Look for function instead of function pointer type

What this code is actually interested in are references to functions.
Use of a function pointer type is being used as an imprecise proxy
for that.

2 years ago[PPCGCodeGeneration] Avoid another pointer element type access
Nikita Popov [Tue, 19 Apr 2022 15:25:47 +0000 (17:25 +0200)]
[PPCGCodeGeneration] Avoid another pointer element type access

Use an API that returns both the address and the element type,
and use that for the load type.

2 years ago[PerfectShuffle] Remove unused variables from D123386. NFC
David Green [Tue, 19 Apr 2022 15:22:04 +0000 (16:22 +0100)]
[PerfectShuffle] Remove unused variables from D123386. NFC

2 years ago[VPlan] Remove unused SCEV forward declaration (NFC).
Florian Hahn [Tue, 19 Apr 2022 15:16:17 +0000 (17:16 +0200)]
[VPlan] Remove unused SCEV forward declaration (NFC).

2 years ago[PPCGCodeGeneration] Avoid pointer element type access
Nikita Popov [Tue, 19 Apr 2022 15:09:11 +0000 (17:09 +0200)]
[PPCGCodeGeneration] Avoid pointer element type access

Pass through the ArrayTy instead.

2 years ago[ASan] Fixed a reporting bug in (load|store)N functions which would print unknown...
Kirill Stoimenov [Mon, 18 Apr 2022 23:36:06 +0000 (23:36 +0000)]
[ASan] Fixed a reporting bug in (load|store)N functions which would print unknown-crash instead of the proper error message when a the data access is unaligned.

Reviewed By: kda, eugenis

Differential Revision: https://reviews.llvm.org/D123643

2 years ago[SystemZ] Handle SystemZ specific inline assembly address operands.
Jonas Paulsson [Tue, 22 Mar 2022 09:40:18 +0000 (10:40 +0100)]
[SystemZ] Handle SystemZ specific inline assembly address operands.

Handle ZQ, ZR, ZS and ZT inline assembly operand constraints.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D110267

2 years ago[analyzer] Expose Taint.h to plugins
Tom Ritter [Tue, 19 Apr 2022 14:55:01 +0000 (16:55 +0200)]
[analyzer] Expose Taint.h to plugins

Reviewed By: NoQ, xazax.hun, steakhal

Differential Revision: https://reviews.llvm.org/D123155

2 years ago[llvm-ar][test] Rename two tests and use correct thin command
gbreynoo [Tue, 19 Apr 2022 14:10:51 +0000 (15:10 +0100)]
[llvm-ar][test] Rename two tests and use correct thin command

Two tests used the term "full archive" rather than "regular", these have
been updated including the test names. They now also use --thin rather
than the deprecated T. This change was made in preparation of D123142.

Differential Revision: https://reviews.llvm.org/D123778

2 years ago[clang] Adding Platform/Architecture Specific Resource Header Installation Targets
Qiongsi Wu [Tue, 19 Apr 2022 14:08:57 +0000 (10:08 -0400)]
[clang] Adding Platform/Architecture Specific Resource Header Installation Targets

The goal of this patch is to improve distribution build's flexibility to include only applicable header files.

Currently, the clang-resource-headers target contains nearly all the files in clang/lib/Headers. Most of these files are platform specific (e.g. immintrin.h is x86 specific). A distribution build will have to either include all the headers for all the platforms, or not include any headers. For example, if a distribution build for powerpc includes the clang-resource-headers target, it will include all the x86 specific headers, even-though the x86 specific headers cannot be used.

This patch breaks up the clang-resource-headers list to a core list and platform specific lists. With the patch, a distribution build can now include the ppc-resource-headers to include the headers applicable to the powerpc platform.

Specifically, one can now have

cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;ppc-resource-headers" ... ../llvm
ninja install-distribution then installs the powerpc headers.

Similarly, one can do

cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;x86-resource-headers" ... ../llvm
to include headers applicable to the x86 platform in a distribution installation.

To implement this behaviour, the patch does two things:
* It breaks up the long files header file list to a core list and platform specific lists.
* It adds numerous platform specific installation targets.

Differential Revision: https://reviews.llvm.org/D123498

2 years ago[clang][AArch64] Remove BTI after setjmp from release notes
David Spickett [Tue, 19 Apr 2022 13:48:26 +0000 (13:48 +0000)]
[clang][AArch64] Remove BTI after setjmp from release notes

This is now going into 14.0.2 as
571c7d8f6dae1a8797ae3271c0c09fc648b1940b so will not be
new in clang-15.

2 years ago[AArch64] Add lane moves to PerfectShuffle tables
David Green [Tue, 19 Apr 2022 13:49:50 +0000 (14:49 +0100)]
[AArch64] Add lane moves to PerfectShuffle tables

This teaches the perfect shuffle tables about lane inserts, that can
help reduce the cost of many entries. Many of the shuffle masks are
one-away from being correct, and a simple lane move can be a lot simpler
than trying to use ext/zip/etc. Because they are not exactly like the
other masks handled in the perfect shuffle tables, they require special
casing to generate them, with a special InsOp Operator.

The lane to insert into is encoded as the RHSID, and the move from is
grabbed from the original mask. This helps reduce the maximum perfect
shuffle entry cost to 3, with many more shuffles being generatable in a
single instruction.

Differential Revision: https://reviews.llvm.org/D123386

2 years ago[SLP][NFC]Add a test for reducing same values, NFC.
Alexey Bataev [Tue, 19 Apr 2022 13:48:21 +0000 (06:48 -0700)]
[SLP][NFC]Add a test for reducing same values, NFC.

2 years agoRevert "[SLP]Improve reductions analysis and emission, part 1."
Alexey Bataev [Tue, 19 Apr 2022 12:36:23 +0000 (05:36 -0700)]
Revert "[SLP]Improve reductions analysis and emission, part 1."

This reverts commit 0e1f4d4d3cb08ff84df5adc4f5e41d0a2cebc53d to fix
a crash reported in PR54976

2 years ago[clangd] IncludeCleaner: Add filtering mechanism
Kirill Bobyrev [Tue, 19 Apr 2022 12:56:21 +0000 (14:56 +0200)]
[clangd] IncludeCleaner: Add filtering mechanism

This introduces filtering out inclusions based on the resolved path. This
mechanism will be important for disabling warnings for headers that we can not
diagnose correctly yet.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D123488

2 years ago[OpenMP][Docs] Remove old 14.0 release information
Joseph Huber [Tue, 19 Apr 2022 12:45:51 +0000 (08:45 -0400)]
[OpenMP][Docs] Remove old 14.0 release information

Summary:
This patch removes the OpenMP sections in the release notes. These will
be filled once the release is close and implementations are finalized.

2 years ago[OpenMP] Make Xopenmp-target args compile-only to silence warnings
Joseph Huber [Tue, 19 Apr 2022 11:47:33 +0000 (07:47 -0400)]
[OpenMP] Make Xopenmp-target args compile-only to silence warnings

Summary:
Previously we needed the `Xopenmp-target=` option during the linking
phase so the old offloading driver knew which items to extract and link
for the device. Now that the new driver has become the default this is
no longer necessary and will cause a warning to be emitted for the
unused argument. This should be silenced to avoid noise.

2 years ago[MLIR][GPU] Add canonicalizer for gpu.memcpy
Arnab Dutta [Tue, 19 Apr 2022 11:08:06 +0000 (16:38 +0530)]
[MLIR][GPU] Add canonicalizer for gpu.memcpy

Fold away gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D121279

2 years ago[AArch64] Only mark cost 1 perfect shuffles as legal
David Green [Tue, 19 Apr 2022 11:58:55 +0000 (12:58 +0100)]
[AArch64] Only mark cost 1 perfect shuffles as legal

The perfect shuffle tables encode a cost of either 0 (a nop-copy) or 1
(a single instruction) with a cost encoding of 0 in the upper 2 bits.
All perfect shuffles with any cost are then marked as legal shuffles
though (the maximum encoded cost is 3), which can confuse the DAG
combiner into thinking the shuffles are cheaper than the should be.

Limiting legal shuffles to single instructions seems to do better in
most case, producing less instructions for complex shuffles. There are
some cases that now become tbl, which may be better or worse depending
on whether the instruction is in a loop and the tbl load can be hoisted
out.

Differential Revision: https://reviews.llvm.org/D123377

2 years agoRevert "[Concepts] Fix overload resolution bug with constrained candidates"
Roy Jacobson [Tue, 19 Apr 2022 11:51:21 +0000 (07:51 -0400)]
Revert "[Concepts] Fix overload resolution bug with constrained candidates"

This reverts commit 454d1df9423c95e54c3a2f5cb58d864096032d09.

2 years ago[VPlan] Expand induction step in VPlan pre-header.
Florian Hahn [Tue, 19 Apr 2022 11:06:39 +0000 (13:06 +0200)]
[VPlan] Expand induction step in VPlan pre-header.

This patch moves SCEV expansion of steps used by
VPWidenIntOrFpInductionRecipes to the pre-header using
VPExpandSCEVRecipe. This ensures that those steps are expanded while the
CFG is in a valid state. Previously, SCEV expansion may happen during
vector body code-generation, during which the CFG may be invalid,
causing issues with SCEV expansion.

Depends on D122095.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D122096

2 years ago[AArch64] Cost all perfect shuffles entries as cost 1
David Green [Tue, 19 Apr 2022 11:05:05 +0000 (12:05 +0100)]
[AArch64] Cost all perfect shuffles entries as cost 1

A brief introduction to perfect shuffles - AArch64 NEON has a number of
shuffle operations - dups, zips, exts, movs etc that can in some way
shuffle around the lanes of a vector. Given a shuffle of size 4 with 2
inputs, some shuffle masks can be easily codegen'd to a single
instruction. A <0,0,1,1> mask for example is a zip LHS, LHS. This is
great, but some masks are not so simple, like a <0,0,1,2>. It turns out
we can generate that from zip LHS, <0,2,0,2>, having generated
<0,2,0,2> from uzp LHS, LHS, producing the result in 2 instructions.

It is not obvious from a given mask how to get there though. So we have
a simple program (PerfectShuffle.cpp in the util folder) that can scan
through all combinations of 4-element vectors and generate the perfect
combination of results needed for each shuffle mask (for some definition
of perfect). This is run offline to generate a table that is queried for
generating shuffle instructions. (Because the table could get quite big,
it is limited to 4 element vectors).

In the perfect shuffle tables zip, unz and trn shuffles were being cost
as 2, which is higher than needed and skews the perfect shuffle tables
to create inefficient combinations. This sets them to 1 and regenerates
the tables. The codegen will usually be better and the costs should be
more precise (but it can get less second-order re-use of values from
multiple shuffles, these cases should be fixed up in subsequent patches.

Differential Revision: https://reviews.llvm.org/D123379

2 years agoFix SLP score for out of order contiguous loads
Alban Bridonneau [Tue, 19 Apr 2022 10:23:44 +0000 (11:23 +0100)]
Fix SLP score for out of order contiguous loads

SLP uses the distance between pointers to optimize
the getShallowScore. However the current code misses
the case where we are trying to vectorize for VF=4, and the distance
between pointers is 2. In that case the returned score
reflects the case of contiguous loads, when it's not actually
contiguous.

The attached unit tests have 5 loads, where the program order
is not the same as the offset order in the GEPs. So, the choice
of which 4 loads to bundle together matters. If we pick the
first 4, then we can vectorize with VF=4. If we pick the
last 4, then we can only vectorize with VF=2.

This patch makes a more conservative choice, to consider
all distances>1 to not be a case of contiguous load, and
give those cases a lower score.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D123516

2 years ago[AMDGPU][MC] Corrected error message "image data size does not match dmask and tfe"
Dmitry Preobrazhensky [Tue, 19 Apr 2022 10:52:58 +0000 (13:52 +0300)]
[AMDGPU][MC] Corrected error message "image data size does not match dmask and tfe"

Differential Revision: https://reviews.llvm.org/D123929

2 years ago[analyzer] Remove HasAlphaDocumentation tablegen enum value
Balazs Benics [Tue, 19 Apr 2022 10:14:27 +0000 (12:14 +0200)]
[analyzer] Remove HasAlphaDocumentation tablegen enum value

D121387 simplified the doc url generation process, so we no longer need
the HasAlphaDocumentation enum entry. This patch removes that.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D121459

2 years ago[analyzer] ClangSA should tablegen doc urls refering to the main doc page
Balazs Benics [Tue, 19 Apr 2022 10:14:27 +0000 (12:14 +0200)]
[analyzer] ClangSA should tablegen doc urls refering to the main doc page

AFAIK we should prefer
https://clang.llvm.org/docs/analyzer/checkers.html to
https://clang-analyzer.llvm.org/{available_checks,alpha_checks}.html

This patch will ensure that the doc urls produced by tablegen for the
ClangSA, will use the new url. Nothing else will be changed.

Reviewed By: martong, Szelethus, ASDenysPetrov

Differential Revision: https://reviews.llvm.org/D121387

2 years ago[analyzer] Turn missing tablegen doc entry of a checker into fatal error
Balazs Benics [Tue, 19 Apr 2022 10:14:27 +0000 (12:14 +0200)]
[analyzer] Turn missing tablegen doc entry of a checker into fatal error

It turns out all checkers explicitly mention the `Documentation<>`.
It makes sense to demand this, so emit a fatal tablegen error if such
happens.

Reviewed By: martong, Szelethus

Differential Revision: https://reviews.llvm.org/D122244

2 years ago[analyzer][NFC] Introduce the checker package separator character
Balazs Benics [Tue, 19 Apr 2022 10:14:27 +0000 (12:14 +0200)]
[analyzer][NFC] Introduce the checker package separator character

Reviewed By: martong, ASDenysPetrov

Differential Revision: https://reviews.llvm.org/D122243

2 years ago[lldb] Handle empty search string in "memory find"
David Spickett [Thu, 14 Apr 2022 14:06:27 +0000 (14:06 +0000)]
[lldb] Handle empty search string in "memory find"

Given that you'd never find empty string, just error.

Also add a test that an invalid expr generates an error.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D123793

2 years ago[OpenCL] opencl-c.h: Add const to get_image_num_samples
Sven van Haastregt [Tue, 19 Apr 2022 09:16:44 +0000 (10:16 +0100)]
[OpenCL] opencl-c.h: Add const to get_image_num_samples

Align with the `-fdeclare-opencl-builtins` option and other
get_image_* builtins which have the const attribute.

Differential Revision: https://reviews.llvm.org/D122728

2 years ago[mlir][emitc] Add test for invalid type
Marius Brehler [Mon, 11 Apr 2022 13:09:21 +0000 (13:09 +0000)]
[mlir][emitc] Add test for invalid type

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D123503

2 years ago[Concepts] Fix overload resolution bug with constrained candidates
Roy Jacobson [Fri, 15 Apr 2022 15:58:11 +0000 (11:58 -0400)]
[Concepts] Fix overload resolution bug with constrained candidates

When doing overload resolution, we have to check that candidates' parameter types are equal before trying to find a better candidate through checking which candidate is more constrained.
This revision adds this missing check and makes us diagnose those cases as ambiguous calls when the types are not equal.

Fixes GitHub issue https://github.com/llvm/llvm-project/issues/53640

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D123182

2 years ago[AMDGPU] Select d16 stores even when sramecc is enabled
Jay Foad [Tue, 22 Jun 2021 12:06:02 +0000 (13:06 +0100)]
[AMDGPU] Select d16 stores even when sramecc is enabled

The sramecc feature changes the behaviour of d16 loads so they do not
preserve the unused 16 bits of the result register, but it has no impact
on d16 stores, so we should make use of them even when the feature is
enabled.

Differential Revision: https://reviews.llvm.org/D104912

2 years ago[clang][lexer] Allow u8 character literal prefixes in C2x
Timm Bäder [Tue, 8 Feb 2022 09:13:11 +0000 (10:13 +0100)]
[clang][lexer] Allow u8 character literal prefixes in C2x

Implement N2418 for C2x.

Differential Revision: https://reviews.llvm.org/D119221

2 years ago[Support] Optimize (.*) regex matches
Nikita Popov [Thu, 14 Apr 2022 09:49:35 +0000 (11:49 +0200)]
[Support] Optimize (.*) regex matches

If capturing groups are used, the regex matcher handles something
like `(.*)suffix` by first doing a maximal match of `.*`, trying to
match `suffix` afterward, and then reducing the maximal stop
position one by one until this finally succeeds. This makes the
match quadratic in the length of the line (with large constant factors).

This is particularly problematic because regexes of this form are
ubiquitous in FileCheck (something like `[[VAR:%.*]] = ...` falls
in this category), making FileCheck executions much slower than
they have any right to be.

This implements a very crude optimization that checks if suffix
starts with a fixed character, and steps back to the last occurrence
of that character, instead of stepping back by one character at a
time. This drops FileCheck time on
clang/test/CodeGen/RISCV/rvv-intrinsics/vloxseg_mask.c from
7.3 seconds to 2.7 seconds.

An obvious further improvement would be to check more than one
character (once again, this is particularly relevant for FileCheck,
because the next character is usually a space, which happens to
have many occurrences).

This should help with https://github.com/llvm/llvm-project/issues/54821.

2 years ago[mlir][interfaces] Fix infinite loop in insideMutuallyExclusiveRegions
Matthias Springer [Tue, 19 Apr 2022 07:21:08 +0000 (16:21 +0900)]
[mlir][interfaces] Fix infinite loop in insideMutuallyExclusiveRegions

This function was missing a termination condition.

2 years agoApply clang-tidy fixes for performance-unnecessary-value-param in JitRunner.cpp ...
Mehdi Amini [Sat, 16 Apr 2022 08:04:56 +0000 (08:04 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in JitRunner.cpp (NFC)

2 years agoApply clang-tidy fixes for performance-for-range-copy in MemRefOps.cpp (NFC)
Mehdi Amini [Sat, 16 Apr 2022 07:43:24 +0000 (07:43 +0000)]
Apply clang-tidy fixes for performance-for-range-copy in MemRefOps.cpp (NFC)

2 years ago[NFC] Remove unused variable
Chuanqi Xu [Tue, 19 Apr 2022 07:12:44 +0000 (15:12 +0800)]
[NFC] Remove unused variable

2 years ago[mlir][interfaces] Add helpers for detecting recursive regions
Matthias Springer [Tue, 19 Apr 2022 07:12:40 +0000 (16:12 +0900)]
[mlir][interfaces] Add helpers for detecting recursive regions

Add helper functions to check if an op may be executed multiple times based on RegionBranchOpInterface.

Differential Revision: https://reviews.llvm.org/D123789

2 years ago[RISCV] Fix lowering of BUILD_VECTORs as VID sequences
Fraser Cormack [Thu, 14 Apr 2022 12:09:09 +0000 (13:09 +0100)]
[RISCV] Fix lowering of BUILD_VECTORs as VID sequences

This patch fixes a bug when lowering BUILD_VECTOR via VID sequences.
After adding support for fractional steps in D106533, elements with zero
steps may be skipped if no step has yet been computed. This allowed
certain sequences to slip through the cracks, being identified as VID
sequences when in fact they are not.

The fix for this is to perform a second loop over the BUILD_VECTOR to
validate the entire sequence once the step has been computed. This isn't
the most efficient, but on balance the code is more readable and
maintainable than doing back-validation during the first loop.

Fixes the tests introduced in D123785.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123786

2 years ago[RISCV] Add tests showing incorrect BUILD_VECTOR lowering
Fraser Cormack [Thu, 14 Apr 2022 12:03:56 +0000 (13:03 +0100)]
[RISCV] Add tests showing incorrect BUILD_VECTOR lowering

These tests both use vector constants misidentified as VID sequences.
Because the initial run of elements has a zero step, the elements are
skipped until such a step can be identified. The bug is that the skipped
elements are never validated, even though the computed step is
incompatible across the entire sequence.

A fix will follow in a subseqeuent patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123785