platform/upstream/llvm.git
3 years ago[mlir][linalg] fixing hard-coded variable names in a test (NFC)
Tobias Gysi [Mon, 12 Apr 2021 09:35:26 +0000 (09:35 +0000)]
[mlir][linalg] fixing hard-coded variable names in a test (NFC)

The patch fixes hard-coded variable names in the vector-to-loops test.

3 years ago[AMDGPU][MC][NFC] Removed extra spaces
Dmitry Preobrazhensky [Mon, 12 Apr 2021 10:30:29 +0000 (13:30 +0300)]
[AMDGPU][MC][NFC] Removed extra spaces

Fixed bugs 49646, 49647.

Differential Revision: https://reviews.llvm.org/D100173

3 years ago[IR] Fix Wdocumentation warning. NFCI.
Simon Pilgrim [Mon, 12 Apr 2021 10:20:32 +0000 (11:20 +0100)]
[IR] Fix Wdocumentation warning. NFCI.

3 years ago[AArch64] ACLE: Fix issue for mismatching enum types with builtins.
Sander de Smalen [Mon, 12 Apr 2021 08:49:00 +0000 (09:49 +0100)]
[AArch64] ACLE: Fix issue for mismatching enum types with builtins.

This patch fixes an issue with the SVE prefetch and qinc/qdec intrinsics
that take an `enum` argument, but where the builtin prototype encodes
these as `int`. Some code in SemaDecl found the mismatch and chose
to forget about the builtin altogether, which meant that any future
code using that builtin would fail. The code that forgets about the
builtin was actually obsolete after D77491 and should have been removed.
This patch now removes that code.

This patch also fixes another issue with the SVE prefetch intrinsic
when built with C++, where the builtin didn't accept the correct
pointer type, which should be `const void *`.

Reviewed By: tambre

Differential Revision: https://reviews.llvm.org/D100046

3 years ago[AMDGPU] Fix ubsan error
Sebastian Neubauer [Mon, 12 Apr 2021 10:10:32 +0000 (12:10 +0200)]
[AMDGPU] Fix ubsan error

The RegScavenger can be null sometimes, so a pointer is needed.

Fixes UBSan error introduced in f9a8c6a0e505.

3 years ago[LLDB] Fix buildbots breakage due to TestGuessLanguage.py
Muhammad Omair Javaid [Mon, 12 Apr 2021 10:10:41 +0000 (15:10 +0500)]
[LLDB] Fix buildbots breakage due to TestGuessLanguage.py

Fix LLDB buidbot breakage due to D99250.

Differential Revision: https://reviews.llvm.org/D99250

3 years ago[AMDGPU] Fix saving fp and bp
Sebastian Neubauer [Mon, 12 Apr 2021 09:47:16 +0000 (11:47 +0200)]
[AMDGPU] Fix saving fp and bp

Spilling the fp or bp to scratch could overwrite VGPRs of inactive
lanes. Fix that by using only the active lanes of the scavenged VGPR.

This builds on the assumptions that
1. a function is never called with exec=0
2. lanes do not die in a function, i.e. exec!=0 in the function epilog
3. no new lanes are active when exiting the function, i.e. exec in the
   epilog is a subset of exec in the prolog.

Differential Revision: https://reviews.llvm.org/D96869

3 years ago[AMDGPU] Autogenerate test. NFC
Sebastian Neubauer [Mon, 12 Apr 2021 09:51:28 +0000 (11:51 +0200)]
[AMDGPU] Autogenerate test. NFC

3 years ago[AMDGPU] Unify spill code
Sebastian Neubauer [Mon, 12 Apr 2021 09:19:04 +0000 (11:19 +0200)]
[AMDGPU] Unify spill code

Instead of reimplementing spilling in prolog and epilog, reuse
buildSpillLoadStore.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D99269

3 years ago[AMDGPU] Save VGPR of whole wave when spilling
Sebastian Neubauer [Mon, 12 Apr 2021 08:25:54 +0000 (10:25 +0200)]
[AMDGPU] Save VGPR of whole wave when spilling

Spilling SGPRs to scratch uses a temporary VGPR. LLVM currently cannot
determine if a VGPR is used in other lanes or not, so we need to save
all lanes of the VGPR. We even need to save the VGPR if it is marked as
dead.

The generated code depends on two things:
- Can we scavenge an SGPR to save EXEC?
- And can we scavenge a VGPR?

If we can scavenge an SGPR, we
- save EXEC into the SGPR
- set the needed lane mask
- save the temporary VGPR
- write the spilled SGPR into VGPR lanes
- save the VGPR again to the target stack slot
- restore the VGPR
- restore EXEC

If we were not able to scavenge an SGPR, we do the same operations, but
everytime the temporary VGPR is written to memory, we
- write VGPR to memory
- flip exec (s_not exec, exec)
- write VGPR again (previously inactive lanes)

Surprisingly often, we are able to scavenge an SGPR, even though we are
at the brink of running out of SGPRs.
Scavenging a VGPR does not have a great effect (saves three instructions
if no SGPR was scavenged), but we need to know if the VGPR we use is
live before or not, otherwise the machine verifier complains.

Differential Revision: https://reviews.llvm.org/D96336

3 years ago[OpenCL] Accept .rgba in OpenCL 3.0
Sven van Haastregt [Mon, 12 Apr 2021 08:30:06 +0000 (09:30 +0100)]
[OpenCL] Accept .rgba in OpenCL 3.0

The .rgba vector component accessors are supported in OpenCL C 3.0.

Previously, the diagnostic would check `OpenCLVersion` for version 2.2
(value 220) and report those accessors are an OpenCL 2.2 feature.
However, there is no "OpenCL C version 2.2", so change the check and
diagnostic text to 3.0 only.

A spurious `OpenCLVersion` argument was passed into the diagnostic;
remove that.

Differential Revision: https://reviews.llvm.org/D99969

3 years ago[AArch64] Adds memory operands for indexed loads.
Stelios Ioannou [Fri, 9 Apr 2021 16:36:20 +0000 (17:36 +0100)]
[AArch64] Adds memory operands for indexed loads.

This patch adds the memory operands for indexed loads so
that certain optimizations can take place.

Differential Revision: https://reviews.llvm.org/D100215/

Change-Id: I539fcf046ca4ad1e7df1d893f57d751419d8364d

3 years ago[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions.
Esme-Yi [Mon, 12 Apr 2021 07:42:54 +0000 (07:42 +0000)]
[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions.

Summary: The tags DW_LANG_C_plus_plus_14 and DW_LANG_C_plus_plus_11, introduced in Dwarf-5, are unexpected in previous versions. Fixing the mismathing doesn't have any drawbacks for any other debuggers, but helps dbx.

Reviewed By: aprantl, shchenz

Differential Revision: https://reviews.llvm.org/D99250

3 years ago[clang][AST] Handle overload callee type in CallExpr::getCallReturnType.
Balázs Kéri [Mon, 12 Apr 2021 06:52:40 +0000 (08:52 +0200)]
[clang][AST] Handle overload callee type in CallExpr::getCallReturnType.

The function did not handle every case. In some cases this
caused assertion failure.
After the fix the function returns DependentTy if the exact
return type can not be determined.

It seems that clang itself does not call the function in the
affected cases but some checker or other code may call it.

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D95244

3 years ago[NFC][Debug] Fix unnecessary deep-copy for vector to save compiling time
Zhang Qing Shan [Mon, 12 Apr 2021 06:55:03 +0000 (14:55 +0800)]
[NFC][Debug] Fix unnecessary deep-copy for vector to save compiling time

We saw some big compiling time impact after enabling the debug entry value
feature for X86 platform(D73534). Compiling time goes from 900s->1600s with
our testcase. It is caused by allocating/freeing the memory busily.

'using FwdRegWorklist = MapVector<unsigned, SmallVector<FwdRegParamInfo, 2>>;'
The value for this map is vector, and we miss the reference when access the
element. The same happens for `auto CalleesMap = MF->getCallSitesInfo();` which is a DenseMap.

Reviewed by: djtodoro, flychen50

Differential Revision: https://reviews.llvm.org/D100162

3 years ago[libtooling][clang-tidy] Fix compiler warnings in testcase [NFC]
Mikael Holmen [Mon, 12 Apr 2021 06:23:47 +0000 (08:23 +0200)]
[libtooling][clang-tidy] Fix compiler warnings in testcase [NFC]

Without the fix we get:

06:31:09 In file included from ../../clang-tools-extra/unittests/clang-tidy/ClangTidyDiagnosticConsumerTest.cpp:3:
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1392:11: error: comparison of integers of different signs: 'const int' and 'const unsigned int' [-Werror,-Wsign-compare]
06:31:09   if (lhs == rhs) {
06:31:09       ~~~ ^  ~~~
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1421:12: note: in instantiation of function template specialization 'testing::internal::CmpHelperEQ<int, unsigned int>' requested here
06:31:09     return CmpHelperEQ(lhs_expression, rhs_expression, lhs, rhs);
06:31:09            ^
06:31:09 ../../clang-tools-extra/unittests/clang-tidy/ClangTidyDiagnosticConsumerTest.cpp:60:3: note: in instantiation of function template specialization 'testing::internal::EqHelper<false>::Compare<int, unsigned int>' requested here
06:31:09   EXPECT_EQ(4, Errors[0].Message.FileOffset);
06:31:09   ^
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1924:63: note: expanded from macro 'EXPECT_EQ'
06:31:09                       EqHelper<GTEST_IS_NULL_LITERAL_(val1)>::Compare, \
06:31:09                                                               ^
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1392:11: error: comparison of integers of different signs: 'const int' and 'const unsigned long' [-Werror,-Wsign-compare]
06:31:09   if (lhs == rhs) {
06:31:09       ~~~ ^  ~~~
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1421:12: note: in instantiation of function template specialization 'testing::internal::CmpHelperEQ<int, unsigned long>' requested here
06:31:09     return CmpHelperEQ(lhs_expression, rhs_expression, lhs, rhs);
06:31:09            ^
06:31:09 ../../clang-tools-extra/unittests/clang-tidy/ClangTidyDiagnosticConsumerTest.cpp:64:3: note: in instantiation of function template specialization 'testing::internal::EqHelper<false>::Compare<int, unsigned long>' requested here
06:31:09   EXPECT_EQ(1, Errors[0].Message.Ranges.size());
06:31:09   ^
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1924:63: note: expanded from macro 'EXPECT_EQ'
06:31:09                       EqHelper<GTEST_IS_NULL_LITERAL_(val1)>::Compare, \
06:31:09                                                               ^
06:31:09 2 errors generated.

3 years ago[NFC] [Clang]: fix spelling mistake in assert message
Jim Lin [Mon, 12 Apr 2021 06:10:52 +0000 (14:10 +0800)]
[NFC] [Clang]: fix spelling mistake in assert message

Reviewed By: Jim

Differential Revision: https://reviews.llvm.org/D71541

3 years agofix typo in a CMake SANITIZER_CAN_USE_CXXABI variable initial definition
Jim Lin [Mon, 12 Apr 2021 06:04:38 +0000 (14:04 +0800)]
fix typo in a CMake SANITIZER_CAN_USE_CXXABI variable initial definition

The current variable name isn't used anywhere else, which indicates it's
a typo.  Let's fix it before someone copy+pastes it somewhere else.

Reviewed By: Jim

Differential Revision: https://reviews.llvm.org/D39157

3 years ago[X86] Pass to transform tdpbsud&tdpbusd&tdpbuud intrinsics to scalar operation
Bing1 Yu [Wed, 24 Mar 2021 08:53:12 +0000 (16:53 +0800)]
[X86] Pass to transform tdpbsud&tdpbusd&tdpbuud intrinsics to scalar operation

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D99244

3 years ago[NARY] Don't optimize min/max if there are side uses
Evgeniy Brevnov [Fri, 9 Apr 2021 08:41:27 +0000 (15:41 +0700)]
[NARY] Don't optimize min/max if there are side uses

Say we have
%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%2,%a)

The optimization will try to reassociate the later one so that we can rewrite it to %3=min(%1, %c) and remove %2.
But if %2 has another uses outside of %3 then we can't remove %2 and end up with:

%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%1, %c)

This doesn't harm by itself except it is not profitable and changes IR for no good reason.
What is bad it triggers next iteration which finds out that optimization is applicable to %2 and %3 and generates:

%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%1,%c)
%4=min(%2,%a)

and so on...

The solution is to prevent optimization in the first place if intermediate result (%2) has side uses and
known to be not removed.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D100170

3 years ago[X86] Remove FeatureCLWB from FeaturesICLClient
Freddy Ye [Mon, 12 Apr 2021 02:36:08 +0000 (10:36 +0800)]
[X86] Remove FeatureCLWB from FeaturesICLClient

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100279

3 years ago[lld-macho][nfc] Convert tabs to spaces
Jez Ng [Mon, 12 Apr 2021 03:23:37 +0000 (23:23 -0400)]
[lld-macho][nfc] Convert tabs to spaces

3 years ago[Debug-Info] make fortran CHARACTER(1) type as valid unsigned type
Chen Zheng [Mon, 12 Apr 2021 03:13:17 +0000 (23:13 -0400)]
[Debug-Info] make fortran CHARACTER(1) type as valid unsigned type

This resolves https://bugs.llvm.org/show_bug.cgi?id=49872

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D100015

3 years ago[Clang][Coroutine][DebugInfo] In c++ coroutine, clang will emit different debug info...
yifeng.dongyifeng [Mon, 12 Apr 2021 02:59:22 +0000 (10:59 +0800)]
[Clang][Coroutine][DebugInfo] In c++ coroutine, clang will emit different debug info variables for parameters and move-parameters.

The first one is the real parameters of the coroutine function, the
other one just for copying parameters to the coroutine frame.

Considering the following c++ code:
```
struct coro {
  ...
};

coro foo(struct test & t) {
  ...
  co_await suspend_always();
    ...
    co_await suspend_always();
    ...
    co_await suspend_always();
}

int main(int argc, char *argv[]) {
  auto c = foo(...);
    c.handle.resume();
      ...
  }
```

Function foo is the standard coroutine function, and it has only
one parameter named t (ignoring this at first),
when we use the llvm code to compile this function, we can get the
following ir:

```
!2921 = distinct !DISubprogram(name: "foo", linkageName:
"_ZN6Object3fooE4test", scope: !2211, file: !45, li\
ne: 48, type: !2329, scopeLine: 48, flags: DIFlagPrototyped |
DIFlagAllCallsDescribed, spFlags: DISPFlagDefi\
nition | DISPFlagOptimized, unit: !44, declaration: !2328,
retainedNodes: !2922)
!2924 = !DILocalVariable(name: "t", arg: 2, scope: !2921, file: !45,
line: 48, type: !838)
...
!2926 = !DILocalVariable(name: "t", scope: !2921, type: !838, flags:
DIFlagArtificial)
```
We can find there are two `the same` DIVariable named t in the same
dwarf scope for foo.resume.
And when we try to use llvm-dwarfdump to dump the dwarf info of this
elf, we get the following output:

```
0x00006684:   DW_TAG_subprogram
                DW_AT_low_pc    (0x00000000004013a0)
                DW_AT_high_pc   (0x00000000004013a8)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_object_pointer    (0x0000669c)
                DW_AT_GNU_all_call_sites        (true)
                DW_AT_specification     (0x00005b5c "_ZN6Object3fooE4test")

0x000066a5:     DW_TAG_formal_parameter
                DW_AT_name    ("t")
                DW_AT_decl_file       ("/disk1/yifeng.dongyifeng/my_code/llvm/build/bin/coro-debug-1.cpp")
                DW_AT_decl_line       (48)
                DW_AT_type    (0x00004146 "test")

0x000066ba:     DW_TAG_variable
                  DW_AT_name    ("t")
                  DW_AT_type    (0x00004146 "test")
                  DW_AT_artificial      (true)
```
The elf also has two 't' in the same scope.
But unluckily, it might let the debugger
confused. And failed to print parameters for O0 or above.
This patch will make coroutine parameters and move
parameters use the same DIVar and try to fix the problems
that I mentioned before.

Test Plan: check-clang

Reviewed By: aprantl, jmorse

Differential Revision: https://reviews.llvm.org/D97533

3 years ago[PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled
Qiu Chaofan [Mon, 12 Apr 2021 02:31:07 +0000 (10:31 +0800)]
[PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled

XSCMPUQP is not available for pre-P9 subtargets. This patch will lower
them into libcall for correct behavior on power7/power8.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92083

3 years ago[RISCV][Clang] Add some RVV Permutation intrinsic functions.
Zakk Chen [Thu, 8 Apr 2021 17:15:09 +0000 (10:15 -0700)]
[RISCV][Clang] Add some RVV Permutation intrinsic functions.

Support the following instructions.
1. Vector Slide Instructions
2. Vector Register Gather Instructions
3. Vector Compress Instruction

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100127

3 years ago[RISCV][Clang] Add all RVV Mask intrinsic functions.
Zakk Chen [Thu, 8 Apr 2021 15:28:15 +0000 (08:28 -0700)]
[RISCV][Clang] Add all RVV Mask intrinsic functions.

1. Redefine vpopc and vfirst IR intrinsic so it could adapt on
clang tablegen generator which always appends a type for vl
in IntrinsicType of clang codegen.
2. Remove `c` type transformer and add `u` and `l` for unsigned long
and long type.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100120

3 years ago[RISCV][Clang] Add more RVV load/store intrinsic functions.
Zakk Chen [Sun, 11 Apr 2021 14:25:06 +0000 (07:25 -0700)]
[RISCV][Clang] Add more RVV load/store intrinsic functions.

Support the following instructions.
1. Mask load and store
2. Vector Strided Instructions
3. Vector Indexed Store Instructions

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99965

3 years ago[RISCV][Clang] Add all RVV Reduction intrinsic functions.
Zakk Chen [Tue, 6 Apr 2021 14:57:41 +0000 (07:57 -0700)]
[RISCV][Clang] Add all RVV Reduction intrinsic functions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99964

3 years ago[RISCV][Clang] Add RVV merge intrinsic functions.
Zakk Chen [Tue, 6 Apr 2021 14:23:30 +0000 (07:23 -0700)]
[RISCV][Clang] Add RVV merge intrinsic functions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99963

3 years ago[RISCV][Clang] Add RVV Type-Convert intrinsic functions.
Zakk Chen [Thu, 1 Apr 2021 16:21:11 +0000 (09:21 -0700)]
[RISCV][Clang] Add RVV Type-Convert intrinsic functions.

Fix extension macro condition.

Support below instructions:
1. Single-Width Floating-Point/Integer Type-Convert Instructions
2. Widening Floating-Point/Integer Type-Convert Instructions
3. Narrowing Floating-Point/Integer Type-Convert Instructions

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99742

3 years ago[RISCV][Clang] Add some RVV Floating-Point intrinsic functions.
Zakk Chen [Thu, 8 Apr 2021 15:21:06 +0000 (08:21 -0700)]
[RISCV][Clang] Add some RVV Floating-Point intrinsic functions.

Support vfclass, vfmerge, vfrec7, vfrsqrt7, vfsqrt instructions.

Reviewed By: craig.topper

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99741

3 years ago[RISCV][Clang] Add more RVV Floating-Point intrinsic functions.
Zakk Chen [Thu, 8 Apr 2021 15:09:42 +0000 (08:09 -0700)]
[RISCV][Clang] Add more RVV Floating-Point intrinsic functions.

Support below instructions.
1. Vector Widening Floating-Point Add/Subtract Instructions
2. Vector Widening Floating-Point Multiply
3. Vector Single-Width Floating-Point Fused Multiply-Add Instructions
4. Vector Widening Floating-Point Fused Multiply-Add Instructions
5. Vector Floating-Point Compare Instructions

Reviewed By: craig.topper, HsiangKai

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99669

3 years ago[RISCV][Clang] Add some RVV Floating-Point intrinsic functions.
Zakk Chen [Thu, 8 Apr 2021 14:29:59 +0000 (07:29 -0700)]
[RISCV][Clang] Add some RVV Floating-Point intrinsic functions.

Support the following instructions which have the same class.
1. Vector Single-Width Floating-Point Subtract Instructions
2. Vector Single-Width Floating-Point Multiply/Divide Instructions
3. Vector Floating-Point MIN/MAX Instructions
4. Vector Floating-Point Sign-Injection Instructions

Reviewed By: craig.topper

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99668

3 years ago[RISCV][Clang] Add RVV Widening Integer Add/Subtract intrinsic functions.
Zakk Chen [Tue, 6 Apr 2021 10:26:44 +0000 (03:26 -0700)]
[RISCV][Clang] Add RVV Widening Integer Add/Subtract intrinsic functions.

Reviewed By: craig.topper

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99526

3 years ago[RISCV][NFC] Remove unneeded explict XLenVT type on codegen patterns
Jim Lin [Mon, 12 Apr 2021 02:15:35 +0000 (10:15 +0800)]
[RISCV][NFC] Remove unneeded explict XLenVT type on codegen patterns

Customized SDNode has been specified the explict XLenVT type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100190

3 years ago[RISCV] Update computeKnownBitsForTargetNode to treat READ_VLENB as being 16 byte...
Craig Topper [Mon, 12 Apr 2021 00:54:20 +0000 (17:54 -0700)]
[RISCV] Update computeKnownBitsForTargetNode to treat READ_VLENB as being 16 byte aligned.

According to the 0.10 spec, VLEN is at least 128 bits and is a
power of 2.

3 years ago[RISCV] Use SLLI/SRLI instead of SLLIW/SRLIW for (srl (and X, 0xffff), C) custom...
Craig Topper [Sun, 11 Apr 2021 18:57:52 +0000 (11:57 -0700)]
[RISCV] Use SLLI/SRLI instead of SLLIW/SRLIW for (srl (and X, 0xffff), C) custom isel on RV64.

We don't need the sign extending behavior here and SLLI/SRLI
are able to compress to C.SLLI/C.SRLI.

3 years ago[NFCI][SimplifyCFG] PerformValueComparisonIntoPredecessorFolding(): improve Dominator...
Roman Lebedev [Sun, 11 Apr 2021 20:44:24 +0000 (23:44 +0300)]
[NFCI][SimplifyCFG] PerformValueComparisonIntoPredecessorFolding(): improve Dominator Tree updating

Same as with previous patches.

3 years ago[NFCI][SimplifyCFG] mergeEmptyReturnBlocks(): improve Dominator Tree updating
Roman Lebedev [Sun, 11 Apr 2021 20:25:40 +0000 (23:25 +0300)]
[NFCI][SimplifyCFG] mergeEmptyReturnBlocks(): improve Dominator Tree updating

Same as with previous patches.

3 years ago[NFCI][Local] MergeBasicBlockIntoOnlyPred(): improve Dominator Tree updating
Roman Lebedev [Sun, 11 Apr 2021 20:17:48 +0000 (23:17 +0300)]
[NFCI][Local] MergeBasicBlockIntoOnlyPred(): improve Dominator Tree updating

Same as with TryToSimplifyUncondBranchFromEmptyBlock()/MergeBlockIntoPredecessor() patch.

3 years ago[NFCI][BasicBlockUtils] MergeBlockIntoPredecessor(): improve Dominator Tree updating
Roman Lebedev [Sun, 11 Apr 2021 20:08:19 +0000 (23:08 +0300)]
[NFCI][BasicBlockUtils] MergeBlockIntoPredecessor(): improve Dominator Tree updating

Same as with TryToSimplifyUncondBranchFromEmptyBlock() patch.

3 years ago[NFCI][Local] TryToSimplifyUncondBranchFromEmptyBlock(): improve Dominator Tree updating
Roman Lebedev [Sun, 11 Apr 2021 19:39:22 +0000 (22:39 +0300)]
[NFCI][Local] TryToSimplifyUncondBranchFromEmptyBlock(): improve Dominator Tree updating

First, we don't need vector-ness for the predecessor lists.

Secondly, like elsewhere, do insertions before deletions.

Lastly, the check that we actually need to insert an edge,
that it doesn't exist already, is backwards. Instead of
looking at successors of every single 'PredOfBB',
just always look at predecessors of the 'Succ'.
The result is always the same, but we avoid *really* inefficient code.

3 years ago[NFCI][DomTreeUpdater] applyUpdates(): reserve space for updates first
Roman Lebedev [Sun, 11 Apr 2021 19:02:57 +0000 (22:02 +0300)]
[NFCI][DomTreeUpdater] applyUpdates(): reserve space for updates first

While, indeed, we may end up pushing less updates that we'd reserve space
for, self-dominating updates aren't often enough for that to matter.
But this should matter for normal updates.

3 years ago[LoopUnroll] Add AArch64 test case with large vector ops.
Florian Hahn [Sat, 10 Apr 2021 14:23:47 +0000 (15:23 +0100)]
[LoopUnroll] Add AArch64 test case with large vector ops.

Add test case to illustrate over-eager unrolling on AArch64, due to the
cost-model not estimating the size of vector loads/stores accurately.

3 years ago[VectorCombine] Add tests for load/extract scalarization.
Florian Hahn [Sun, 11 Apr 2021 15:51:37 +0000 (16:51 +0100)]
[VectorCombine] Add tests for load/extract scalarization.

Add tests where scalarizing a vector load + extract is profitable.

3 years ago[X86][AVX512] Fold not(kmov(x)) -> kmov(not(x)) and not(widen_subvector(x)) -> widen_...
Simon Pilgrim [Sun, 11 Apr 2021 19:06:53 +0000 (20:06 +0100)]
[X86][AVX512] Fold not(kmov(x)) -> kmov(not(x)) and not(widen_subvector(x)) -> widen_subvector(not(x))

Improve AVX512 mask inversion, rG38c799bce801 exposed some missing opportunities to move scalar not() back onto the boolvector types for folding with setcc etc.

3 years ago[WebAssembly] Update v128.any_true
Thomas Lively [Sun, 11 Apr 2021 18:13:16 +0000 (11:13 -0700)]
[WebAssembly] Update v128.any_true

In the final SIMD spec, there is only a single v128.any_true instruction, rather
than one for each lane interpretation because the semantics do not depend on the
lane interpretation.

Differential Revision: https://reviews.llvm.org/D100241

3 years ago[X86] combineXor - Pull out repeated getOperand() calls. NFCI.
Simon Pilgrim [Sun, 11 Apr 2021 18:01:59 +0000 (19:01 +0100)]
[X86] combineXor - Pull out repeated getOperand() calls. NFCI.

3 years ago[X86] Fold cmpeq/ne(and(X,Y),Y) --> cmpeq/ne(and(~X,Y),0)
Simon Pilgrim [Sun, 11 Apr 2021 17:41:51 +0000 (18:41 +0100)]
[X86] Fold cmpeq/ne(and(X,Y),Y) --> cmpeq/ne(and(~X,Y),0)

Followup to D100177, handle an similar (demorgan inverse style) case from PR47797 as well

The AVX512 test cases could be further improved if we folded not(iX bitcast(vXi1)) -> (iX bitcast(not(vXi1)))

Alive2: https://alive2.llvm.org/ce/z/AnA_-W

3 years ago[RISCV] Drop earlyclobber constraint from vwadd(u).wx, vwsub(u).wx, vfwadd.wf and...
Craig Topper [Sun, 11 Apr 2021 17:19:43 +0000 (10:19 -0700)]
[RISCV] Drop earlyclobber constraint from vwadd(u).wx, vwsub(u).wx, vfwadd.wf and vfwsub.wf.

The first source has the same EEW as the destination and the other
source is a scalar so the overlap constraints don't apply to
the unmasked version.

For the masked version we have a constraint that the destination
can't be V0 so that covers the only overlap issue there.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D100217

3 years ago[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffff) when zext...
Craig Topper [Sun, 11 Apr 2021 16:51:45 +0000 (09:51 -0700)]
[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffff) when zext.h is supported.

Similar to what we do for zext.w.

Disable the (srl (and X, 0xffff), C) custom isel when zext.h is
available.

3 years ago[RISCV] Add i8 and i16 srli and srai tests to Zbb/Zbp test files. NFC
Craig Topper [Sun, 11 Apr 2021 16:40:13 +0000 (09:40 -0700)]
[RISCV] Add i8 and i16 srli and srai tests to Zbb/Zbp test files. NFC

These require the input to be zero or sign extended. If we have
sext.b, sext.h or zext.h instructions we can use them. Otherwise
we need to use a pair of shifts to accomplish the zero/sign extend
and the final shift.

We don't currently use zext.h when it is available.

3 years ago[InstCombine] Improve "get low bit mask upto and including bit X" pattern
Roman Lebedev [Sun, 11 Apr 2021 14:58:47 +0000 (17:58 +0300)]
[InstCombine] Improve "get low bit mask upto and including bit X" pattern

https://alive2.llvm.org/ce/z/3u-48R

3 years ago[NFC][InstCombine] Add tests for "get low bit mask upto and including bit X" pattern
Roman Lebedev [Sun, 11 Apr 2021 14:20:59 +0000 (17:20 +0300)]
[NFC][InstCombine] Add tests for "get low bit mask upto and including bit X" pattern

3 years ago[InstCombine] (X | Op01C) + Op1C --> X + (Op01C + Op1C) iff the or is actually an add
Roman Lebedev [Sun, 11 Apr 2021 13:33:47 +0000 (16:33 +0300)]
[InstCombine] (X | Op01C) + Op1C --> X + (Op01C + Op1C) iff the or is actually an add

https://alive2.llvm.org/ce/z/Coc5yf

3 years ago[NFC][InstCombine] Add a few test of adding to add-like or
Roman Lebedev [Sun, 11 Apr 2021 13:49:21 +0000 (16:49 +0300)]
[NFC][InstCombine] Add a few test of adding to add-like or

3 years ago[NFC][LoopVectorize] Autogenerate interleaved-accesses.ll
Roman Lebedev [Sun, 11 Apr 2021 13:37:21 +0000 (16:37 +0300)]
[NFC][LoopVectorize] Autogenerate interleaved-accesses.ll

3 years ago[LoopIdiom] left-shift-until-bittest: set all allowed no-wrap flags on add/sub
Roman Lebedev [Sun, 11 Apr 2021 12:38:15 +0000 (15:38 +0300)]
[LoopIdiom] left-shift-until-bittest: set all allowed no-wrap flags on add/sub

I've checked each one of these with alive2,
and this is both correct and precise.

3 years ago[NFC][LoopIdiom] left-shift-until-bittest: add small-bitwidth tests
Roman Lebedev [Sun, 11 Apr 2021 12:55:09 +0000 (15:55 +0300)]
[NFC][LoopIdiom] left-shift-until-bittest: add small-bitwidth tests

3 years ago[NFC][LoopIdiom] Regenerate left-shift-until-bittest.ll
Roman Lebedev [Sun, 11 Apr 2021 12:51:06 +0000 (15:51 +0300)]
[NFC][LoopIdiom] Regenerate left-shift-until-bittest.ll

3 years ago[libc++] [CI] Validate the output of the generated scripts.
Mark de Wever [Sun, 4 Apr 2021 18:11:48 +0000 (20:11 +0200)]
[libc++] [CI] Validate the output of the generated scripts.

This adds a CI job validating that the output of
utils/generate_feature_test_macro_components.py,
libcxx/utils/generate_header_inclusion_tests.py, and
utils/generate_header_tests.py are up to date.

The validation method has been copied from the Format job.

Differential Revision: https://reviews.llvm.org/D99862

3 years agoUpdate personal info in CREDITS.TXT
Zhang Qing Shan [Sun, 11 Apr 2021 11:25:02 +0000 (19:25 +0800)]
Update personal info in CREDITS.TXT

3 years agoTypo fix
Sushma Unnibhavi [Sun, 11 Apr 2021 06:53:20 +0000 (12:23 +0530)]
Typo fix

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D100254

3 years agoMissing syntax highlighting for LLVM IR in Langref
Sushma Unnibhavi [Sun, 11 Apr 2021 06:47:49 +0000 (12:17 +0530)]
Missing syntax highlighting for LLVM IR in Langref

Added syntax highlighting

Differential Revision: https://reviews.llvm.org/D100125

3 years agoRevert "Remove "Rewrite Symbols" from codegen pipeline"
Arthur Eubanks [Sun, 11 Apr 2021 06:28:16 +0000 (23:28 -0700)]
Revert "Remove "Rewrite Symbols" from codegen pipeline"

This reverts commit 6210261ecb21c84c9a440a76c0ccbc8ad211bed3.

addr-label.ll crashes on armv7.

3 years agoRemove "Rewrite Symbols" from codegen pipeline
Arthur Eubanks [Thu, 1 Apr 2021 06:12:36 +0000 (23:12 -0700)]
Remove "Rewrite Symbols" from codegen pipeline

It breaks up the function pass manager in the codegen pipeline.

With empty parameters, it looks at the -mllvm flag -rewrite-map-file.
This is likely not in use.

Add a check that we only have one function pass manager in the codegen
pipeline.

This required reverting commit 9583a3f2625818b78c0cf6d473cdedb9f23ad82c:
"[AsmPrinter] Delete dead takeDeletedSymbsForFunction()".
This was not NFC as initially thought. By coalescing two function
psas managers, this exposed the reverted code as necessary.
addr-label.ll was crashing due to an emitted blockaddress's block being
removed but the label not emitted.

Some tests relied on the fact that we had a module pass somewhere in the
codegen pipeline.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D99707

3 years ago[Polly] Partially refactoring of IslAstInfo and IslNodeBuilder to use isl++. NFC.
patacca [Sat, 10 Apr 2021 21:25:05 +0000 (16:25 -0500)]
[Polly] Partially refactoring of IslAstInfo and IslNodeBuilder to use isl++. NFC.

Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.

Changes made:
 - Refactoring the following methods of class IslAstInfo
   - isParallel() isExecutedInParallel() isReductionParallel() getSchedule() getMinimalDependenceDistance() getBrokenReductions()
 - Refactoring the following methods of class IslNodeBuilder
   - getReferencesInSubtree() getScheduleForAstNode()
 - Refactoring function getBrokenReductionsStr()
 - Fixed the mismatching function declaration for getScheduleForAstNode()

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D99971

3 years ago[CVP] @llvm.[us]{min,max}() intrinsics handling
Roman Lebedev [Sat, 10 Apr 2021 21:23:27 +0000 (00:23 +0300)]
[CVP] @llvm.[us]{min,max}() intrinsics handling

If we can tell that either one of the arguments is taken,
bypass the intrinsic.

Notably, we are indeed fine with non-strict predicate:
* UL: https://alive2.llvm.org/ce/z/69qVW9 https://alive2.llvm.org/ce/z/kNFTKf
      https://alive2.llvm.org/ce/z/AvaPw2 https://alive2.llvm.org/ce/z/oxo53i
* UG: https://alive2.llvm.org/ce/z/wxHeGH https://alive2.llvm.org/ce/z/Lf76qx
* SL: https://alive2.llvm.org/ce/z/hkeTGS https://alive2.llvm.org/ce/z/eR_b-W
* SG: https://alive2.llvm.org/ce/z/wEqRm7 https://alive2.llvm.org/ce/z/FpAsVr

Much like with all other comparison handling in CVP,
while we could sort-of handle two Value's,
at least for plain ICmpInst it does not appear to be worthwhile.

This only fires 78 times on test-suite + dt + rs,
but we don't canonicalize to these yet. (only SCEV produces them)

3 years ago[NFC][CVP] Add tests for @llvm.[us]{min,max}() intrinsics
Roman Lebedev [Sat, 10 Apr 2021 21:10:47 +0000 (00:10 +0300)]
[NFC][CVP] Add tests for @llvm.[us]{min,max}() intrinsics

3 years ago[IVUsers] Check LoopSimplify cache earlier (NFC)
Nikita Popov [Sat, 10 Apr 2021 20:17:35 +0000 (22:17 +0200)]
[IVUsers] Check LoopSimplify cache earlier (NFC)

Check the cache before calling isLoopSimplifyForm(). Otherwise we'd
always perform the check for the innermost loop and only skip it
for dominating loops.

3 years ago[CSSPGO] Fix dangling context strings and improve profile order consistency and error...
Wenlei He [Thu, 8 Apr 2021 06:06:39 +0000 (23:06 -0700)]
[CSSPGO] Fix dangling context strings and improve profile order consistency and error handling

This patch fixed the following issues along side with some refactoring:

1. Fix bugs where StringRef for context string out live the underlying std::string. We now keep string table in profile generator to hold std::strings. We also do the same for bracketed context strings in profile writer.
2. Make sure profile output strictly follow (total sample, name) order. Previously, there's inconsistency between ProfileMap's key and FunctionSamples's name, leading to inconsistent ordering. This is now fixed by introducing context profile canonicalization. Assertions are also added to make sure ProfileMap's key and FunctionSamples's name are always consistent.
3. Enhanced error handling for profile writing to make sure we bubble up errors properly for both llvm-profgen and llvm-profdata when string table is not populated correctly for extended binary profile.
4. Keep all internal context representation bracket free. This avoids creating new strings for context trimming, merging and preinline. getNameWithContext API is now simplied accordingly.
5. Factor out the code for context trimming and merging into SampleContextTrimmer in SampleProf.cpp. This enables llvm-profdata to use the trimmer when merging profiles. Changes in llvm-profgen will be in separate patch.

Differential Revision: https://reviews.llvm.org/D100090

3 years ago[NFC][JumpThreading] Increment 'NumFolds' statistic all places terminator becomes...
Roman Lebedev [Sat, 10 Apr 2021 18:24:29 +0000 (21:24 +0300)]
[NFC][JumpThreading] Increment 'NumFolds' statistic all places terminator becomes uncond

3 years ago[NFC][CVP] Add statistic for function pointer argument non-null-ness deduction
Roman Lebedev [Sat, 10 Apr 2021 18:23:20 +0000 (21:23 +0300)]
[NFC][CVP] Add statistic for function pointer argument non-null-ness deduction

3 years ago[CVP] LVI: Use in-block values when checking value signedness domain
Roman Lebedev [Sat, 10 Apr 2021 18:05:17 +0000 (21:05 +0300)]
[CVP] LVI: Use in-block values when checking value signedness domain

This has a huge positive impact on all the folds that use these helpers,
as it can be seen on vanilla test-suite + rawspeed + darktable:
correlated-value-propagation.NumSRems             +75.68% (+ 28)
correlated-value-propagation.NumAShrs             +63.87% (+198)
correlated-value-propagation.NumSDivs             +49.42% (+127)
correlated-value-propagation.NumSExt              + 8.85% (+593)
correlated-value-propagation.NumUDivURemsNarrowed + 8.65% (+34)

... while having pretty minimal compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=e8c7f43e2c2c6f3581ec1c6489ec21ad9f98958a&to=4cd197711e58ee1b2faeee0c35eea54540185569&stat=instructions

3 years ago[NFC][LVI] getPredicateAt(): drop default value for UseBlockValue
Roman Lebedev [Sat, 10 Apr 2021 17:45:37 +0000 (20:45 +0300)]
[NFC][LVI] getPredicateAt(): drop default value for UseBlockValue

The default is likely wrong.
Out of all the callees, only a single one needs to pass-in false (JumpThread),
everything else either already passes true, or should pass true.

Until the default is flipped, at least make it harder to unintentionally
add new callees with UseBlockValue=false.

3 years ago[NFC] Rename LimitingIntrinsic into MinMaxIntrinsic
Roman Lebedev [Sat, 10 Apr 2021 17:34:27 +0000 (20:34 +0300)]
[NFC] Rename LimitingIntrinsic into MinMaxIntrinsic

As requested in post-commit review

3 years ago[flang] Accept & fold IEEE_SELECTED_REAL_KIND
peter klausler [Wed, 7 Apr 2021 20:21:10 +0000 (13:21 -0700)]
[flang] Accept & fold IEEE_SELECTED_REAL_KIND

F18 supports the standard intrinsic function SELECTED_REAL_KIND
but not its synonym in the standard module IEEE_ARITHMETIC
named IEEE_SELECTED_REAL_KIND until this patch.

Differential Revision: https://reviews.llvm.org/D100066

3 years ago[libtooling][clang-tidy] Fix off-by-one rendering issue with SourceRanges
Whisperity [Sat, 10 Apr 2021 16:48:22 +0000 (18:48 +0200)]
[libtooling][clang-tidy] Fix off-by-one rendering issue with SourceRanges

There was an off-by-one issue with calculating the *exact* end location
of token ranges (as given by SomeDecl->getSourceRange()) which resulted in:

  xxx(something)
      ^~~~~~~~   // Note the missing ~ under the last character.

In addition, a test is added to keep the behaviour in check in the future.

This patch hotfixes commit 3b677b81cec7b3c5132aee8fccc30252d87deb69.

3 years ago[NFC][ConstantRange] Add 'icmp' helper method
Roman Lebedev [Sat, 10 Apr 2021 16:37:59 +0000 (19:37 +0300)]
[NFC][ConstantRange] Add 'icmp' helper method

"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.

3 years agoRevert "[NFC][ConstantRange] Add 'icmp' helper method"
Roman Lebedev [Sat, 10 Apr 2021 16:37:53 +0000 (19:37 +0300)]
Revert "[NFC][ConstantRange] Add 'icmp' helper method"

This reverts commit 17cf2c94230bc107e7294ef84fad3b47f4cd1b73.

3 years agoRevert "zz"
Roman Lebedev [Sat, 10 Apr 2021 16:37:16 +0000 (19:37 +0300)]
Revert "zz"

It wasn't meant to be committed, two commits should have been squashed.

This reverts commit 0c184154969c020db416bd7066af80ffd2a27ac4.

3 years ago[NFC][ConstantRange] Add 'icmp' helper method
Roman Lebedev [Sat, 10 Apr 2021 14:58:47 +0000 (17:58 +0300)]
[NFC][ConstantRange] Add 'icmp' helper method

"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.

3 years agozz
Roman Lebedev [Sat, 10 Apr 2021 14:10:51 +0000 (17:10 +0300)]
zz

3 years ago[libtooling][clang-tidy] Fix diagnostics not highlighting fed SourceRanges
Whisperity [Mon, 15 Mar 2021 16:06:03 +0000 (17:06 +0100)]
[libtooling][clang-tidy] Fix diagnostics not highlighting fed SourceRanges

Fixes bug http://bugs.llvm.org/show_bug.cgi?id=49000.

This patch allows Clang-Tidy checks to do

    diag(X->getLocation(), "text") << Y->getSourceRange();

and get the highlight of `Y` as expected:

    warning: text [blah-blah]
        xxx(something)
        ^   ~~~~~~~~~

Reviewed-By: aaron.ballman, njames93
Differential Revision: http://reviews.llvm.org/D98635

3 years ago[CVP] @llvm.abs() handling
Roman Lebedev [Sat, 10 Apr 2021 12:52:28 +0000 (15:52 +0300)]
[CVP] @llvm.abs() handling

Iff we know the sigdness domain of the argument,
we can either skip @llvm.abs, or do negation directly.

Notably, INT_MIN can belong to either domain:
* X u<= INT_MIN --> X  is always fine
  https://alive2.llvm.org/ce/z/QB8j-C https://alive2.llvm.org/ce/z/7sFKpS
* X s<= 0 --> -X  is always fine
  https://alive2.llvm.org/ce/z/QbGSyq https://alive2.llvm.org/ce/z/APsN84

If all else fails, try to inferr NSW flag:
https://alive2.llvm.org/ce/z/qCJfYm

3 years ago[NFC][CVP] Add `@llvm.abs` test cases
Roman Lebedev [Sat, 10 Apr 2021 12:41:43 +0000 (15:41 +0300)]
[NFC][CVP] Add `@llvm.abs` test cases

3 years ago[Matrix] Implement C-style explicit type conversions for matrix types.
Saurabh Jha [Sat, 10 Apr 2021 09:25:34 +0000 (10:25 +0100)]
[Matrix] Implement C-style explicit type conversions for matrix types.

This implements C-style type conversions for matrix types, as specified
in clang/docs/MatrixTypes.rst.

Fixes PR47141.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D99037

3 years ago[RISCV][Clang] Add RVV vleff intrinsic functions.
Hsiangkai Wang [Fri, 9 Apr 2021 23:02:08 +0000 (07:02 +0800)]
[RISCV][Clang] Add RVV vleff intrinsic functions.

Reviewed By: craig.topper, liaolucy, jrtc27, khchen

Differential Revision: https://reviews.llvm.org/D99151

3 years agoTemporairly revert "[CGCall] Annotate `this` argument with alignment"
Roman Lebedev [Sat, 10 Apr 2021 07:41:16 +0000 (10:41 +0300)]
Temporairly revert "[CGCall] Annotate `this` argument with alignment"

As per @jyknight, "It seems like there's a bug with vtable thunks getting the wrong information."
See https://reviews.llvm.org/D99790#2680857, https://godbolt.org/z/MxhYMe1q7

This reverts commit 0aa0458f1429372038ca6a4edc7e94c96cd9a753.

3 years ago[AMDGPU][CostModel] Refine cost model for control-flow instructions.
dfukalov [Tue, 16 Feb 2021 19:20:06 +0000 (22:20 +0300)]
[AMDGPU][CostModel] Refine cost model for control-flow instructions.

Added cost estimation for switch instruction, updated costs of branches, fixed
phi cost.
Had to increase `-amdgpu-unroll-threshold-if` default value since conditional
branch cost (size) was corrected to higher value.
Test renamed to "control-flow.ll".

Removed redundant code in `X86TTIImpl::getCFInstrCost()` and
`PPCTTIImpl::getCFInstrCost()`.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D96805

3 years ago[clang][AVR] Support variable decorator '__flash'
Ben Shi [Sat, 10 Apr 2021 03:23:55 +0000 (11:23 +0800)]
[clang][AVR] Support variable decorator '__flash'

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D96853

3 years agoSupport: Add move semantics to mapped_file_region
Duncan P. N. Exon Smith [Fri, 9 Apr 2021 01:43:21 +0000 (18:43 -0700)]
Support: Add move semantics to mapped_file_region

Update llvm::sys::fs::mapped_file_region to have a move constructor and
a move assignment operator, allowing it to be used as an Optional. Also,
update FileOutputBuffer's OnDiskBuffer to take advantage of this,
avoiding an extra allocation from the unique_ptr.

A nice follow-up would be to make the mapped_file_region constructor
private and replace its use with a factory function, such as
mapped_file_region::create(), that returns an Expected (or ErrorOr). I
don't plan on doing that immediately, but I might swing back later.

No functionality change, besides the saved allocation in OnDiskBuffer.

Differential Revision: https://reviews.llvm.org/D100159

3 years ago[flang] RANDOM_NUMBER, RANDOM_SEED, RANDOM_INIT in runtime
peter klausler [Wed, 7 Apr 2021 20:14:14 +0000 (13:14 -0700)]
[flang] RANDOM_NUMBER, RANDOM_SEED, RANDOM_INIT in runtime

Add APIs, initial non-coarray implementations, and unit
tests for the intrinsic subroutines for pseudo-random
number generation.

Differential Revision: https://reviews.llvm.org/D100064

3 years ago[lld-macho][nfc] Remove DYSYM8 reloc attribute
Jez Ng [Fri, 9 Apr 2021 23:47:10 +0000 (19:47 -0400)]
[lld-macho][nfc] Remove DYSYM8 reloc attribute

It's likely redundant, per discussion with @gkm. The BYTE8
attribute covers the bit width requirement already.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D100133

3 years ago[flang] Enforce a limit on recursive PDT instantiations
peter klausler [Wed, 7 Apr 2021 20:17:39 +0000 (13:17 -0700)]
[flang] Enforce a limit on recursive PDT instantiations

For pernicious test cases with explicit non-constant actual
type parameter expressions in components, e.g.:

  type :: t(k)
    integer, kind :: k
    type(t(k+1)), pointer :: p
  end type

we should detect the infinite recursion and complain rather
than looping until the stack overflows.

Differential Revision: https://reviews.llvm.org/D100065

3 years agoRevert "[AMDGPU] Remove MachineDCE after SIFoldOperands"
Mitch Phillips [Fri, 9 Apr 2021 22:36:11 +0000 (15:36 -0700)]
Revert "[AMDGPU] Remove MachineDCE after SIFoldOperands"

This reverts commit 5a0117b2d0eaedffeeb393bd9915f11cccfe241b.

Reason: Dependent change d19a42eba98fe853dd52f7dc89d8cd2727c7fc1c broke
the ASan buildbots.

3 years agoRevert "[AMDGPU] SIFoldOperands: eagerly erase dead REG_SEQUENCEs"
Mitch Phillips [Fri, 9 Apr 2021 22:02:33 +0000 (15:02 -0700)]
Revert "[AMDGPU] SIFoldOperands: eagerly erase dead REG_SEQUENCEs"

This reverts commit d19a42eba98fe853dd52f7dc89d8cd2727c7fc1c.

Reason: Broke the ASan buildbots. See the original phabricator review
for more details: https://reviews.llvm.org/D100188

3 years ago[AArch64][GlobalISel] Swap compare operands when it may be profitable
Jessica Paquette [Mon, 9 Nov 2020 21:35:41 +0000 (13:35 -0800)]
[AArch64][GlobalISel] Swap compare operands when it may be profitable

This adds support for swapping comparison operands when it may introduce new
folding opportunities.

This is roughly the same as the code added to AArch64ISelLowering in
162435e7b5e026b9f988c730bb6527683f6aa853.

For an example of a testcase which exercises this, see
llvm/test/CodeGen/AArch64/swap-compare-operands.ll

(Godbolt for that testcase: https://godbolt.org/z/43WEMb)

The idea behind this is that sometimes, we may be able to fold away, say, a
shift or extend in a compare by swapping its operands.

e.g. in the case of this compare:

```
lsl x8, x0, #1
cmp x8, x1
cset w0, lt
```

The following is equivalent:

```
cmp x1, x0, lsl #1
cset w0, gt
```

Most of the code here is just a reimplementation of what already exists in
AArch64ISelLowering.

(See `getCmpOperandFoldingProfit` and `getAArch64Cmp` for the equivalent code.)

Note that most of the AND code in the testcase doesn't actually fold. It seems
like we're missing selection support for that sort of fold right now, since SDAG
happily folds these away (e.g testSwapCmpWithShiftedZeroExtend8_32 in the
original .ll testcase)

Differential Revision: https://reviews.llvm.org/D89422

3 years ago[flang] Check for conflicting BIND(C) names
peter klausler [Wed, 7 Apr 2021 20:23:45 +0000 (13:23 -0700)]
[flang] Check for conflicting BIND(C) names

Check for two or more symbols that define a data object or entry point
with the same interoperable BIND(C) name.

Differential Revision: https://reviews.llvm.org/D100067