platform/upstream/llvm.git
3 years ago[lld-macho] Check simulator platforms to avoid issuing false positive errors.
Vy Nguyen [Tue, 4 May 2021 20:23:21 +0000 (16:23 -0400)]
[lld-macho] Check simulator platforms to avoid issuing false positive errors.

Currently the linker causes unnecessary errors when either the target or the config's platform is a simulator.

Differential Revision: https://reviews.llvm.org/D101855

3 years ago[gn build] (semi-manually) port 0b10bb7ddd3c
Nico Weber [Wed, 5 May 2021 22:06:52 +0000 (18:06 -0400)]
[gn build] (semi-manually) port 0b10bb7ddd3c

3 years agoAMDGPU: Add a few more tail call tests
Matt Arsenault [Sun, 14 Mar 2021 17:52:31 +0000 (13:52 -0400)]
AMDGPU: Add a few more tail call tests

Add some cases I noticed were missing when porting to GlobalISel. The
cases that required any argument splitting did not work at first.

3 years agoARM/GlobalISel: Don't store a MachineInstrBuilder reference
Matt Arsenault [Wed, 5 May 2021 21:22:10 +0000 (17:22 -0400)]
ARM/GlobalISel: Don't store a MachineInstrBuilder reference

This is basically a pointer anyway

3 years agoWhen performing template argument deduction to select a partial
Richard Smith [Wed, 5 May 2021 21:44:49 +0000 (14:44 -0700)]
When performing template argument deduction to select a partial
specialization while substituting a partial template parameter pack,
don't try to extend the existing deduction.

This caused us to select the wrong partial specialization in some rare
cases. A recent change to libc++ caused this to happen in practice for
code using std::conjunction.

3 years ago[AMDGPU] Improve global SADDR selection
Stanislav Mekhanoshin [Mon, 3 May 2021 18:01:13 +0000 (11:01 -0700)]
[AMDGPU] Improve global SADDR selection

An address can be a uniform sum of two i64 bit values.
That regularly happens in a loop where index is an induction
variable promoted to 64 bit by the LSR. We can materialize
zero in a VGPR and still use SADDR form of the load.

Differential Revision: https://reviews.llvm.org/D101591

3 years ago[clangd] Split CC and refs limit and increase refs limit to 1000
Kirill Bobyrev [Wed, 5 May 2021 21:39:37 +0000 (23:39 +0200)]
[clangd] Split CC and refs limit and increase refs limit to 1000

Related discussion: https://github.com/clangd/clangd/discussions/761

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D101902

3 years agoGlobalISel: Update documentation
Matt Arsenault [Wed, 5 May 2021 17:55:24 +0000 (13:55 -0400)]
GlobalISel: Update documentation

3 years agoAMDGPU/GlobalISel: Remove unnecessary override
Matt Arsenault [Wed, 5 May 2021 02:29:30 +0000 (22:29 -0400)]
AMDGPU/GlobalISel: Remove unnecessary override

This is the same as the default implementation

3 years agoX86/GlobalISel: Use generic version of splitToValueTypes
Matt Arsenault [Sun, 28 Feb 2021 16:35:37 +0000 (11:35 -0500)]
X86/GlobalISel: Use generic version of splitToValueTypes

The custom insert of an unmerge and the callback weirdness should be
unnecessary. Since handleAssignments should now use
getRegisterTypeForCalling conv as SelectionDAG builder would, this
should now just be able to use the generic code. X86-32 relies on the
generated CCAssignFns not seeing illegal types and sharing code with
x86_64, so i64 values would incorrectly be assigned to 64-bit
registers.

3 years agoGlobalISel: Use DAG call lowering infrastructure in a more compatible way
Matt Arsenault [Tue, 13 Apr 2021 17:45:35 +0000 (13:45 -0400)]
GlobalISel: Use DAG call lowering infrastructure in a more compatible way

Unfortunately the current call lowering code is built on top of the
legacy MVT/DAG based code. However, GlobalISel was not using it the
same way. In short, the DAG passes legalized types to the assignment
function, and GlobalISel was passing the original raw type if it was
simple.

I do believe the DAG lowering is conceptually broken since it requires
picking a type up front before knowing how/where the value will be
passed. This ends up being a problem for AArch64, which wants to pass
i1/i8/i16 values as a different size if passed on the stack or in
registers.

The argument type decision is split across 3 different places which is
hard to follow. SelectionDAG builder uses
getRegisterTypeForCallingConv to pick a legal type, tablegen gives the
illusion of controlling the type, and the target may have additional
hacks in the C++ part of the call lowering. AArch64 hacks around this
by not using the standard AnalyzeFormalArguments and special casing
i1/i8/i16 by looking at the underlying type of the original IR
argument.

I believe people have generally assumed the calling convention code is
processing the original types, and I've discovered a number of dead
paths in several targets.

x86 actually relies on the opposite behavior from AArch64, and relies
on x86_32 and x86_64 sharing calling convention code where the 64-bit
cases implicitly do not work on x86_32 due to using the pre-legalized
types.

AMDGPU targets without legal i16/f16 have always used a broken ABI
that promotes to i32/f32. GlobalISel accidentally fixed this to be the
ABI we should have, but this fixes it so we're using the worse ABI
that is compatible with the DAG. Ideally we would fix the DAG to match
the old GlobalISel behavior, but I don't wish to fight that battle.

A new native GlobalISel call lowering framework should let the target
process the incoming types directly.

CCValAssigns select a "ValVT" and "LocVT" but the meanings of these
aren't entirely clear. Different targets don't use them consistently,
even within their own call lowering code. My current belief is the
intent was "ValVT" is supposed to be the legalized value type to use
in the end, and and LocVT was supposed to be the ABI passed type
(which is also legalized).

With the default CCState::Analyze functions always passing the same
type for these arguments, these only differ when the TableGen part of
the lowering decide to promote the type from one legal type to
another. AArch64's i1/i8/i16 hack ends up inverting the meanings of
these values, so I had to add an additional hack to let the target
interpret how large the argument memory is.

Since targets don't consistently interpret ValVT and LocVT, this
doesn't produce quite equivalent code to the initial DAG
lowerings. I've opted to consistently interpret LocVT as the in-memory
size for stack passed values, and ValVT as the register type to assign
from that memory. We therefore produce extending loads directly out of
the IRTranslator, whereas the DAG would emit regular loads of smaller
values. This will also produce loads/stores that are wider than the
argument value if the allocated stack slot is larger (and there will
be undef padding bytes). If we had the optimizations to reduce
load/stores based on truncated values, this wouldn't produce a
different end result.

Since ValVT/LocVT are more consistently interpreted, we now will emit
more G_BITCASTS as requested by the CCAssignFn. For example AArch64
was directly assigning types to some physical vector registers which
according to the tablegen spec should have been casted to a vector
with a different element type.

This also moves the responsibility for inserting
G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the
generic code, which is closer to how SelectionDAGBuilder works.

I had to xfail an x86 test since I don't see a quick way to fix it
right now (I filed bug 50035 for this). It's broken independently of
this change, and only triggers since now we end up with more ands
which hit the improperly handled selection pattern.

I also observed that FP arguments that need promotion (e.g. f16 passed
as f32) are broken, and use regular G_TRUNC and G_ANYEXT.

TLDR; the current call lowering infrastructure is bad and nobody has
ever understood how it chooses types.

3 years ago[mlir] Add polynomial approximation for math::ExpM1
Emilio Cota [Wed, 5 May 2021 21:26:50 +0000 (14:26 -0700)]
[mlir] Add polynomial approximation for math::ExpM1

This approximation matches the one in Eigen.

```
name                      old cpu/op  new cpu/op  delta
BM_mlir_Expm1_f32/10      90.9ns ± 4%  52.2ns ± 4%  -42.60%    (p=0.000 n=74+87)
BM_mlir_Expm1_f32/100      837ns ± 3%   231ns ± 4%  -72.43%    (p=0.000 n=79+69)
BM_mlir_Expm1_f32/1k      8.43µs ± 3%  1.58µs ± 5%  -81.30%    (p=0.000 n=77+83)
BM_mlir_Expm1_f32/10k     83.8µs ± 3%  15.4µs ± 5%  -81.65%    (p=0.000 n=83+69)
BM_eigen_s_Expm1_f32/10   68.8ns ±17%  72.5ns ±14%   +5.40%  (p=0.000 n=118+115)
BM_eigen_s_Expm1_f32/100   694ns ±11%   717ns ± 2%   +3.34%   (p=0.000 n=120+75)
BM_eigen_s_Expm1_f32/1k   7.69µs ± 2%  7.97µs ±11%   +3.56%   (p=0.000 n=95+117)
BM_eigen_s_Expm1_f32/10k  88.0µs ± 1%  89.3µs ± 6%   +1.45%   (p=0.000 n=74+106)
BM_eigen_v_Expm1_f32/10   44.3ns ± 6%  45.0ns ± 8%   +1.45%   (p=0.018 n=81+111)
BM_eigen_v_Expm1_f32/100   351ns ± 1%   360ns ± 9%   +2.58%    (p=0.000 n=73+99)
BM_eigen_v_Expm1_f32/1k   3.31µs ± 1%  3.42µs ± 9%   +3.37%   (p=0.000 n=71+100)
BM_eigen_v_Expm1_f32/10k  33.7µs ± 8%  34.1µs ± 9%   +1.04%    (p=0.007 n=99+98)
```

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D101852

3 years ago[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs
Michael Kitzan [Sat, 1 May 2021 02:50:54 +0000 (19:50 -0700)]
[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs

- Move the code preventing CSE of `isConvergent` instrs into
  `ProcessBlockCSE` (from `isProfitableToCSE`)
- Add comments explaining why `isConvergent` is used to prevent
  CSE of non-local instrs in MachineCSE and the new test

3 years ago[Utils][NFC] Rename replace-function-regex in update_cc_test_checks
Giorgis Georgakoudis [Wed, 5 May 2021 18:46:02 +0000 (11:46 -0700)]
[Utils][NFC] Rename replace-function-regex in update_cc_test_checks

This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality can replace any IR value besides functions.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101934

3 years agoPreserve metadata on masked intrinsics in auto-upgrade
Krzysztof Parzyszek [Fri, 23 Apr 2021 20:07:00 +0000 (15:07 -0500)]
Preserve metadata on masked intrinsics in auto-upgrade

When auto-upgrade was replacing a call to a masked intrinsic, it would
not copy the metadata from the original call.

If an intrinsic had metadata, but did not need any updates, the metadata
would stay, but if an update was needed, the would end up being removed.
A similar effect could be observed with masked_expandload and
masked_compressstore, which at the moment are not handled by auto-upgrade:
the metadata remained untouched.

Differential Revision: https://reviews.llvm.org/D101201

3 years ago[NFC][X86][Codegen] Add some tests for 64-bit shift by (32-x)
Roman Lebedev [Wed, 5 May 2021 20:46:35 +0000 (23:46 +0300)]
[NFC][X86][Codegen] Add some tests for 64-bit shift by (32-x)

3 years ago[WebAssembly] Add SIMD const_splat intrinsics
Thomas Lively [Wed, 5 May 2021 20:46:45 +0000 (13:46 -0700)]
[WebAssembly] Add SIMD const_splat intrinsics

These intrinsics do not correspond to their own underlying instruction, but are
a convenience for the common case of materializing a constant vector that has
the same value in each lane.

Differential Revision: https://reviews.llvm.org/D101885

3 years ago[lld] Convert LLVM_CMAKE_PATH to a CMake path
Isuru Fernando [Wed, 28 Apr 2021 17:50:51 +0000 (12:50 -0500)]
[lld] Convert LLVM_CMAKE_PATH to a CMake path

Otherwise I get the following error on windows.
```
CMake Error at D:/bld/lld_1569206597988/work/build/CMakeFiles/CMakeTmp/CMakeLists.txt:2 (set):
  Syntax error in cmake code at

    D:/bld/lld_1569206597988/work/build/CMakeFiles/CMakeTmp/CMakeLists.txt:2

  when parsing string

    D:\bld\lld_1569206597988\_h_env\Library\lib\cmake\llvm

  Invalid character escape '\b'.

CMake Error at D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:100 (try_compile):
  Failed to configure test project build system.
Call Stack (most recent call first):
  D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:57 (__CHECK_SYMBOL_EXISTS_IMPL)
  D:/bld/lld_1569206597988/_h_env/Library/lib/cmake/llvm/HandleLLVMOptions.cmake:943 (check_symbol_exists)
  CMakeLists.txt:56 (include)
```

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D68158

3 years ago[mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv
Rob Suderman [Wed, 5 May 2021 20:10:49 +0000 (13:10 -0700)]
[mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv

Implements support for undialated depthwise convolution using the existing
depthwise convolution operation. Once convolutions migrate to yaml defined
versions we can rewrite for cleaner implementation.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D101579

3 years ago[scudo] Align objects with alignas
Vitaly Buka [Tue, 4 May 2021 23:34:59 +0000 (16:34 -0700)]
[scudo] Align objects with alignas

Operator new must align allocations for types with large alignment.

Before c++17 behavior was implementation defined and both clang and gc++
before 11 ignored alignment. Miss-aligned objects mysteriously crashed
tests on Ubuntu 14.

Alternatives are compile with -std=c++17 or -faligned-new, but they were
discarded as less portable.

Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D101874

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Stop using invalid iterators to insert into sets/maps.
Arthur O'Dwyer [Tue, 20 Apr 2021 19:38:57 +0000 (15:38 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Stop using invalid iterators to insert into sets/maps.

This simply applies Howard's commit 4c80bfbd53caf consistently
across all the associative and unordered container tests.

"unord.set/insert_hint_const_lvalue.pass.cpp" failed with `-D_LIBCPP_DEBUG=1`
before this patch; it was the only one that incorrectly reused
invalid iterator `e`. The others already used valid iterators
(generally `c.end()`); I'm just making them all match the same pattern
of usage: "e, then r, then c.end() for the rest."

Differential Revision: https://reviews.llvm.org/D101679

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] std::advance shouldn't use ADL `>=` on the _Distance...
Arthur O'Dwyer [Tue, 20 Apr 2021 19:59:22 +0000 (15:59 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] std::advance shouldn't use ADL `>=` on the _Distance type.

Convert to a primitive type first; then use primitive `>=` on that value.

Differential Revision: https://reviews.llvm.org/D101678

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Our `__debug_less` breaks some complexity guarantees.
Arthur O'Dwyer [Tue, 20 Apr 2021 22:21:59 +0000 (18:21 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Our `__debug_less` breaks some complexity guarantees.

`__debug_less` ends up running the comparator up-to-twice per comparison,
because whenever `(x < y)` it goes on to verify that `!(y < x)`.
This breaks the strict "Complexity" guarantees of algorithms like
`inplace_merge`, which we test in the test suite. So, just skip the
complexity assertions in debug mode.

Differential Revision: https://reviews.llvm.org/D101677

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB.
Arthur O'Dwyer [Wed, 21 Apr 2021 01:51:41 +0000 (21:51 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB.

The range of char pointers [data, data+size] is a valid closed range,
but the range [begin, end) is valid only half-open.

Differential Revision: https://reviews.llvm.org/D101676

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Fix an iterator-invalidation issue in string::assign.
Arthur O'Dwyer [Tue, 27 Apr 2021 13:10:04 +0000 (09:10 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Fix an iterator-invalidation issue in string::assign.

This appears to be a bug in our string::assign: when assigning into
a longer string, from a shorter snippet of itself, we invalidate
iterators before doing the copy. We should invalidate them afterward.
Also drive-by improve the formatting of a function header.

Differential Revision: https://reviews.llvm.org/D101675

3 years ago[libc++] Move <__sso_allocator> out of include/ into src/. NFCI.
Arthur O'Dwyer [Mon, 26 Apr 2021 13:56:50 +0000 (09:56 -0400)]
[libc++] Move <__sso_allocator> out of include/ into src/. NFCI.

This allocator is not intended for libc++'s users to use;
it's strictly an implementation detail of `src/locale.cpp`.
So, move it to the `src/include/` directory.

Drive-by const-qualify its comparison operators.

For consistency with `__hidden_allocator` (defined in `src/thread.cpp`),
do *not* remove it from "libcxx/lib/libc++unexp.exp",
"libcxx/utils/symcheck-blacklists/linux_blacklist.txt", etc.

Differential Revision: https://reviews.llvm.org/D101293

3 years ago[WebAssembly] Fix constness of pointer params to load intrinsics
Thomas Lively [Wed, 5 May 2021 20:16:55 +0000 (13:16 -0700)]
[WebAssembly] Fix constness of pointer params to load intrinsics

Update the SIMD builtin load functions to take pointers to const data and update
the intrinsics themselves to not cast away constness.

Differential Revision: https://reviews.llvm.org/D101884

3 years ago[WebAssembly] Update narrowing builtin function operand types
Thomas Lively [Wed, 5 May 2021 20:04:04 +0000 (13:04 -0700)]
[WebAssembly] Update narrowing builtin function operand types

Make the inputs to all narrowing builtins signed, which is how they are
interpreted by the underlying instructions (only the result changes sign
between instructions).

Differential Revision: https://reviews.llvm.org/D101883

3 years agoAdd fuzzer for Rust demangler
Tomasz Miąsko [Wed, 5 May 2021 19:29:03 +0000 (12:29 -0700)]
Add fuzzer for Rust demangler

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101823

3 years ago[lld-macho] Try to unbreak build
Jez Ng [Wed, 5 May 2021 19:46:42 +0000 (15:46 -0400)]
[lld-macho] Try to unbreak build

Looks like the PointerUnion casting cares about const-ness...

3 years ago[libcxx] [ci] Add a Windows CI configuration for a statically linked libc++
Martin Storsjö [Mon, 5 Apr 2021 21:17:30 +0000 (00:17 +0300)]
[libcxx] [ci] Add a Windows CI configuration for a statically linked libc++

On Windows, static vs DLL linking affects details in quite a few
cases, so it's good to have coverage for both cases.

Testing with static linking also increases coverage for a number of
cases and individual checks that have had to be waived for the DLL
case, and allows testing libc++experimental, increasing the number
of test cases actually executed by 180 (176 new tests from
libc++experimental and 4 ones that are XFAIL windows-dll).

Also drop the "generic-" prefix from these configuration names, as
they're perhaps not what the "generic" prefix intended originally
in the other generic-posix configurations.

Differential Revision: https://reviews.llvm.org/D101565

3 years ago[libc++] NFC: Remove stray semicolon in from-scratch config files
Louis Dionne [Wed, 5 May 2021 19:05:45 +0000 (15:05 -0400)]
[libc++] NFC: Remove stray semicolon in from-scratch config files

3 years ago[WebAssembly] Set alignment to 1 for SIMD memory intrinsics
Thomas Lively [Wed, 5 May 2021 18:59:33 +0000 (11:59 -0700)]
[WebAssembly] Set alignment to 1 for SIMD memory intrinsics

The WebAssembly SIMD intrinsics in wasm_simd128.h generally try not to require
any particular alignment for memory operations to be maximally flexible. For
builtin memory access functions and their corresponding LLVM IR intrinsics,
there's no way to set the expected alignment, so the best we can do is set the
alignment to 1 in the backend. This change means that the alignment hints in the
emitted code will no longer be incorrect when users use the intrinsics to access
unaligned data.

Differential Revision: https://reviews.llvm.org/D101850

3 years ago[libomptarget] Initial documentation on amdgpu offload
Jon Chesterfield [Wed, 5 May 2021 18:58:51 +0000 (19:58 +0100)]
[libomptarget] Initial documentation on amdgpu offload

[libomptarget] Initial documentation on amdgpu offload

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101927

3 years ago[hwasan] Fix missing synchronization in AllocThread.
Evgenii Stepanov [Wed, 5 May 2021 02:16:30 +0000 (19:16 -0700)]
[hwasan] Fix missing synchronization in AllocThread.

The problem was introduced in D100348.

It's really hard to trigger the bug in a stress test - the race is just too
narrow - but the new checks in Thread::Init should at least provide usable
diagnostic if the problem ever returns.

Differential Revision: https://reviews.llvm.org/D101881

3 years ago[lld-macho] Preliminary support for ARM_RELOC_BR24
Jez Ng [Wed, 5 May 2021 18:40:41 +0000 (14:40 -0400)]
[lld-macho] Preliminary support for ARM_RELOC_BR24

ARM_RELOC_BR24 is used for BL/BLX instructions from within ARM (i.e. not
Thumb) code. This diff just handles the basic case: branches from ARM to
ARM, or from ARM to Thumb where no shimming is required. (See comments
in ARM.cpp for why shims are required.)

Note: I will likely be deprioritizing ARM work for the near future to
focus on other parts of LLD. Apologies for the half-done state of this;
I'm just trying to wrap up what I've already worked on.

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D101814

3 years ago[lld-macho] Have --reproduce account for path rerooting
Jez Ng [Wed, 5 May 2021 18:38:36 +0000 (14:38 -0400)]
[lld-macho] Have --reproduce account for path rerooting

We need to account for path rerooting when generating the response
file. We could either reroot the paths before generating the file, or pass
through the original filenames and change just the syslibroot. I've opted for
the latter, in order that the reproduction run more closely mirrors the
original.

We must also be careful *not* to make an absolute path relative if it is
shadowed by a rerooted path. See repro6.tar in reroot-path.s for
details.

I've moved the call to `createResponseFile()` after the initialization of
`config->systemLibraryRoots`, since it now needs to know what those roots are.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D101224

3 years agoMake clangd CompletionModel not depend on directory layout.
Harald van Dijk [Wed, 5 May 2021 18:25:34 +0000 (19:25 +0100)]
Make clangd CompletionModel not depend on directory layout.

The current code accounts for two possible layouts, but there is at
least a third supported layout: clang-tools-extra may also be checked
out as clang/tools/extra with the releases, which was not yet handled.
Rather than treating that as a special case, use the location of
CompletionModel.cmake to handle all three cases. This should address the
problems that prompted D96787 and the problems that prompted the
proposed revert D100625.

Reviewed By: usaxena95

Differential Revision: https://reviews.llvm.org/D101851

3 years ago[Clang] remove text extension from diag::err_drv_invalid_value_with_suggestion
Nick Desaulniers [Wed, 5 May 2021 18:01:33 +0000 (11:01 -0700)]
[Clang] remove text extension from diag::err_drv_invalid_value_with_suggestion

This hinders translations, as per:
https://clang.llvm.org/docs/InternalsManual.html#the-format-string

Reviewed By: MaskRay, xbolva00

Differential Revision: https://reviews.llvm.org/D101387

3 years ago[NFC][SimplifyCFG] Update documentation comments for SinkCommonCodeFromPredecessors...
Roman Lebedev [Wed, 5 May 2021 17:34:43 +0000 (20:34 +0300)]
[NFC][SimplifyCFG] Update documentation comments for SinkCommonCodeFromPredecessors() after 1886aad

3 years ago[llvm-objcopy][ELF] --only-keep-debug: set offset/size of segments with no sections...
Fangrui Song [Wed, 5 May 2021 17:26:57 +0000 (10:26 -0700)]
[llvm-objcopy][ELF] --only-keep-debug: set offset/size of segments with no sections to zero

PR50160: we currently ignore non-PT_PHDR segments with no sections, not
accounting for its p_offset and p_filesz: this can cause an out-of-bounds write
in `writeSegmentData` if the p_offset+p_filesz is larger than the total file
size.

This can be fixed by setting p_offset=p_filesz=0. The logic nicely unifies with
the logic added in D90897.

Reviewed By: jhenderson, rupprecht

Differential Revision: https://reviews.llvm.org/D101560

3 years agoRISSCV: clang-format RISC-V AsmParser (NFC)
Saleem Abdulrasool [Wed, 5 May 2021 17:15:14 +0000 (10:15 -0700)]
RISSCV: clang-format RISC-V AsmParser (NFC)

This corrects a few issues identified by `clang-format`.  This is meant
to be preparation for a subsequent change.

3 years ago[NFC][X86][CostModel] Add tests for byteswap intrinsic
Roman Lebedev [Wed, 5 May 2021 16:21:22 +0000 (19:21 +0300)]
[NFC][X86][CostModel] Add tests for byteswap intrinsic

3 years ago[MC] Untangle MCContext and MCObjectFileInfo
Philipp Krones [Wed, 5 May 2021 17:03:02 +0000 (10:03 -0700)]
[MC] Untangle MCContext and MCObjectFileInfo

This untangles the MCContext and the MCObjectFileInfo. There is a circular
dependency between MCContext and MCObjectFileInfo. Currently this dependency
also exists during construction: You can't contruct a MOFI without a MCContext
without constructing the MCContext with a dummy version of that MOFI first.
This removes this dependency during construction. In a perfect world,
MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the
MCContext, like other MC information. This is future work.

This also shifts/adds more information to the MCContext making it more
available to the different targets. Namely:

- TargetTriple
- ObjectFileType
- SubtargetInfo

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101462

3 years ago[LV] Workaround PR49900 (a crash due to analyzing partially mutated IR)
Philip Reames [Wed, 5 May 2021 16:55:09 +0000 (09:55 -0700)]
[LV] Workaround PR49900 (a crash due to analyzing partially mutated IR)

LoopVectorize has a fairly deeply baked in design problem where it will try to query analysis (primarily SCEV, but also ValueTracking) in the midst of mutating IR. In particular, the intermediate IR state does not represent the semantics of the original (or final) program.

Fixing this for real is hard, but all of the cases seen so far share a common symptom. In cases seen to date, the analysis being queried is the computation of the original loop's trip count. We can fix this particular instance of the issue by simply computing the trip count early, and caching it.

I want to be really clear that this is nothing but a workaround. It does nothing to fix the root issue, and at best, delays the time until we have to fix this for real. Florian and I have discussed an eventual solution in the review comments for https://reviews.llvm.org/D100663, but it's a lot of work.

Test taken from https://reviews.llvm.org/D100663.

Differential Revision: https://reviews.llvm.org/D101487

3 years ago[mlir][ArmSVE] Add masked arithmetic operations
Javier Setoain [Mon, 19 Apr 2021 14:37:29 +0000 (15:37 +0100)]
[mlir][ArmSVE] Add masked arithmetic operations

These instructions map to SVE-specific instrinsics that accept a
predicate operand to support control flow in vector code.

Differential Revision: https://reviews.llvm.org/D100982

3 years ago[clang] remove an incremental build workaround
Nico Weber [Wed, 5 May 2021 16:21:54 +0000 (12:21 -0400)]
[clang] remove an incremental build workaround

This cleaned up an oversight over a year ago. Should no longer be needed.

3 years ago[AMDGPU] Pre-commit 2 new saddr load tests. NFC.
Stanislav Mekhanoshin [Wed, 5 May 2021 16:08:14 +0000 (09:08 -0700)]
[AMDGPU] Pre-commit 2 new saddr load tests. NFC.

3 years ago[mlir][Affine][Vector] Support vectorizing reduction loops
Sergei Grechanik [Wed, 5 May 2021 15:33:19 +0000 (08:33 -0700)]
[mlir][Affine][Vector] Support vectorizing reduction loops

This patch adds support for vectorizing loops with 'iter_args'
implementing known reductions along the vector dimension. Comparing to
the non-vector-dimension case, two additional things are done during
vectorization of such loops:
- The resulting vector returned from the loop is reduced to a scalar
  using `vector.reduce`.
- In some cases a mask is applied to the vector yielded at the end of
  the loop to prevent garbage values from being written to the
  accumulator.

Vectorization of reduction loops is disabled by default. To enable it, a
map from loops to array of reduction descriptors should be explicitly passed to
`vectorizeAffineLoops`, or `vectorize-reductions=true` should be passed
to the SuperVectorize pass.

Current limitations:
- Loops with a non-unit step size are not supported.
- n-D vectorization with n > 1 is not supported.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D100694

3 years ago[clang][Driver] Add -fintegrate-as to debug-pass-structure test
Jinsong Ji [Wed, 5 May 2021 16:10:28 +0000 (16:10 +0000)]
[clang][Driver] Add -fintegrate-as to debug-pass-structure test

CGProfilePass is not always on, it will be disabled when using
non-intergrated assemblers.

  // Only enable CGProfilePass when using integrated assembler, since
  // non-integrated assemblers don't recognize .cgprofile section.
  PMBuilder.CallGraphProfile = !CodeGenOpts.DisableIntegratedAS;

Add -fintegrate-as to make sure the output don't rely on the platform default.

Reviewed By: evgeny777

Differential Revision: https://reviews.llvm.org/D101918

3 years agoAdded a faster method to clone llvm project [DOCS]
Sushma Unnibhavi [Wed, 5 May 2021 16:07:22 +0000 (21:37 +0530)]
Added a faster method to clone llvm project [DOCS]

Reviewed By: xgupta, amccarth

Differential Revision: https://reviews.llvm.org/D101433

3 years ago[docs] Update the llvm/example section
Pooja Yadav [Wed, 5 May 2021 15:57:24 +0000 (21:27 +0530)]
[docs] Update the llvm/example section

Added details about the llvm/example section.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D101284

3 years agoRevert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits...
Jessica Clarke [Wed, 5 May 2021 16:00:40 +0000 (17:00 +0100)]
Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"

This seems to have broken sanitizers, giving lots of

  Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed.

failures across multiple targets (currently X86 and PowerPC). Reverting
until I have a chance to reproduce and debug.

This reverts commit 6e876f9dedf00b24a96b8781e3b39d5282c43e91.

3 years ago[libc] Normalize LIBC_TARGET_MACHINE
Guillaume Chatelet [Wed, 5 May 2021 15:52:42 +0000 (15:52 +0000)]
[libc] Normalize LIBC_TARGET_MACHINE

Current implementation defines LIBC_TARGET_MACHINE with the use of CMAKE_SYSTEM_PROCESSOR.
Unfortunately CMAKE_SYSTEM_PROCESSOR is OS dependent and can produce different results.
An evidence of this is the various matchers used to detect whether the architecture is x86.

This patch normalizes LIBC_TARGET_MACHINE and renames it LIBC_TARGET_ARCHITECTURE.
I've added many architectures but we may want to limit ourselves to x86 and ARM.

Differential Revision: https://reviews.llvm.org/D101524

3 years ago[RISCV][NFC] Fix up pseudoinstruction name in comment
Fraser Cormack [Wed, 5 May 2021 15:40:28 +0000 (16:40 +0100)]
[RISCV][NFC] Fix up pseudoinstruction name in comment

3 years ago[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSi...
Jessica Clarke [Wed, 5 May 2021 14:32:34 +0000 (15:32 +0100)]
[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics

Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342

3 years ago[GlobalISel] Fix buildZExtInReg creating new register.
Vang Thao [Tue, 4 May 2021 23:43:07 +0000 (16:43 -0700)]
[GlobalISel] Fix buildZExtInReg creating new register.

Fix a bug where buildZExtInReg will create and use a new register instead of using the register from parameter DstOp Res.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D101871

3 years ago[InstCombine] improve readability; NFC
Sanjay Patel [Wed, 5 May 2021 14:29:52 +0000 (10:29 -0400)]
[InstCombine] improve readability; NFC

3 years ago[MIPS][MSA] Regenerate immediates tests. NFCI.
Simon Pilgrim [Wed, 5 May 2021 14:55:53 +0000 (15:55 +0100)]
[MIPS][MSA] Regenerate immediates tests. NFCI.

Simplifies an upcoming patch diff

3 years ago[MIPS][MSA] Regenerate i5-b tests. NFCI.
Simon Pilgrim [Wed, 5 May 2021 14:52:44 +0000 (15:52 +0100)]
[MIPS][MSA] Regenerate i5-b tests. NFCI.

Simplifies an upcoming patch diff

3 years ago[MIPS][MSA] Regenerate bitwise tests. NFCI.
Simon Pilgrim [Wed, 5 May 2021 14:52:03 +0000 (15:52 +0100)]
[MIPS][MSA] Regenerate bitwise tests. NFCI.

Simplifies an upcoming patch diff

3 years ago[AMDGPU] Fix llc pipeline lit test for bots enabling expensive checks
Baptiste Saleil [Wed, 5 May 2021 14:56:40 +0000 (10:56 -0400)]
[AMDGPU] Fix llc pipeline lit test for bots enabling expensive checks

3 years ago[mlir][linalg] Fix bug in the fusion on tensors index op handling.
Tobias Gysi [Wed, 5 May 2021 13:58:57 +0000 (13:58 +0000)]
[mlir][linalg] Fix bug in the fusion on tensors index op handling.

The old index op handling let the new index operations point back to the
producer block. As a result, after fusion some index operations in the
fused block had back references to the old producer block resulting in
illegal IR. The patch now relies on a block and value mapping to avoid
such back references.

Differential Revision: https://reviews.llvm.org/D101887

3 years ago[AMDGPU][OpenMP] Fix clang driver crash when provided -c
Pushpinder Singh [Wed, 5 May 2021 12:02:25 +0000 (12:02 +0000)]
[AMDGPU][OpenMP] Fix clang driver crash when provided -c

The offload action is used in four different ways as explained
in Driver.cpp:4495. When -c is present, the final phase will be
assemble (linker when -c is not present). However, this phase
is skipped according to D96769 for amdgcn. So, offload action
arrives into following situation,

 compile (device) ---> offload ---> offload

without -c the chain looks like,
 compile (device) ---> offload ---> linker (device)
---> offload

The former situation creates an unhandled case which causes
problem. The solution presented in this patch delays the D96769
logic until job creation time. This keeps the offload action
in the 1 of the 4 specified situations.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D101901

3 years ago[AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM
Anirudh Prasad [Wed, 5 May 2021 14:21:27 +0000 (10:21 -0400)]
[AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM

- As per the HLASM support we are providing, i.e. support only for the first parameter of the inline asm block, only pertaining to Z machine instructions defined in LLVM, character literals and string literals are not supported (see Figure 4 - https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3sc264940/$file/asmr1023.pdf for more information)
- This patch explicitly rejects the usage of char literals and string literals (for example "abc 'a'") when the relevant field is set
- This is achieved by introducing a field called `LexHLASMStrings` in MCAsmLexer similar to `LexMasmStrings`

Reviewed By: abhina.sreeskantharajan, Kai

Differential Revision: https://reviews.llvm.org/D101660

3 years ago[AArch64] Fix for the pre-indexed paired load/store optimization.
Stelios Ioannou [Wed, 5 May 2021 10:02:33 +0000 (11:02 +0100)]
[AArch64] Fix for the pre-indexed paired load/store optimization.

This patch fixes an issue where a pre-indexed store e.g.,
STR x1, [x0, #24]! with a store like STR x0, [x0, #8] are
merged into a single store: STP x1, x0, [x0, #24]!
. They shouldn’t be merged because the second store uses
x0 as both the stored value and the address and so it needs to be using the updated x0.
Therefore, it should not be folded into a STP <>pre.

Additionally a new test case is added to verify this fix.

Differential Revision: https://reviews.llvm.org/D101888

Change-Id: I26f1985ac84e970961e2cdca23c590fa6773851a

3 years ago[OpenCL] Add clang extension for non-portable kernel parameters.
Anastasia Stulova [Wed, 5 May 2021 12:18:00 +0000 (13:18 +0100)]
[OpenCL] Add clang extension for non-portable kernel parameters.

Added __cl_clang_non_portable_kernel_param_types extension that
allows using non-portable types as kernel parameters. This allows
bypassing the portability guarantees from the restrictions specified
in C++ for OpenCL v1.0 s2.4.

Currently this only disables the restrictions related to the data
layout. The programmer should ensure the compiler generates the same
layout for host and device or otherwise the argument should only be
accessed on the device side. This extension could be extended to other
case (e.g. permitting size_t) if desired in the future.

Patch by olestrohm (Ole Strohm)!

https://reviews.llvm.org/D101168

3 years ago[DebugInfo][test][MIPS] Use mtriple in tests
Jinsong Ji [Wed, 5 May 2021 13:51:02 +0000 (13:51 +0000)]
[DebugInfo][test][MIPS] Use mtriple in tests

Mips tests are using -march in RUN lines,
this will fail on AIX OS , when we get the mips-ibm-aix triple.

This is caused/exposed recently due to https://reviews.llvm.org/D101194 changed the default getMultiarchTriple in toolchain.

Update the tests to use -mtriple instead to avoid unintended failures.

Reviewed By: atanasyan

Differential Revision: https://reviews.llvm.org/D101863

3 years ago[SystemZ][z/OS] Fix return values in AutoConversion functions
Abhina Sreeskantharajan [Wed, 5 May 2021 13:41:45 +0000 (09:41 -0400)]
[SystemZ][z/OS] Fix return values in AutoConversion functions

My previous patch https://reviews.llvm.org/rG1527a5e4b4834e65678f9c30f786a2f4c17932bf incorrectly set int return values instead of std::error_code. This patch correctly returns and std::error_code value.

Reviewed By: fanbo-meng, Jonathan.Crowther

Differential Revision: https://reviews.llvm.org/D101904

3 years ago[AArch64] Fix scalar imm variants of SIMD shift left instructions
Andrew Savonichev [Thu, 29 Apr 2021 16:34:39 +0000 (19:34 +0300)]
[AArch64] Fix scalar imm variants of SIMD shift left instructions

This issue was reported in PR50057: Cannot select:
t10: i64 = AArch64ISD::VSHL t2, Constant:i32<2>

Shift intrinsics (llvm.aarch64.neon.ushl.i64 and sshl) with a constant
shift operand are lowered into AArch64ISD::VSHL in tryCombineShiftImm.
VSHL has i64 and v1i64 patterns for a right shift, but only v1i64 for
a left shift.

This patch adds the missing i64 pattern for AArch64ISD::VSHL, and LIT
tests to cover scalar variants (i64 and v1i64) of all shift
intrinsics (only ushl and sshl cases fail without the patch, others
were just not covered).

Differential Revision: https://reviews.llvm.org/D101580

3 years agoMake dependency between certain analysis passes transitive (reapply)
Bjorn Pettersson [Tue, 4 May 2021 17:08:58 +0000 (19:08 +0200)]
Make dependency between certain analysis passes transitive (reapply)

LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and
LoopAccessLegacyAnalysis all cache pointers to their nestled required
analysis passes. One need to use addRequiredTransitive to describe
that the nestled passes can't be freed until those analysis passes
no longer are used themselves.

There is still a bit of a mess considering the getLazyBPIAnalysisUsage
and getLazyBFIAnalysisUsage functions. Those functions are used from
both Transform, CodeGen and Analysis passes. I figure it is OK to
use addRequiredTransitive also when being used from Transform and
CodeGen passes. On the other hand, I figure we must do it when
used from other Analysis passes. So using addRequiredTransitive should
be more correct here. An alternative solution would be to add a
bool option in those functions to let the user tell if it is a
analysis pass or not. Since those lazy passes will be obsolete when
new PM has conquered the world I figure we can leave it like this
right now.

Intention with the patch is to fix PR49950. It at least solves the
problem for the reproducer in PR49950. However, that reproducer
need five passes in a specific order, so there are lots of various
"solutions" that could avoid the crash without actually fixing the
root cause.

This is a reapply of commit 3655f0757f2b4b, that was reverted in
33ff3c20498ef5c2057 due to problems with assertions in the polly
lit tests. That problem is supposed to be solved by also adjusting
ScopPass to explicitly preserve LazyBlockFrequencyInfo and
LazyBranchProbabilityInfo (it already preserved
OptimizationRemarkEmitter which depends on those lazy passes).

Differential Revision: https://reviews.llvm.org/D100958

3 years ago[X86][SSE] Move unpack(hop,hop) fold from foldShuffleOfHorizOp to combineTargetShuffle
Simon Pilgrim [Wed, 5 May 2021 11:21:30 +0000 (12:21 +0100)]
[X86][SSE] Move unpack(hop,hop) fold from foldShuffleOfHorizOp to combineTargetShuffle

By moving this after more of the shuffle canonicalization we reduce the demanded vector elts, avoiding a few unnecessary copies/moves etc.

3 years agoRevert "[Passes] Enable the relative lookup table converter pass on aarch64"
Martin Storsjö [Wed, 5 May 2021 12:23:14 +0000 (15:23 +0300)]
Revert "[Passes] Enable the relative lookup table converter pass on aarch64"

This reverts commit 57b259a852a6383880f5d0875d848420bb3c2945.

The relative lookup table converter pass seems to cause problems
for chromium on Windows/ARM64, see https://crbug.com/1204788.

3 years ago[RISCV][VP][NFC] Add tests for VP_SREM and VP_UREM
Fraser Cormack [Wed, 5 May 2021 12:11:11 +0000 (13:11 +0100)]
[RISCV][VP][NFC] Add tests for VP_SREM and VP_UREM

As agreed in D101826, these are follow-up tests for the RISC-V VP
support.

3 years ago[AMDGPU] Autogenerate checks for a clustering test and add GFX10
Jay Foad [Wed, 5 May 2021 12:05:38 +0000 (13:05 +0100)]
[AMDGPU] Autogenerate checks for a clustering test and add GFX10

3 years ago[RISCV][VP][NFC] Add tests for VP_MUL and VP_[US]DIV
Fraser Cormack [Wed, 5 May 2021 12:08:11 +0000 (13:08 +0100)]
[RISCV][VP][NFC] Add tests for VP_MUL and VP_[US]DIV

As agreed in D101826, these are follow-up tests for the RISC-V VP
support.

3 years ago[X86]Fix a crash trying to convert indices to proper type.
Alexey Bataev [Tue, 4 May 2021 14:48:06 +0000 (07:48 -0700)]
[X86]Fix a crash trying to convert indices to proper type.

Need to perfortm a bitcast on IndicesVec rather than subvector extract
if the original size of the IndicesVec is the same as the size of the
  destination type.

Differential Revision: https://reviews.llvm.org/D101838

3 years ago[MLIR] Rename free function `verify` on OffsetSizeAndStrideOpInterface
Uday Bondhugula [Wed, 5 May 2021 07:47:33 +0000 (13:17 +0530)]
[MLIR] Rename free function `verify` on OffsetSizeAndStrideOpInterface

Using a free function verify(<Op>) is error prone. Rename it.

Differential Revision: https://reviews.llvm.org/D101886

3 years ago[RISCV][VP][NFC] Add tests for VP_SHL and VP_LSHR
Fraser Cormack [Wed, 5 May 2021 12:01:04 +0000 (13:01 +0100)]
[RISCV][VP][NFC] Add tests for VP_SHL and VP_LSHR

As agreed in D101826, these are follow-up tests for the RISC-V VP
support. Tests for VP_ASHR were landed as part of D101826.

3 years ago[RISCV][VP][NFC] Add tests for VP_AND, VP_XOR, VP_OR
Fraser Cormack [Wed, 5 May 2021 11:56:16 +0000 (12:56 +0100)]
[RISCV][VP][NFC] Add tests for VP_AND, VP_XOR, VP_OR

As agreed in D101826, these are follow-up tests for the RISC-V VP
support.

3 years ago[RISCV][VP] Lower VP ISD nodes to RVV instructions
Fraser Cormack [Thu, 29 Apr 2021 15:58:56 +0000 (16:58 +0100)]
[RISCV][VP] Lower VP ISD nodes to RVV instructions

This patch supports all of the current set of VP integer binary
intrinsics by lowering them to to RVV instructions. It does so by using
the existing RISCVISD *_VL custom nodes as an intermediate layer. Both
scalable and fixed-length vectors are supported by using this method.

One notable change to the existing vector codegen strategy is that
scalable all-ones and all-zeros mask SPLAT_VECTORs are now lowered to
RISCVISD VMSET_VL and VMCLR_VL nodes to match their fixed-length
BUILD_VECTOR counterparts. This allows them to reuse the existing
"all-ones" VL patterns.

To reduce the size of the phabricator diff, some tests are intentionally
left out and will be added later if the patch is accepted.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D101826

3 years ago[mlir] Use ReassociationIndices instead of affine maps in linalg.reshape.
Alexander Belyaev [Wed, 5 May 2021 09:02:24 +0000 (11:02 +0200)]
[mlir] Use ReassociationIndices instead of affine maps in linalg.reshape.

Differential Revision: https://reviews.llvm.org/D101861

3 years ago [DOCS] Added example for G_EXTRACT and G_INSERT
Sushma Unnibhavi [Wed, 5 May 2021 10:11:23 +0000 (15:41 +0530)]
 [DOCS] Added example for G_EXTRACT and G_INSERT

Reviewed By: xgupta, gargaroff

Differential Revision: https://reviews.llvm.org/D101227

3 years agoRequire asserts for clang/test/Headers/wasm.c
Hans Wennborg [Wed, 5 May 2021 09:42:16 +0000 (11:42 +0200)]
Require asserts for clang/test/Headers/wasm.c

The test doesn't pass in no-asserts builds, see comment on
https://reviews.llvm.org/D101805

3 years ago[RISCV] Cap legal fixed-length vectors to 256-element types
Fraser Cormack [Tue, 4 May 2021 14:18:28 +0000 (15:18 +0100)]
[RISCV] Cap legal fixed-length vectors to 256-element types

Previously, RISC-V would make legal all fixed-length vectors types whose
size are less than or equal to some function of the minimum value of
VLEN and the maximum-permissible LMUL grouping.

Due to vector legalization issues, this patch instead caps the legal
fixed-length vector types to those with 256 elements. This value was
chosen because it is the longest vector length which has corresponding
MVTs across all supported element types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D101839

3 years ago[AMDGPU] Select V_CVT_*16_F16 more often
Julien Pagès [Wed, 5 May 2021 07:53:59 +0000 (08:53 +0100)]
[AMDGPU] Select V_CVT_*16_F16 more often

Improve the code generation of fp_to_sint
and fp_to_uint for integer on 16-bits.

Differential Revision: https://reviews.llvm.org/D101481

Patch by Julien Pagès!

3 years ago[mlir][ArmSVE] Add basic arithmetic operations
Javier Setoain [Wed, 5 May 2021 07:38:50 +0000 (09:38 +0200)]
[mlir][ArmSVE] Add basic arithmetic operations

While we figure out how to best add Standard support for scalable
vectors, these instructions provide a workaround for basic arithmetic
between scalable vectors.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D100837

3 years ago[llvm-objdump] Add -M {att,intel} & deprecate --x86-asm-syntax={att,intel}
Fangrui Song [Wed, 5 May 2021 07:20:41 +0000 (00:20 -0700)]
[llvm-objdump] Add -M {att,intel} & deprecate --x86-asm-syntax={att,intel}

The internal `cl::opt` option --x86-asm-syntax sets the AsmParser and AsmWriter
dialect. The option is used by llc and llvm-mc tests to set the AsmWriter dialect.

This patch adds -M {att,intel} as GNU objdump compatible aliases (PR43413).

Note: the dialect is initialized when the MCAsmInfo is constructed.
`MCInstPrinter::applyTargetSpecificCLOption` is called too late and its MCAsmInfo
reference is const, so changing the `cl::opt` in
`MCInstPrinter::applyTargetSpecificCLOption` is not an option, at least without
large amount of refactoring.

Reviewed By: hoy, jhenderson, thakis

Differential Revision: https://reviews.llvm.org/D101695

3 years ago[clang][TargetCXXABI] Fix -Wreturn-type warning (NFC)
Yang Fan [Wed, 5 May 2021 06:44:48 +0000 (14:44 +0800)]
[clang][TargetCXXABI] Fix -Wreturn-type warning (NFC)

GCC warning:
```
In file included from /llvm-project/clang/include/clang/Basic/LangOptions.h:22,
                 from /llvm-project/clang/include/clang/Frontend/CompilerInvocation.h:16,
                 from /llvm-project/clang/lib/Frontend/CompilerInvocation.cpp:9:
/llvm-project/clang/include/clang/Basic/TargetCXXABI.h: In static member function ‘static bool clang::TargetCXXABI::isSupportedCXXABI(const llvm::Triple&, clang::TargetCXXABI::Kind)’:
/llvm-project/clang/include/clang/Basic/TargetCXXABI.h:114:3: warning: control reaches end of non-void function [-Wreturn-type]
  114 |   };
      |   ^
```

3 years ago[lldb/Test] Disable testBreakpointByLineAndColumnNearestCode on Windows
Med Ismail Bennani [Wed, 5 May 2021 06:01:50 +0000 (06:01 +0000)]
[lldb/Test] Disable testBreakpointByLineAndColumnNearestCode on Windows

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
3 years ago[dfsan] Turn off all dfsan test cases on non x86_64 OSs
Jianzhou Zhao [Wed, 5 May 2021 05:30:53 +0000 (05:30 +0000)]
[dfsan] Turn off all dfsan test cases on non x86_64 OSs

https://reviews.llvm.org/D101666 enables sanitizer allocator.
This broke all test cases on non x86-64.

3 years ago[lldb/Symbol] Fix column breakpoint `move_to_nearest_code` match
Med Ismail Bennani [Wed, 5 May 2021 04:28:28 +0000 (04:28 +0000)]
[lldb/Symbol] Fix column breakpoint `move_to_nearest_code` match

This patch fixes the column symbol resolution when creating a breakpoint
with the `move_to_nearest_code` flag set.

In order to achieve this, the patch adds column information handling in
the `LineTable`'s `LineEntry` finder. After experimenting a little, it
turns out the most natural approach in case of an inaccurate column match,
is to move backward and match the previous `LineEntry` rather than going
forward like we do with simple line breakpoints.

The patch also reflows the function to reduce code duplication.

Finally, it updates the `BreakpointResolver` heuristic to align it with
the `LineTable` method.

rdar://73218201

Differential Revision: https://reviews.llvm.org/D101221

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
3 years ago[gn build] Port 600686d75f55
LLVM GN Syncbot [Wed, 5 May 2021 04:57:55 +0000 (04:57 +0000)]
[gn build] Port 600686d75f55

3 years agoFix typo, arvm7 -> armv7
Brad Smith [Wed, 5 May 2021 04:55:36 +0000 (00:55 -0400)]
Fix typo, arvm7 -> armv7

3 years ago[libcxx][ranges] Add ranges::ssize CPO.
zoecarver [Fri, 23 Apr 2021 18:23:22 +0000 (11:23 -0700)]
[libcxx][ranges] Add ranges::ssize CPO.

Based on D101079.

Differential Revision: https://reviews.llvm.org/D101189

3 years ago[libcxx][ranges] Add ranges::size CPO.
zoecarver [Thu, 22 Apr 2021 16:29:02 +0000 (09:29 -0700)]
[libcxx][ranges] Add ranges::size CPO.

The begining of [range.prim].

Differential Revision: https://reviews.llvm.org/D101079

3 years ago[InstCombine] Fold more select of selects using isImpliedCondition
Juneyoung Lee [Tue, 4 May 2021 01:16:21 +0000 (10:16 +0900)]
[InstCombine] Fold more select of selects using isImpliedCondition

This is a simple folding that does these:

```
select x_inv, true, (select y, x, false)
=>
select x_inv, true, y
```
https://alive2.llvm.org/ce/z/-STJ2d

```
select (select y, x, false), true, x_inv
=>
select y, true, x_inv
```
https://alive2.llvm.org/ce/z/6ruYt6

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D101807

3 years ago[InstCombine] Precommit tests for D101807 (NFC)
Juneyoung Lee [Tue, 4 May 2021 01:14:02 +0000 (10:14 +0900)]
[InstCombine] Precommit tests for D101807 (NFC)

3 years ago[libcxx][ranges] Add `random_access_{iterator,range}`.
zoecarver [Sat, 24 Apr 2021 00:11:44 +0000 (17:11 -0700)]
[libcxx][ranges] Add `random_access_{iterator,range}`.

Differential Revision: https://reviews.llvm.org/D101316

3 years ago[MLIR][SCF] Combine adjacent scf.if with same condition
William S. Moses [Mon, 3 May 2021 23:20:10 +0000 (19:20 -0400)]
[MLIR][SCF] Combine adjacent scf.if with same condition

Differential Revision: https://reviews.llvm.org/D101798