platform/upstream/llvm.git
22 months ago[BOLT] Track fragment info for all split fragments
Fabian Parzefall [Wed, 24 Aug 2022 17:03:01 +0000 (10:03 -0700)]
[BOLT] Track fragment info for all split fragments

To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132051

22 months ago[BOLT] Allocate FunctionFragment on heap
Fabian Parzefall [Wed, 24 Aug 2022 17:02:35 +0000 (10:02 -0700)]
[BOLT] Allocate FunctionFragment on heap

This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132050

22 months ago[BOLT] Towards FunctionLayout const-correctness
Fabian Parzefall [Wed, 24 Aug 2022 17:02:16 +0000 (10:02 -0700)]
[BOLT] Towards FunctionLayout const-correctness

A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132049

22 months ago[mlir] Support llvm.readnone attribute for all FunctionOpInterface ops.
Slava Zakharin [Tue, 9 Aug 2022 00:47:05 +0000 (17:47 -0700)]
[mlir] Support llvm.readnone attribute for all FunctionOpInterface ops.

The attribute is translated into LLVM's function attribute 'readnone'.
There is no explicit verification regarding conflicting 'readnone'
and function attributes from 'passthrough', though, LLVM would assert
if they are incompatible during LLVM IR creation.

Differential Revision: https://reviews.llvm.org/D131457

22 months ago[LV] Support predicated div/rem operations via safe-divisor select idiom
Philip Reames [Wed, 24 Aug 2022 16:48:01 +0000 (09:48 -0700)]
[LV] Support predicated div/rem operations via safe-divisor select idiom

This patch adds support for vectorizing conditionally executed div/rem operations via a variant of widening. The existing support for predicated divrem in the vectorizer requires scalarization which we can't do for scalable vectors.

The basic idea is that we can always divide (take remainder) by 1 without executing UB. As such, we can use the active lane mask to conditional select either the actual divisor for active lanes, or a constant one for inactive lanes. We already account for the cost of the active lane mask, so the only additional cost is a splat of one and the vector select. This is one of several possible approaches to this problem; see the review thread for discussion on some of the others.  This one was chosen mostly because it was straight forward, and none of the others seemed oviously better.

I enabled the new code only for scalable vectors. We could also legally enable it for fixed vectors as well, but I haven't thought through the cost tradeoffs between widening and scalarization enough to know if that's profitable. This will be explored in future patches.

Differential Revision: https://reviews.llvm.org/D130164

22 months ago[VPlan] Remove unneeded `struct` prefix for VPTransformState args (NFC).
Florian Hahn [Wed, 24 Aug 2022 16:57:57 +0000 (17:57 +0100)]
[VPlan] Remove unneeded `struct` prefix for VPTransformState args (NFC).

22 months agoextending code layout alg
spupyrev [Fri, 15 Jul 2022 18:52:56 +0000 (11:52 -0700)]
extending code layout alg

The diff modifies ext-tsp code layout algorithm in the following ways:
(i) fixes merging of cold block chains (this is a port of D129397);
(ii) adjusts the cost model utilized for optimization;
(iii) adjusts some APIs so that the implementation can be used in BOLT; this is
a prerequisite for D129895.

The only non-trivial change is (ii). Here we introduce different weights for
conditional and unconditional branches in the cost model. Based on the new model
it is slightly more important to increase the number of "fall-through
unconditional" jumps, which makes sense, as placing two blocks with an
unconditional jump next to each other reduces the number of jump instructions in
the generated code. Experimentally, this makes a mild impact on the performance;
I've seen up to 0.2%-0.3% perf win on some benchmarks.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D129893

22 months ago[ELF] Parallelize writes of different OutputSections
Fangrui Song [Wed, 24 Aug 2022 16:40:03 +0000 (09:40 -0700)]
[ELF] Parallelize writes of different OutputSections

We currently process one OutputSection at a time and for each OutputSection
write contained input sections in parallel. This strategy does not leverage
multi-threading well. Instead, parallelize writes of different OutputSections.

The default TaskSize for parallelFor often leads to inferior sharding. We
prepare the task in the caller instead.

* Move llvm::parallel::detail::TaskGroup to llvm::parallel::TaskGroup
* Add llvm::parallel::TaskGroup::execute.
* Change writeSections to declare TaskGroup and pass it to writeTo.

Speed-up with --threads=8:

* clang -DCMAKE_BUILD_TYPE=Release: 1.11x as fast
* clang -DCMAKE_BUILD_TYPE=Debug: 1.10x as fast
* chrome -DCMAKE_BUILD_TYPE=Release: 1.04x as fast
* scylladb build/release: 1.09x as fast

On M1, many benchmarks are a small fraction of a percentage faster. Mozilla showed the largest difference with the patch being about 1.03x as fast.

Differential Revision: https://reviews.llvm.org/D131247

22 months ago[llvm] Teach LLVM about filesets
Jonas Devlieghere [Wed, 24 Aug 2022 15:35:59 +0000 (08:35 -0700)]
[llvm] Teach LLVM about filesets

Teach LLVM about filesets. Filesets were added in macOS 11 (Big Sur) to
combine multiple Mach-O files. They introduce a new load command
(LC_FILESET_ENTRY) consisting of a fileset_entry_command.

  struct fileset_entry_command {
      uint32_t     cmd;        /* LC_FILESET_ENTRY */
      uint32_t     cmdsize;    /* includes entry_id string */
      uint64_t     vmaddr;     /* memory address of the entry */
      uint64_t     fileoff;    /* file offset of the entry */
      union lc_str entry_id;   /* contained entry id */
      uint32_t     reserved;   /* reserved */
  };

This patch teaches LLVM about the new load command and the corresponding
data.

Differential revision: https://reviews.llvm.org/D132432

22 months ago[NFC][mlir] Add support for llvm style casting for mlir types
Tyker [Wed, 17 Aug 2022 04:16:34 +0000 (21:16 -0700)]
[NFC][mlir] Add support for llvm style casting for mlir types

Note:
when operating on a Type hierarchy with LeafType inheriting from MiddleType which inherits from mlir::Type.
calling LeafType::classof(MiddleType) will always return false.
because classof call the static getTypeID from its parent instead of the dynamic Type::getTypeID
so classof in this context will check if the TypeID of LeafType is the same as the TypeID of MiddleType which is always false.
It is bypassed in this commit inside CastInfo<To, From>::isPossible by calling classof with an mlir::Type.
but other unsuspecting users of LeafType::classof(MiddleType) would still get an incorrect result.

22 months ago[MLIR][OpenMP] Add support for safelen clause
Prabhdeep Singh Soni [Fri, 19 Aug 2022 20:40:39 +0000 (16:40 -0400)]
[MLIR][OpenMP] Add support for safelen clause

This supports translation from MLIR to LLVM IR using OMPIRBuilder for
OpenMP safelen clause in SIMD construct.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D132245

22 months agoRevert "[MLIR][OpenMP] Add support for safelen clause"
Prabhdeep Singh Soni [Wed, 24 Aug 2022 16:28:02 +0000 (12:28 -0400)]
Revert "[MLIR][OpenMP] Add support for safelen clause"

This reverts commit 172fe1706d83832a330170f43fe52aab1b75e7de.

22 months ago[X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch
Simon Pilgrim [Wed, 24 Aug 2022 16:28:07 +0000 (17:28 +0100)]
[X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch

This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis.

For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling.

Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU.

Differential Revision: https://reviews.llvm.org/D132520

22 months ago[MLIR][OpenMP] Add support for safelen clause
Prabhdeep Singh Soni [Fri, 19 Aug 2022 20:40:39 +0000 (16:40 -0400)]
[MLIR][OpenMP] Add support for safelen clause

This supports translation from MLIR to LLVM IR using OMPIRBuilder for
OpenMP safelen clause in SIMD construct.

22 months agoRevert "Add support for safelen clause"
Prabhdeep Singh Soni [Wed, 24 Aug 2022 16:15:41 +0000 (12:15 -0400)]
Revert "Add support for safelen clause"

This reverts commit 3dd4d6a0cec85d96af0340a48aaacf638215fe76.

22 months ago[InstCombine] ease use constraint in tryFactorization()
Sanjay Patel [Wed, 24 Aug 2022 15:39:24 +0000 (11:39 -0400)]
[InstCombine] ease use constraint in tryFactorization()

The stronger one-use checks prevented transforms like this:
(x * y) + x --> x * (y + 1)
(x * y) - x --> x * (y - 1)

https://alive2.llvm.org/ce/z/eMhvQa

This is one of the IR transforms suggested in issue #57255.

This should be better in IR because it removes a use of a
variable operand (we already fold the case with a constant
multiply operand).
The backend should be able to re-distribute the multiply if
that's better for the target.

Differential Revision: https://reviews.llvm.org/D132412

22 months agoAdd support for safelen clause
Prabhdeep Singh Soni [Fri, 19 Aug 2022 20:40:39 +0000 (16:40 -0400)]
Add support for safelen clause

This supports translation from MLIR to LLVM IR using OMPIRBuilder for
OpenMP safelen clause in SIMD construct.

22 months ago[InstCombine] Canonicalize ((X & -X) - 1) --> ((X - 1) & ~X) (PR51784)
Simon Pilgrim [Wed, 24 Aug 2022 15:50:34 +0000 (16:50 +0100)]
[InstCombine] Canonicalize ((X & -X) - 1) --> ((X - 1) & ~X) (PR51784)

Enables the ctpop((x & -x ) - 1) -> cttz(x, false) fold

Alive2: https://alive2.llvm.org/ce/z/EDk4h7 (((X & -X) - 1) --> (~X & (X - 1)) )

Alive2: https://alive2.llvm.org/ce/z/8Yr3XG (CTPOP -> CTTZ)

Fixes #51126

Differential Revision: https://reviews.llvm.org/D110488

22 months ago[mlir][Vector] Support 0-D vectors in FMAOp
Michal Terepeta [Wed, 24 Aug 2022 14:59:59 +0000 (07:59 -0700)]
[mlir][Vector] Support 0-D vectors in FMAOp

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D115742

22 months ago[Libomptarget] Replace use of `dlopen` with LLVM's dynamic library support
Joseph Huber [Tue, 9 Aug 2022 15:37:59 +0000 (11:37 -0400)]
[Libomptarget] Replace use of `dlopen` with LLVM's dynamic library support

This patch replaces uses of `dlopen` and `dlsym` with LLVM's support
with `loadPermanentLibrary` and `getSymbolAddress`. This allows us to
remove the explicit dependency on the `dl` libraries in the CMake. This
removes another explicit dependency and solves an issue encountered
while building on Windows platforms. The one downside to this is that
the LLVM library does not currently support `dlclose` functionality, but
this could be added in the future.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D131507

22 months ago[Libomptarget] Remove use of ELF link_address in x86_64 plugin
Joseph Huber [Tue, 9 Aug 2022 19:13:18 +0000 (15:13 -0400)]
[Libomptarget] Remove use of ELF link_address in x86_64 plugin

We use the offloading entires array to determine the relative names and
addressed of device-side kernel functions. The x86_64 plugin previously
derived the device-side entry table by first identifying the
`omp_offloading_entries` section offset in the loaded elf. Then we would
use the base offset of the loaded dyanmic library to identify the
entries array within the loaded image. This relied on some more
unconventional methods which prevented us from using the LLVM dynamic
library loader for this plugin. This patch simplifies this by instead
copying the host-side entry and replacing its address with the
device-side address looked up through `dlsym`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D131516

22 months ago[RISCV][NFC] Minor cleanup in RISCVInstrInfo::getOutliningType
Kito Cheng [Wed, 24 Aug 2022 15:39:17 +0000 (23:39 +0800)]
[RISCV][NFC] Minor cleanup in RISCVInstrInfo::getOutliningType

The only use of TM is checking result of TargetMachine::getFunctionSections,
check that directly instead of introdce a local variable.

22 months ago[InstCombine] improve readability in tryFactorization(); NFC
Sanjay Patel [Wed, 24 Aug 2022 14:43:46 +0000 (10:43 -0400)]
[InstCombine] improve readability in tryFactorization(); NFC

Added/removed braces, reduced indents, and renamed a variable.

22 months ago[InstCombine] add tests for mul+sub common factor; NFC
Sanjay Patel [Tue, 23 Aug 2022 21:59:29 +0000 (17:59 -0400)]
[InstCombine] add tests for mul+sub common factor; NFC

22 months ago[mlgo] Fix cmake logic detecting tf pip package location
Mircea Trofin [Wed, 24 Aug 2022 15:28:23 +0000 (08:28 -0700)]
[mlgo] Fix cmake logic detecting tf pip package location

New logic works for both `tensorflow` and `tf-nightly`.

22 months agoRevert rGc360955c4804e9b25017372cb4c6be7adcb216ce "[InstCombine] Canonicalize ((X...
Simon Pilgrim [Wed, 24 Aug 2022 15:26:20 +0000 (16:26 +0100)]
Revert rGc360955c4804e9b25017372cb4c6be7adcb216ce "[InstCombine] Canonicalize ((X & -X) - 1) --> (~X & (X - 1)) (PR51784)"

The test changes are failing on some buildbots (but not others.....).

22 months ago[X86][FP16] Add the missing legal action for EXTRACT_SUBVECTOR
Phoebe Wang [Wed, 24 Aug 2022 15:24:41 +0000 (23:24 +0800)]
[X86][FP16] Add the missing legal action for EXTRACT_SUBVECTOR

Fixes #57340

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132563

22 months ago[runtimes][NFC] Colocate handling of LLVM_ENABLE_PROJECTS and LLVM_ENABLE_RUNTIMES
Louis Dionne [Tue, 23 Aug 2022 15:10:18 +0000 (11:10 -0400)]
[runtimes][NFC] Colocate handling of LLVM_ENABLE_PROJECTS and LLVM_ENABLE_RUNTIMES

This will make the following patches to migrate projects off of the
LLVM_ENABLE_PROJECTS build onto the LLVM_ENABLE_RUNTIMES build much
easier to comprehend. This patch should be a NFC since it keeps the
same set of runtimes being built by default.

Differential Revision: https://reviews.llvm.org/D132478

22 months ago[LLDB] Skip TestFunctionTemplateSpecializationTempArgs for Arm64/Windows
Muhammad Omair Javaid [Wed, 24 Aug 2022 15:06:50 +0000 (20:06 +0500)]
[LLDB] Skip TestFunctionTemplateSpecializationTempArgs for Arm64/Windows

This test fails on buildbot while passes on standalone builds. I am
marking it as skipped until actual problem is found and resolved.

22 months ago[flang] Create a temporary of the correct size when lowering SetLength
Valentin Clement [Wed, 24 Aug 2022 14:56:14 +0000 (16:56 +0200)]
[flang] Create a temporary of the correct size when lowering SetLength

This patch creates a temporary of the appropriate length while lowering SetLength.

The corresponding character can be truncated or padded if necessary.

This fix issue with array constructor in argument and also with statement function.

```
  character(7) :: str = "1234567"
  call s(str(1:1))
contains
 subroutine s(a)
  character(*) :: a
  call s2([Character(3)::a])
 end subroutine
 subroutine s2(c)
  character(3) :: c(1)
  print "(4a)", c(1), "end"
 end subroutine
end
```

The example prior the patch prints `123end` instead of `1. end`

Reviewed By: PeteSteinfeld, jeanPerier

Differential Revision: https://reviews.llvm.org/D132464

22 months ago[instcombine] Test for zero initialisation optimisation of integer product
Zain Jaffal [Wed, 24 Aug 2022 14:54:19 +0000 (15:54 +0100)]
[instcombine] Test for zero initialisation optimisation of integer product

Following the work on `D131672` we do the same optimisations for integer products.
We add tests to check if a loop gets removed if we repeatdly multiply an array elements with an accumulator initalised to zero

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D132553

22 months ago[AMDGPU] Remove old isCheapToSpeculateCttz FIXME
Simon Pilgrim [Wed, 24 Aug 2022 14:45:23 +0000 (15:45 +0100)]
[AMDGPU] Remove old isCheapToSpeculateCttz FIXME

As confirmed on D132520 - this should always return true

22 months ago[bolt] Fix a test affected by D131589.
Simon Tatham [Wed, 24 Aug 2022 14:34:10 +0000 (15:34 +0100)]
[bolt] Fix a test affected by D131589.

This test contained some data tables that llvm-objdump was
disassembling as code, so the test was recovering the 32-bit values in
the table from the instruction encoding column of the disassembly.

D131589 changed how llvm-objdump decides what to disassemble as code
or as data. As a result, these data tables are now being disassembled
as data, which I think is actually more sensible -- but the test
wasn't expecting it, and got confused.

22 months ago[AMDGPU][GISel] Enable Selection of ADD3 for G_PTR_ADD
Pierre van Houtryve [Wed, 24 Aug 2022 12:13:04 +0000 (12:13 +0000)]
[AMDGPU][GISel] Enable Selection of ADD3 for G_PTR_ADD

Allows things like `(G_PTR_ADD (G_PTR_ADD a, b), c)` to be
simplified into a single ADD3 instruction instead of two adds.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D131254

22 months ago[mlir] Rename arith.addi_carry to arith.addui_carry
Jakub Kuderski [Wed, 24 Aug 2022 14:40:41 +0000 (10:40 -0400)]
[mlir] Rename arith.addi_carry to arith.addui_carry

The intention is to have this op lowered to
`llvm.intr.uadd.with.overflow` or `spv.IAddCarry`. LLVM has a second
intrinsic for signed add-with-overflow, `llvm.intr.sadd.with.overflow`,
with different semantics. Therefore we should have 2 ops with `arith`,
and be explicit about signed/unsigned semantics.

Rename `arith.addi_carry` to `arith.addui_carry` before we introduce a
signed version of this op: `arith.addsi_carry`.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D132491

22 months ago[MLIR] Fix cast warning in pass tablegen test
Michele Scuttari [Wed, 24 Aug 2022 14:35:26 +0000 (16:35 +0200)]
[MLIR] Fix cast warning in pass tablegen test

22 months ago[runtimes] Don't link against compiler-rt when we don't find it
Louis Dionne [Wed, 24 Aug 2022 14:27:42 +0000 (10:27 -0400)]
[runtimes] Don't link against compiler-rt when we don't find it

Otherwise, we would end up passing `-lNOTFOUND` to the compiler, which
caused various compiler checks to fail and ended up breaking the build
in the most obscure ways. For example, checks for -faligned-allocation
would fail because the compiler would complain about an unknown library
called NOTFOUND, and we would end up not passing -faligned-allocation
anywhere in our build. This is madness.

An even better alternative would be to simply FATAL_ERROR if we don't
find the builtins library. However, it seems like our build has been
working fine without finding it for a while, so instead of making a
bunch of builds fail, we can figure out why linking against compiler-rt
doesn't actually seem to be required in a follow-up, and perhaps
relax that.

22 months ago[InstCombine] Canonicalize ((X & -X) - 1) --> (~X & (X - 1)) (PR51784)
Simon Pilgrim [Wed, 24 Aug 2022 14:31:04 +0000 (15:31 +0100)]
[InstCombine] Canonicalize ((X & -X) - 1) --> (~X & (X - 1)) (PR51784)

Enables the ctpop((x & -x ) - 1) -> cttz(x, false) fold

Alive2: https://alive2.llvm.org/ce/z/EDk4h7 (((X & -X) - 1) --> (~X & (X - 1)) )

Alive2: https://alive2.llvm.org/ce/z/8Yr3XG (CTPOP -> CTTZ)

Fixes #51126

Differential Revision: https://reviews.llvm.org/D110488

22 months agoRevert "[DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops"
Stephen Tozer [Wed, 24 Aug 2022 13:54:33 +0000 (14:54 +0100)]
Revert "[DebugInfo] Extend the InstrRef LDV to support DbgValues with many Ops"

Reverting due to reported errors when running Linux kernel builds with
KMSAN -gdwarf-4.

This reverts commit 2cb9e1ac422f46de0ab728c6c9d50ebafbfe372a.

22 months ago[update_llc_test_checks][VE] Handle .Lfoo$local in function regex
Alex Richardson [Wed, 24 Aug 2022 13:48:11 +0000 (13:48 +0000)]
[update_llc_test_checks][VE] Handle .Lfoo$local in function regex

While working on https://reviews.llvm.org/D131429, I got a test diff in
one of the VE tests and running update_llc_test_checks.py deleted all the
code for that function. This updates the regex to handle this new output.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D131431

22 months ago[update_llc_test_checks][VE] Add baseline test for PIC function regex
Alex Richardson [Wed, 24 Aug 2022 13:47:57 +0000 (13:47 +0000)]
[update_llc_test_checks][VE] Add baseline test for PIC function regex

While working on https://reviews.llvm.org/D131429, I got a test diff in
one of the VE tests and running update_llc_test_checks.py deleted all the
code for that function. This is a baseline test for this bug (incorrect
regex for VE when .Lfoo$local symbols are used).

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D131434

22 months ago[RegisterInfoEmitter] Generate isConstantPhysReg(). NFCI
Alex Richardson [Wed, 24 Aug 2022 13:42:50 +0000 (13:42 +0000)]
[RegisterInfoEmitter] Generate isConstantPhysReg(). NFCI

This commit moves the information on whether a register is constant into
the Tablegen files to allow generating the implementaiton of
isConstantPhysReg(). I've marked isConstantPhysReg() as final in this
generated file to ensure that changes are made to tablegen instead of
overriding this function, but if that turns out to be too restrictive,
we can remove the qualifier.

This should be pretty much NFC, but I did notice that e.g. the AMDGPU
generated file also includes the LO16/HI16 registers now.

The new isConstant flag will also be used by D131958 to ensure that
constant registers are marked as call-preserved.

Differential Revision: https://reviews.llvm.org/D131962

22 months ago[CMake] Avoid `LLVM_BINARY_DIR` when other more specific variable are better-suited
John Ericson [Sat, 20 Aug 2022 22:11:58 +0000 (18:11 -0400)]
[CMake] Avoid `LLVM_BINARY_DIR` when other more specific variable are better-suited

A simple sed doing these substitutions:

- `${LLVM_BINARY_DIR}/(\$\{CMAKE_CFG_INTDIR}/)?lib(${LLVM_LIBDIR_SUFFIX})?\>` -> `${LLVM_LIBRARY_DIR}`
- `${LLVM_BINARY_DIR}/(\$\{CMAKE_CFG_INTDIR}/)?bin\>` -> `${LLVM_TOOLS_BINARY_DIR}`

where `\>` means "word boundary".

The only manual modifications were reverting changes in

- `compiler-rt/cmake/Modules/CompilerRTUtils.cmake
- `runtimes/CMakeLists.txt`

because these were "entry points" where we wanted to tread carefully not not introduce a "loop" which would end with an undefined variable being expanded to nothing.

This hopefully increases readability overall, and also decreases the usages of `LLVM_LIBDIR_SUFFIX`, preparing us for D130586.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D132316

22 months ago[llvm-objdump] Handle multiple syms at same addr in disassembly.
Simon Tatham [Wed, 24 Aug 2022 12:13:38 +0000 (13:13 +0100)]
[llvm-objdump] Handle multiple syms at same addr in disassembly.

The main disassembly loop in llvm-objdump works by iterating through
the symbols in a code section, and for each one, dumping the range of
the section from that symbol to the next. If there's another symbol
defined at the same location, then that range will have length 0, and
llvm-objdump will skip over the symbol entirely.

As a result, llvm-objdump will only show the last of the symbols
defined at that address. Not only that, but the other symbols won't
even be checked against the `--disassemble-symbol` list. So if you
have two symbols `foo` and `bar` defined in the same place, then one
of `--disassemble-symbol=foo` and `--disassemble-symbol=bar` will
generate an error message and no disassembly.

I think a better approach in that situation is to prioritise display
of the symbol the user actually asked for. Also, if the user
specifically asks for disassembly of //both// of two symbols defined
at the same address, the best response I can think of is to
disassemble the code once, preceded by both symbol names.

This involves teaching llvm-objdump to be able to display more than
one symbol name at the head of a disassembled section, which also
makes it possible to implement a `--show-all-symbols` option to
display //every// symbol defined in the code, not just the most
preferred one at each address.

This change also turns out to fix a bug in which `--disassemble-all`
on a mixed Arm/Thumb ELF file would fail to switch disassembly states
between Arm and Thumb functions, because the mapping symbols were
accidentally ignored.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D131589

22 months ago[flang] Lower F08 parity intrinsic
Tarun Prabhu [Wed, 24 Aug 2022 13:57:21 +0000 (07:57 -0600)]
[flang] Lower F08 parity intrinsic

Lower F08 parity intrinsic. This largely follows the implementation of the ANY
and ALL intrinsics which are related.

Differential Revision: https://reviews.llvm.org/D129788

22 months ago[CUDA][OpenMP] Fix the new driver crashing on multiple device-only outputs
Joseph Huber [Fri, 19 Aug 2022 15:38:12 +0000 (11:38 -0400)]
[CUDA][OpenMP] Fix the new driver crashing on multiple device-only outputs

The new driver supports device-only compilation for the offloading
device. The way this is handlded is a little different from the old
offloading driver. The old driver would put all the outputs in the final
action list akin to a linker job. The new driver however generated these
in the middle of the host's job so we instead put them all in a single
offloading action. However, we only handled these kinds of offloading
actions correctly when there was only a single input. When we had
multiple inputs we would instead attempt to get the host job, which
didn't exist, and crash.

This patch simply adds some extra logic to generate the jobs for all
dependencies if there is not host action.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D132248

22 months ago[RISCV] Don't outline pcrel-lo operand.
Kito Cheng [Wed, 24 Aug 2022 13:16:52 +0000 (21:16 +0800)]
[RISCV] Don't outline pcrel-lo operand.

This issue is found by build llvm-testsuite with `-Oz`, linker will complain
`dangerous relocation: %pcrel_lo missing matching %pcrel_hi` and that
turn out cause by we outlined pcrel-lo, but leave pcrel-hi there, that's
not problem in general, but the problem is they put into different section, they
pcrel-hi and pcrel-lo pair (e.g. AUIPC+ADDI) *MUST* put be present in same
section due to the implementation.

Outlined function will put into .text name, but the source functions
will put in .text.<function-name> if function-section is enabled or the
function has `comdat` attribute.

There are few solutions for this issue:
1. Always disallow instructions with pcrel-lo flags.
2. Only disallow instructions with pcrel-lo flags that when function-section is
   enabled or this function has `comdat` attribute.
3. Check the corresponding instruction with pcrel-high also included in the
   outlining candidate sequence or not, and allow that only when pcrel-high is
   included in the outlining candidate.

First one is most conservative, that might lose some optimization
opportunities, and second one could save those opportunities, and last
one is hard to implement, and don't have any benefits since pcrel-high
are using different label even accessing same symbol.

Use custom section name might also cause this problem, but that already
filtered by RISCVInstrInfo::isFunctionSafeToOutlineFrom.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D132528

22 months ago[InstCombine] Prevent complexity commutation in dec_mask_commute_neg_i32
Simon Pilgrim [Wed, 24 Aug 2022 13:35:04 +0000 (14:35 +0100)]
[InstCombine] Prevent complexity commutation in dec_mask_commute_neg_i32

Noticed by @spatel in D110488

22 months ago[RISCV] Precommit test for machine outliner issue for instruction with pcrel-lo.
Kito Cheng [Wed, 24 Aug 2022 13:16:06 +0000 (21:16 +0800)]
[RISCV] Precommit test for machine outliner issue for instruction with pcrel-lo.

Differential Revision: https://reviews.llvm.org/D132527

22 months ago[clang] Explicitly add libcxx and libcxxabi runtimes during Stage2 builds
Louis Dionne [Wed, 24 Aug 2022 13:11:49 +0000 (09:11 -0400)]
[clang] Explicitly add libcxx and libcxxabi runtimes during Stage2 builds

For the time being, we are still building libc++ and libc++abi's headers
during stage 2 builds. Encode that in the cache file so that CI jobs don't
have to manually specify LLVM_ENABLE_RUNTIMES when doing a stage 2 build.

22 months ago[FLANG]Add maxval simplification support
Mats Petersson [Fri, 29 Jul 2022 19:19:04 +0000 (20:19 +0100)]
[FLANG]Add maxval simplification support

Add simplifcation pass for MAXVAL intrinsic function

This refactors some of the code to allow variation on the
initialization value and operation performed within the loop,
reusing the majority of code for both SUM and MAXVAL.

Adding tests for the test-cases that produce different output
than the SUM function.

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D132234

22 months ago[NFC] Colocate cache values for controling libc++ headers build in stage 2
Louis Dionne [Wed, 24 Aug 2022 12:50:46 +0000 (08:50 -0400)]
[NFC] Colocate cache values for controling libc++ headers build in stage 2

22 months ago[NFC] Mark variable as maybe_unused to silence warning
Kiran Chandramohan [Wed, 24 Aug 2022 11:47:47 +0000 (12:47 +0100)]
[NFC] Mark variable as maybe_unused to silence warning

22 months ago[InstCombine] Add tests for ((X & -X) - 1) --> (~X & (X - 1)) canonicalization
Simon Pilgrim [Wed, 24 Aug 2022 11:53:43 +0000 (12:53 +0100)]
[InstCombine] Add tests for ((X & -X) - 1) --> (~X & (X - 1)) canonicalization

As originally suggested on D110488

22 months ago[LV] Replace fixed-order cost model with a SK_Splice shuffle
David Green [Wed, 24 Aug 2022 12:00:32 +0000 (13:00 +0100)]
[LV] Replace fixed-order cost model with a SK_Splice shuffle

The existing cost model for fixed-order recurrences models the phi as an
extract shuffle of a v1 vector. The shuffle produced should be a splice,
as they take two vectors inputs are extracting from a subset of the
lanes. On certain architectures the existing cost model can drastically
under-estimate the correct cost for the shuffle, so this changes it to a
SK_Splice and passes a correct Mask through to the getShuffleCost call.

I believe this might be the first use of a SK_Splice shuffle cost model
outside of scalable vectors, and some targets may require additions to
the cost-model to correctly account for them. In tree targets appear to
all have been updated where needed.

Differential Revision: https://reviews.llvm.org/D132308

22 months ago[cross-project] Disable debug-types-section tests on Apple systems
Felipe de Azevedo Piovezan [Wed, 24 Aug 2022 11:30:30 +0000 (07:30 -0400)]
[cross-project] Disable debug-types-section tests on Apple systems

The -fdebug-types-section flag is not supported on Apple platforms.

Reviewed By: Michael137

Differential Revision: https://reviews.llvm.org/D132410

22 months agotsan: add ability to compile for different Go subarch values.
Keith Randall [Wed, 24 Aug 2022 10:50:27 +0000 (06:50 -0400)]
tsan: add ability to compile for different Go subarch values.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D131927

22 months ago[mlir][arith] Fold `andi x, not(x)` to zero
Markus Böck [Wed, 24 Aug 2022 11:09:46 +0000 (13:09 +0200)]
[mlir][arith] Fold `andi x, not(x)` to zero

A bitwise and with the bitwise negate of itself is always 0, regardless of the integer type. This patch adds detection of such a pattern in `arith.andi`s `fold` method.

Differential Revision: https://reviews.llvm.org/D131860

22 months agoAArch64 SVE
Hassnaa Hamdi [Tue, 23 Aug 2022 10:44:23 +0000 (10:44 +0000)]
AArch64 SVE
Add SVE patterns to make use of predicated smin, umin, smax, and umax instructions,
add sve-min-max-pred.ll test file for the new patterns

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D132122

22 months ago[AMDGPU][MC][GFX11][NFC] Split tests for SOP formats
Dmitry Preobrazhensky [Wed, 24 Aug 2022 10:57:48 +0000 (13:57 +0300)]
[AMDGPU][MC][GFX11][NFC] Split tests for SOP formats

Differential Revision: https://reviews.llvm.org/D132474

22 months ago[AMDGPU][MC][GFX11][NFC] Add missing tests for SOP instructions
Dmitry Preobrazhensky [Wed, 24 Aug 2022 10:45:20 +0000 (13:45 +0300)]
[AMDGPU][MC][GFX11][NFC] Add missing tests for SOP instructions

Differential Revision: https://reviews.llvm.org/D132404

22 months ago[AMDGPU][MC][GFX11][NFC] Update tests for FLAT instructions
Dmitry Preobrazhensky [Wed, 24 Aug 2022 10:38:09 +0000 (13:38 +0300)]
[AMDGPU][MC][GFX11][NFC] Update tests for FLAT instructions

Differential Revision: https://reviews.llvm.org/D132402

22 months ago[AMDGPU][MC][NFC] Rename disassembler tests
Dmitry Preobrazhensky [Wed, 24 Aug 2022 10:26:57 +0000 (13:26 +0300)]
[AMDGPU][MC][NFC] Rename disassembler tests

Make test names more uniform.

Differential Revision: https://reviews.llvm.org/D132472

22 months ago[AMDGPU][MC][GFX8][NFC] Consolidate tests by encoding
Dmitry Preobrazhensky [Wed, 24 Aug 2022 10:11:31 +0000 (13:11 +0300)]
[AMDGPU][MC][GFX8][NFC] Consolidate tests by encoding

Differential Revision: https://reviews.llvm.org/D132469

22 months ago[DAG] matchRotateHalf - constify SelectionDAG arg. NFC.
Simon Pilgrim [Wed, 24 Aug 2022 09:57:30 +0000 (10:57 +0100)]
[DAG] matchRotateHalf - constify SelectionDAG arg. NFC.

Based off Issue #57283 - we need to try harder to ensure we're not creating nodes on-the-fly - so make sure we're just using SelectionDAG for analysis where possible

22 months ago[RISCV] : Add support for immediate operands.
MarkGoncharovAl [Wed, 24 Aug 2022 09:34:24 +0000 (17:34 +0800)]
[RISCV] : Add support for immediate operands.

llvm-exegesis uses operand type information provided in tablegen files to initialize
immediate arguments of the instruction. Some of them simply don't have such information.
Thus we should set into relevant immediate operands their specific type.
Also create verification methods for them.

Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D131771

22 months agoRevert "[Pipelines] Introduce DAE after ArgumentPromotion"
Pavel Samolysov [Wed, 24 Aug 2022 09:44:13 +0000 (12:44 +0300)]
Revert "[Pipelines] Introduce DAE after ArgumentPromotion"

This reverts commit 3f20dcbf708cb23f79c4866d8285a8ae7bd885de.

22 months ago[NVPTX] SHL.64 $r, 31 cannot be converted to a mulwide.s32
Dmitry Vassiliev [Wed, 24 Aug 2022 09:39:41 +0000 (11:39 +0200)]
[NVPTX] SHL.64 $r, 31 cannot be converted to a mulwide.s32

In order to convert to mulwide.s32, we compute the 2nd operand as MulWide.32 $r, (1 << 31).
(1 << 31) is interpreted as a negative number, and is not equivalent to the original instruction.
The code `int64_t r = (int64_t)a << 31;` incorrectly compiled to `mul.wide.s32 %rd7, %r1, -2147483648;`

Reviewed By: jchlanda

Differential Revision: https://reviews.llvm.org/D132516

22 months ago[LoongArch] Implement TargetLowering::hasAndNot() for more optimization chances
gonglingqin [Wed, 24 Aug 2022 09:12:59 +0000 (17:12 +0800)]
[LoongArch] Implement TargetLowering::hasAndNot() for more optimization chances

Differential Revision: https://reviews.llvm.org/D132282

22 months ago[AMDGPU] Remove unused S_ADD_U64_CO_PSEUDO and S_SUB_U64_CO_PSEUDO
Jay Foad [Wed, 24 Aug 2022 09:28:07 +0000 (10:28 +0100)]
[AMDGPU] Remove unused S_ADD_U64_CO_PSEUDO and S_SUB_U64_CO_PSEUDO

22 months ago[clang] Allow using -rtlib=platform to switching to the default rtlib on all targets
Martin Storsjö [Mon, 28 Mar 2022 20:23:36 +0000 (23:23 +0300)]
[clang] Allow using -rtlib=platform to switching to the default rtlib on all targets

Normally, passing -rtlib=platform overrides any earlier -rtlib
options, and overrides any hardcoded CLANG_DEFAULT_RTLIB option.
However, some targets, MSVC and Darwin, have custom logic for
disallowing specific -rtlib= option values; amend these checks for
allowing the -rtlib=platform option.

Differential Revision: https://reviews.llvm.org/D132444

22 months ago[mlir] Apply ClangTidy readability fix.
Adrian Kuegel [Wed, 24 Aug 2022 08:34:26 +0000 (10:34 +0200)]
[mlir] Apply ClangTidy readability fix.

Use .empty() instead of checking for size() == 0.

22 months agoRevert "[Clang] Avoid using unwind library in the MSVC environment"
Petr Hosek [Wed, 24 Aug 2022 08:24:18 +0000 (08:24 +0000)]
Revert "[Clang] Avoid using unwind library in the MSVC environment"

This reverts commit eca29d4a37b8d1c93fe99be6289a60bb11cf789d since
the test fails in the per-target-runtime-dir layout.

22 months ago[mlir][Bazel] Fix bazel build.
Adrian Kuegel [Wed, 24 Aug 2022 08:13:28 +0000 (10:13 +0200)]
[mlir][Bazel] Fix bazel build.

22 months ago[runtimes] Use a response file for runtimes test suites
Petr Hosek [Thu, 18 Aug 2022 08:25:13 +0000 (08:25 +0000)]
[runtimes] Use a response file for runtimes test suites

We don't know which test suites are going to be included by runtimes
builds so we cannot include those before running the sub-build, but
that's not possible during the LLVM build configuration. We instead use
a response file that's populated by the runtimes build as a level of
indirection.

This addresses the issue described in:
https://discourse.llvm.org/t/cmake-regeneration-is-broken/62788

Differential Revision: https://reviews.llvm.org/D132438

22 months ago[lit] Support reading arguments from a file
Petr Hosek [Mon, 15 Aug 2022 17:59:39 +0000 (17:59 +0000)]
[lit] Support reading arguments from a file

This allows reading arguments from file using the response file syntax.
We would like to use this in the LLVM build to pass test suites from
subbuilds.

Differential Revision: https://reviews.llvm.org/D132437

22 months ago[MLIR] Split autogenerated pass declarations & C++ controllable pass options
Michele Scuttari [Wed, 24 Aug 2022 07:59:50 +0000 (09:59 +0200)]
[MLIR] Split autogenerated pass declarations & C++ controllable pass options

The pass tablegen backend has been reworked to remove the monolithic nature of the autogenerated declarations.
The pass public header can be generated with the -gen-pass-decls option. It contains options structs and registrations: the inclusion of options structs can be controlled individually for each pass by defining the GEN_PASS_DECL_PASSNAME macro; the declaration of the registrations have been kept together and can still be included by defining the GEN_PASS_REGISTRATION macro.
The private code used for the pass implementation (i.e. the pass base class and the constructors definitions, if missing from tablegen) can be generated with the -gen-pass-defs option. Similarly to the declarations file, the definitions of each pass can be enabled by defining the GEN_PASS_DEF_PASNAME variable.
While doing so, the pass base class has been enriched to also accept a the aformentioned struct of options and copy them to the actual pass options, thus allowing each pass to also be configurable within C++ and not only through command line.

Reviewed By: rriddle, mehdi_amini, Mogball, jpienaar

Differential Revision: https://reviews.llvm.org/D131839

22 months ago[Pipelines] Introduce DAE after ArgumentPromotion
Pavel Samolysov [Wed, 29 Jun 2022 10:46:10 +0000 (13:46 +0300)]
[Pipelines] Introduce DAE after ArgumentPromotion

The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated `alloca` instructions as well as meaningless `store`s and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted `GEP`s as well as `load`s and `cast`s in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.

Differential Revision: https://reviews.llvm.org/D128830

22 months ago[docs] Add LICENSE.txt to the root of the mono-repo
Tobias Hieta [Wed, 24 Aug 2022 07:33:58 +0000 (09:33 +0200)]
[docs] Add LICENSE.txt to the root of the mono-repo

This will make it easier to find the LICENSE and some
software also looks in the root to automatically find it.

Reviewed By: kristof.beyls, lattner

Differential Revision: https://reviews.llvm.org/D132018

22 months ago[AArch64][X86] Add some fixed-order-recurrence tests to check the costmodel of fixed...
David Green [Wed, 24 Aug 2022 07:18:01 +0000 (08:18 +0100)]
[AArch64][X86] Add some fixed-order-recurrence tests to check the costmodel of fixed order recurrences. NFC

22 months ago[AArch64][SVE] Remove -O1 from SVE intrinsic tests.
David Green [Wed, 17 Aug 2022 10:16:59 +0000 (11:16 +0100)]
[AArch64][SVE] Remove -O1 from SVE intrinsic tests.

This removes -O1 from the SVE ACLE intrinsics tests and replaces it with
-O0 and "opt -mem2reg -instcombine -tailcallelim". Instrcombine and
TailCallElim are only added to keep the differences smaller and can be
removed in a followup patches. The only remaining differences in the
tests are tbaa nodes not being emitted under -O0, and the removable of
some tailcall flags.

22 months ago[mlir][Bazel] Fix bazel build.
Adrian Kuegel [Wed, 24 Aug 2022 06:51:44 +0000 (08:51 +0200)]
[mlir][Bazel] Fix bazel build.

To avoid a dependency cycle, add BytecodeImplementation.h header to the
"IR" target.

22 months agoFix warning from a7bfdc23ab3ade54da99f0f59dababe4d71ae75b
Mahesh Ravishankar [Wed, 24 Aug 2022 06:36:40 +0000 (06:36 +0000)]
Fix warning from a7bfdc23ab3ade54da99f0f59dababe4d71ae75b

22 months ago[RISCV] Add zihintntl compressed instructions
Alex [Mon, 22 Aug 2022 09:50:19 +0000 (17:50 +0800)]
[RISCV] Add zihintntl compressed instructions

Add zihintntl compressed instructions and some files related to zihintntl.
This patch is base on {D121670}.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121779

22 months ago[DAGCombine] Add more tests for cmp to sbb combination; NFC
Paweł Bylica [Tue, 23 Aug 2022 11:35:40 +0000 (13:35 +0200)]
[DAGCombine] Add more tests for cmp to sbb combination; NFC

Add 2 more tests for potential DAG combine of cmp into sbb.

Differential Revision: https://reviews.llvm.org/D132463

22 months ago[mlir][Linalg] Handle multi-result operations in Elementwise op fusion.
Mahesh Ravishankar [Wed, 24 Aug 2022 05:56:13 +0000 (05:56 +0000)]
[mlir][Linalg] Handle multi-result operations in Elementwise op fusion.

This drops the artificial requirement of producers having a single
result value to be able to fuse with consumers.

The current default also only fuses producer with consumer when the
producer has a single use. This is a simplifying assumption. There are
legitimate use cases where a producer can be fused with consumer and
the fused o pcould be used to replace the uses of the producer as
well. This needs to be done with care to avoid use-def violations. To
allow for downstream users to explore more fusion opportunities, the
core transformation method is exposed as a utility function.

This patch also modifies the control function to take just the fused
operand as the argument. This is enough information for the callers to
get the producer and the consumer operations being considered to
fuse. It also provides information of which producer result is used.

Differential Revision: https://reviews.llvm.org/D132301

22 months ago[AIX] use the original name as the input to create the new symbol for TLS symbol.
esmeyi [Wed, 24 Aug 2022 05:36:40 +0000 (01:36 -0400)]
[AIX] use the original name as the input to create the new symbol for TLS symbol.

Summary: Currently, an error was reported when a thread local symbol has an invalid name. D100956 create a new symbol to prefix the TLS symbol name with a dot. When the symbol name is renamed, the error occurs. This patch uses the original symbol name (name in the symbol table) as the input for the symbol for TOC entry.

Reviewed By: shchenz, lkail

Differential Revision: https://reviews.llvm.org/D132348

22 months ago[RISCV] Handle register spill in branch relaxation
ZHU Zijia [Wed, 24 Aug 2022 05:27:56 +0000 (13:27 +0800)]
[RISCV] Handle register spill in branch relaxation

In branch relaxation pass, `j`'s with offset over 1MiB will be relaxed
to `jump` pseudo-instructions.

This patch allocates a stack slot for functions with a size greater than
1MiB. If the register scavenger cannot find a scratch register for
`jump`, spill a register to the slot before the jump and restore it
after the jump.

.mbb:
        foo
        j       .dest_bb
        bar
        bar
        bar
.dest_bb:
        baz

The above code will be relaxed to the following code.

.mbb:
        foo
        sd      s11, 0(sp)
        jump    .restore_bb, s11
        bar
        bar
        bar
        j       .dest_bb
.restore_bb:
        ld      s11, 0(sp)
.dest_bb:
        baz

Depends on D129999.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D130560

22 months ago[RISCV][TableGen] Mark MachineInstr with FrameIndex as not compressible
ZHU Zijia [Wed, 24 Aug 2022 05:23:38 +0000 (13:23 +0800)]
[RISCV][TableGen] Mark MachineInstr with FrameIndex as not compressible

If a MachineInstr's operand should be Reg in compiler's output but is
currently FrameIndex, `isCompressibleInst()` will terminate at
`MachineOperandType::getReg()`.

This patch adds `.isReg()` checks to make `isCompressibleInst()` return
false for these MachineInstr, allowing `getInstSizeInBytes()` to return
a value and `EstimateFunctionSizeInBytes()` to work as intended.

See https://reviews.llvm.org/D129999#3694222 for details.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D129999

22 months ago[mlir][math] Lower math.floor,ceil to libm
Kai Sasaki [Wed, 24 Aug 2022 01:58:01 +0000 (10:58 +0900)]
[mlir][math] Lower math.floor,ceil to libm

Lower math.floor and math.ceil to libm

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D131876

22 months agoReland "[MLIR]Extend vector.gather to support n-D result"
Che-Yu Wu [Wed, 24 Aug 2022 04:12:50 +0000 (04:12 +0000)]
Reland "[MLIR]Extend vector.gather to support n-D result"

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D132507

22 months ago[MSAN] Handle array alloca with non-i64 size specification
Keno Fischer [Wed, 24 Aug 2022 03:23:36 +0000 (03:23 +0000)]
[MSAN] Handle array alloca with non-i64 size specification

The array size specification of the an alloca can be any integer,
so zext or trunc it to intptr before attempting to multiply it
with an intptr constant.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D131846

22 months ago[MSAN] Correct shadow type for atomicrmw instrumentation
Keno Fischer [Wed, 24 Aug 2022 03:23:31 +0000 (03:23 +0000)]
[MSAN] Correct shadow type for atomicrmw instrumentation

We were passing the type of `Val` to `getShadowOriginPtr`, rather
than the type of `Val`'s shadow resulting in broken IR. The fix
is simple.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D131845

22 months ago[Polly] Don't use `llvm-config` anymore (in CMake sad path)
John Ericson [Sat, 20 Aug 2022 21:38:05 +0000 (17:38 -0400)]
[Polly] Don't use `llvm-config` anymore (in CMake sad path)

If `LLVM_BUILD_MAIN_SRC_DIR` is not defined, just assume we are in
regular monorepo layout. Non-standard (and not really supported) layouts
can still be configured manually.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D132314

22 months ago[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit
Bing1 Yu [Wed, 24 Aug 2022 01:41:40 +0000 (09:41 +0800)]
[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132141

22 months ago[DAG] MatchRotate - bail if we fail to match a shl/srl pair
Simon Pilgrim [Wed, 24 Aug 2022 02:04:59 +0000 (03:04 +0100)]
[DAG] MatchRotate - bail if we fail to match a shl/srl pair

extractShiftForRotate may fail to return canonicalized shifts due to constant folding or other simplification that can occur in getNode()

Fixes Issue #57283

22 months ago[HLSL] Infer language from file extension
Chris Bieneman [Wed, 24 Aug 2022 01:52:29 +0000 (20:52 -0500)]
[HLSL] Infer language from file extension

This allows the language mode for HLSL to be inferred from the file
extension.

22 months ago[NFC] Fix warning
Chris Bieneman [Wed, 24 Aug 2022 01:49:56 +0000 (20:49 -0500)]
[NFC] Fix warning

This change came in a few hours ago and introduced a warning. The fix
is trivial, so I'm providing it. The original change was reviewed here:

https://reviews.llvm.org/D132331

22 months agoRevert "[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit"
Bing1 Yu [Wed, 24 Aug 2022 01:38:46 +0000 (09:38 +0800)]
Revert "[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit"

This reverts commit 07e34763b02728857e1d6e8ccd2b82820eb3c0cc.

22 months ago[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit
Bing1 Yu [Tue, 23 Aug 2022 08:25:48 +0000 (16:25 +0800)]
[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132141