platform/upstream/llvm.git
3 years ago[X86] Don't attempt to fold sub(C1, xor(X, C2)) with opaque constants
Simon Pilgrim [Thu, 11 Mar 2021 11:56:58 +0000 (11:56 +0000)]
[X86] Don't attempt to fold sub(C1, xor(X, C2)) with opaque constants

Fixes PR49451

3 years ago[mlir] Correct verifyCompatibleShapes
Tres Popp [Wed, 10 Mar 2021 11:55:29 +0000 (12:55 +0100)]
[mlir] Correct verifyCompatibleShapes

verifyCompatibleShapes is not transitive. Create an n-ary version and
update SameOperandShapes and SameOperandAndResultShapes traits to use
it.

Differential Revision: https://reviews.llvm.org/D98331

3 years ago[clangd] Group filename calculations in SymbolCollector, and cache mroe.
Sam McCall [Wed, 10 Mar 2021 21:57:06 +0000 (22:57 +0100)]
[clangd] Group filename calculations in SymbolCollector, and cache mroe.

Also give CanonicalIncludes a less powerful interface (canonicalizes
symbols vs headers separately) so we can cache its results better.

Prior to this:
 - path->uri conversions were not consistently cached, this is
   particularly cheap when we start from a FileEntry* (which we often can)
 - only a small fraction of header-to-include calculation was cached

This is a significant speedup at least for dynamic indexing of preambles.
On my machine, opening XRefs.cpp:

```
PreambleCallback 1.208 -> 1.019 (-15.7%)
BuildPreamble    5.538 -> 5.214 (-5.8%)
```

Differential Revision: https://reviews.llvm.org/D98371

3 years ago[Statepoint Lowering] Handle the case with several gc.result
Serguei Katkov [Thu, 11 Mar 2021 05:53:45 +0000 (12:53 +0700)]
[Statepoint Lowering] Handle the case with several gc.result

Recently gc.result has been marked with readnone instead of readonly and
this opens a door for different optimization to duplicate gc.result.
Statepoint lowering is not ready to see several gc.results.
The problem appears when there are gc.results with one located in the same
basic block and another located in other basic block.
In this case we need both export VR and fill local setValue.

Note that this case is not sufficient optimization done before CodeGen.
It is evident that local gc.result dominates all other gc.results and it is handled
by GVN and EarlyCSE.

But anyway, even if IR is not optimal Backend should not crash on a valid IR.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98393

3 years ago[mlir] Fix invalid hoisting of dependent allocs in buffer hoisting pass.
Julian Gross [Tue, 9 Mar 2021 11:58:21 +0000 (12:58 +0100)]
[mlir] Fix invalid hoisting of dependent allocs in buffer hoisting pass.

Buffer hoisting moves allocs upwards although it has dependency within its
nested region. This patch fixes this issue.

https://bugs.llvm.org/show_bug.cgi?id=49142

Differential Revision: https://reviews.llvm.org/D98248

3 years agoFix MSVC "'type cast': conversion from 'unsigned int' to 'const llvm::CallBase *...
Simon Pilgrim [Thu, 11 Mar 2021 10:40:11 +0000 (10:40 +0000)]
Fix MSVC "'type cast': conversion from 'unsigned int' to 'const llvm::CallBase *' of greater size" warning. NFCI.

3 years ago[FileCheck] Fix naming of OverflowErrorStr var
Thomas Preud'homme [Wed, 10 Mar 2021 13:43:20 +0000 (13:43 +0000)]
[FileCheck] Fix naming of OverflowErrorStr var

As pointed out by Joel E. Denny in D97845, the OverflowErrorStr variable
is misnamed because the error is raised for any parsing error. Note that
in FileCheck proper this only happens in case of (under|over)flow
because the regex will ensure a number in the correct format is matched.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D98342

3 years ago[IPO] Fix EXPENSIVE_CHECKS assert added at D83744. NFCI.
Simon Pilgrim [Thu, 11 Mar 2021 10:29:02 +0000 (10:29 +0000)]
[IPO] Fix EXPENSIVE_CHECKS assert added at D83744. NFCI.

It wasn't taking into account that QueryingAA was a pointer.

3 years agoFix MSVC "result of 32-bit shift implicitly converted to 64 bits" warnings. NFCI.
Simon Pilgrim [Thu, 11 Mar 2021 10:08:20 +0000 (10:08 +0000)]
Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warnings. NFCI.

3 years ago[clang][ARM] Refactor ComputeLLVMTriple code for ARM
David Spickett [Tue, 9 Mar 2021 12:06:59 +0000 (12:06 +0000)]
[clang][ARM] Refactor ComputeLLVMTriple code for ARM

This moves code that sets the architecture name
and Float ABI into two new functions in
ToolChains/Arch/ARM.cpp. Greatly simplifying ComputeLLVMTriple.

Some light refactoring in setArchNameInTriple to
move local variables closer to their first use.

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D98253

3 years ago[OpenCL] Add missing atomic_xchg overload
Sven van Haastregt [Thu, 11 Mar 2021 10:20:29 +0000 (10:20 +0000)]
[OpenCL] Add missing atomic_xchg overload

3 years ago[MCA] Support in-order CPUs with MicroOpBufferSize=1
Jay Foad [Tue, 9 Mar 2021 16:12:36 +0000 (16:12 +0000)]
[MCA] Support in-order CPUs with MicroOpBufferSize=1

Differential Revision: https://reviews.llvm.org/D98356

3 years agoReapply [LICM] Make promotion faster
Nikita Popov [Tue, 9 Mar 2021 11:07:55 +0000 (12:07 +0100)]
Reapply [LICM] Make promotion faster

Relative to the previous implementation, this always uses
aliasesUnknownInst() instead of aliasesPointer() to correctly
handle atomics. The added test case was previously miscompiled.

-----

Even when MemorySSA-based LICM is used, an AST is still populated
for scalar promotion. As the AST has quadratic complexity, a lot
of time is spent in this step despite the existing access count
limit. This patch optimizes the identification of promotable stores.

The idea here is pretty simple: We're only interested in must-alias
mod sets of loop invariant pointers. As such, only populate the AST
with loop-invariant loads and stores (anything else is definitely
not promotable) and then discard any sets which alias with any of
the remaining, definitely non-promotable accesses.

If we promoted something, check whether this has made some other
accesses loop invariant and thus possible promotion candidates.

This is much faster in practice, because we need to perform AA
queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable)
instead of O(NumTotal^2), and NumPromotable tends to be small.
Additionally, promotable accesses have loop invariant pointers,
for which AA is cheaper.

This has a signicant positive compile-time impact. We save ~1.8%
geomean on CTMark at O3, with 6% on lencod in particular and 25%
on individual files.

Conceptually, this change is NFC, but may not be so in practice,
because the AST is only an approximation, and can produce
different results depending on the order in which accesses are
added. However, there is at least no impact on the number of promotions
(licm.NumPromoted) in test-suite O3 configuration with this change.

Differential Revision: https://reviews.llvm.org/D89264

3 years ago[lldb] Remove implicit_const_form_support.test
Pavel Labath [Tue, 9 Mar 2021 15:27:08 +0000 (16:27 +0100)]
[lldb] Remove implicit_const_form_support.test

It is superseded by dwarf5-implicit-const.s (added in D98197), which tests it more thoroughly.

3 years agoSave and restore previous terminal after setting the terminal for checking if termina...
Augusto Noronha [Thu, 11 Mar 2021 09:17:21 +0000 (10:17 +0100)]
Save and restore previous terminal after setting the terminal for checking if terminal supports colors.

The call to "set_curterm" inside the "terminalHasColors" function breaks
the EditLine configuration on some Linux distributions, causing certain
characters that have functions bound to them to not show up and
backspace to stop deleting characters (only visually). This patch
ensures that term struct is restored after the routine for cheking if
terminal supports colors is done, which fixes the aforementioned issue.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D95230

3 years ago[mlir] Change test-gpu-to-cubin to derive from SerializeToBlobPass
Christian Sigg [Wed, 10 Mar 2021 16:52:58 +0000 (17:52 +0100)]
[mlir] Change test-gpu-to-cubin to derive from SerializeToBlobPass

Clean-up after D98279, remove one call to createConvertGPUKernelToBlobPass().

Depends On D98203

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D98360

3 years ago[CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC)
Nikita Popov [Sat, 13 Feb 2021 16:43:17 +0000 (17:43 +0100)]
[CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC)

These are incompatible with opaque pointers. This is in preparation
of dropping this API on the IRBuilder side as well.

Instead explicitly pass the loaded type.

3 years ago[AArch64][compiler-rt] Fix PAC instructions for older compilers
Oliver Stannard [Thu, 11 Mar 2021 09:16:30 +0000 (09:16 +0000)]
[AArch64][compiler-rt] Fix PAC instructions for older compilers

The paciasp and autiasp instructions are only accepted by recent
compilers, but have the same encoding as hint instructions, so we can
use the hint menmonic to support older compilers.

3 years ago[Debugify][OriginalDIMode] Export the report into JSON file
Djordje Todorovic [Mon, 22 Feb 2021 14:01:54 +0000 (06:01 -0800)]
[Debugify][OriginalDIMode] Export the report into JSON file

By using the original-di check with debugify in the combination with
the llvm/utils/llvm-original-di-preservation.py it becomes very user
friendly tool. An example of the HTML page with the issues
related to debug info can be found at [0].

[0] https://djolertrk.github.io/di-checker-html-report-example/

Differential Revision: https://reviews.llvm.org/D82546

3 years ago[MLIR] Add canoncalization for `shape.is_broadcastable`
Frederik Gossen [Thu, 11 Mar 2021 09:09:26 +0000 (10:09 +0100)]
[MLIR] Add canoncalization for `shape.is_broadcastable`

Canonicalize `is_broadcastable` to constant true if fewer than 2 unique shape
operands. Eliminate redundant operands, otherwise.

Differential Revision: https://reviews.llvm.org/D98361

3 years ago[mlir] Add NVVM to CUBIN conversion to mlir-opt
Christian Sigg [Thu, 11 Mar 2021 08:56:00 +0000 (09:56 +0100)]
[mlir] Add NVVM to CUBIN conversion to mlir-opt

If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt.

The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner.

Depends On D98279

Reviewed By: herhut, rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D98203

3 years agoFix unused lambda capture in a non-asserts build
David Blaikie [Thu, 11 Mar 2021 08:21:02 +0000 (00:21 -0800)]
Fix unused lambda capture in a non-asserts build

For locally scoped lambdas like this there's no particular benefit to
explicitly listing captures - or avoiding capturing this. Switch to [&]
and make it all easier to maintain.

(& driveby change std::function to llvm::function_ref)

3 years ago[SEH] Fix capture of this in lambda functions
Olivier Goffart [Mon, 1 Mar 2021 14:17:43 +0000 (15:17 +0100)]
[SEH] Fix capture of this in lambda functions

Commit 1b04bdc2f3ffaa7a0e1e3dbdc3a0cd08f0b9a4ce added support for
capturing the 'this' pointer in a SEH context (__finally or __except),
But the case in which the 'this' pointer is part of a lambda capture
was not handled properly

Differential Revision: https://reviews.llvm.org/D97687

3 years ago[sanitizer] Change NanoTime to use clock_gettime on non-glibc
Fangrui Song [Thu, 11 Mar 2021 07:02:51 +0000 (23:02 -0800)]
[sanitizer] Change NanoTime to use clock_gettime on non-glibc

This avoids the `__NR_gettimeofday` syscall number, which does not exist on 32-bit musl (it has `__NR_gettimeofday_time32`).

This switched Android to `clock_gettime` as well, which should work according to the old code before D96925.

Tested on Alpine Linux x86-64 (musl) and FreeBSD x86-64.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D98121

3 years ago[InstrProfiling] Don't generate __llvm_profile_runtime_user
Petr Hosek [Wed, 10 Mar 2021 09:09:29 +0000 (01:09 -0800)]
[InstrProfiling] Don't generate __llvm_profile_runtime_user

This is no longer needed, we can add __llvm_profile_runtime directly
to llvm.compiler.used or llvm.used to achieve the same effect.

Differential Revision: https://reviews.llvm.org/D98325

3 years ago[tsan] Fix aarch64-*-linux after D86377
Fangrui Song [Thu, 11 Mar 2021 06:16:03 +0000 (22:16 -0800)]
[tsan] Fix aarch64-*-linux after D86377

All check-tsan tests fail on aarch64-*-linux because HeapMemEnd() > ShadowBeg()
for the following code path:
```
 #if defined(__aarch64__) && !HAS_48_BIT_ADDRESS_SPACE
   ProtectRange(HeapMemEnd(), ShadowBeg());
```

Restore the behavior before D86377 for aarch64-*-linux.

3 years agoRename top-level LICENSE.txt files to LICENSE.TXT
Leonard Chan [Thu, 11 Mar 2021 05:25:15 +0000 (21:25 -0800)]
Rename top-level LICENSE.txt files to LICENSE.TXT

This makes all the license filenames uniform across subprojects.

Differential Revision: https://reviews.llvm.org/D98380

3 years ago[RISCV] Merge fixed-vectors-int-splat-rv32.ll and fixed-vectors-int-splat-rv64.ll.
Craig Topper [Thu, 11 Mar 2021 04:11:21 +0000 (20:11 -0800)]
[RISCV] Merge fixed-vectors-int-splat-rv32.ll and fixed-vectors-int-splat-rv64.ll.

The vXi64 test cases no longer crash on rv32.

3 years ago[mlir][AVX512] Implement sparse vector dot product integration test.
Matthias Springer [Thu, 11 Mar 2021 03:57:28 +0000 (12:57 +0900)]
[mlir][AVX512] Implement sparse vector dot product integration test.

This test operates on two hardware-vector-sized vectors and utilizes vp2intersect and mask.compress.

PHAB_REVIEW=D98099

3 years ago[RISCV] Add additional checking to tablgen RISCVVEmitter requested in D95016.
Craig Topper [Thu, 11 Mar 2021 03:46:23 +0000 (19:46 -0800)]
[RISCV] Add additional checking to tablgen RISCVVEmitter requested in D95016.

This errors, but doesn't give source location. We'd need to pass
the Record through several layers to get to the location.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D98379

3 years ago[RISCV] Add v2i64 _vi_ and _iv_ test cases to fixed-vectors-int.ll since we no longer...
Craig Topper [Thu, 11 Mar 2021 03:09:16 +0000 (19:09 -0800)]
[RISCV] Add v2i64 _vi_ and _iv_ test cases to fixed-vectors-int.ll since we no longer crash.

I think we were missing some build_vector or other support and
skipped these test cases. They work now but don't generate
optimal code.

3 years agoRevert "WIP"
David Blaikie [Thu, 11 Mar 2021 03:17:14 +0000 (19:17 -0800)]
Revert "WIP"

Accidental commit.

This reverts commit 60238f29bf489dea7fabbcd5c69753d60a562b36.

3 years agoWIP
David Blaikie [Thu, 11 Mar 2021 03:01:31 +0000 (19:01 -0800)]
WIP

3 years agoResolve unused variable warning (NFC)
Juneyoung Lee [Thu, 11 Mar 2021 03:03:03 +0000 (12:03 +0900)]
Resolve unused variable warning (NFC)

3 years ago[gn build] (manually) Port d6a0560bf258
Nico Weber [Thu, 11 Mar 2021 02:56:49 +0000 (21:56 -0500)]
[gn build] (manually) Port d6a0560bf258

3 years ago[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.
Zakk Chen [Fri, 5 Mar 2021 15:40:28 +0000 (07:40 -0800)]
[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.

Demonstrate how to generate vadd/vfadd intrinsic functions

1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.

riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D95016

3 years ago[llvm] Fix thinko in getVendorSignature(), where expected values of ECX and EDX...
Vy Nguyen [Wed, 10 Mar 2021 07:41:58 +0000 (02:41 -0500)]
[llvm] Fix thinko in getVendorSignature(), where expected values of  ECX and EDX were flipped for the AMD case.

Follow up to D97504

Differential Revision: https://reviews.llvm.org/D98322

3 years ago[InstSimplify] Pass SimplifyQuery to computePointerICmp (NFC)
Juneyoung Lee [Thu, 11 Mar 2021 02:12:31 +0000 (11:12 +0900)]
[InstSimplify] Pass SimplifyQuery to computePointerICmp (NFC)

3 years ago[AMDGPU] Always create Stack Object for reserved VGPR
Ruiling Song [Wed, 10 Mar 2021 03:04:54 +0000 (11:04 +0800)]
[AMDGPU] Always create Stack Object for reserved VGPR

As we may overwrite inactive lanes of a caller-save-vgpr, we should
always save/restore the reserved vgpr for sgpr spill.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D98319

3 years ago[dfsan] Update atomics.ll test
George Balatsouras [Thu, 11 Mar 2021 00:01:43 +0000 (16:01 -0800)]
[dfsan] Update atomics.ll test

Remove hard-coded shadow width references and remove irrelevant instructions.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D98376

3 years ago[ValueMapper] Add debug output for metadata remapping
Ruiling Song [Mon, 1 Feb 2021 09:24:12 +0000 (17:24 +0800)]
[ValueMapper] Add debug output for metadata remapping

This is useful for debugging which pointers are updated during remapping
process.

Differential Revision: https://reviews.llvm.org/D95775

3 years ago[mir] Change 'undef' for MMO base addresses to 'unknown-address'
Daniel Sanders [Sat, 6 Mar 2021 04:47:09 +0000 (20:47 -0800)]
[mir] Change 'undef' for MMO base addresses to 'unknown-address'

Differential Revision: https://reviews.llvm.org/D98100

3 years ago[mlir] Optimize the implementation of RegionDCE
River Riddle [Thu, 11 Mar 2021 00:13:25 +0000 (16:13 -0800)]
[mlir] Optimize the implementation of RegionDCE

The current implementation has some inefficiencies that become noticeable when running on large modules. This revision optimizes the code, and updates some out-dated idioms with newer utilities. The main components of this optimization include:

* Add an overload of Block::eraseArguments that allows for O(N) erasure of disjoint arguments.
* Don't process entry block arguments given that we don't erase them at this point.
* Don't track individual operation results, given that we don't erase them. We can just track the parent operation.

Differential Revision: https://reviews.llvm.org/D98309

3 years ago[clang][Driver] Expose -fexperimental-relative-c++-abi-vtables flag
Leonard Chan [Thu, 11 Mar 2021 00:27:26 +0000 (16:27 -0800)]
[clang][Driver] Expose -fexperimental-relative-c++-abi-vtables flag

Initially, this flag was meant to only be used through cc1 and not directly
through the clang driver. However, we accidentally ended up using this flag
as a driver flag already for selecting multilibs within the fuchsia toolchain.
We're currently in an awkward state where it's only accepted as a driver flag
when targeting Fuchsia, and all other instances it can only be added via
-Xclang. Since we're ready to use this in Fuchsia, we can just expose this to
the driver for simplicity.

Differential Revision: https://reviews.llvm.org/D98375

3 years ago[gn build] Port 4f16e177e104
LLVM GN Syncbot [Wed, 10 Mar 2021 23:36:48 +0000 (23:36 +0000)]
[gn build] Port 4f16e177e104

3 years agoRevert "[AST] Add generator for source location introspection"
Stephen Kelly [Wed, 10 Mar 2021 23:35:50 +0000 (23:35 +0000)]
Revert "[AST] Add generator for source location introspection"

This reverts commit d627a27d264b47eda3f15f086ff419dfe053ebf7.

This fails to link on Windows somehow.

3 years agoRevert "Workaround a -Wmisleading-indentation warning"
Stephen Kelly [Wed, 10 Mar 2021 23:35:41 +0000 (23:35 +0000)]
Revert "Workaround a -Wmisleading-indentation warning"

This reverts commit 5c22e2bec008760cc7078d8d14382ef4762c5d54.

3 years agoRe-land "[PDB] Defer relocating .debug$S until commit time and parallelize it"
Reid Kleckner [Wed, 10 Mar 2021 22:51:52 +0000 (14:51 -0800)]
Re-land "[PDB] Defer relocating .debug$S until commit time and parallelize it"

This reverts commit bacf9cf2c5cdec3567580e5030c4c82f42b3d745 and
reinstates commit 1a9bd5b81328adf0dd5a8b4f3ad5949463e66da3.

Reverting this commit did not appear to make the problem go away, so we
can go ahead and reland it.

3 years agoWorkaround a -Wmisleading-indentation warning
Stephen Kelly [Wed, 10 Mar 2021 23:11:35 +0000 (23:11 +0000)]
Workaround a -Wmisleading-indentation warning

Because the generated code is not formatted, it can cause warnings.

3 years agoRevert "Replace func name with regex in update_cc_test_checks"
Giorgis Georgakoudis [Wed, 10 Mar 2021 22:56:49 +0000 (14:56 -0800)]
Revert "Replace func name with regex in update_cc_test_checks"

This reverts commit bf58d6a1f92244c797a280d318a56d7d3fc4a704.

Breaks tests, fix

3 years agoUpdate __is_unsigned builtin to match the Standard.
zoecarver [Wed, 10 Mar 2021 22:59:38 +0000 (14:59 -0800)]
Update __is_unsigned builtin to match the Standard.

Updates __is_unsigned to have the same behavior as the standard
specifies. This is in line with 511dbd8, which applied the same change
to __is_signed.

Refs D67897.

Differential Revision: https://reviews.llvm.org/D98104

3 years ago[mlir] Add polynomial approximation for math::Log2
Emilio Cota [Wed, 10 Mar 2021 19:29:26 +0000 (11:29 -0800)]
[mlir] Add polynomial approximation for math::Log2

```
name                     old cpu/op  new cpu/op  delta
BM_mlir_Log2_f32/10       134ns ±15%    45ns ± 4%  -66.39%  (p=0.000 n=20+17)
BM_mlir_Log2_f32/100     1.03µs ±16%  0.12µs ±10%  -88.78%  (p=0.000 n=20+18)
BM_mlir_Log2_f32/1k      10.3µs ±16%   0.7µs ± 5%  -93.24%  (p=0.000 n=20+17)
BM_mlir_Log2_f32/10k      104µs ±15%     7µs ±14%  -93.25%  (p=0.000 n=20+20)
BM_eigen_s_Log2_f32/10   95.3ns ±17%  90.9ns ± 6%     ~     (p=0.228 n=20+18)
BM_eigen_s_Log2_f32/100   907ns ± 3%   911ns ± 6%     ~     (p=0.539 n=16+20)
BM_eigen_s_Log2_f32/1k   9.88µs ± 4%  9.85µs ± 3%     ~     (p=0.790 n=16+17)
BM_eigen_s_Log2_f32/10k   105µs ±10%   110µs ±16%     ~     (p=0.459 n=16+20)
BM_eigen_v_Log2_f32/10   32.5ns ±31%  33.9ns ±14%   +4.31%  (p=0.028 n=17+20)
BM_eigen_v_Log2_f32/100   176ns ± 8%   180ns ± 7%   +2.19%  (p=0.045 n=16+17)
BM_eigen_v_Log2_f32/1k   1.44µs ± 4%  1.50µs ± 9%   +3.91%  (p=0.001 n=16+17)
BM_eigen_v_Log2_f32/10k  14.5µs ±10%  15.0µs ± 8%   +3.92%  (p=0.002 n=16+19)
```

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D98282

3 years agoRevert "[cmake] Enable -Werror=return-type"
Dave Lee [Wed, 10 Mar 2021 22:46:52 +0000 (14:46 -0800)]
Revert "[cmake] Enable -Werror=return-type"

This reverts commit ce94a161651d0edd313d0fa65571eb53d3a34d13.

3 years ago[AST] Add generator for source location introspection
Stephen Kelly [Sat, 12 Dec 2020 13:17:49 +0000 (13:17 +0000)]
[AST] Add generator for source location introspection

Generate a json file containing descriptions of AST classes and their
public accessors which return SourceLocation or SourceRange.

Use the JSON file to generate a C++ API and implementation for accessing
the source locations and method names for accessing them for a given AST
node.

This new API can be used to implement 'srcloc' output in clang-query:

  http://ce.steveire.com/z/m_kTIo

In this first version of this feature, only the accessors for Stmt
classes are generated, not Decls, TypeLocs etc.  Those can be added
after this change is reviewed, as this change is mostly about
infrastructure of these code generators.

Differential Revision: https://reviews.llvm.org/D93164

3 years ago[nfc] [lldb] Remove variable ranges_base in DWARFUnit::AddUnitDIE
Jan Kratochvil [Wed, 10 Mar 2021 22:31:05 +0000 (23:31 +0100)]
[nfc] [lldb] Remove variable ranges_base in DWARFUnit::AddUnitDIE

3 years agoAdd noreturn attribute to non-returning functions
Aditya Kumar [Tue, 23 Feb 2021 19:17:29 +0000 (11:17 -0800)]
Add noreturn attribute to non-returning functions

Differential Revision: https://reviews.llvm.org/D97308

3 years agollvm-lto: default Relocation Model should be selected by the TargetMachine.
Wael Yehia [Wed, 10 Mar 2021 22:20:09 +0000 (17:20 -0500)]
llvm-lto: default Relocation Model should be selected by the TargetMachine.

Right now, the createTargetMachine function in LTOBackend.cpp (used by llvm-lto, and other components) selects the default Relocation Model when none is specified in the module.
Other components (such as opt and llc) that construct a TargetMachine delegate the decision on the default value to the polymorphic TargetMachine's constructor.

This commit aligns llvm-lto with other components.

Reviewed By: daltenty, fhahn

Differential Revision: https://reviews.llvm.org/D97507

3 years ago[AArch64] Extend vecreduce -> udot handling to mla reductions
David Green [Wed, 10 Mar 2021 22:25:12 +0000 (22:25 +0000)]
[AArch64] Extend vecreduce -> udot handling to mla reductions

We previously have lowering for:
  vecreduce.add(zext(X)) to vecreduce.add(UDOT(zero, X, one))
This extends that to also handle:
  vecreduce.add(mul(zext(X), zext(Y)) to vecreduce.add(UDOT(zero, X, Y))
It extends the existing code to optionally handle a mul with equal
extends.

Differential Revision: https://reviews.llvm.org/D97280

3 years ago[Attributor] Attributor call site specific AAValueConstantRange
kuterd [Sun, 24 Jan 2021 14:04:22 +0000 (17:04 +0300)]
[Attributor] Attributor call site specific AAValueConstantRange

This patch makes uses of the context bridges introduced in D83299 to make
AAValueConstantRange call site specific.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D83744

3 years ago[lldb] Ignore linkage diagnostic for LLDBSwigPythonBreakpointCallbackFunction (NFC)
Dave Lee [Wed, 10 Mar 2021 20:48:05 +0000 (12:48 -0800)]
[lldb] Ignore linkage diagnostic for LLDBSwigPythonBreakpointCallbackFunction (NFC)

Ignore `-Wreturn-type-c-linkage` diagnostics for `LLDBSwigPythonBreakpointCallbackFunction`.

The function is defined in `python-wrapper.swig` which uses `extern "C" { ... }` blocks.
The declaration of this function in `ScriptInterpreterPython.cpp` already uses these
same pragmas to silence the warning there.

This prevents `-Werror` builds from failing.

Differential Revision: https://reviews.llvm.org/D98368

3 years ago[lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC
Vedant Kumar [Tue, 9 Mar 2021 18:12:18 +0000 (10:12 -0800)]
[lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC

GetXcodeSDK() consistently takes over 1 second to complete if the
queried SDK is missing, because `xcrun` doesn't cache negative lookups.

Because there are multiple simulator platforms, this can add 4+ seconds
to `lldb -b some_object_file.o`.

To work around this, skip the call to GetXcodeSDK() when setting up
simulator platforms if the specified arch doesn't have what looks like a
simulator triple.

Some other ways to fix this:
- Fix caching in xcrun (rdar://74882205)
- Test for arch compat before calling SomePlatform::CreateInstance() (much
  larger change)

Differential Revision: https://reviews.llvm.org/D98272

3 years ago[flang][driver] Formatting OpenMP sema check as per clang-format
Arnamoy Bhattacharyya [Wed, 10 Mar 2021 21:47:56 +0000 (16:47 -0500)]
[flang][driver] Formatting OpenMP sema check as per clang-format

3 years ago[NFC] Fix a compiler warning
Quentin Colombet [Wed, 10 Mar 2021 21:28:53 +0000 (13:28 -0800)]
[NFC] Fix a compiler warning

Fix a warning caused by -Wrange-loop-analysis

Patch by Xiaoqing Wu <xiaoqing_wu@apple.com>

Differential Revision: https://reviews.llvm.org/D98297

3 years ago[AArch64] Extend vecreduce -> udot handling to v8i8
David Green [Wed, 10 Mar 2021 21:03:15 +0000 (21:03 +0000)]
[AArch64] Extend vecreduce -> udot handling to v8i8

https://reviews.llvm.org/D88577 added v16i8 vecreduce to udot/sdot
lowering. This extends that to v8i8 too, generalizing the pattern to
handle the extra types.

Differential Revision: https://reviews.llvm.org/D97279

3 years ago[VPlan] Support to widen select intructions in VPlan native path
Mauri Mustonen [Wed, 10 Mar 2021 20:22:16 +0000 (20:22 +0000)]
[VPlan] Support to widen select intructions in VPlan native path

Add support to widen select instructions in VPlan native path by using a correct recipe when such instructions are encountered. This is already used by inner loop vectorizer.

Previously select instructions get handled by the wrong recipe and resulted in unreachable instruction errors like this one: https://bugs.llvm.org/show_bug.cgi?id=48139.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D97136

3 years agoReplace func name with regex in update_cc_test_checks
Giorgis Georgakoudis [Sat, 20 Feb 2021 04:22:14 +0000 (20:22 -0800)]
Replace func name with regex in update_cc_test_checks

The patch adds an argument to update_cc_test_checks for replacing a function name matching a regex. This functionality is needed to match generated function signatures that include file hashes. Example:

The function signature for the following function:

`__omp_offloading_50_b84c41e__Z9ftemplateIiET_i_l30_worker`

with `--replace-function-regex "__omp_offloading_[0-9]+_[a-z0-9]+_(.*)"` will become:

`CHECK-LABEL: @{{__omp_offloading_[0-9]+_[a-z0-9]+__Z9ftemplateIiET_i_l30_worker}}(`

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97107

3 years ago[mlir] Remove unnecessary copying of pass options
Christian Sigg [Wed, 10 Mar 2021 20:32:52 +0000 (21:32 +0100)]
[mlir] Remove unnecessary copying of pass options

I missed a comment in D98279 that you don't need to copy pass options.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D98366

3 years ago[llvm-objcopy][NFC] replace class Buffer/MemBuffer/FileBuffer with streams.
Alexey Lapshin [Sat, 24 Oct 2020 14:35:55 +0000 (17:35 +0300)]
[llvm-objcopy][NFC] replace class Buffer/MemBuffer/FileBuffer with streams.

During D88827 it was requested to remove the local implementation
of Memory/File Buffers:

// TODO: refactor the buffer classes in LLVM to enable us to use them here
// directly.

This patch uses raw_ostream instead of Buffers. Generally, using streams
could allow us to reduce memory usages. No need to load all data into the
memory - the data could be streamed through a smaller buffer.
Thus, this patch uses raw_ostream as an interface for output data:

Error executeObjcopyOnBinary(CopyConfig &Config,
                             object::Binary &In,
                             raw_ostream &Out);

Note 1. This patch does not change the implementation of Writers
so that data would be directly stored into raw_ostream.
This is assumed to be done later.

Note 2. It would be better if Writers would be implemented in a such way
that data could be streamed without seeking/updating. If that would be
inconvenient then raw_ostream could be replaced with raw_pwrite_stream
to have a possibility to seek back and update file headers.
This is assumed to be done later if necessary.

Note 3. Current FileOutputBuffer allows using a memory-mapped file.
The raw_fd_ostream (which could be used if data should be stored in the file)
does not allow us to use a memory-mapped file. Memory map functionality
could be implemented for raw_fd_ostream:

It is possible to add resize() method into raw_ostream.

class raw_ostream {
  void resize(uint64_t size);
}

That method, implemented for raw_fd_ostream, could create a memory-mapped file.
The streamed data would be written into that memory file then.
Thus we would be able to use memory-mapped files with raw_fd_ostream.
This is assumed to be done later if necessary.

Differential Revision: https://reviews.llvm.org/D91028

3 years ago[mlir][spirv] Define spv.Image Operation
Weiwei Li [Wed, 10 Mar 2021 20:43:29 +0000 (15:43 -0500)]
[mlir][spirv] Define spv.Image Operation

co-authered-by: Alan Liu <alanliu.yf@gmail.com>

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D98270

3 years ago[AMDGPU] Disable SCC bit on fp atomics
Stanislav Mekhanoshin [Mon, 8 Mar 2021 23:12:54 +0000 (15:12 -0800)]
[AMDGPU] Disable SCC bit on fp atomics

Differential Revision: https://reviews.llvm.org/D98221

3 years ago[AMDGPU] Always expand system scope fp atomics on gfx90a
Stanislav Mekhanoshin [Fri, 5 Mar 2021 23:25:55 +0000 (15:25 -0800)]
[AMDGPU] Always expand system scope fp atomics on gfx90a

FP atomics in system scope cannot be used and shall always
be expanded in a CAS loop.

Differential Revision: https://reviews.llvm.org/D98085

3 years agoRun non-filechecked commands in update_cc_test_checks.py
Giorgis Georgakoudis [Fri, 19 Feb 2021 18:45:40 +0000 (10:45 -0800)]
Run non-filechecked commands in update_cc_test_checks.py

Some tests in clang require running non-filechecked commands to generate the actual filecheck input. For example, tests for openmp offloading require generating the host bc without any checking, before running the clang command to actually generate the filechecked IR of the target device. This patch enables `update_cc_test_checks.py` to run non-filechecked run lines in-place.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97068

3 years ago[dfsan] Update fast16labels.ll test
George Balatsouras [Wed, 10 Mar 2021 01:07:43 +0000 (17:07 -0800)]
[dfsan] Update fast16labels.ll test

Remove hard-coded shadow width references. Separate CHECK lines that only apply to fast16 mode.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D98308

3 years ago[DSE] Extending isOverwrite to support offsetted fully overlapping stores
Matteo Favaro [Wed, 10 Mar 2021 20:07:54 +0000 (21:07 +0100)]
[DSE] Extending isOverwrite to support offsetted fully overlapping stores

The isOverwrite function is making sure to identify if two stores
are fully overlapping and ideally we would like to identify all the
instances of OW_Complete as they'll yield possibly killable stores.
The current implementation is incapable of spotting instances where
the earlier store is offsetted compared to the later store, but
still fully overlapped. The limitation seems to lie on the
computation of the base pointers with the
GetPointerBaseWithConstantOffset API that often yields different
base pointers even if the stores are guaranteed to partially overlap
(e.g. the alias analysis is returning AliasResult::PartialAlias).

The patch relies on the offsets computed and cached by BatchAAResults
(available after D93529) to determine if the offsetted overlapping
is OW_Complete.

Differential Revision: https://reviews.llvm.org/D97676

3 years ago[lld-macho][NFC] add const to pointer/reference induction variables of range-based...
Greg McGary [Wed, 10 Mar 2021 05:41:34 +0000 (21:41 -0800)]
[lld-macho][NFC] add const to pointer/reference induction variables of range-based for loops

Pointer and reference induction variables of range-based for loops are often const, and code authors often lax about qualifying them.

Differential Revision: https://reviews.llvm.org/D98317

3 years agoRemove original implementation of UniqueInternalLinkageNames pass.
Sriraman Tallam [Tue, 9 Mar 2021 06:33:00 +0000 (22:33 -0800)]
Remove original implementation of UniqueInternalLinkageNames pass.

D96109 was recently submitted which contains the refactored implementation of
-funique-internal-linakge-names by adding the unique suffixes in clang rather
than as an LLVM pass. Deleting the former implementation in this change.

Differential Revision: https://reviews.llvm.org/D98234

3 years ago[InstCombine] Regenerate test checks (NFC)
Nikita Popov [Wed, 10 Mar 2021 19:26:29 +0000 (20:26 +0100)]
[InstCombine] Regenerate test checks (NFC)

3 years agoRevert "[mlir][Vector][Affine] Improve affine vectorizer algorithm"
Alex Zinenko [Wed, 10 Mar 2021 19:25:49 +0000 (20:25 +0100)]
Revert "[mlir][Vector][Affine] Improve affine vectorizer algorithm"

This reverts commit 95db7b4aeaad590f37720898e339a6d54313422f.

This breaks vectorize_2d.mlir and vectorize_3d.mlir test under ASAN (use
after free).

3 years agoRevert "[mlir][Affine][Vector] Add initial support for 'iter_args' to Affine vectorizer."
Alex Zinenko [Wed, 10 Mar 2021 19:25:32 +0000 (20:25 +0100)]
Revert "[mlir][Affine][Vector] Add initial support for 'iter_args' to Affine vectorizer."

This reverts commit 77a9d1549fcc57946b66fd5bacef3b48a613e872.

Parent commit is broken.

3 years ago[RuntimeDyld] Support more relocations
Rafael Auler [Thu, 4 Mar 2021 00:03:14 +0000 (16:03 -0800)]
[RuntimeDyld] Support more relocations

This patch introduces functionality used by BOLT when
re-linking the final binary. It adds new relocation types that
are currently unsupported by RuntimeDyldELF.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D97899

3 years ago[lldb] Fix PushPlan to set subplan to private
Dave Lee [Wed, 17 Feb 2021 23:38:04 +0000 (15:38 -0800)]
[lldb] Fix PushPlan to set subplan to private

Call `SetPrivate(true)` for subplans pushed via `PushPlan()`, as described in its
docstring.

Differential Revision: https://reviews.llvm.org/D96916

3 years ago[NFC] Fix compiler warnings
Quentin Colombet [Wed, 10 Mar 2021 18:36:59 +0000 (10:36 -0800)]
[NFC] Fix compiler warnings

Fix warnings caused by -Wrange-loop-analysis.

Patch by Xiaoqing Wu <xiaoqing_wu@apple.com>

Differential Revision: https://reviews.llvm.org/D98298

3 years ago[mlir][Affine][Vector] Add initial support for 'iter_args' to Affine vectorizer.
Diego Caballero [Wed, 10 Mar 2021 18:39:39 +0000 (20:39 +0200)]
[mlir][Affine][Vector] Add initial support for 'iter_args' to Affine vectorizer.

This patch adds support for vectorizing loops with 'iter_args' when those loops
are not a vector dimension. This allows vectorizing outer loops with an inner
'iter_args' loop (e.g., reductions). Vectorizing scenarios where 'iter_args'
loops are vector dimensions would require more work (e.g., analysis,
generating horizontal reduction, etc.) not included in this patch.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97892

3 years ago[mlir][Vector][Affine] Improve affine vectorizer algorithm
Diego Caballero [Wed, 10 Mar 2021 18:11:16 +0000 (20:11 +0200)]
[mlir][Vector][Affine] Improve affine vectorizer algorithm

This patch replaces the root-terminal vectorization approach implemented in the
Affine vectorizer with a topological order approach that vectorizes all the
operations within the target loop nest. These are the most important changes
introduced by the new algorithm:
  * Removed tracking of root and terminal ops. Existing vectorization
    functionality is preserved and extended so that loop nests without
    root-terminal chains can be vectorized.
  * Vectorizing a loop nest now only requires a single topological traversal.
  * A new vector loop nest is incrementally built along the vectorization
    process. The original scalar loop is kept intact. No cloning guard is needed
    to recover the scalar loop if vectorization fails. This approach also
    simplifies the challenging task of replacing a loop operation amid the
    vectorization process without invalidating the analysis information that
    depends on the original loop.
  * Vectorization of specific operations has been implemented as independent,
    preparing them to be moved to a potential vectorization interface.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97442

3 years ago[PowerPC] Implement patterns for PC-Rel zextload/extload byte loads
Amy Kwan [Fri, 5 Mar 2021 05:43:57 +0000 (23:43 -0600)]
[PowerPC] Implement patterns for PC-Rel zextload/extload byte loads

This patch adds patterns to select the PC-Relative extloadi1 and zextloadi1 byte loads.

Differential Revision: https://reviews.llvm.org/D98042

3 years ago[clang] Don't assert in EmitAggregateCopy on trivial_abi types
Arthur Eubanks [Wed, 3 Mar 2021 17:55:02 +0000 (09:55 -0800)]
[clang] Don't assert in EmitAggregateCopy on trivial_abi types

Fixes PR42961.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D97872

3 years ago[DebugInfo][NFC] Refactor BinOp+GEP salvaging in salvageDebugInfoImpl
gbtozers [Tue, 8 Dec 2020 16:01:26 +0000 (16:01 +0000)]
[DebugInfo][NFC] Refactor BinOp+GEP salvaging in salvageDebugInfoImpl

This patch refactors out the salvaging of GEP and BinOp instructions into
separate functions, in preparation for further changes to the salvaging of these
instructions coming in another patch; there should be no functional change as a
result of this refactor.

Differential Revision: https://reviews.llvm.org/D92851

3 years ago[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent...
Craig Topper [Wed, 10 Mar 2021 17:46:16 +0000 (09:46 -0800)]
[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32.

On riscv32, i64 isn't a legal scalar type but we would like to
support scalable vectors of i64.

This patch introduces a new node that can represent a splat made
of multiple scalar values. I've used this new node to solve the current
crashes we experience when getConstant is used after type legalization.

For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS
when needed and then handling the SPLAT_VECTOR_PARTS later during
LegalizeOps. I've remove the special case I previously put in for
ABS for D97991 as the default expansion is now able to succesfully
use getConstant.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98004

3 years ago[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32.
Craig Topper [Wed, 10 Mar 2021 17:37:25 +0000 (09:37 -0800)]
[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32.

Currently we crash in type legalization any time an intrinsic
uses a scalar i64 on RV32.

This patch adds support for type legalizing this to prevent
crashing. I don't promise that it uses the best possible codegen
just that it is functional.

This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x
intrinsic and intrinsics that take a scalar input, splat it and
then do some operation.

For vmv.v.x we'll either rely on hardware sign extension for
constants or we'll convert it to multiple splats and bit
manipulation.

For vmv.s.x we use a really unoptimal sequence inspired by what
we do for an INSERT_VECTOR_ELT.

For the third case we'll either try to use the .vi form for
constants or convert to a complicated splat and bitmanip and use
the .vv form of the operation.

I've renamed the ExtendOperand field to SplatOperand now use it
specifically for the third case. The first two cases are handled
by custom lowering specifically for those intrinsics.

I haven't updated all tests yet, but I tried to cover a subset
that includes single-width, widening, and narrowing.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97895

3 years ago[compiler-rt] Fix stale incremental builds when using `LLVM_BUILD_EXTERNAL_COMPILER_R...
Dan Liew [Tue, 9 Mar 2021 21:59:46 +0000 (13:59 -0800)]
[compiler-rt] Fix stale incremental builds when using `LLVM_BUILD_EXTERNAL_COMPILER_RT=ON`.

When building with `LLVM_BUILD_EXTERNAL_COMPILER_RT=ON` (e.g. Swift does
this) we do an "external" build of compiler-rt where we build
compiler-rt with the just built clang.

Unfortunately building in this mode had a bug where compiler-rt would
not get rebuilt if compiler-rt sources changed. This is problematic
for incremental builds because it meant that the compiler-rt binaries
were stale.

The fix is to use the `BUILD_ALWAYS` ExternalProject_Add option which
means the build command for compiler-rt is always run.

In principle if all of the following are true:

* compiler-rt has already been built.
* there are no compiler-rt source changes.
* the compiler hasn't changed.
* ninja is being used as the generator for the compiler-rt build.

then the overhead for always running the build command for incremental
builds is negligible.

However, in practice clang gets rebuilt everytime the HEAD commit
changes (due to commit hash being embedded in the output of `--version`)
which means all of compiler-rt will be rebuilt everytime this happens.
While this is annoying it's better to do the slow but correct thing
rather than the fast but incorrect thing.

rdar://75150660

Differential Revision: https://reviews.llvm.org/D98291

3 years ago[flang] Fix call to CHECK() on overriding an erroneous type-bound procedure
Peter Steinfeld [Wed, 10 Mar 2021 16:09:57 +0000 (08:09 -0800)]
[flang] Fix call to CHECK() on overriding an erroneous type-bound procedure

You can define a base type with a type-bound procedure which is erroneously
missing a NOPASS attribute and then define another type that extends the base
type and overrides the erroneous procedure.  In this case, when we perform
semantic checking on the overriding procedure, we verify the "pass index" of
the overriding procedure.  The attempt to get the procedure's pass index fails
a call to CHECK().

I fixed this by calling SetError() on the symbol of the overridden procedure in
the base type.  Then, I check HasError() before executing the code that invokes
the failing call to CHECK().  I also added a test that will cause the compiler
to fail the call to CHECK() without this change.

Differential Revision: https://reviews.llvm.org/D98355

3 years ago[lldb] [test] Update XFAILs for FreeBSD/aarch64
Michał Górny [Wed, 3 Mar 2021 14:57:51 +0000 (15:57 +0100)]
[lldb] [test] Update XFAILs for FreeBSD/aarch64

3 years ago[lldb] [Process/FreeBSD] Introduce aarch64 hw break/watchpoint support
Michał Górny [Tue, 9 Feb 2021 20:10:09 +0000 (21:10 +0100)]
[lldb] [Process/FreeBSD] Introduce aarch64 hw break/watchpoint support

Split out the common base of Linux hardware breakpoint/watchpoint
support for AArch64 into a Utility class, and use it to implement
the matching support on FreeBSD.

Differential Revision: https://reviews.llvm.org/D96548

3 years ago[InstCombine][SimplifyLibCalls] An extra sqrtf was produced because of transformation...
Daniil Seredkin [Wed, 10 Mar 2021 17:30:53 +0000 (12:30 -0500)]
[InstCombine][SimplifyLibCalls] An extra sqrtf was produced because of transformations in optimizePow function

See: https://bugs.llvm.org/show_bug.cgi?id=47613

There was an extra sqrt call because shrinking emitted a new powf and at the same time optimizePow replaces the previous pow with sqrt and as the result we have two instructions that will be in worklist of InstCombie despite the fact that %powf is not used by anyone (it is alive because of errno).

As the result we have two instructions:

  %powf = call fast float @powf(float %x, float 5.000000e-01)
  %sqrt = call fast double @sqrt(double %dx)

%powf will be converted to %sqrtf on a later iteration.

As a quick fix for that I moved shrinking to the end of optimizePow so that pow is replaced with sqrt at first that allows not to emit a new shrunk powf.

Differential Revision: https://reviews.llvm.org/D98235

3 years ago[RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on...
Craig Topper [Wed, 10 Mar 2021 17:10:11 +0000 (09:10 -0800)]
[RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on RV32.

The type legalizer will visit the result before the operands. To
avoid creating an illegal target specific node or falling back to
scalarization, we need to manually split vector operands.

This still doesn't handle the case of non-power of 2 operands
which need to be widened. I'm not sure the type legalizer is
ready for it. I think we would need to insert an
INSERT_SUBVECTOR with the power of 2 type we want, with an undef
first operand, and the non-power of 2 orignal operand as the vector
to insert. Then fill in the neutral elements into the elements the
padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector.
From there we carry on splitting if needed to get to a legal type
then do the target specific code.

The problem with this is the type legalizer doesn't know how to
widen an insert_subvector yet. We would need to add that including
the handling for a non-undef first vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98292

3 years agoRevert "[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopN...
Ta-Wei Tu [Wed, 10 Mar 2021 17:24:43 +0000 (01:24 +0800)]
Revert "[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`"

This reverts commit df9158c9a45a6902c2b0394f9bd6512e3e441f31.

3 years ago[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR
Stephen Tozer [Wed, 10 Mar 2021 14:25:09 +0000 (14:25 +0000)]
[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR

This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after
finalize-isel), excluding the debug liveness passes and DWARF emission. This
most significantly affects MachineSink, which now needs to consider all used
registers of a debug value when sinking, but for most passes this change is
simply replacing getDebugOperand(0) with an iteration over all debug operands.

Differential Revision: https://reviews.llvm.org/D92578

3 years ago[dfsan] Tracking origins at phi nodes
Jianzhou Zhao [Tue, 9 Mar 2021 04:13:16 +0000 (04:13 +0000)]
[dfsan] Tracking origins at phi nodes

This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D98268

3 years ago[flang][driver] Revert RUN-line change
Andrzej Warzynski [Wed, 10 Mar 2021 16:53:56 +0000 (16:53 +0000)]
[flang][driver] Revert RUN-line change

In https://reviews.llvm.org/D98283, the RUN line in pre-fir-tree04.f90
was updated to use `%flang_fc1` instead of `%f18` (so that the test is
shared between the old and the new driver). Unfortunately, the new
driver does not know yet how to find standard intrinsics modules. As a
result, the test fails when `FLANG_BUILD_NEW_DRIVER` is set to On.

I'm restoring the original RUN line. This is rather straightforward, so
sending without a review. This should make Flang builders happy.

3 years ago[DSE] Handle memmove with equal non-const sizes
Dávid Bolvanský [Wed, 10 Mar 2021 16:51:39 +0000 (17:51 +0100)]
[DSE] Handle memmove with equal non-const sizes

Follow up for fhahn's D98284. Also fixes a case from PR47644.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D98346