Andy Wingo [Tue, 3 Nov 2020 18:46:23 +0000 (10:46 -0800)]
[WebAssembly] Implement ref.null
This patch adds a new "heap type" operand kind to the WebAssembly MC
layer, used by ref.null. Currently the possible values are "extern" and
"func"; when typed function references come, though, this operand may be
a type index.
Note that the "heap type" production is still known as "refedtype" in
the draft proposal; changing its name in the spec is
ongoing (https://github.com/WebAssembly/reference-types/issues/123).
The register form of ref.null is still untested.
Differential Revision: https://reviews.llvm.org/D90608
Aaron En Ye Shi [Wed, 28 Oct 2020 17:51:55 +0000 (17:51 +0000)]
[HIP] Math Headers to use type promotion
Similar to libcxx implementation of cmath function
overloads, use type promotion templates to determine
return types of multi-argument math functions.
Fixes: SWDEV-256825
Reviewed By: tra, yaxunl
Differential Revision: https://reviews.llvm.org/D90409
Artem Belevich [Thu, 29 Oct 2020 22:19:06 +0000 (15:19 -0700)]
[HIP] Use argv[0] as the default choice for the Executable name.
The path produced by getMainExecutable() may not be the right one when the files are installed in
a symlinked tree and when the real location of llvm-objdump is in a different directory.
Given that clang-offload-bundler is invoked by clang, the driver already does the job figuring out
the right path (e.g. it pays attention to -no-canonical-prefixes).
Offload bundler should use it, instead of trying to figure out the path on its
own.
Differential Revision: https://reviews.llvm.org/D90436
Artem Belevich [Fri, 25 Sep 2020 23:25:27 +0000 (16:25 -0700)]
[CUDA] Allow local static variables with target attributes.
While CUDA documentation claims that such variables are not allowed[1], NVCC has
been accepting them since CUDA-10.0[2] and some headers in CUDA-11 rely on this
working.
1. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#static-variables-function
2. https://godbolt.org/z/zsodzc
Differential Revision: https://reviews.llvm.org/D88345
Jameson Nash [Tue, 3 Nov 2020 18:17:52 +0000 (13:17 -0500)]
[GVN] small improvements to comments
Jonas Devlieghere [Tue, 3 Nov 2020 18:21:00 +0000 (10:21 -0800)]
[crashlog] Modularize parser
Instead of parsing the crashlog in one big loop, use methods that
correspond to the different parsing modes.
Differential revision: https://reviews.llvm.org/D90665
Simon Pilgrim [Tue, 3 Nov 2020 18:13:21 +0000 (18:13 +0000)]
Cleanup namespace comment to fix clang-tidy warning. NFCI.
Simon Pilgrim [Tue, 3 Nov 2020 18:09:15 +0000 (18:09 +0000)]
[DAG] computeKnownBits - Move ISD::SRA handling into KnownBits::ashr
As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
Craig Topper [Tue, 3 Nov 2020 17:33:06 +0000 (09:33 -0800)]
[RISCV] Add missing patterns for rotr with immediate for Zbb/Zbp extensions.
DAGCombine doesn't canonicalize rotl/rotr with immediate so we
need patterns for both.
Remove the custom matcher for rotl to RORI and just use a SDNodeXForm
to convert the immediate instead. Doing this gives priority to the
rev32/rev16 versions of grevi over rori since an explicit immediate
is more precise than any immediate. I also added rotr patterns for
rev32/rev16. And removed the (or (shl), (shr)) patterns that should be
combined to rotl by DAG combine.
There is at least one other grev pattern that probably needs a
another rotr pattern, but we need more test coverage first.
Differential Revision: https://reviews.llvm.org/D90575
Simon Pilgrim [Tue, 3 Nov 2020 17:30:17 +0000 (17:30 +0000)]
[DAG] computeKnownBits - Move (most) ISD::SRL handling into KnownBits::lshr
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.
We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
Simon Pilgrim [Tue, 3 Nov 2020 16:50:18 +0000 (16:50 +0000)]
[AMDGPU] Regenerate load i16 tests to use update_llc_test_checks.py script. NFCI.
Necessary for upcoming KnownBits::lshr support.
Louis Dionne [Tue, 3 Nov 2020 17:05:55 +0000 (12:05 -0500)]
[libc++] Move <memory> helpers outside of std::allocator_traits
They don't really belong as members of allocator_traits.
Jonas Devlieghere [Tue, 3 Nov 2020 04:29:32 +0000 (20:29 -0800)]
[crashlog] Move crash log parsing into its own class
Move crash log parsing out of the CrashLog class and into its own class
and add more tests.
Differential revision: https://reviews.llvm.org/D90664
Nico Weber [Tue, 3 Nov 2020 16:55:22 +0000 (11:55 -0500)]
Make test/tools/llvm-dlltool/tool-name.test pass, and make it run
The test hasn't run since it was added in D71302.
Tony [Tue, 3 Nov 2020 02:14:45 +0000 (02:14 +0000)]
[NFC][AMDGPU] Minor editorial improvements to AMDGPUUsage.rst
Differential Revision: https://reviews.llvm.org/D90661
etiotto [Tue, 3 Nov 2020 16:44:18 +0000 (11:44 -0500)]
[compiler-rt][profile][AIX]: Enable compiler-rt profile build on AIX
This patch adds support for building the compiler-rt profile library on AIX.
Reviewed by: phosek
Differential Revision: https://reviews.llvm.org/D90619
Esme-Yi [Tue, 3 Nov 2020 16:34:02 +0000 (16:34 +0000)]
Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA."
This reverts commit
119ab2181e6ed823849c93d55af8e989c28c9f3c.
Tim Renouf [Fri, 30 Oct 2020 08:21:12 +0000 (08:21 +0000)]
[AMDGPU] Add gfx1033 target
Differential Revision: https://reviews.llvm.org/D90447
Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
Tim Renouf [Tue, 6 Oct 2020 17:23:59 +0000 (18:23 +0100)]
[AMDGPU] Add gfx90c target
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.
Differential Revision: https://reviews.llvm.org/D90419
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
Michał Górny [Tue, 3 Nov 2020 14:00:58 +0000 (15:00 +0100)]
[lldb] [Process/FreeBSDRemote] Fix "Fix attaching via lldb-server"
One of the changes seems to have been lost in rebase. Reapply.
Valentin Clement [Tue, 3 Nov 2020 16:12:14 +0000 (11:12 -0500)]
[openmp][openacc][NFC] Simplify access and validation of DirectiveBase information
This patch adds some helper in the DirectiveLanguage wrapper to initialize it from
the RecordKeeper and validate the records. This simplify arguments in lots of function
since only the DirectiveLanguge is passed.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D90358
Sanjay Patel [Tue, 3 Nov 2020 15:41:52 +0000 (10:41 -0500)]
[CostModel] fix cost calc bug for sadd/ssub with overflow
As noted in D90554, there's an opcode typo in using an easily
misused cost model API: getCmpSelInstrCost(). Beyond that, the
assumed sequence of ops is questionable, but that would be
another patch.
My guess is that the x86 test diffs show that we are probably
wrong both before and after this change, so there will be no
practical difference.
As an example, I tried this test which shows a cost of '7'
either way:
define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) {
%V4I32 = call {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb)
%ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1
%r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0
%z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r
ret <4 x i32> %z
}
$ llc -o - sadd.ll -mattr=avx
vpaddd %xmm1, %xmm0, %xmm2
vpcmpgtd %xmm2, %xmm0, %xmm0
vpxor %xmm0, %xmm1, %xmm0
vblendvps %xmm0, LCPI0_0(%rip), %xmm2, %xmm0a
Differential Revision: https://reviews.llvm.org/D90681
Hans Wennborg [Tue, 3 Nov 2020 15:55:12 +0000 (16:55 +0100)]
Fix GCC error: specialization of 'template<class LeafTy> struct llvm::LinearPolyBaseTypeTraits' in different namespace
Joachim Protze [Tue, 3 Nov 2020 15:31:50 +0000 (16:31 +0100)]
[OpenMP][Tools] clang-format Archer (NFC)
Joe Ellis [Tue, 3 Nov 2020 15:24:41 +0000 (15:24 +0000)]
[SVE][InstCombine] Improve specificity of InstCombine TypeSize test
The test was using -O2, where -instcombine will suffice.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D90684
Jay Foad [Mon, 2 Nov 2020 13:05:15 +0000 (13:05 +0000)]
[AMDGPU] Fix ds_read2/write2 with unaligned offsets
These instructions use a scaled offset. We were wrongly selecting them
even when the required offset was not a multiple of the scale factor.
Differential Revision: https://reviews.llvm.org/D90607
Martin Storsjö [Mon, 28 Oct 2019 21:46:34 +0000 (23:46 +0200)]
[libcxx] Error out if __libcpp_mbsrtowcs_l fails in __time_get_storage
If __libcpp_mbsrtowcs_l outputs zero wchar_t's for week days or
month names (due to errors in the locale function setup), these are
matched all the time in __time_get_storage::__analyze, ending up in
an infinite loop, allocating more memory until killed.
Differential Revision: https://reviews.llvm.org/D69553
Martin Storsjö [Fri, 23 Oct 2020 07:54:02 +0000 (10:54 +0300)]
[libcxx] [libcxxabi] Set flags for visibility when statically linking libcxxabi into libcxx for windows
Previously, these had to be set manually when building each of the
projects standalone, in order to get proper symbol visibility when
combining the two libraries.
Differential Revision: https://reviews.llvm.org/D90021
Pavel Labath [Mon, 7 Sep 2020 07:33:58 +0000 (09:33 +0200)]
[lldb/Utility] Add unit tests for RegisterValue::GetScalarValue
Buggy cases are commented out.
Also sneak in a modernization of a RegisterValue constructor.
Mircea Trofin [Tue, 3 Nov 2020 15:08:51 +0000 (07:08 -0800)]
[Docs][FileCheck] Small fix.
Jameson Nash [Tue, 3 Nov 2020 13:54:51 +0000 (08:54 -0500)]
make the AsmPrinterHandler array public
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.
Replaces https://reviews.llvm.org/D74158
Differential Revision: https://reviews.llvm.org/D89613
Nathan James [Tue, 3 Nov 2020 14:57:08 +0000 (14:57 +0000)]
[ADT] Add SmallVector::pop_back_n
Adds a method called pop_back_n to SmallVector.
This is more readable and less error prone than the alternatives of using
```lang=c++
Vector.resize(Vector.size() - N);
Vector.erase(Vector.end() - N, Vector.end());
for (unsigned I = 0;I<N;++I) Vector.pop_back();
```
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90576
Anton Afanasyev [Tue, 3 Nov 2020 14:40:41 +0000 (17:40 +0300)]
[SLP][X86][Test] Extend test coverage for PR47629
Add two cases for `<i32 x 8>`. Precommit for PR47629 and D90445. NFC
Jay Foad [Tue, 3 Nov 2020 14:24:44 +0000 (14:24 +0000)]
[AMDGPU] Precommit globalisel tests for ds_read2_b64 with large offset
Nathan James [Tue, 3 Nov 2020 14:36:50 +0000 (14:36 +0000)]
[ASTMatchers] Made isExpandedFromMacro Polymorphic
Made the isExpandedFromMacro matcher work on Stmt's, TypeLocs and Decls in line with the other macro expansion matchers.
Also tweaked it to take a `std::string` instead of a `StringRef`.
This prevents potential use-after-free bugs if the matcher is created with a string thats destroyed before the matcher finishes matching.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D90303
Simon Pilgrim [Tue, 3 Nov 2020 13:49:00 +0000 (13:49 +0000)]
[DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.
We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
LLVM GN Syncbot [Tue, 3 Nov 2020 13:58:51 +0000 (13:58 +0000)]
[gn build] Port
1667d23e585
Nico Weber [Tue, 3 Nov 2020 13:58:23 +0000 (08:58 -0500)]
[gn build] (manually) port
1af3cb5424d
Jay Foad [Tue, 3 Nov 2020 13:31:59 +0000 (13:31 +0000)]
[AMDGPU] Specify a triple to avoid codegen changes depending on host OS
Lei Zhang [Tue, 3 Nov 2020 13:10:56 +0000 (08:10 -0500)]
[mlir][spirv] Support for a few more decorations in (de)serialization
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D90655
Sanjay Patel [Tue, 3 Nov 2020 12:55:55 +0000 (07:55 -0500)]
[x86] update cost table comments for maxnum; NFC
Follow-up suggested in D90613.
Yaxun (Sam) Liu [Wed, 28 Oct 2020 14:44:21 +0000 (10:44 -0400)]
[CUDA][HIP] Fix linkage for -fgpu-rdc
Currently for explicit template function instantiation in CUDA/HIP device
compilation clang emits instantiated kernel with external linkage
and instantiated device function with internal linkage.
This is fine for -fno-gpu-rdc since there is only one TU.
However this causes duplicate symbols for kernels for -fgpu-rdc if
the same instantiation happen in multiple TU. Or missing symbols
if a device function calls an explicitly instantiated template function
in a different TU.
To make explicit template function instantiation work for
-fgpu-rdc we need to follow the C++ linkage paradigm, i.e.
use weak_odr linkage.
Differential Revision: https://reviews.llvm.org/D90311
Roman Lebedev [Tue, 3 Nov 2020 12:32:31 +0000 (15:32 +0300)]
[InstCombine] Perform C-(X+C2) --> (C-C2)-X transform before using Negator
In particular, it makes it fire for C=0, because negator doesn't want
to perform that fold since in general it's not beneficial.
Roman Lebedev [Tue, 3 Nov 2020 10:34:38 +0000 (13:34 +0300)]
[InstCombine] Negator: - (C - %x) --> %x - C (PR47997)
This relaxes one-use restriction on that `sub` fold,
since apparently the addition of Negator broke
preexisting `C-(C2-X) --> X+(C-C2)` (with C=0) fold.
Roman Lebedev [Tue, 3 Nov 2020 11:06:42 +0000 (14:06 +0300)]
[NFC][InstCombine] Negator: add test coverage for `(?? - (%y + C))` pattern (PR47997)
Roman Lebedev [Tue, 3 Nov 2020 11:04:36 +0000 (14:04 +0300)]
[NFC][InstCombine] Negator: add test coverage for `(?? - (C - %y))` pattern (PR47997)
Roman Lebedev [Tue, 3 Nov 2020 10:18:20 +0000 (13:18 +0300)]
[NFC][InstCombine] Add test coverage for PR47997
Florian Hahn [Tue, 3 Nov 2020 10:32:38 +0000 (10:32 +0000)]
[SCCP] Handle bitcast of vector constants.
Vectors where all elements have the same known constant range are treated as a
single constant range in the lattice. When bitcasting such vectors, there is a
mis-match between the width of the lattice value (single constant range) and
the original operands (vector). Go to overdefined in that case.
Fixes PR47991.
David Green [Tue, 3 Nov 2020 12:58:10 +0000 (12:58 +0000)]
[ARM] Remove unused variable. NFC
Joachim Protze [Tue, 3 Nov 2020 11:31:05 +0000 (12:31 +0100)]
[OpenMP][libomptarget][Tests] fix failing test
D88149 updated `omp_get_initial_device` behavior to conform with OpenMP 5.1.
omp_get_initial_device() == omp_get_num_devices()
Joachim Protze [Mon, 2 Nov 2020 15:43:12 +0000 (16:43 +0100)]
[OpenMP][OMPT][NFC] Fix flaky test
As reported by @ronlieb, the test shows intermittent fails.
The test failed, if the dependent task was already finished, when the depending
task was to be created. We have other tests to check for the dependences pair.
Joachim Protze [Fri, 30 Oct 2020 08:36:07 +0000 (09:36 +0100)]
[OpenMP][Tool] Handle detached tasks in Archer
Since detached tasks are supported by clang and the OpenMP runtime, Archer
must expect to receive the corresponding callbacks.
This patch adds support to interpret the synchronization semantics of
omp_fulfill_event and cleans up the handling of task switches.
Hans Wennborg [Tue, 3 Nov 2020 12:01:55 +0000 (13:01 +0100)]
Revert "[CodeGen] [WinException] Only produce handler data at the end of the function if needed"
This caused an explosion in ICF times during linking on Windows when libfuzzer
instrumentation is enabled. For a small binary we see ICF time go from ~0 to
~10 s. For a large binary it goes from ~1 s to forevert (I gave up after 30
minutes).
See comment on the code review.
> If we are going to write handler data (that is written as variable
> length data following after the unwind info in .xdata), we need to
> emit the handler data immediately, but for cases where no such
> info is going to be written, skip emitting it right away. (Unwind
> info for all remaining functions that hasn't gotten it emitted
> directly is emitted at the end.)
>
> This does slightly change the ordering of sections (triggering a
> bunch of updates to DebugInfo/COFF tests), but the change should be
> benign.
>
> This also matches GCC's assembly output, which doesn't output
> .seh_handlerdata unless it actually is needed.
>
> For ARM64, the unwind info can be packed into the runtime function
> entry itself (leaving no data in the .xdata section at all), but
> that can only be done if there's no follow-on data in the .xdata
> section. If emission of the unwind info is triggered via
> EmitWinEHHandlerData (or the .seh_handlerdata directive), which
> implicitly switches to the .xdata section, there's a chance of the
> caller wanting to pass further data there, so the packed format
> can't be used in that case.
>
> Differential Revision: https://reviews.llvm.org/D87448
This reverts commit
36c64af9d7f97414d48681b74352c9684077259b.
Stefan Gränitz [Tue, 3 Nov 2020 12:05:38 +0000 (12:05 +0000)]
[JITLink][ELF] Implement R_X86_64_PLT32 relocations
Basic implementation for call and jmp branches with 32 bit offset. Branches to local targets produce
Branch32 edges that are resolved like a regular PCRel32 relocations. Branches to external (undefined)
targets produce Branch32ToStub edges and go through a PLT entry by default. If the target happens to
get resolved within the 32 bit range from the callsite, the edge is relaxed during post-allocation
optimization. There is a test for each of these cases.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D90331
Hiral Oza [Tue, 3 Nov 2020 10:11:44 +0000 (10:11 +0000)]
[clang-tidy] adding "--config-file=<file-path>" to specify custom config file.
Let clang-tidy to read config from specified file.
Example:
$ clang-tidy --config-file=/some/path/myTidyConfig --list-checks --
...this will read config from '/some/path/myTidyConfig'.
ClangTidyMain.cpp reads ConfigFile into string and then assigned read data to 'Config' i.e. makes like '--config' code flow internally.
May speed-up tidy runtime since now it will just look-up <file-path>
instead of searching ".clang-tidy" in parent-dir(s).
Directly specifying config path helps setting build dependencies.
Thanks to @DmitryPolukhin for valuable suggestion. This patch now propose
change only in ClangTidyMain.cpp.
Reviewed By: DmitryPolukhin
Differential Revision: https://reviews.llvm.org/D89936
serge-sans-paille [Tue, 3 Nov 2020 11:57:44 +0000 (12:57 +0100)]
Fix 'default label in switch which covers all enumeration values' warning
David Green [Tue, 3 Nov 2020 11:53:09 +0000 (11:53 +0000)]
[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops
If an instruction will be lowered to a call there is no advantage of
using a low overhead loop as the LR register will need to be spilled and
reloaded around the call, and the low overhead will end up being
reverted. This teaches our hardware loop lowering that these memory
intrinsics will be calls under certain situations.
Differential Revision: https://reviews.llvm.org/D90439
David Green [Tue, 3 Nov 2020 11:44:50 +0000 (11:44 +0000)]
[ARM] Low overhead loop memcpy lowering test. NFC
Sander de Smalen [Tue, 3 Nov 2020 10:25:06 +0000 (10:25 +0000)]
[AArch64][SVE] NFC: Guard all SVE tests for TypeSize warnings.
This patch adds a bunch of CHECK lines to guard against implicit
conversions of TypeSize -> uint64_t occuring in code-paths that previously
were safe for scalable vectors.
Alexander Bosch [Tue, 13 Oct 2020 14:02:21 +0000 (16:02 +0200)]
[MLIR] Added test operations to replace linalg dependency for
BufferizeTests.
Summary:
Added test operations to replace the LinalgDialect dependency in tests
which use the buffer-deallocation, buffer-hoisting,
buffer-loop-hoisting, promote-buffers-to-stack,
buffer-placement-preparation-allowed-memref-resutls and
buffer-placement-preparation pass. Adapted the corresponding tests cases
and TestBufferPlacement.cpp.
Differential Revision: https://reviews.llvm.org/D90037
Mehdi Amini [Tue, 3 Nov 2020 11:17:39 +0000 (11:17 +0000)]
Make the implicit nesting behavior of the PassManager user-controllable and default to false
This is an error prone behavior, I frequently have ~20 min debugging sessions when I hit
an unexpected implicit nesting. This default makes the C++ API safer for users.
Depends On D90669
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D90671
Mehdi Amini [Tue, 3 Nov 2020 11:17:01 +0000 (11:17 +0000)]
Handle the verifier at run() time in the PassManager instead of build time
This simplifies a few parts of the pass manager, but in particular we don't add as many
verifierpass as there are passes in the pipeline, and we can now enable/disable the
verifier after the fact on an already built PassManager.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D90669
Mehdi Amini [Tue, 3 Nov 2020 05:00:43 +0000 (05:00 +0000)]
Change the PrintOpStatsPass to operate on any operation instead of just ModuleOp
This allows to use it on other operation, like a GPUModule for example.
Mehdi Amini [Mon, 2 Nov 2020 21:27:28 +0000 (21:27 +0000)]
Remove mlir-c/Core.h which is superseded by the new API in mlir-c/IR.h
This header was an initial early attempt at a crude C API for bindings,
but it isn't used and redundant with the new API. At this point it only
contributes to more confusion.
Differential Revision: https://reviews.llvm.org/D90643
Stephen Kelly [Mon, 2 Nov 2020 23:30:52 +0000 (23:30 +0000)]
Add test missing from previous commit
Simon Pilgrim [Tue, 3 Nov 2020 10:49:33 +0000 (10:49 +0000)]
[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.
This should allow us to begin resolving PR46896 et al.
Differential Revision: https://reviews.llvm.org/D90625
Alexander Belyaev [Tue, 3 Nov 2020 10:36:14 +0000 (11:36 +0100)]
[mlir] Convert `memref_reshape` to LLVM.
https://llvm.discourse.group/t/rfc-standard-memref-cast-ops/1454/15
Differential Revision: https://reviews.llvm.org/D90377
Florian Hahn [Tue, 3 Nov 2020 09:55:47 +0000 (09:55 +0000)]
[SLP] Pass VecPred argument to getCmpSelInstrCost.
Check if all compares in VL have the same predicate and pass it to
getCmpSelInstrCost, to improve cost-modeling on targets that only
support compare/select combinations for certain uniform predicates.
This leads to additional vectorization in some cases
```
Same hash: 217 (filtered out)
Remaining: 19
Metric: SLP.NumVectorInstructions
Program base slp2 diff
test-suite...marks/SciMark2-C/scimark2.test 11.00 26.00 136.4%
test-suite...T2006/445.gobmk/445.gobmk.test 79.00 135.00 70.9%
test-suite...ediabench/gsm/toast/toast.test 54.00 71.00 31.5%
test-suite...telecomm-gsm/telecomm-gsm.test 54.00 71.00 31.5%
test-suite...CI_Purple/SMG2000/smg2000.test 426.00 542.00 27.2%
test-suite...ch/g721/g721encode/encode.test 30.00 24.00 -20.0%
test-suite...000/186.crafty/186.crafty.test 116.00 138.00 19.0%
test-suite...ications/JM/ldecod/ldecod.test 697.00 765.00 9.8%
test-suite...6/464.h264ref/464.h264ref.test 822.00 886.00 7.8%
test-suite...chmarks/MallocBench/gs/gs.test 154.00 162.00 5.2%
test-suite...nsumer-lame/consumer-lame.test 621.00 651.00 4.8%
test-suite...lications/ClamAV/clamscan.test 223.00 231.00 3.6%
test-suite...marks/7zip/7zip-benchmark.test 680.00 695.00 2.2%
test-suite...CFP2000/177.mesa/177.mesa.test 2121.00 2129.00 0.4%
test-suite...:: External/Povray/povray.test 2406.00 2412.00 0.2%
test-suite...TimberWolfMC/timberwolfmc.test 634.00 634.00 0.0%
test-suite...CFP2006/433.milc/433.milc.test 1036.00 1036.00 0.0%
test-suite.../Benchmarks/nbench/nbench.test 321.00 321.00 0.0%
test-suite...ctions-flt/Reductions-flt.test NaN 5.00 nan%
```
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D90124
serge-sans-paille [Tue, 3 Nov 2020 10:14:38 +0000 (11:14 +0100)]
[lld] missing doc entry for error handling script
Fix http://lab.llvm.org:8011/#/builders/69/builds/67
Nicholas Guy [Thu, 22 Oct 2020 12:41:05 +0000 (13:41 +0100)]
[AArch64] Redundant masks in downcast long multiply
Adds patterns to catch masks preceeding a long multiply,
and generating a single umull/smull instruction instead.
Differential revision: https://reviews.llvm.org/D89956
serge-sans-paille [Mon, 19 Oct 2020 11:19:52 +0000 (13:19 +0200)]
Provide a hook to customize missing library error handling
Make it possible for lld users to provide a custom script that would help to
find missing libraries. A possible scenario could be:
% clang /tmp/a.c -fuse-ld=lld -loauth -Wl,--error-handling-script=/tmp/addLibrary.py
unable to find library -loauth
looking for relevant packages to provides that library
liboauth-0.9.7-4.el7.i686
liboauth-devel-0.9.7-4.el7.i686
liboauth-0.9.7-4.el7.x86_64
liboauth-devel-0.9.7-4.el7.x86_64
pix-1.6.1-3.el7.x86_64
Where addLibrary would be called with the missing library name as first argument
(in that case addLibrary.py oauth)
Differential Revision: https://reviews.llvm.org/D87758
David Green [Tue, 3 Nov 2020 09:58:28 +0000 (09:58 +0000)]
[CostModel] Make target intrinsics cheap by default
This patch changes the intrinsics cost model to assume that by default
target intrinsics are cheap. This didn't seem to be the case for all
intrinsics, and is potentially an MVE problem due to our scalarization
overheads. Cheap seems to be a good default in general though.
Differential Revision: https://reviews.llvm.org/D90597
Sander de Smalen [Mon, 2 Nov 2020 21:32:57 +0000 (21:32 +0000)]
[NFCI] Add StackOffset class and base classes for ElementCount, TypeSize.
This patch adds a linear polynomial base class, called LinearPolyBase, which
serves as a base class for StackOffset. It tries to represent a linear
polynomial like:
c0 * scale0 + c1 * scale1 + ... + cK * scaleK
where the scale is implicit, meaning that only the coefficients are
encoded.
This patch also adds a univariate linear polynomial, which serves as
a base class for ElementCount and TypeSize. This tries to represent a
linear polynomial where only one dimension can be set at any one time,
i.e. a TypeSize is either fixed-sized, or scalable-sized, but cannot be
a combination of the two.
class LinearPolyBase
^
|
+---- class StackOffset (dimensions = 2 (fixed/scalable), type = int64_t)
class UnivariateLinearPolyBase
|
|
+---- class LinearPolySize (dimensions = 2 (fixed/scalable))
^
|
+-------- class ElementCount (type = unsigned)
|
|
+-------- class TypeSize (type = uint64_t)
Reviewed By: ctetreau, david-arm
Differential Revision: https://reviews.llvm.org/D88982
Pedro Tammela [Sun, 1 Nov 2020 16:23:36 +0000 (16:23 +0000)]
[LLDB][NFC] treat Lua error codes in a more explicit manner
This patch is a minor suggestion to not rely on the fact
that the `LUA_OK` macro is 0.
This assumption could change in future versions of the C API.
Differential Revision: https://reviews.llvm.org/D90556
Tres Popp [Tue, 3 Nov 2020 09:23:54 +0000 (10:23 +0100)]
[mlir] Add to shape.is_broadcastable description
Tres Popp [Tue, 13 Oct 2020 15:56:45 +0000 (17:56 +0200)]
[mlir] Add partial lowering of shape.cstr_broadcastable.
Because cstr operations allow more instruction reordering than asserts, we only
lower cstr_broadcastable to std ops with cstr_require. This ensures that the
more drastic lowering to asserts can happen specifically with the user's desire.
Differential Revision: https://reviews.llvm.org/D89325
Michał Górny [Mon, 2 Nov 2020 22:48:55 +0000 (23:48 +0100)]
[lldb] [Plugins/FreeBSDRemote] Disable GetMemoryRegionInfo()
Disable GetMemoryRegionInfo() in order to unbreak expression parsing.
For some reason, the presence of non-stub function causes LLDB to fail
to detect system libraries correctly. Through being unable to find
mmap() and allocate memory, this leads to expression parser being
broken.
The issue is non-trivial and it is going to require more time debugging.
On the other hand, the downsides of missing the function are minimal
(2 failing tests), and the benefit of working expression parser
justifies disabling it temporarily. Furthermore, the old FreeBSD plugin
did not implement it anyway, so it allows us to switch to the new plugin
without major regressions.
The really curious part is that the respective code in the NetBSD plugin
yields very similar results, yet does not seem to break the expression
parser.
Differential Revision: https://reviews.llvm.org/D90650
Michał Górny [Mon, 2 Nov 2020 16:01:40 +0000 (17:01 +0100)]
[lldb] [Process/FreeBSDRemote] Remove GetSharedLibraryInfoAddress override
Remove the NetBSD-specific override of GetSharedLibraryInfoAddress(),
restoring the generic implementation from NativeProcessELF.
Differential Revision: https://reviews.llvm.org/D90620
Michał Górny [Sat, 31 Oct 2020 09:30:13 +0000 (10:30 +0100)]
[lldb] [Process/FreeBSDRemote] Fix attaching via lldb-server
Fix two bugs that caused attaching to a process in a pre-connected
lldb-server to fail. These are:
1. Prematurely reporting status in NativeProcessFreeBSD::Attach().
The SetState() call defaulted to notify the process, and LLGS tried
to send the stopped packet before the process instance was assigned
to it. While at it, add an assert for that in LLGS.
2. Duplicate call to ReinitializeThreads() (via SetupTrace()) that
overwrote the stopped status in threads. Now SetupTrace() is called
directly by NativeProcessFreeBSD::Attach() (not the Factory) in place
of ReinitializeThreads().
This fixes at least commands/process/attach/TestProcessAttach.py
and python_api/hello_world/TestHelloWorld.py.
Differential Revision: https://reviews.llvm.org/D90525
Michał Górny [Fri, 30 Oct 2020 11:00:45 +0000 (12:00 +0100)]
[lldb] [Host/{free,net}bsd] Fix process matching by name
Fix process matching by name to make 'process attach -n ...' work.
The process finding code has an optimization that defers getting
the process name and executable format after the numeric (PID, UID...)
parameters are tested. However, the ProcessInstanceInfoMatch.Matches()
method has been matching process name against the incomplete process
information as well, and effectively no process ever matched.
In order to fix this, create a copy of ProcessInstanceInfoMatch, set
it to ignore process name and se this copy for the initial match.
The same fix applies to FreeBSD and NetBSD host code.
Differential Revision: https://reviews.llvm.org/D90454
Michał Górny [Wed, 28 Oct 2020 11:40:45 +0000 (12:40 +0100)]
[lldb] [Process/FreeBSDRemote] Implement thread GetName()
Implement NativeThreadFreeBSD::GetName(). This is based
on the equivalent code in the legacy FreeBSD plugin, except it is
modernized a bit to use llvm::Optional and std::vector for data storage.
Differential Revision: https://reviews.llvm.org/D90298
Georgii Rymar [Tue, 13 Oct 2020 13:17:04 +0000 (16:17 +0300)]
[llvm-readobj/libObject] - Allow dumping objects that has a broken SHT_SYMTAB_SHNDX section.
Currently it is impossible to create an instance of ELFObjectFile when the
SHT_SYMTAB_SHNDX can't be read. We error out when fail to parse the
SHT_SYMTAB_SHNDX section in the factory method.
This change delays reading of the SHT_SYMTAB_SHNDX section entries,
with it llvm-readobj is now able to work with such inputs.
Differential revision: https://reviews.llvm.org/D89379
Petar Avramovic [Tue, 3 Nov 2020 08:23:56 +0000 (09:23 +0100)]
AMDGPU/GlobalISel: Use same builder/observer in post-legalizer-combiner
Change match/apply functions into methods of new target specific combiner
helper class. Use reference to MachineIRBuilder from helper instead of
constructing new MachineIRBuilder each time new instruction needs to made.
Allows correct tracking of newly created instructions.
Differential Revision: https://reviews.llvm.org/D90623
Martin Storsjö [Tue, 3 Nov 2020 08:20:36 +0000 (10:20 +0200)]
[clang] Fix the fsanitize.c testcase after
eaae6fdf67e1f. NFC.
After that commit, the vptr sanitizer is enabled for mingw targets.
Max Kazantsev [Tue, 3 Nov 2020 07:59:04 +0000 (14:59 +0700)]
[NFC] Refactor code in IndVars, preparing for further improvement
Martin Storsjö [Sun, 1 Nov 2020 20:49:49 +0000 (22:49 +0200)]
[clang] [MinGW] Allow using the vptr sanitizer
Differential Revision: https://reviews.llvm.org/D90572
Martin Storsjö [Sun, 1 Nov 2020 20:52:08 +0000 (22:52 +0200)]
[compiler-rt] [ubsan] Use the itanium type info lookup for mingw targets
Differential Revision: https://reviews.llvm.org/D90571
Esme-Yi [Tue, 3 Nov 2020 07:44:11 +0000 (07:44 +0000)]
[PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.
Reviewed By: shchenz, steven.zhang
Differential Revision: https://reviews.llvm.org/D89855
Martin Storsjö [Thu, 29 Oct 2020 10:40:13 +0000 (12:40 +0200)]
[libcxx] [test] Use error_code::default_error_condition to check errors against the expected codes
error_code returned from functions might not be of the generic category,
but of the system category, which can have different error code values.
Use default_error_condition() to remap errors to the generic category
where possible, to allow comparing them to the expected values.
Use the ErrorIs() helper instead of a direct comparison against
an excpected value.
Differential Revision: https://reviews.llvm.org/D90602
Martin Storsjö [Mon, 2 Nov 2020 08:19:42 +0000 (10:19 +0200)]
[libcxx] Avoid double frees of file descriptors in the fallback ifstream/ofstream codepath
So far, most actual uses of libc++ std::filesystem probably use
the sendfile or fcopyfile implementations.
Differential Revision: https://reviews.llvm.org/D90601
Martin Storsjö [Fri, 30 Oct 2020 23:01:28 +0000 (01:01 +0200)]
[libcxx] [test] Create symlink_to_dir as the right kind, as a directory symlink
This was missed in
5c39eebc126d.
Differential Revision: https://reviews.llvm.org/D90600
Martin Storsjö [Thu, 22 Oct 2020 10:28:35 +0000 (13:28 +0300)]
[libcxx] [test] Avoid an unused variable in non-libcpp cases in path.append
Differential Revision: https://reviews.llvm.org/D89947
Martin Storsjö [Wed, 14 Oct 2020 10:16:03 +0000 (13:16 +0300)]
[libcxx] [test] Fix the fs.op.absolute test to cope with windows paths
Prepend the root path on the already_absolute testcase, and construct
a path ending with the preferred separator for the test reference for
"foo/".
Differential Revision: https://reviews.llvm.org/D89944
Martin Storsjö [Fri, 30 Oct 2020 17:11:23 +0000 (19:11 +0200)]
[libcxxabi] Build all of libcxxabi with _LIBCPP_BUILDING_LIBRARY defined
Various definitions from libcxx need to be set in the same way
as if building libcxx itself.
Differential Revision: https://reviews.llvm.org/D90476
Stephan Bergmann [Thu, 15 Oct 2020 16:10:22 +0000 (18:10 +0200)]
[scan-build] Fix clang++ pathname again
e00629f777d9d62875730f40d266727df300dbb2 "[scan-build] Fix clang++ pathname" had
removed the -MAJOR.MINOR suffix, but since presumably LLVM 7 the suffix is only
-MAJOR, so ClangCXX (i.e., the CLANG_CXX environment variable passed to
clang/tools/scan-build/libexec/ccc-analyzer) now contained a non-existing
/path/to/clang-12++ (which apparently went largely unnoticed as
clang/tools/scan-build/libexec/ccc-analyzer falls back to just 'clang++' if the
executable denoted by CLANG_CXX does not exist).
For the new clang/test/Analysis/scan-build/cxx-name.test to be effective,
%scan-build must now take care to pass the clang executable's resolved pathname
(i.e., ending in .../clang-MAJOR rather than just .../clang) to --use-analyzer.
Differential Revision: https://reviews.llvm.org/D89481
Max Kazantsev [Tue, 3 Nov 2020 07:13:28 +0000 (14:13 +0700)]
[NFC] Split lambda into 2 parts for further reuse
Craig Topper [Tue, 3 Nov 2020 06:54:20 +0000 (22:54 -0800)]
[RISCV] Remove isel patterns for fshl/fshr with same inputs. NFC
These were being selected to ROL/ROR, but DAG combine should
canonicalize fshl/fshr with same inputs to rotl/rotr which we
also have patterns for.
Max Kazantsev [Tue, 3 Nov 2020 06:18:46 +0000 (13:18 +0700)]
[IndVars] Use knowledge about execution on last iteration when removing checks
If we know that some check will not be executed on the last iteration, we can use this
fact to eliminate its check.
Differential Revision: https://reviews.llvm.org/D88210
Reviwed By: ebrevnov
Esme-Yi [Tue, 3 Nov 2020 06:28:56 +0000 (06:28 +0000)]
[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo.
Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example D88274. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.
This is a NFC patch to move the folding patterns to PPCInstrInfo, and the follow-up works will be calling it in pre-emit-peephole and expand the patterns to handle more cases.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D89846
Sameeran joshi [Wed, 28 Oct 2020 18:13:49 +0000 (23:43 +0530)]
[Flang][OpenMP][NFC][1/2] Reorder OmpStructureChecker and simplify it.
`OmpStructureChecker` has too much boilerplate code in source file.
It was not easy to figure out the seperation of clauses inside 'OmpClause' and
the ones which had a seperate node in parse-tree.h.
This patch:
1. Removes the boilerplate by defining a few macros.
2. Makes seperation between constructs, directives and clauses(sub classes are seperated).
3. Macros could have been shared between OMP and OACC, template specilizations might have
been costly hence used macros.
Follows the same strategy used for `AccStructureChecker`.
Next patch in series to simplify OmpStructureChecker would try to simplify
boilerplates inside the functions and either create abstractions or use if
something is available inside check-directive-structure.h
Reviewed By: kiranchandramohan, clementval
Differential Revision: https://reviews.llvm.org/D90324