Vitaly Buka [Fri, 31 Jul 2020 01:48:34 +0000 (18:48 -0700)]
[ValueTracking] Remove AllocaForValue parameter
findAllocaForValue uses AllocaForValue to cache resolved values.
The function is used only to resolve arguments of lifetime
intrinsic which usually are not fare for allocas. So result reuse
is likely unnoticeable.
In followup patches I'd like to replace the function with
GetUnderlyingObjects.
Depends on D84616.
Differential Revision: https://reviews.llvm.org/D84617
Shilei Tian [Fri, 31 Jul 2020 01:37:01 +0000 (21:37 -0400)]
[OpenMP] Refactored the function `targetDataEnd`
Refactored the function `targetDataEnd` to make preparation of fixing
the issue of ahead-of-time target memory deallocation. This patch only
renamed `targetDataEnd` related variables and functions to conform
with LLVM code standard.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D84991
Vitaly Buka [Fri, 31 Jul 2020 01:22:59 +0000 (18:22 -0700)]
[NFC] Move findAllocaForValue into ValueTracking.h
Differential Revision: https://reviews.llvm.org/D84616
Shilei Tian [Fri, 31 Jul 2020 01:05:30 +0000 (21:05 -0400)]
[OpenMP] Refactored the function `target`
Refactored the function `target` to make preparation for fixing the
issue of ahead-of-time device memory deallocation.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D84816
dfukalov [Thu, 30 Jul 2020 01:12:17 +0000 (04:12 +0300)]
[NFC][AMDGPU] Improve fused fmul+fadd tests.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D84903
Scott Constable [Fri, 31 Jul 2020 00:21:48 +0000 (17:21 -0700)]
[X86] Fix for ballooning compile times due to Load Value Injection (LVI) mitigations
Fix for the issue raised in https://github.com/rust-lang/rust/issues/74632.
The current heuristic for inserting LFENCEs uses a quadratic-time algorithm. This can apparently cause substantial compilation slowdowns for building Rust projects, where functions > 5000 LoC are apparently common.
The updated heuristic in this patch implements a linear-time algorithm. On a set of benchmarks, the slowdown factor for the generated code was comparable (2.55x geo mean for the quadratic-time heuristic, vs. 2.58x for the linear-time heuristic). Both heuristics offer the same security properties, namely, mitigating LVI.
This patch also includes some formatting fixes.
Differential Revision: https://reviews.llvm.org/D84471
Craig Topper [Fri, 31 Jul 2020 00:05:06 +0000 (17:05 -0700)]
[X86] Separate CPU Feature lists in X86.td between architecture features and tuning features
After the recent change to the tuning settings for pentium4 to improve our default 32-bit behavior, I've decided to see about implementing -mtune support. This way we could have a default architecture CPU of "pentium4" or "x86-64" and a default tuning cpu of "generic". And we could change our "pentium4" tuning settings back to what they were before.
As a step to supporting this, this patch separates all of the features lists for the CPUs into 2 lists. I'm using the Proc class and a new ProcModel class to concat the 2 lists before passing to the target independent ProcessorModel. Future work to truly support mtune would change ProcessorModel to take 2 lists separately. I've diffed the X86GenSubtargetInfo.inc file before and after this patch to ensure that the final feature list for the CPUs isn't changed.
Differential Revision: https://reviews.llvm.org/D84879
kuterd [Thu, 30 Jul 2020 20:26:39 +0000 (23:26 +0300)]
[Attributor] Add time trace support.
This patch addes time trace functionality to have a better understanding
of the analysis times.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84980
Craig Topper [Thu, 30 Jul 2020 18:25:55 +0000 (11:25 -0700)]
[ValueTracking] Add basic computeKnownBits support for llvm.abs intrinsic
This includes basic support for computeKnownBits on abs. I've left FIXMEs for more complicated things we could do.
Differential Revision: https://reviews.llvm.org/D84963
Vedant Kumar [Thu, 30 Jul 2020 23:19:05 +0000 (16:19 -0700)]
[profile] Remove dependence on getpagesize from InstrProfilingBuffer.c.o
InstrProfilingBuffer.c.o is generic code that must support compilation
into freestanding projects. This gets rid of its dependence on the
_getpagesize symbol from libc, shifting it to InstrProfilingFile.c.o.
This fixes a build failure seen in a firmware project.
rdar://
66249701
Davide Italiano [Thu, 30 Jul 2020 23:20:38 +0000 (16:20 -0700)]
[debugserver/Apple Silicon] Handoff connections when attaching to translated processes
When we detect a process that the native debugserver cannot handle,
handoff the connection fd to the translated debugserver.
Eli Friedman [Mon, 27 Jul 2020 21:08:31 +0000 (14:08 -0700)]
[LegalizeTypes][SVE] Support widen/split legalization for SPLAT_VECTOR
Just the obvious implementation that rewrites the result type. Also fix
warning from EXTRACT_SUBVECTOR legalization that triggers on the test.
Differential Revision: https://reviews.llvm.org/D84706
Amara Emerson [Fri, 24 Jul 2020 20:01:36 +0000 (13:01 -0700)]
[AArch64][GlobalISel] Add legalization & selection support for G_INTRINSIC_LRINT.
Differential Revision: https://reviews.llvm.org/D84552
Mircea Trofin [Thu, 30 Jul 2020 23:08:06 +0000 (16:08 -0700)]
[doc] Describe the header guard style
clang-tidy's llvm-header-guard rule references the LLVM style - where it's
missing.
Differential Revision: https://reviews.llvm.org/D84989
Siva Chandra Reddy [Wed, 29 Jul 2020 06:42:11 +0000 (23:42 -0700)]
[libc] Add a tool called WrapperGen.
This tool will be used to generate C wrappers for the C++ LLVM libc
implementations. This change does not hook this tool up to anything yet.
However, it can be useful for cases where one does not want to run the
objcopy step (to insert the C symbol in the object file) but can make use
of LTO to eliminate the cost of the additional wrapper call. This can be
relevant for certain downstream platforms. If this tool can benefit other
libc platforms in general, then it can be integrated into the build system
with options to use or not use the wrappers. An example of such a
platform is CUDA.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D84848
Eli Friedman [Mon, 27 Jul 2020 21:01:46 +0000 (14:01 -0700)]
[clang codegen][AArch64] Use llvm.aarch64.neon.fcvtzs/u where it's necessary
fptosi/fptoui have similar, but not identical, semantics. In
particular, the behavior on overflow is different.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46844 for 64-bit. (The
corresponding patch for 32-bit is more involved because the equivalent
intrinsics don't exist, as far as I can tell.)
Differential Revision: https://reviews.llvm.org/D84703
LLVM GN Syncbot [Thu, 30 Jul 2020 22:29:22 +0000 (22:29 +0000)]
[gn build] Port
763671f387f
Lang Hames [Thu, 30 Jul 2020 05:55:33 +0000 (22:55 -0700)]
[llvm-jitlink] Add -harness option to llvm-jitlink.
The -harness option enables new testing use-cases for llvm-jitlink. It takes a
list of objects to treat as a test harness for any regular objects passed to
llvm-jitlink.
If any files are passed using the -harness option then the following
transformations are applied to all other files:
(1) Symbols definitions that are referenced by the harness files are promoted
to default scope. (This enables access to statics from test harness).
(2) Symbols definitions that clash with definitions in the harness files are
deleted. (This enables interposition by test harness).
(3) All other definitions in regular files are demoted to local scope.
(This causes untested code to be dead stripped, reducing memory cost and
eliminating spurious unresolved symbol errors from untested code).
These transformations allow the harness files to reference and interpose
symbols in the regular object files, which can be used to support execution
tests (including fuzz tests) of functions in relocatable objects produced by a
build.
Lang Hames [Thu, 30 Jul 2020 03:46:56 +0000 (20:46 -0700)]
[JITLink] Allow JITLinkContext::notifyResolved to return an Error.
This allows clients to detect invalid transformations applied by JITLink passes
(e.g. inserting or removing symbols in unexpected ways) and terminate linking
with an error.
This change is used to simplify the error propagation logic in
ObjectLinkingLayer.
Zequan Wu [Tue, 21 Jul 2020 20:46:11 +0000 (13:46 -0700)]
[COFF] Port CallGraphSort to COFF from ELF
Rahul Joshi [Thu, 30 Jul 2020 21:18:33 +0000 (14:18 -0700)]
[MLIR][NFC] Add SymbolUse::UseRange::empty()
Differential Revision: https://reviews.llvm.org/D84984
Matt Arsenault [Wed, 1 Jul 2020 16:48:42 +0000 (12:48 -0400)]
AMDGPU: Fix liveness errors when copying AGPR tuples
Avoid recursively calling copyPhysReg for AGPR handling. This was
dropping the necessary super register implicit defs to avoid liveness
verifier errors.
Thomas Raoux [Thu, 30 Jul 2020 21:56:50 +0000 (14:56 -0700)]
[mlir][spirv] Add support for converting memref of vector to SPIR-V
This allow declaring buffers and alloc of vectors so that we can support vector
load/store.
Differential Revision: https://reviews.llvm.org/D84982
Nathan James [Thu, 30 Jul 2020 21:57:32 +0000 (22:57 +0100)]
[clang-tidy][NFC] Use StringMap for ClangTidyCheckFactories::FacoryMap
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D84926
Richard Smith [Thu, 30 Jul 2020 21:17:26 +0000 (14:17 -0700)]
PR46908: Emit undef destroying_delete_t as an aggregate RValue.
We previously used a non-aggregate RValue to represent the passed value,
which violated the assumptions of call arg lowering in some cases, in
particular on 32-bit Windows, where we'd end up producing an FCA store
with TBAA metadata, that the IR verifier would reject.
Jez Ng [Thu, 30 Jul 2020 21:38:58 +0000 (14:38 -0700)]
[lld-macho] Add comment for literal argument
Changpeng Fang [Thu, 30 Jul 2020 21:37:06 +0000 (14:37 -0700)]
AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst
Summary:
This is in response to the review of https://reviews.llvm.org/D84873:
The expensive check should be reordered last
Reviewers:
arsenm
Differential Revision:
https://reviews.llvm.org/D84890
Jez Ng [Thu, 30 Jul 2020 21:29:14 +0000 (14:29 -0700)]
[lld-macho] Make __LINKEDIT sections contiguous
codesign (or more specifically libstuff) checks that each section in
__LINKEDIT ends where the next one starts -- no gaps are permitted. This
diff achieves it by aligning every section's start and end points to
WordSize.
Remarks: ld64 appears to satisfy the constraint by adding padding bytes
when generating the __LINKEDIT data, e.g. by emitting BIND_OPCODE_DONE
(which is a 0x0 byte) repeatedly. I think the approach this diff takes
is a bit more elegant, but I'm not sure if it's too restrictive. In
particular, it assumes padding always uses the zero byte. But we can
revisit this later.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D84718
Jez Ng [Thu, 30 Jul 2020 21:28:45 +0000 (14:28 -0700)]
[lld-macho] Implement -headerpad
Tools like `install_name_tool` and `codesign` may modify the Mach-O
header and increase its size. The linker has to provide padding to make this
possible. This diff does that, plus sets its default value to 32 bytes (which
is what ld64 does).
Unlike ld64, however, we lay out our sections *exactly* `-headerpad` bytes from
the header, whereas ld64 just treats the padding requirement as a lower bound.
ld64 actually starts laying out the non-header sections in the __TEXT segment
from the end of the (page-aligned) segment rather than the front, so its
binaries typically have more than `-headerpad` bytes of actual padding.
We should consider implementing the same alignment behavior.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D84714
Jez Ng [Thu, 30 Jul 2020 21:28:41 +0000 (14:28 -0700)]
[lld-macho] Support __dso_handle for C++
The C++ ABI requires dylibs to pass a pointer to __cxa_atexit which does
e.g. cleanup of static global variables. The C++ spec says that the pointer
can point to any address in one of the dylib's segments, but in practice
ld64 seems to set it to point to the header, so that's what's implemented
here.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D83603
Nikita Popov [Thu, 30 Jul 2020 20:54:53 +0000 (22:54 +0200)]
[ConstantRange][CVP] Make use of abs poison flag
Pass the abs poison flag to the underlying ConstantRange
implementation, allowing CVP to simplify based on it.
Importantly, this recognizes that abs with poison flag is actually
non-negative...
Jon Roelofs [Wed, 29 Jul 2020 19:14:17 +0000 (13:14 -0600)]
[SelectionDAG] Fix lowering of vector geps
This fixes an assertion failure that was being triggered in
SelectionDAG::getZeroExtendInReg(), where it was trying to extend the <2xi32>
to i64 (which should have been <2xi64>).
Fixes: rdar://
66016901
Differential Revision: https://reviews.llvm.org/D84884
Jonas Devlieghere [Thu, 30 Jul 2020 20:50:05 +0000 (13:50 -0700)]
[lldb/Docs] Remove stale bot on GreenDragon and add reproducer one
- Remove the link to the Python 3 job which no longer exists.
- Add a link to the reproducer job.
Nikita Popov [Thu, 30 Jul 2020 20:47:33 +0000 (22:47 +0200)]
[ConstantRange] Support abs with poison flag
This just adds the ConstantRange support, including exhaustive
testing. It's not wired up to the IR intrinsic flag yet.
Jonas Devlieghere [Thu, 30 Jul 2020 20:46:47 +0000 (13:46 -0700)]
[lldb/Docs] Add lldb-arm-ubuntu to the list of bots
Peiyuan Song [Thu, 30 Jul 2020 20:37:17 +0000 (23:37 +0300)]
[compiler-rt] [profile] fix profile generate for mingw x86_64
Differential Revision: https://reviews.llvm.org/D84757
Peiyuan Song [Thu, 30 Jul 2020 20:32:37 +0000 (23:32 +0300)]
[LLD] [Mingw] Don't export symbols from profile generate
Differential Revision: https://reviews.llvm.org/D84756
Nikita Popov [Thu, 30 Jul 2020 20:15:06 +0000 (22:15 +0200)]
[ConstantRange][CVP] Compute min/max/abs intrinsic ranges
Wire up ConstantRange::intrinsic() to the existing primitives for
min, max and abs.
The poison flag on abs is not yet taken into account.
Nikita Popov [Thu, 30 Jul 2020 20:16:11 +0000 (22:16 +0200)]
[CVP] Add tests for min/max/abs intrinsic comparisons (NFC)
Petr Hosek [Fri, 17 Jul 2020 02:50:34 +0000 (19:50 -0700)]
[CMake][Fuchsia] Include additional tools in the toolchain
These are needed on Windows.
Differential Revision: https://reviews.llvm.org/D83999
Peter Steinfeld [Thu, 30 Jul 2020 17:51:44 +0000 (10:51 -0700)]
[flang] Fix an assert on duplicate initializations
When declaring the same variable twice with an initialization, we were failing
an internal check. I fixed this by checking to see if the associated symbol
already had an error.
I added tests for pointer and non-pointer initialization of duplicate names.
Differential Revision: https://reviews.llvm.org/D84969
Petr Hosek [Wed, 24 Jun 2020 03:00:04 +0000 (20:00 -0700)]
[ELF] Add --dependency-file option
Clang and GCC have a feature (-MD flag) to create a dependency file
in a format that build systems such as Make or Ninja can read, which
specifies all the additional inputs such .h files.
This change introduces the same functionality to lld bringing it to
feature parity with ld and gold which gained this feature recently.
See https://sourceware.org/bugzilla/show_bug.cgi?id=22843 for more
details and discussion.
The implementation corresponds to -MD -MP compiler flag where the
generated dependency file also includes phony targets which works
around the errors where the dependency is removed. This matches the
format used by ld and gold.
Fixes PR42806
Differential Revision: https://reviews.llvm.org/D82437
Nikita Popov [Sun, 19 Jul 2020 19:28:14 +0000 (21:28 +0200)]
[SCCP] Remove dead switch cases based on range information
Determine whether switch edges are feasible based on range information,
and remove non-feasible edges lateron.
This does not try to determine whether the default edge is dead,
as we'd have to determine that the range is fully covered by the
cases for that.
Another limitation here is that we don't remove dead cases that
have the same successor as a live case. I'm not handling this
because I wanted to keep the edge removal based on feasible edges
only, rather than inspecting ranges again there -- this does not
seem like a particularly useful case to handle.
Differential Revision: https://reviews.llvm.org/D84270
Jonas Devlieghere [Thu, 30 Jul 2020 18:47:55 +0000 (11:47 -0700)]
[lldb/Test] Use self.assertIn in TestGdbRemoteTargetXmlPacket
On the ARM buildbot the returned architecture is `armv8l` while
getArchitecture() just returns `arm`.
Florian Hahn [Thu, 30 Jul 2020 18:12:28 +0000 (19:12 +0100)]
[LAA] Avoid adding pointers to the checks if they are not needed.
Currently we skip alias sets with only reads or a single write and no
reads, but still add the pointers to the list of pointers in RtCheck.
This can lead to cases where we try to access a pointer that does not
exist when grouping checks. In most cases, the way we access
PositionMap masked that, as the value would default to index 0.
But in the example in PR46854 it causes a crash.
This patch updates the logic to avoid adding pointers for alias sets
that do not need any checks. It makes things slightly more verbose, by
first checking the numbers of reads/writes and bailing out early if we don't
need checks for the alias set.
I think this makes the logic a bit simpler to follow.
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D84608
Alexander Belyaev [Thu, 30 Jul 2020 18:16:51 +0000 (20:16 +0200)]
[mlir] NFC: Expose `getElementPtrType` and `getSizes` methods of AllocOpLowering.
Differential Revision: https://reviews.llvm.org/D84917
Sanjay Patel [Thu, 30 Jul 2020 18:16:18 +0000 (14:16 -0400)]
[InstCombine] update test checks; NFC
Ettore Tiotto [Thu, 30 Jul 2020 15:36:09 +0000 (15:36 +0000)]
Fix computeHostNumPhysicalCores() for Linux on POWER and Linux on Z
ThinLTO is run using a single thread on Linux on Power. The
compute_thread_count() routine calls getHostNumPhysicalCores which
returns -1 by default, and so `MaxThreadCount is set to 1.
unsigned llvm::ThreadPoolStrategy::compute_thread_count() const {
int MaxThreadCount = UseHyperThreads
? computeHostNumHardwareThreads()
: sys::getHostNumPhysicalCores();
if (MaxThreadCount <= 0)
MaxThreadCount = 1;
…
}
Fix: provide custom implementation of getHostNumPhysicalCores for
Linux on Power and Linux on Z.
Reviewed By: Kai, uweigand
Differential Revision: https://reviews.llvm.org/D84764
Wouter van Oortmerssen [Mon, 27 Jul 2020 21:59:31 +0000 (14:59 -0700)]
[WebAssembly] Fixed 64-bit indices in br_table
LLVM selection dag assumes "switch" indices are pointer sized, which causes problems for our 32-bit br_table. The new function ensures 32-bit operands don't get unnecessarily extended, and 64-bit operands get truncated.
Note that the changes to the existing test test exactly that: the addition of -NEXT in 2 places ensures no extension is inserted (which the test previously ignored) and that the wrap is present (previously omitted in wasm64 mode).
Differential Revision: https://reviews.llvm.org/D84705
Stanislav Mekhanoshin [Thu, 30 Jul 2020 00:17:45 +0000 (17:17 -0700)]
[AMDGPU] Do not use undef on indirect source
We are using undef on the indirect move source subreg and then
using implicit super-reg. This creates a problem in RA when
Greedy decides to split the register. It reassigns the implicit
super-reg but does not bother to change undef source because
it is really does not matter. The fix is to stop lying to RA and
drop undef flag.
This has also hit a problem in SIFoldOperands as it can fold
immediate into an indirect move since there is no undef flag
anymore. That results in multiple test failures, so added the
check for this case.
Differential Revision: https://reviews.llvm.org/D84899
Jonas Devlieghere [Thu, 30 Jul 2020 17:34:32 +0000 (10:34 -0700)]
[lldb] Add copy ctor/assignment operator to SBCommandInterpreterRunOptions
Jordan Rupprecht [Thu, 30 Jul 2020 17:25:28 +0000 (10:25 -0700)]
[lldb][test] Move registers-target-xml-reading target to the correct test location.
This test was added in D74217 (and the `.categories` file later added in
ccf1c30cde6e1e763e7c9cdd48a609a805166699) around the same time I moved the test tree from `lldb/packages/Python/lldbsuite/test` to `lldb/test/API` (D71151). Since this got lost in the move, it isn't running. (I introduced an intentional syntax error, and `ninja check-lldb` passes).
I moved it to the correct location, and now it runs and passes -- locally, at least -- as `ninja check-lldb-api-tools-lldb-server-registers-target-xml-reading`.
Simon Pilgrim [Thu, 30 Jul 2020 16:37:57 +0000 (17:37 +0100)]
LoopUnroll.cpp - pass std::vector by const reference to needToInsertPhisForLCSSA helper. NFCI.
Avoid an unnecessary pass by value.
Yuanfang Chen [Wed, 29 Jul 2020 00:08:24 +0000 (17:08 -0700)]
[NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations
Problem:
Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order.
Solution:
Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder)
This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order.
This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want.
The test fixes looks messy. It includes changes:
- Remove PassInstrumentationAnalysis
- Remove PassAdaptor
- If a PassAdaptor is for a real pass, the pass is added
- Pass reorder (to the correct order), related to PassAdaptor
- Add missing passes (due to Debuglogging not passed down)
Reviewed By: asbirlea, aeubanks
Differential Revision: https://reviews.llvm.org/D84774
Craig Topper [Thu, 30 Jul 2020 16:54:02 +0000 (09:54 -0700)]
[WebAssembly] Fix GCC 5 build.
Hans' speculative fix in
b7292f2db02d37c9291afc0613a3fbce0a4ad4e8
didn't work for me. This seems to.
Johannes Doerfert [Thu, 30 Jul 2020 16:53:06 +0000 (11:53 -0500)]
[MLIR][OpenMP] Fix OpenMPIRBuilder usage after D82470
Jordan Rupprecht [Wed, 29 Jul 2020 23:58:29 +0000 (16:58 -0700)]
[lldb][NFC][test] Fix comment referring to FileCheck instead of yaml2obj
cgyurgyik [Thu, 30 Jul 2020 16:36:28 +0000 (12:36 -0400)]
[libc] Implements isdigit and isalnum. Adds a utility header to inline
functions to avoid overhead of function calls.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D84893
Kuba Mracek [Thu, 30 Jul 2020 16:32:11 +0000 (09:32 -0700)]
[tsan] Fixup for
1260a155: Move variadic-open.cpp test into Darwin/ directory
Hiroshi Yamauchi [Wed, 29 Jul 2020 22:22:13 +0000 (15:22 -0700)]
[PGO] Include the mem ops into the function hash.
To avoid hash collisions when the only difference is in mem ops.
Joel E. Denny [Thu, 30 Jul 2020 16:20:46 +0000 (12:20 -0400)]
[OpenMP][Docs] Mark `present` motion modifier as done
hsmahesha [Thu, 30 Jul 2020 16:09:34 +0000 (21:39 +0530)]
[AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic
Get rid of all fixmes and base heuristic on `num-clustered-dwords`. The main intuition behind this is as
follows. The existing heuristic roughly summarizes as below:
* Assume, all the mem ops instructions participating in the clustering process, loads/stores same num bytes
* If num bytes loaded by each mem op is 4 bytes, then cluster at max 5 mem ops, that is at max 20 bytes
* If num bytes loaded by each mem op is 8 bytes, then cluster at max 3 mem ops, that is at max 24 bytes
* If num bytes loaded by each mem op is 16 bytes, then cluster at max 2 mem ops, that is at max 32 bytes
So, we need to make sure that the new heuristic do not completey deviate away from the above one, and it
properly handles both the sub-word loads and the wide loads.
Reviewed By: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D84354
Kuba Mracek [Thu, 30 Jul 2020 16:00:14 +0000 (09:00 -0700)]
[tsan] Fix the open and open64 interceptors to have correct declarations (variadic functions)
Not matching the (real) variadic declaration makes the interceptor take garbage inputs on Darwin/AArch64.
Differential Revision: https://reviews.llvm.org/D84570
Louis Dionne [Fri, 11 Oct 2019 18:42:26 +0000 (14:42 -0400)]
[libc++] Use generator expression in Linker script generation
This is an alternative to the workaround in
34a3b24a90c6.
Differential Revision: https://reviews.llvm.org/D68880
Artem Dergachev [Thu, 30 Jul 2020 01:05:42 +0000 (18:05 -0700)]
[clang-tidy] Fix ODR violation in unittests.
Both tests define clang::tidy::test::TestCheck::registerMatchers().
This is UB and causes linker to sometimes choose the wrong overload.
Put classes into anonymous namespaces to avoid the problem.
Differential Revision: https://reviews.llvm.org/D84902
Jonas Devlieghere [Thu, 30 Jul 2020 15:46:02 +0000 (08:46 -0700)]
[lldb] Add SBCommandInterpreterRunOptions to LLDB.h
Brendon Cahoon [Thu, 30 Jul 2020 14:50:59 +0000 (09:50 -0500)]
Align store conditional address
In cases where the alignment of the datatype is smaller than
expected by the instruction, the address is aligned. The aligned
address is used for the load, but wasn't used for the store
conditional, which resulted in a run-time alignment exception.
Fangrui Song [Thu, 30 Jul 2020 15:30:06 +0000 (08:30 -0700)]
[X86] Parse and ignore .arch directives
We parse .arch so that some `.arch i386; .code32` code can assemble. It seems
that X86AsmParser does not do a good job tracking what features are needed to
assemble instructions. GNU as's x86 port supports a very wide range of .arch
operands. Ignore the operand for now.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D84900
Johannes Doerfert [Mon, 29 Jun 2020 18:14:46 +0000 (13:14 -0500)]
[OpenMP][FIX] Consistently use OpenMPIRBuilder if requested
When we use the OpenMPIRBuilder for the parallel region we need to also
use it to get the thread ID (among other things) in the body. This is
because CGOpenMPRuntime::getThreadID() and
CGOpenMPRuntime::emitUpdateLocation implicitly assumes that if they are
called from within a parallel region there is a certain structure to the
code and certain members of the OMPRegionInfo are initialized. It might
make sense to initialize them even if we use the OpenMPIRBuilder but we
would preferably get rid of such state instead.
Bug reported by Anchu Rajendran Sudhakumari.
Depends on D82470.
Reviewed By: anchu-rajendran
Differential Revision: https://reviews.llvm.org/D82822
Johannes Doerfert [Fri, 19 Jun 2020 14:38:04 +0000 (09:38 -0500)]
[OpenMP][IRBuilder] Support allocas in nested parallel regions
We need to keep track of the alloca insertion point (which we already
communicate via the callback to the user) as we place allocas as well.
Reviewed By: fghanim, SouraVX
Differential Revision: https://reviews.llvm.org/D82470
Alexey Bataev [Thu, 30 Jul 2020 15:00:21 +0000 (11:00 -0400)]
[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D84767
Kirill Bobyrev [Thu, 30 Jul 2020 15:12:29 +0000 (17:12 +0200)]
[clangd] NFC: Spell out types in index callback arguments
Alexey Bataev [Thu, 30 Jul 2020 14:57:21 +0000 (10:57 -0400)]
Revert "[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region."
This reverts commit
142d0d3ed8e07aca2476bc4ecc1a12d15577a84a to
investigate undefined behavior revealed by buildbots.
Xiangling Liao [Wed, 29 Jul 2020 18:05:20 +0000 (14:05 -0400)]
[AIX] Temporarily disable IncrementalProcessingTest partially
Temporarily disable IncrementalProcessingTest partially until the static
initialization implementation on AIX is recovered.
Differential Revision: https://reviews.llvm.org/D84880
Momchil Velikov [Thu, 30 Jul 2020 13:43:15 +0000 (14:43 +0100)]
[AArch64] Fix operand definitions of XPACI/XPACD
The operand to these instructions is both input and output.
These are not yet emitted by the compiler and the assembler already
works fine, so can't test in this patch. But D75044 will use XPACI
and provide test coverage for this patch as well.
Differential Revision: https://reviews.llvm.org/D84298
Matt Arsenault [Thu, 30 Jul 2020 13:23:19 +0000 (09:23 -0400)]
AMDGPU: Convert some tests to use new buffer intrinsics
The legacy not struct or raw buffer intrinsics should now all be
consolidated into the tests specifically for those intrinsics.
Simon Pilgrim [Thu, 30 Jul 2020 13:25:08 +0000 (14:25 +0100)]
Attributor.h - remove unnecessary includes. NFCI.
Fix implicit cpp include dependencies.
Jinsong Ji [Thu, 30 Jul 2020 02:46:39 +0000 (02:46 +0000)]
[PowerPC][AIX] Move the testcase to proper dir
Hans Wennborg [Thu, 30 Jul 2020 14:11:15 +0000 (16:11 +0200)]
Speculative GCC 5 build fix
It's complaining about specializing the template in a different namespace.
Tim Keith [Thu, 30 Jul 2020 14:12:24 +0000 (07:12 -0700)]
[flang] Create HostAssoc symbols for uplevel references
To make it easier for lowering to identify which symbols from the host
are captured by internal subprograms, create HostAssocDetails for them.
In particular, if a symbol is referenced and it is contained in a
subprogram or main program that is not the same as the containing
program unit of the reference, a HostAssocDetails symbol is created
in the current scope.
Differential Revision: https://reviews.llvm.org/D84889
Alexey Bataev [Fri, 24 Jul 2020 21:46:19 +0000 (17:46 -0400)]
[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.
It applies only for global pointers.
Differential Revision: https://reviews.llvm.org/D84767
jasonliu [Wed, 29 Jul 2020 15:06:04 +0000 (15:06 +0000)]
[XCOFF][AIX] Enable -ffunction-sections
Summary:
This patch implements -ffunction-sections on AIX.
This patch focuses on assembly generation.
Follow-on patch needs to handle:
1. -ffunction-sections implication for jump table.
2. Object file generation path and associated testing.
Differential Revision: https://reviews.llvm.org/D83875
Sanjay Patel [Thu, 30 Jul 2020 12:37:02 +0000 (08:37 -0400)]
[ConstantFolding] add tests for abs intrinsic; NFC
David Green [Thu, 30 Jul 2020 13:28:08 +0000 (14:28 +0100)]
[LoopVectorizer] Don't create unused block masks for reductions. NFC
This removes some unneeded block masks when we don't have any
reductions. It should not have any effect on codegen as the values
created are dead anyway.
Differential Revision: https://reviews.llvm.org/D81415
Louis Dionne [Thu, 30 Jul 2020 13:26:34 +0000 (09:26 -0400)]
[libc++] Add XFAIL for <float.h> and <cfloat> tests on older Clangs
Stephan Herhut [Thu, 30 Jul 2020 12:38:12 +0000 (14:38 +0200)]
[mlir][shape] Use memref of index in shape lowering
Now that we can have a memref of index type, we no longer need to materialize shapes in i64 and then index_cast.
Differential Revision: https://reviews.llvm.org/D84938
Christian Sigg [Thu, 30 Jul 2020 08:09:29 +0000 (10:09 +0200)]
[MLIR] Don't pass separate LowerToLLVMOptions when we already pass a LLVMTypeConverter which contains those options already.
This also prevents passing inconsistent options.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D84915
Abhishek Varma [Thu, 30 Jul 2020 12:36:59 +0000 (18:06 +0530)]
[MLIR] Introduce inter-procedural memref layout normalization
-- Introduces a pass that normalizes the affine layout maps to the identity layout map both within and across functions by rewriting function arguments and call operands where necessary.
-- Memref normalization is now implemented entirely in the module pass '-normalize-memrefs' and the limited intra-procedural version has been removed from '-simplify-affine-structures'.
-- Run using -normalize-memrefs.
-- Return ops are not handled and would be handled in the subsequent revisions.
Signed-off-by: Abhishek Varma <abhishek.varma@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D84490
Stephan Herhut [Thu, 30 Jul 2020 12:02:46 +0000 (14:02 +0200)]
[mlir] Allow index as element type of memref
Differential Revision: https://reviews.llvm.org/D84934
Jean Perier [Thu, 30 Jul 2020 12:29:24 +0000 (14:29 +0200)]
[flang] Expose specific to generic intrinsic name mapping
The intrinsic lowering facility is based on the generic intrinsic names to avoid
duplicating implementations. Specific intrinsics call are re-written to call to
the generic versions by the front-end but this cannot be done when specific intrinsics
are passed as arguments (the rewrite would give illegal/ambiguous unparsed Fortran).
Solve the issue by making the specific to generic name mapping accessible to lowering
and can be later used to generate the unrestricted intrinsic functions.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D84842
Florian Hahn [Thu, 30 Jul 2020 12:06:15 +0000 (13:06 +0100)]
Revert "[IPConstProp] Remove and move tests to SCCP."
This reverts commit
e77624a3be942c7abba48942b3a8da3462070a3f.
Looks like some clang tests manually invoke -ipconstprop via opt.....
Frederik Gossen [Thu, 30 Jul 2020 11:40:16 +0000 (11:40 +0000)]
[MLIR][Shape] Limit `shape.rank` lowering to its extent tensor variant
When lowering to the standard dialect, we currently support only the extent
tensor variant of the shape.rank operation. This change lets the conversion
pattern fail in a well-defined manner.
Differential Revision: https://reviews.llvm.org/D84852
Florian Hahn [Thu, 30 Jul 2020 09:15:47 +0000 (10:15 +0100)]
[IPConstProp] Remove and move tests to SCCP.
As far as I know, ipconstprop has not been used in years and ipsccp has
been used instead. This has the potential for confusion and sometimes
leads people to spend time finding & reporting bugs as well as
updating it to work with the latest API changes.
This patch moves the tests over to SCCP. There's one functional difference
I am aware of: ipconstprop propagates for each call-site individually, so
for functions that are called with different constant arguments it can sometimes
produce better results than ipsccp (at much higher compile-time cost).But
IPSCCP can be thought to do so as well for internal functions and as mentioned
earlier, the pass seems unused in practice (and there are no plans on working
towards enabling it anytime).
Also discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143773.html
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84447
Simon Pilgrim [Thu, 30 Jul 2020 11:11:34 +0000 (12:11 +0100)]
VectorUtils.h - reduce unnecessary includes. NFC.
Replace TargetLibraryInfo.h include with forward declaration and fix implicit dependencies.
Reduce SmallSet.h include to SmallVector.h include.
Simon Pilgrim [Thu, 30 Jul 2020 10:27:37 +0000 (11:27 +0100)]
[X86][SSE] combineExtractWithShuffle - extend extract(truncate(x),0) for any source vector size
As long as we can extract the lowest 128-bit subvector from the pre-truncated source vector, then we don't care what size it is.
The next stage will be to support non-zero extraction indices, as long as its still coming from the lowest 128-bit subvector.
Kirill Bobyrev [Thu, 30 Jul 2020 10:57:20 +0000 (12:57 +0200)]
[clangd] Implement Relations request for remote index
This is the last missing bit in the core remote index implementation. The only
remaining bits are some API refactorings (replacing Optional with Expected and
being better at reporting errors).
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D84894
Florian Hahn [Thu, 30 Jul 2020 10:17:52 +0000 (11:17 +0100)]
[AArch64] Add machine-combiner tests with instruction level FMFs.
Raphael Isemann [Thu, 30 Jul 2020 09:50:53 +0000 (11:50 +0200)]
[lldb] Don't use static locals for return value storage in some *AsCString functions
Let's just return a std::string to make this safe. formatv seemed overkill for formatting
the return values as they all just append an integer value to a constant string.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D84505
Esme-Yi [Thu, 30 Jul 2020 10:05:04 +0000 (10:05 +0000)]
[NFC] Failed cases for some patterns defined in DAGCombiner.cpp
Aleksandr Platonov [Thu, 30 Jul 2020 09:45:07 +0000 (12:45 +0300)]
[clangd] findNearbyIdentifier(): fix the word search in the token stream.
Without this patch the word occurrence search always returns the first token of the file.
Despite of that, `findNeardyIdentifier()` returns the correct result (but inefficently) until there are several matched tokens with the same value `floor(log2(<token line> - <word line>))` (e.g. several matched tokens on the same line).
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D84912