Bill Wendling [Tue, 22 Jun 2021 00:41:47 +0000 (17:41 -0700)]
[llvm-diff] Explicitly check ConstantArrays
Global initializers may be ConstantArrays. They need to be checked
explicitly, because different-yet-still-equivalent type names may be
used for each, and/or a GEP instruction may appear in one.
Bill Wendling [Sun, 20 Jun 2021 21:45:12 +0000 (14:45 -0700)]
[llvm-diff] Add support for diffing the callbr instruction
The only wrinkle is that we can't process the "blockaddress" arguments
of the callbr until the blocks have been equated. So we force them to be
"unified" before checking.
This was left out when the callbr instruction was added.
Differential Revision: https://reviews.llvm.org/D104606
Nikita Popov [Tue, 22 Jun 2021 19:17:40 +0000 (21:17 +0200)]
Revert "[compiler-rt] Make use of undefined symbols configurable"
This reverts commit
ed7086ad46f99f639b85ea6c8bda7c1a71be7c53.
This reverts commit
b9792638b0bfb308e0c7c125ac78f4ebf910c11b.
This breaks cmake with message:
CMake Error at llvm-project/compiler-rt/CMakeLists.txt:449:
Parse error. Expected "(", got newline with text "
Nikita Popov [Tue, 22 Jun 2021 15:20:44 +0000 (17:20 +0200)]
[OpaquePtr] Support changing load type in InstCombine
When the load type is changed to ptr, we need the load pointer type
to also be ptr, because it's not allowed to create a pointer to an
opaque pointer. This is achieved by adjusting the getPointerTo() API
to return an opaque pointer for an opaque pointer base type.
Differential Revision: https://reviews.llvm.org/D104718
Sami Tolvanen [Tue, 22 Jun 2021 19:09:44 +0000 (12:09 -0700)]
Revert "ThinLTO: Fix inline assembly references to static functions with CFI"
This reverts commit
4474958d3a97dede2caa0920f7c4a4dc7aac57d3.
Breaks check-llvm on Mac.
Joseph Huber [Fri, 21 May 2021 18:43:44 +0000 (14:43 -0400)]
[OpenMP] Remove OpenMP CUDA Target Parallel compiler flag
Summary:
The changes introduced in D97680 turns this command line option into a no-op so
it can be removed entirely.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D102940
Christopher Di Bella [Tue, 22 Jun 2021 18:58:30 +0000 (18:58 +0000)]
[libcxx][doc] corrects LWG links in the One Ranges section
Petr Hosek [Tue, 22 Jun 2021 18:58:26 +0000 (11:58 -0700)]
[CMake] Fix the option declaration
This addresses build issue introduced in
b9792638b0bfb308e0c7c125ac78f4ebf910c11b.
Christopher Di Bella [Fri, 28 May 2021 00:46:49 +0000 (00:46 +0000)]
[libcxx][docs] updates the ranges status paper
* indicates whether work has been started or completed
* consolidates content that was split for dependency reasons (iff
everything has been merged)
* makes things a lot more fine-grained
* turns sub-CSVs into lists
* puts links into description section and removes patch column
* adds links to c++draft on occasion
These changes heavily prioritise the the reader of the generated HTML
file, not the source.
Differential Revision: https://reviews.llvm.org/D103295
Petr Hosek [Tue, 22 Jun 2021 18:03:37 +0000 (11:03 -0700)]
[compiler-rt] Make use of undefined symbols configurable
We want to disable the use of undefined symbols on Fuchsia, but there
are cases where it might be desirable so may it configurable.
Differential Revision: https://reviews.llvm.org/D104728
Petr Hosek [Mon, 21 Jun 2021 02:10:50 +0000 (19:10 -0700)]
[compiler-rt][CMake] Drop flags that are set by default for Fuchsia
-Wl,-z,now is set by the Fuchsia driver, -Wl,-z,relro is the default
in LLD.
Akira Hatanaka [Tue, 22 Jun 2021 18:42:26 +0000 (11:42 -0700)]
[CodeGen] Don't create fake FunctionDecls when generating block/byref
copy/dispose helper functions
We found out that these fake functions would cause clang to crash if the
changes proposed in https://reviews.llvm.org/D98799 were made.
Differential Revision: https://reviews.llvm.org/D104082
Joseph Huber [Tue, 22 Jun 2021 18:40:31 +0000 (14:40 -0400)]
[OpenMP][NFC] Add new optimizations to OpenMPOpt comment header
Summary:
Adds mentions to the new globalization optimizations added to the OpenMPOpt comment header.
Joseph Huber [Wed, 16 Jun 2021 19:37:22 +0000 (15:37 -0400)]
[Attributor] Add an option to increase the max number of iterations
Right now the Attributor defaults to 32 fixed point iterations unless it is set
explicitly by a command line flag. This patch allows this to be configured when
the attributor instance is created. The maximum is then increased in OpenMPOpt
if the target is a kernel. This is because the globalization analysis can result
in larger iteration counts due to many dependent instances running at once.
Depends on D102444
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104416
Reid Kleckner [Tue, 22 Jun 2021 18:35:14 +0000 (11:35 -0700)]
Revert "[LLD] [COFF] Avoid doing repeated fuzzy symbol lookup for each iteration. NFC."
This reverts commit
e1adf90826a57b674eee79b071fb46c1f5683cd0.
This appears to affect the way that C++ mangled symbols appear in the
import library when using a .def file that names a C++ free function
with no name decoration. I will follow up with a reduced test case
shortly.
Fangrui Song [Tue, 22 Jun 2021 18:20:49 +0000 (11:20 -0700)]
Improve clang -Wframe-larger-than= diagnostic
Match the style in D104667.
This commit is for non-LTO diagnostics, while D104667 is for LTO and llc diagnostics.
Sanjay Patel [Tue, 22 Jun 2021 17:42:44 +0000 (13:42 -0400)]
[InstCombine] reduce code duplication for FP min/max with casts fold; NFC
Sanjay Patel [Tue, 22 Jun 2021 17:42:01 +0000 (13:42 -0400)]
[InstCombine][test] add tests for FP min/max with negated op; NFC
Sanjay Patel [Tue, 22 Jun 2021 17:37:14 +0000 (13:37 -0400)]
[InstCombine][test] add tests for FP min/max with negated op; NFC
Joseph Huber [Mon, 7 Jun 2021 18:31:40 +0000 (14:31 -0400)]
[Attributor] Add interface to emit remarks in Attributor
Summary:
This patch adds support for the Attributor to emit remarks on behalf of some
other pass. The attributor can now optionally take a callback function that
returns an OptimizationRemarkEmitter object when given a Function pointer. If
this is availible then a remark will be emitted for the corresponding pass
name.
Depends on D102197
Reviewed By: sstefan1 thegameg
Differential Revision: https://reviews.llvm.org/D102444
David Green [Tue, 22 Jun 2021 18:11:39 +0000 (19:11 +0100)]
[ARM] Change some Gather/Scatter interface types to Instructions. NFC
These returned Values are cast to an Instruction already, this just
cleans up the interface a little to match the expected types.
Raphael Isemann [Tue, 22 Jun 2021 17:49:09 +0000 (19:49 +0200)]
[lldb] Add missing string include to lldb-server's main
Louis Dionne [Tue, 22 Jun 2021 17:47:32 +0000 (13:47 -0400)]
[libc++] NFC: Add missing all.h to the modulemap
Matt Arsenault [Tue, 15 Jun 2021 22:51:06 +0000 (18:51 -0400)]
AMDGPU: Try to eliminate clearing of high bits of 16-bit instructions
These used to consistently be zeroed pre-gfx9, but gfx9 made the
situation complicated since now some still do and some don't. This
also manages to pick up a few cases that the pattern fails to optimize
away.
We handle some cases with instruction patterns, but some get
through. In particular this improves the integer cases.
Arthur O'Dwyer [Tue, 15 Jun 2021 16:57:54 +0000 (12:57 -0400)]
[libc++] Enable `explicit` conversion operators, even in C++03 mode.
C++03 didn't support `explicit` conversion operators;
but Clang's C++03 mode does, as an extension, so we can use it.
This lets us make the conversion explicit in `std::function` (even in '03),
and remove some silly metaprogramming in `std::basic_ios`.
Drive-by improvements to the tests for these operators, in addition
to making sure all these tests also run in `c++03` mode.
Differential Revision: https://reviews.llvm.org/D104682
Matt Arsenault [Wed, 16 Jun 2021 22:55:51 +0000 (18:55 -0400)]
AMDGPU: Add baseline test for instructions zeroing high bits
Joseph Huber [Mon, 7 Jun 2021 17:24:30 +0000 (13:24 -0400)]
[OpenMP] Enable HeapToStack conversion in OpenMPOpt for new RTL globalization calls
Summary:
The changes to globalization introduced in D97680 introduce a large amount of overhead by default. The old globalization method would always ignore globalization code if executing in SPMD mode. This wasn't strictly correct as data sharing is still possible in SPMD mode. The new interface is correct but introduces globalization code even when unnecessary. This optimization will use the existing HeapToStack transformation in the attributor to allow for unneeded globalization to be replaced with thread-private stack memory. This is done using the newly introduced library instances for the RTL functions added in D102087.
Depends on D97818
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102197
Joseph Huber [Mon, 7 Jun 2021 17:11:42 +0000 (13:11 -0400)]
[OpenMP] Add new OpenMP globalization functions to library info
Summary:
The changes to globalization introduced in D97680 created two new functions to
push / pop shareably memory on the GPU, __kmpc_alloc_shared and
__kmpc_free_shared. This patch adds these new runtime functions to the
library info so they can be used by the HeapToStack attributor interface. This
optimization replaces malloc / free pairs with stack memory if legal.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D102087
Patrick Holland [Mon, 21 Jun 2021 02:12:00 +0000 (19:12 -0700)]
[MCA] [In-order pipeline] Fix for 0 latency instruction causing assertion to fail.
0 latency instructions now get processed and retired properly within the in-order pipeline. Had to fix a bug within TimelineView.cpp as well that would show up when a 0 latency instruction was the first instruction in the source.
Differential Revision: https://reviews.llvm.org/D104675
Matt Arsenault [Tue, 15 Jun 2021 21:56:54 +0000 (17:56 -0400)]
AMDGPU: Fix high 16-bit optimization on gfx9
We can do this optimization in the majority of cases, but we currently
don't have a way to do it. We do not track/model which instructions
have which behavior, the control bit to change the high bit behavior,
or making use of preserved bits at all. This is a bit fuzzy since we
don't know precisely how the source instruction will be lowered, but
that only really matters in one case (for fma_mixlo).
We do need to fixup some of these cases after selection, but the
pattern helps eliminate many of these zexts.
LLVM GN Syncbot [Tue, 22 Jun 2021 17:03:46 +0000 (17:03 +0000)]
[gn build] Port
40d6d2c49dd1
Sami Tolvanen [Tue, 22 Jun 2021 16:28:31 +0000 (09:28 -0700)]
ThinLTO: Fix inline assembly references to static functions with CFI
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them.
Link: https://github.com/ClangBuiltLinux/linux/issues/1354
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D104058
zhijian [Tue, 22 Jun 2021 17:01:31 +0000 (13:01 -0400)]
[AIX][XCOFF] generate eh_info when vector registers are saved according to the traceback table.
Summary:
generate eh_info when vector registers are saved according to the traceback table.
struct eh_info_t {
unsigned version; /* EH info version 0 */
#if defined(64BIT)
char _pad[4]; /* padding */
#endif
unsigned long lsda; /* Pointer to Language Specific Data Area */
unsigned long personality; /* Pointer to the personality routine */
};
the value of lsda and personality is zero when the number of vector registers saved is large zero and there is not personality of the function
Reviewers: Jason Liu
Differential Revision: https://reviews.llvm.org/D103651
Stanislav Mekhanoshin [Tue, 15 Jun 2021 21:51:59 +0000 (14:51 -0700)]
[AMDGPU] Use performOptimizedStructLayout for LDS sort
This gives better packing.
Differential Revision: https://reviews.llvm.org/D104331
Fangrui Song [Tue, 22 Jun 2021 16:55:20 +0000 (09:55 -0700)]
Improve the diagnostic of DiagnosticInfoResourceLimit (and warn-stack-size in particular)
Before: `warning: stack size limit exceeded (888) in main`
After: `warning: stack frame size (888) exceeds limit (100) in function 'main'` (the -Wframe-larger-than limit will be mentioned)
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D104667
zoecarver [Tue, 18 May 2021 23:18:18 +0000 (16:18 -0700)]
[libcxx][ranges] Add `ranges::iter_swap`.
Differential Revision: https://reviews.llvm.org/D102809
Nico Weber [Tue, 22 Jun 2021 16:50:45 +0000 (12:50 -0400)]
[gn build] manually port
c747b7d1d9a2 (config.osx_sysroot)
Matt Arsenault [Tue, 15 Jun 2021 21:12:02 +0000 (17:12 -0400)]
AMDGPU: Move zeroed FP high bits optimization to patterns
Hyundeok Park [Tue, 22 Jun 2021 16:37:51 +0000 (12:37 -0400)]
[libc++] Change forward_list::swap to use propagate_on_container_swap for noexcept specification
The current implementation of `std::forward_list::swap` uses
`propagate_on_container_move_assignment` for `noexcept` specification.
This patch changes it to use `propagate_on_container_swap`, as it should.
Fixes https://llvm.org/PR50224.
Differential Revision: https://reviews.llvm.org/D101899
Alexandru Octavian Butiu [Tue, 22 Jun 2021 16:33:30 +0000 (18:33 +0200)]
[clang][c++20] Fix false warning for unused private fields when a class has only defaulted comparison operators.
Fixes bug 50263
When "unused-private-field" flag is on if you have a struct with private
members and only defaulted comparison operators clang will warn about
unused private fields.
If you where to write the comparison operators by hand no warning is
produced.
This is a bug since defaulting a comparison operator uses all private
members .
The fix is simple, in CheckExplicitlyDefaultedFunction just clear the
list of unused private fields if the defaulted function is a comparison
function.
Differential revision: https://reviews.llvm.org/D102186
Shilei Tian [Tue, 22 Jun 2021 16:38:33 +0000 (12:38 -0400)]
[NFC][OpenMP][Offloading] Unified the construction of mapping table entry
This patch unifies construction of mapping table entry to use `emplace`.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D104580
Joseph Huber [Thu, 20 May 2021 02:57:24 +0000 (22:57 -0400)]
[OpenMP] Internalize functions in OpenMPOpt to improve IPO passes
Summary:
Currently the attributor needs to give up if a function has external linkage.
This means that the optimization introduced in D97818 will only apply to static
functions. This change uses the Attributor to internalize OpenMP device
routines by making a copy of each function with private linkage and replacing
the uses in the module with it. This allows for the optimization to be applied
to any regular function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102824
Steven Wu [Tue, 22 Jun 2021 16:21:12 +0000 (09:21 -0700)]
[llvm] Fix lto tests that requires ld64
Since Xcode 13, ld64 requires linking libSystem for all the executable.
Fix the tests that needs to run ld64 by linking libSystem from sysroot.
rdar://
77332728
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D104332
Fangrui Song [Tue, 22 Jun 2021 16:19:48 +0000 (09:19 -0700)]
[llvm-objcopy] Fix some namespace style issues
https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D104693
Bill Wendling [Tue, 22 Jun 2021 00:33:53 +0000 (17:33 -0700)]
[llvm-diff] Constify APIs so that there aren't conflicts
Some APIs work with const variables while others don't. This can cause
conflicts when calling one from the other.
This is NFC.
Differential Revision: https://reviews.llvm.org/D104719
Joseph Huber [Mon, 22 Mar 2021 20:35:55 +0000 (16:35 -0400)]
[OpenMP] Replace GPU globalization calls with shared memory in the middle-end
Summary:
The changes introduced in D97680 create a simpler interface to code that needs
to be globalized. This interface is used to simplify the globalization calls in
the middle end. We can check any globalization call that is only called by a
single thread in the team and replace it with a static shared memory buffer.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97818
Joseph Huber [Fri, 18 Jun 2021 19:48:01 +0000 (15:48 -0400)]
[Libomptarget] Improve device runtime implementation for globalized variables.
Currently the runtime implementation of `__kmpc_alloc_shared` is extremely slow because it allocated memory for each thread individually. This patch adds a small buffer for the threads to share data and will greatly improve performance for builds where all globalization could not be optimized out. If the shared buffer is full, then memory will not only be allocated per-warp rather than per-thread.
Depends on D97680
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104666
Nikita Popov [Mon, 21 Jun 2021 20:32:56 +0000 (22:32 +0200)]
[OpaquePtr] Handle addrspacecasts in InstCombine
This adds support for addrspace casts involving opaque pointers to
InstCombine, as well as the isEliminableCastPair() helper
(otherwise the assertion failure would just move there).
Add PointerType::hasSameElementTypeAs() to hide the element type
details.
Differential Revision: https://reviews.llvm.org/D104668
Meera Nakrani [Mon, 14 Jun 2021 11:39:07 +0000 (11:39 +0000)]
[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier
This is a recommit that fixes unwanted STP generation by checking that
the base register has not been modified or used elsewhere.
Our initial motivating case was memcpy's with alignments > 16. The
loads/stores, to which small memcpy's expand, are kept together in
several places so that we get a sequence like this for a 64 bit copy:
LD w0
LD w1
ST w0
ST w1
The load/store optimiser can generate a LDP/STP w0, w1 from this because
the registers read/written are consecutive. In our case however, the
sequence is optimised during ISel, resulting in:
LD w0
ST w0
LD w0
ST w0
This instruction reordering allows reuse of registers. Since the registers
are no longer consecutive (i.e. they are the same), it inhibits LDP/STP
creation. The approach here is to perform renaming:
LD w0
ST w0
LD w1
ST w1
to enable the folding of the stores into a STP. We do not yet generate
the LDP due to a limitation in the renaming implementation, but plan to
look at that in a follow-up so that we fully support this case. While
this was initially motivated by certain memcpy's, this is a general
approach and thus is beneficial for other cases too, as can be seen
in some test changes.
Differential Revision: https://reviews.llvm.org/D103597
David Spickett [Mon, 21 Jun 2021 15:10:46 +0000 (15:10 +0000)]
[lldb] Remove more redundant SetStatus(eReturnStatusFailed)
Mostly by converting uses of GetErrorStream to AppendError,
so that the call to SetStatus is implicit.
Some remain where it isn't certain that you'll have a message
to set, or you want the output to be on stdout.
One place in CommandObjectWatchpoint previously didn't set
the status to failed at all. However it's pretty obvious
that it should do so.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104697
Jingu Kang [Mon, 7 Jun 2021 11:55:30 +0000 (12:55 +0100)]
[SimpleLoopUnswich] Fixa a bug on ComputeUnswitchedCost with partial unswitch
There was a bug from cost calculation for partially invariant unswitch.
The costs of non-duplicated blocks are substracted from the total LoopCost, so
anything that is duplicated should not be counted.
Differential Revision: https://reviews.llvm.org/D103816
Florian Hahn [Tue, 22 Jun 2021 14:35:23 +0000 (15:35 +0100)]
[SCEV] Reduce code to handle predicates in applyLoopGuards (NFC).
Hoist out common recurrence check and sink updating the map, to reduce
the code required to support additional predicates.
Joseph Huber [Mon, 22 Mar 2021 20:34:11 +0000 (16:34 -0400)]
[OpenMP] Simplify GPU memory globalization
Summary:
Memory globalization is required to maintain OpenMP standard semantics for data sharing between
worker and master threads. The GPU cannot share data between its threads so must allocate global or
shared memory to store the data in. Currently this is implemented fully in the frontend using the
`__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard
CPU stack sharing. The front-end scans the target region for variables that escape the region and
must be shared between the threads. Each variable then has a field created for it in a global record
type.
This patch replaces this functinality with a single allocation command, effectively mimicing an
alloca instruction for the variables that must be shared between the threads. This will be much
slower than the current solution, but makes it much easier to optimize as we can analyze each
variable independently and determine if it is not captured. In the future, we can replace these
calls with an `alloca` and small allocations can be pushed to shared memory.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D97680
Raphael Isemann [Tue, 22 Jun 2021 14:45:48 +0000 (16:45 +0200)]
[lldb][NFC] Remove an outdated comment in HostInfoBase
We should *never* use static local variables in this file as this makes
unittesting the plugin code impossible (and this whole 'testing' thing has
turned out to be rather useful so far).
Rosie Sumpter [Tue, 22 Jun 2021 12:18:26 +0000 (13:18 +0100)]
[SLP][AArch64] Add SLP vectorizer tests for XOR and AND reductions. NFC
These regression tests show missed SLP vectorization opportunities,
which will be fixed in a future commit (see:
https://reviews.llvm.org/D104538).
Differential Revision: https://reviews.llvm.org/D104708
Graham Hunter [Tue, 22 Jun 2021 14:06:42 +0000 (15:06 +0100)]
[clang] Remove unused capture in closure
c6a91ee6aaaa removed uses of IsMonotonic from OpenMP SIMD codegen,
but that left a capture of the variable unused which upset buildbots
using -Werror.
Joseph Huber [Tue, 15 Jun 2021 21:09:50 +0000 (17:09 -0400)]
[Libomptarget] Introduce new globalization runtime calls
Summary:
This patch introduces the new globalization runtime to be used by D97680. These
runtime calls will replace the __kmpc_data_sharing_push_stack and
__kmpc_data_sharing_pop_stack functions.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D102532
Florian Hahn [Tue, 22 Jun 2021 13:48:45 +0000 (14:48 +0100)]
[BitcodeReader] Validate Strtab before accessing.
This fixes a crash with invalid bitcode files that have records
referencing names in Strtab, but Strtab is not present or the index is
out-of-bounds.
This fixes the following clusterfuzz issue:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=29895
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D95554
Nikita Popov [Tue, 22 Jun 2021 10:00:42 +0000 (12:00 +0200)]
[ConstantFold] Delay fetching pointer element type
Don't do this while stipping pointer casts, instead fetch it at
the end. This improves compatibility with opaque pointers for the
case where the base object is not opaque.
Nikita Popov [Tue, 22 Jun 2021 10:14:31 +0000 (12:14 +0200)]
[ConstantFold] Skip bitcast -> GEP transform for opaque pointers
Same as with the InstCombine transform, this is not possible for
bitcasts involving opaque pointers, as GEP preserves opaqueness.
AndreyChurbanov [Tue, 22 Jun 2021 13:29:01 +0000 (16:29 +0300)]
[OpenMP] libomp: fix dynamic loop dispatcher
Restructured dynamic loop dispatcher code.
Fixed use of dispatch buffers for nonmonotonic dynamic (static_steal) schedule:
- eliminated possibility of stealing iterations of the wrong loop when victim
thread changed its buffer to work on another loop;
- fixed race when victim thread changed its buffer to work in nested parallel;
- eliminated "static" property of the schedule, that is now a single thread can
execute whole loop.
Differential Revision: https://reviews.llvm.org/D103648
Butygin [Wed, 19 May 2021 19:04:29 +0000 (22:04 +0300)]
[mlir] Fix invalid handling of AllocOp symbolOperands by SimplifyAllocConst.
symbolOperands were completely ignored by SimplifyAllocConst. Also, slightly improved diagnostic message for verifyAllocLikeOp.
Differential Revision: https://reviews.llvm.org/D104260
Pushpinder Singh [Tue, 22 Jun 2021 06:24:19 +0000 (06:24 +0000)]
[AMDGPU][Libomptarget] Move allow_access_to_all_gpu_agents to rtl.cpp
Moving this method helps eliminate a use of g_atl_machine.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D104691
Raphael Isemann [Tue, 22 Jun 2021 11:40:33 +0000 (13:40 +0200)]
[lldb][NFC] Use SubsystemRAII in XcodeSDKModuleTests
Thomas Johnson [Mon, 21 Jun 2021 17:41:51 +0000 (20:41 +0300)]
Add norm sub-target feature to table gen for ARC
This adds the `norm` sub-target feature (without backing implementation for now) to table gen.
Differential Revision: https://reviews.llvm.org/D104558
Muhammad Omair Javaid [Tue, 22 Jun 2021 11:19:48 +0000 (16:19 +0500)]
[LLDB] Skip TestExitDuringExpression on aarch64/linux buildbot
TestExitDuringExpression test_exit_before_one_thread_no_unwind fails
sporadically on both Arm and AArch64 linux buildbots. This seems like
manifesting itself on a fully loaded machine. I have not found a reliable
timeout value so marking it skip for now.
Stephan Herhut [Mon, 21 Jun 2021 17:33:28 +0000 (19:33 +0200)]
[mlir][memref] Add memref.copy operation
As the name suggests, it copies from one memref to another.
Differential Revision: https://reviews.llvm.org/D104657
Florian Hahn [Mon, 21 Jun 2021 18:11:11 +0000 (19:11 +0100)]
[SCEV] Retain AddExpr flags when subtracting a foldable constant.
Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2.
But we can retain flags under certain conditions:
* Adding a smaller constant is NUW if the original AddExpr was NUW.
* Adding a constant with the same sign and small magnitude is NSW, if the
original AddExpr was NSW.
This can improve results after using `SimplifyICmpOperands`, which may
subtract one in order to use stricter predicates, as is the case for
`isKnownPredicate`.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D104319
Raphael Isemann [Tue, 22 Jun 2021 10:22:14 +0000 (12:22 +0200)]
[lldb] Adjust Clang version requirements for tail_call_frames tests
Those tests are all failing for older Clang versions. This is adding the
respective test decorators for the passing Clang versions to get the recently
revived matrix bot green.
Nico Weber [Tue, 22 Jun 2021 02:29:11 +0000 (22:29 -0400)]
[lld/mac] Add explicit "no unwind info" entries for functions without unwind info
Fixes PR50529. With this, lld-linked Chromium base_unittests passes on arm macs.
Surprisingly, no measurable impact on link time.
Differential Revision: https://reviews.llvm.org/D104681
Raphael Isemann [Tue, 22 Jun 2021 09:55:36 +0000 (11:55 +0200)]
[lldb] Bumb Clang version requirement for TestBasicEntryValues.py to 11
The test only passes with Clang>=11 so adjust the decorator.
Failure output for Clang 10 is:
--- FileCheck trace (code=1) ---
FileCheck main.cpp -check-prefix=FUNC1-GNU
FileCheck input:
Address: a.out[0x0000000000401127] (a.out.PT_LOAD[1]..text + 263)
Summary: a.out`func1(int&) + 23 at main.cpp:25:1
Module: file = "functionalities/param_entry_vals/basic_entry_values/BasicEntryValues_GNU.test_dwo/a.out", arch = "x86_64"
CompileUnit: id = {0x00000000}, file = "functionalities/param_entry_vals/basic_entry_values/main.cpp", language = "c++11"
Function: id = {0x400000000000010a}, name = "func1(int&)", mangled = "_Z5func1Ri", range = [0x0000000000401110-0x0000000000401129)
FuncType: id = {0x400000000000010a}, byte-size = 0, decl = main.cpp:13, compiler_type = "void (int &)"
Blocks: id = {0x400000000000010a}, range = [0x00401110-0x00401129)
LineEntry: [0x0000000000401127-0x0000000000401130): functionalities/param_entry_vals/basic_entry_values/main.cpp:25:1
Symbol: id = {0x0000002c}, range = [0x0000000000401110-0x0000000000401129), name="func1(int&)", mangled="_Z5func1Ri"
FileCheck output:
functionalities/param_entry_vals/basic_entry_values/main.cpp:23:16: error: FUNC1-GNU: expected string not found in input
// FUNC1-GNU: name = "sink", type = "int &", location = DW_OP_GNU_entry_value
Martin Storsjö [Mon, 14 Jun 2021 11:43:46 +0000 (14:43 +0300)]
[ADT] Add StringRef consume_front_lower and consume_back_lower
These serve as a convenient combination of consume_front/back and
startswith_lower/endswith_lower, consistent with other existing
case insensitive methods named <operation>_lower.
Differential Revision: https://reviews.llvm.org/D104218
Graham Hunter [Fri, 4 Jun 2021 11:10:37 +0000 (12:10 +0100)]
[Clang][OpenMP] Monotonic does not apply to SIMD
The codegen for simd constructs was affected by the presence (or
absence) of the 'monotonic' schedule modifier for worksharing
loops. The modifier is only intended to apply to the scheduling of
chunks for a thread, not iterations of a loop inside a chunk.
In addition, the monotonic modifier was applied to worksharing loops
by default if no schedule clause was present; the referenced part of
the OpenMP 4.5 spec in the code (section 2.7.1) only applies if the
user specified a schedule clause with a static kind but no modifier.
Without a user-specified schedule clause we should default to
nonmonotonic scheduling.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D103793
Nikita Popov [Tue, 22 Jun 2021 09:06:28 +0000 (11:06 +0200)]
[ConstantFolding] Separate conditions in GEP evaluation (NFC)
Handle to gep p, 0-v case separately, and not as part of the loop
that ensures all indices are constant integers. Those two things
are not really related.
Balázs Kéri [Tue, 22 Jun 2021 08:25:55 +0000 (10:25 +0200)]
[clang][Analyzer] Track null stream argument in alpha.unix.Stream .
The checker contains check for passing a NULL stream argument.
This change should make more easy to identify where the passed pointer
becomes NULL.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D104640
Matthias Springer [Tue, 22 Jun 2021 07:49:08 +0000 (16:49 +0900)]
[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect
The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect.
* Rename SubTensorOp -> tensor.extract_slice, SubTensorInsertOp -> tensor.insert_slice.
* Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit.
* Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard
* Remove dialect dependencies: Standard --> Tensor
* Move canonicalization test cases to correct dialect (Tensor/MemRef).
Note: This is a fixed version of https://reviews.llvm.org/D104499, which was reverted due to a missing update to two CMakeFile.txt.
Differential Revision: https://reviews.llvm.org/D104676
Fraser Cormack [Fri, 18 Jun 2021 15:30:19 +0000 (16:30 +0100)]
[Utils][vim] Add missing highlights for fast-math flags
This patch adds the `afn`, `contract`, and `reassoc` fast-math flags.
It also fixes up `fneg`'s order in the alphabetized list.
Reviewed By: MaskRay, craig.topper
Differential Revision: https://reviews.llvm.org/D104541
Sander de Smalen [Tue, 22 Jun 2021 07:05:55 +0000 (08:05 +0100)]
[GlobalISel] Add scalable property to LLT types.
This patch aims to add the scalable property to LLT. The rest of the
patch-series changes the interfaces to take/return ElementCount and
TypeSize, which both have the ability to represent the scalable property.
The changes are mostly mechanical and aim to be non-functional changes
for fixed-width vectors.
For scalable vectors some unit tests have been added, but no effort has
been put into making any of the GlobalISel algorithms work with scalable
vectors yet. That will be left as future work.
The work is split into a series of 5 patches to make reviews easier.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D104450
Bjorn Pettersson [Mon, 21 Jun 2021 09:22:14 +0000 (11:22 +0200)]
[NewPM] Print passes with params when using "opt -print-passes"
Make sure we also print passes with params when using "opt -print-passes".
Differential Revision: https://reviews.llvm.org/D104625
Fangrui Song [Tue, 22 Jun 2021 06:49:25 +0000 (23:49 -0700)]
[llvm-objcopy] Internalize some symbols
Tobias Gysi [Tue, 22 Jun 2021 06:27:28 +0000 (06:27 +0000)]
[mlir][linalg] Adapt FillOp to use a scalar operand.
Adapt the FillOp definition to use a scalar operand instead of a capture. This patch is a follow up to https://reviews.llvm.org/D104109. As the input operands are in front of the output operands the patch changes the internal operand order of the FillOp. The pretty printed version of the operation remains unchanged though. The patch also adapts the linalg to standard lowering to ensure the c signature of the FillOp remains unchanged as well.
Differential Revision: https://reviews.llvm.org/D104121
Fangrui Song [Tue, 22 Jun 2021 06:44:07 +0000 (23:44 -0700)]
[llvm-objcopy] Delete empty namespace. NFC
Max Kazantsev [Tue, 22 Jun 2021 05:21:38 +0000 (12:21 +0700)]
Re-land "[LoopDeletion] Handle Phis with similar inputs from different blocks"
Patch was reverted due to a bug that existed before it and was exposed
by it. Returning after the underlying bug has been fixed.
Differential Revision: https://reviews.llvm.org/D103959
Max Kazantsev [Tue, 22 Jun 2021 05:10:58 +0000 (12:10 +0700)]
[LoopDeletion] Require loop to have a predecessor when executing 1st iteration symbolically
Two predecessors break the further logic, and the loop may come to the
opt in non-canonicalized state.
Heejin Ahn [Fri, 18 Jun 2021 21:41:01 +0000 (14:41 -0700)]
[WebAssembly] Make tag attribute's encoding uint8
This changes the encoding of the `attribute` field, which currently only
contains the value `0` denoting this tag is for an exception, from
`varuint32` to `uint8`. This field is effectively unused at the moment
and reserved for future use, and it is not likely to need `varuint32`
even in future.
See https://github.com/WebAssembly/exception-handling/pull/162.
This does not change any encoded binaries because `0` is encoded in the
same way both in `varuint32` and `uint8`.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104571
Walter Erquinigo [Fri, 30 Apr 2021 04:27:12 +0000 (21:27 -0700)]
Retry of [lldb-vscode] only report long running progress events
This time adding a check that should prevent the crash found in
https://lab.llvm.org/buildbot/#/builders/68/builds/14182/steps/6/logs/stdio
Differential Revision: https://reviews.llvm.org/D101128
Matthias Springer [Tue, 22 Jun 2021 02:37:37 +0000 (11:37 +0900)]
[mlir][linalg] Fusion of PadTensorOp
Note: This commit (and previous ones) implements the same functionality as https://reviews.llvm.org/D103243 (which is abandoned).
Differential Revision: https://reviews.llvm.org/D104683
Walter Erquinigo [Tue, 22 Jun 2021 02:42:34 +0000 (19:42 -0700)]
Revert "[lldb-vscode] only report long running progress events"
This reverts commit
610d474cfd82f11dc4702e2cf1b2485584d7c243.
lldb-vscode is crashing.
Walter Erquinigo [Tue, 22 Jun 2021 02:33:56 +0000 (19:33 -0700)]
[lldb-vscode] Add simple DAP logs dump to investigate flakiness in tests
A few times tests have been flaky, presumably by crashed of lldb-vscode
itself. They can be caught by looking at the DAP logs, so I'm dumping
them when the session ends.
Walter Erquinigo [Fri, 30 Apr 2021 04:27:12 +0000 (21:27 -0700)]
[lldb-vscode] only report long running progress events
When the number of shared libs is massive, there could be hundreds of
thousands of short lived progress events sent to the IDE, which makes it
irresponsive while it's processing all this data. As these small jobs
take less than a second to process, the user doesn't even see them,
because the IDE only display the progress of long operations. So it's
better not to send these events.
I'm fixing that by sending only the events that are taking longer than 5
seconds to process.
In a specific run, I got the number of events from ~500k to 100, because
there was only 1 big lib to parse.
I've tried this on several small and massive targets, and it seems to
work fine.
Differential Revision: https://reviews.llvm.org/D101128
Eli Friedman [Mon, 21 Jun 2021 23:34:02 +0000 (16:34 -0700)]
Rename MachineMemOperand::getOrdering -> getSuccessOrdering.
Since this method can apply to cmpxchg operations, make sure it's clear
what value we're actually retrieving. This will help ensure we don't
accidentally ignore the failure ordering of cmpxchg in the future.
We could potentially introduce a getOrdering() method on AtomicSDNode
that asserts the operation isn't cmpxchg, but not sure that's
worthwhile.
Differential Revision: https://reviews.llvm.org/D103338
Vitaly Buka [Sat, 19 Jun 2021 01:34:07 +0000 (18:34 -0700)]
[NFC] Add getUnderlyingObjects test
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D104585
Eli Friedman [Mon, 21 Jun 2021 23:24:16 +0000 (16:24 -0700)]
[ScalarEvolution] Ensure backedge-taken counts are not pointers.
A backedge-taken count doesn't refer to memory; returning a pointer type
is nonsense. So make sure we always return an integer.
The obvious way to do this would be to just convert the operands of the
icmp to integers, but that doesn't quite work out at the moment:
isLoopEntryGuardedByCond currently gets confused by ptrtoint operations.
So we perform the ptrtoint conversion late for lt/gt operations.
The test changes are mostly innocuous. The most interesting changes are
more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to
i64)) + %ptr)". This is expected: we can't fold this to zero because we
need to preserve the pointer base.
The call to isLoopEntryGuardedByCond in howFarToZero is less precise
because of ptrtoint operations; this shows up in the function
pr46786_c26_char in ptrtoint.ll. Fixing it here would require more
complex refactoring. It should eventually be fixed by future
improvements to isImpliedCond.
See https://bugs.llvm.org/show_bug.cgi?id=46786 for context.
Differential Revision: https://reviews.llvm.org/D103656
Rob Suderman [Mon, 21 Jun 2021 23:09:04 +0000 (16:09 -0700)]
[mlir][tosa] Enable tosa.div for TosaMakeBroadcastable
TosaMakeBroadcastable needs to include tosa.div, which was added later in the
specification.
Reviewed By: sjarus, NatashaKnk
Differential Revision: https://reviews.llvm.org/D104157
Greg Clayton [Fri, 18 Jun 2021 23:19:04 +0000 (16:19 -0700)]
Clarify the "env" launch configuration setting.
A few users recently were trying to set environment values when using lldb-vscode and were unsure of the format of the "env" launch configuration setting. Clarify the exact format as when users add the "env" launch config setting, they can see this help string in the IDE.
Differential Revision: https://reviews.llvm.org/D104578
Nick Desaulniers [Mon, 21 Jun 2021 22:09:23 +0000 (15:09 -0700)]
[IR] convert warn-stack-size from module flag to fn attr
Otherwise, this causes issues when building with LTO for object files
that use different values.
Link: https://github.com/ClangBuiltLinux/linux/issues/1395
Reviewed By: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D104342
Andrew Browne [Fri, 18 Jun 2021 19:21:29 +0000 (12:21 -0700)]
[DFSan][NFC] Refactor Origin Address Alignment code.
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104565
Rong Xu [Mon, 21 Jun 2021 21:11:31 +0000 (14:11 -0700)]
[SampleFDO] Make FSDiscriminator flag part of function parameters
Add a parameter of IsFSDiscriminator to function
getBaseDiscriminatorFromDiscriminator().
This function currently checks the internal flag of
--enable-fs-discriminator. This is not good because we might
change the default value of the internal flag.
Note that we have a default parameter. This is just
because create_afdo_tool has a call-site to it.
I will remove the default parameter in a later patch.
Differential Revision: https://reviews.llvm.org/D104584
Eli Friedman [Mon, 21 Jun 2021 21:24:31 +0000 (14:24 -0700)]
[ARM] Make sure we don't transform unaligned store to stm on Thumb1.
This isn't likely to come up in practice; the combination of compiler
flags required to hit this issue should be rare. Found by inspection.
Fangrui Song [Mon, 21 Jun 2021 21:32:25 +0000 (14:32 -0700)]
[AArch64][X86] Allow 64-bit label differences lower to IMAGE_REL_*_REL32
`IMAGE_REL_ARM64_REL64/IMAGE_REL_AMD64_REL64` do not exist and `.quad a - .` is
currently not representable.
For instrumentation, `.quad a - .` is useful representing a cross-section
reference in a metadata section, to allow ELF medium/large code models. The COFF
limitation makes such generic instrumentations inconvenient. I plan to make a
PGO/coverage metadata section field relative in D104556.
Differential Revision: https://reviews.llvm.org/D104564