Valentin Clement [Fri, 28 Oct 2022 06:30:53 +0000 (08:30 +0200)]
[flang] Carry polymoprhic dynamic type when using coordinate_of polymoprhic array
Dynamic type of a polymorphic array element was retrieved by finding the
coordinate operation and use the base array. This patch remove this hack and use
the newly PolymorphicValue to carray the dynamic type together with the element.
The patch also rearrange some tests in the `allocatable-polymorphic.f90`.
Depends on D136824
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D136857
Advenam Tacet [Fri, 28 Oct 2022 05:14:14 +0000 (22:14 -0700)]
[1b/3][ASan][compiler-rt] API for annotating objects memory
This revision is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in std::vector, to std::string and std::deque collections. These changes allow ASan to detect cases when the instrumented program accesses memory which is internally allocated by the collection but is still not in-use (accesses before or after the stored elements for std::deque, or between the size and capacity bounds for std::string).
The motivation for the research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a std::equals function that took iter1_begin, iter1_end, iter2_begin iterators (with a custom comparison function). When object iter1 was longer than iter2, read out-of-bounds on iter2 could happen. Container sanitization would detect it.
This revision extends a compiler-rt ASan sanitization API function sanitizer_annotate_contiguous_container used to sanitize/annotate containers like std::vector to support different allocators and situations when granules are shared between objects. Those changes are necessary to support annotating objects' self memory (in contrast to annotating memory allocated by an object) like short std::basic_string (with short string optimization). That also allows use of non-standard memory allocators, as alignment requirement is no longer necessary.
This also updates an API function to verify if a double ended contiguous container is correctly annotated (__sanitizer_verify_contiguous_container).
If you have any questions, please email:
advenam.tacet@trailofbits.com
disconnect3d@trailofbits.com
Reviewed By: #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D132522
Carlos Alberto Enciso [Fri, 28 Oct 2022 06:20:15 +0000 (07:20 +0100)]
[llvm-debuginfo-analyzer] (08/09) - ELF Reader - Disable test.
Disable the test case: 06-dwarf-full-logical-view.test
It produces incorrect data on ARM:
https://lab.llvm.org/buildbot/#/builders/182/builds/4232
https://lab.llvm.org/buildbot/#/builders/187/builds/9483
Expected:
189 (100.00%) : [0x000000000b][001] {CompileUnit}
110 ( 58.20%) : [0x000000002a][002] 2 {Function}
27 ( 14.29%) : [0x0000000071][003] {Block}
Generated:
3432 ( 0.00%) : [0x000000000b][001] {CompileUnit}
3351 ( 0.00%) : [0x000000002a][002] 2 {Function}
3234 ( 0.00%) : [0x0000000071][003] {Block}
Alexander Shaposhnikov [Fri, 28 Oct 2022 05:30:19 +0000 (05:30 +0000)]
[clang-tidy] Skip template ctors in modernize-use-equals-default
Skip template ctors in modernize-use-equals-default,
such constructors may be enabled/disabled via SFINAE,
it is not safe to make them "= default".
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D136797
Wael Yehia [Thu, 27 Oct 2022 18:37:26 +0000 (14:37 -0400)]
[PGO] Simplify InstrProfilingRuntime.cpp
Differential Revision: https://reviews.llvm.org/D136192
Vitaly Buka [Fri, 28 Oct 2022 04:35:08 +0000 (21:35 -0700)]
[test] Disable the test with asan
There is a memory leak.
See comments in https://reviews.llvm.org/D125783
Ye Luo [Thu, 27 Oct 2022 19:01:18 +0000 (14:01 -0500)]
[DeviceRTL] Fix incremental build
Need both add_custom_command to resolve file-level dependency and add_custom_target to resolve target-level dependency.
From CMake add_custom_command doc:
Do not list the output in more than one independent target that may build in parallel or the two instances of the rule may conflict (instead use the add_custom_target() command to drive the command and make the other targets depend on that one).
${CMAKE_CURRENT_BINARY_DIR}/${bclib_name} is used by multiple targets and thus requires a custom target to avoid racing.
Differential Revision: https://reviews.llvm.org/D136911
LLVM GN Syncbot [Fri, 28 Oct 2022 02:40:48 +0000 (02:40 +0000)]
[gn build] Port
23f02693ec58
Freddy Ye [Fri, 28 Oct 2022 01:43:38 +0000 (09:43 +0800)]
[X86] Add AVX-VNNI-INT8 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D135938
Valery Pykhtin [Tue, 25 Oct 2022 18:07:13 +0000 (20:07 +0200)]
[AMDGPU] Refactor debug printing routines for GCNRPTracker
Use Printable to enhance syntax, remove duplication, unify.
Reviewed By: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D136704
LLVM GN Syncbot [Fri, 28 Oct 2022 01:53:34 +0000 (01:53 +0000)]
[gn build] Port
0e720e6adad1
Emilio Cota [Fri, 28 Oct 2022 01:40:33 +0000 (21:40 -0400)]
[mlir][spirv] fix Bazel build of Passes.h
I cannot repro this with CMake, but on Bazel this is failing with
```
error: incomplete result type 'mlir::spirv::TargetEnvAttr' in function definition
[...]
note: in instantiation of member function 'std::function<mlir::spirv::TargetEnvAttr (mlir::spirv::ModuleOp)>::function' requested here
createUnifyAliasedResourcePass(GetTargetEnvFn getTargetEnv = nullptr);
```
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D136909
Freddy Ye [Fri, 28 Oct 2022 01:11:29 +0000 (09:11 +0800)]
[X86] Add AVX-IFMA instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D135932
Carl Ritson [Fri, 28 Oct 2022 00:26:50 +0000 (09:26 +0900)]
[AMDGPU] Add pseudo wavemode to optimize strict_wqm
Strict WQM does not require a WQM transistion if it occurs within
an existing WQM section.
This occurs heavily in GFX11 pixel shaders with LDS_PARAM_LOAD.
Which leads to unnecessary EXEC mask manipulation.
To avoid these transitions, detect WQM -> Strict WQM -> WQM
and substitute new ENTER_PSEUDO_WM/EXIT_PSEUDO_WM markers instead.
These are treat similarly by WWM register pre-allocation pass,
but do not manipulate EXEC or use registers to save EXEC state.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D136813
Katherine Rasmussen [Thu, 27 Oct 2022 00:15:08 +0000 (17:15 -0700)]
[flang] Add atomic_fetch_xor to list of intrinsics
Add the atomic subroutine, atomic_fetch_xor, to the list of
intrinsic subroutines, add its last dummy argument to a check
for coindexed-object, and update test.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D136804
Fangrui Song [Thu, 27 Oct 2022 23:58:22 +0000 (23:58 +0000)]
[mlir][arith] Fix -Wunused-but-set-variable
Jason Molenda [Thu, 27 Oct 2022 23:46:55 +0000 (16:46 -0700)]
Remove compile-time and runtime checks for SPI in HostInfoMacOSX
There are conditionalized calls to an SPI in HostInfoMacOSX.mm
to test if lldb is being built against a pre-macOS 10.12 SDK,
or being run on a pre-macOS 10.12 system. macOS 10.12 was released
six years ago, and I don't know of any active users of this system
so let's remove the checks.
Differential Revision: https://reviews.llvm.org/D136900
rdar://
101652340
zhongyunde [Thu, 27 Oct 2022 23:52:24 +0000 (07:52 +0800)]
[AArch64] Optimize memcmp when the result is tested for [in]equality with 0
Fixes 1st issue of https://github.com/llvm/llvm-project/issues/58061
Reviewed By: dmgreen, efriedma
Differential Revision: https://reviews.llvm.org/D136244
Zequan Wu [Tue, 18 Oct 2022 22:58:07 +0000 (15:58 -0700)]
[LLDB][NativePDB] Fix parameter size for member functions LF_MFUNCTION
Fix the problem that it was treating member functions as non-member functions
when trying to get the parameter size. This causes some non-parameter variables
showing up in function signature. Suprisingly,
`cantFail(TypeDeserializer::deserializeAs<ProcedureRecord>(...))` just sliently
parse it without error and gave the wrong result.
It's hard to test it. This only causes problem when `params_remaining`
is larger than the real parameter size. If it's smaller, we also check
individual local variable's attribute to see it's a parameter. When I trying to
come up with a test, the parameter size is always 0 if we parse LF_MFUNCTION as
LF_PROCEDURE.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D136209
Alexey Bataev [Tue, 28 Dec 2021 13:23:10 +0000 (05:23 -0800)]
[SLP]Improve analysis of same/alternate code ops and scheduling.
Should improve compile time for analysis and vectorization.
Metric: SLP.NumVectorInstructions
Program SLP.NumVectorInstructions
test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 6380.00 6378.00 -0.0%
test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 6380.00 6378.00 -0.0%
test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test 2023.00 2022.00 -0.0%
test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test 148.00 146.00 -1.4%
Generated more vector instructions.
Differential Revision: https://reviews.llvm.org/D127531
Nikolas Klauser [Thu, 27 Oct 2022 18:01:13 +0000 (20:01 +0200)]
[libc++][math.h] Remove unnecessary uses of __promote
Removes __promote when it's just the identity.
Reviewed By: ldionne, #libc
Spies: libcxx-commits, michaelplatings
Differential Revision: https://reviews.llvm.org/D136868
Diego Caballero [Wed, 26 Oct 2022 01:18:15 +0000 (01:18 +0000)]
[mlir][Vector] Introduce the `vector.mask` operation lowering
This patch introduces the lowering for xfer ops masked with `vector.mask`.
Vector reductions are not lowered yet because new LLVM intrinsics are needed
in the LLVM dialect.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D136741
Diego Caballero [Wed, 26 Oct 2022 00:58:56 +0000 (00:58 +0000)]
[mlir][Vector] Introduce the MaskingOpInterface
This MaskingOpInterface provides masking cababilitites to those
operations that implement it. For only is only implemented by the `vector.mask`
operation and it's used to break the dependency between the Vector
dialect (where the `vector.mask` op lives) and operations implementing
the MaskableOpInterface.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D136734
Alex Zinenko [Fri, 21 Oct 2022 00:53:05 +0000 (00:53 +0000)]
[mlir] ODS: emit interface model method at the end of the header
Previously, ODS interface generator was placing implementations of the
interface's internal "Model" class template immediately after the class
definitions in the header. This doesn't allow this implementation, and
consequently the interface itself, to return an instance of another
interface if its class definition is emitted below. This creates
undesired ordering effects and makes it impossible for two or more
interfaces to return instances of each other. Change the interface
generator to place the implementations of these methods after all
interface classes.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D136322
Amy Huang [Thu, 27 Oct 2022 22:41:49 +0000 (22:41 +0000)]
Fix documentation error in
e8433a2b06d5
Amy Huang [Tue, 18 Oct 2022 17:56:41 +0000 (17:56 +0000)]
Update docs for -fuse-ctor-homing
Update docs to reflect the fact that this flag is on by default now.
Differential Revision: https://reviews.llvm.org/D136188
Alexey Bataev [Thu, 27 Oct 2022 21:42:07 +0000 (14:42 -0700)]
Revert "[SLP]Improve analysis of same/alternate code ops and scheduling."
This reverts commit
dad64448c66975054d3d968232652a56eb93b451 to fix
a crash in https://lab.llvm.org/buildbot/#/builders/74/builds/14584
Jorge Gorbe Moya [Thu, 27 Oct 2022 21:48:44 +0000 (14:48 -0700)]
[lldb-vscode] Don't call SBValue.GetError after generating a summary.
In some occasions, SBValue::GetError can invalidate its cached
`m_summary_str` member. This in turn invalidates any StringRef variables
pointing to it.
Differential Revision: https://reviews.llvm.org/D136890
Aaron Siddhartha Mondal [Wed, 26 Oct 2022 17:10:07 +0000 (19:10 +0200)]
[Bazel] Add missing C++ style Clang headers and modulemap
Reviewed By: chapuni
Differential Revision: https://reviews.llvm.org/D136452
Emilio Cota [Thu, 27 Oct 2022 20:45:14 +0000 (16:45 -0400)]
[mlir] Fix typo s/utilties/utilities/ (including in file name)
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D136887
Xing Xue [Thu, 27 Oct 2022 21:06:18 +0000 (17:06 -0400)]
[libc++abi][AIX] Use reserved slot in stack to pass the address of exception object
Summary:
The existing implementation of the personality for legacy IBM xlclang++ compiler generated code passes the address of exception object in r14 for the landing pad to retrieve with a call to __xlc_exception_handle(). This clobbers the content of r14 in user code (and potentially, when running cleanup actions, the address of another exception object being passed). This patch changes to use the stack slot reserved for compilers to pass the address. It has been confirmed that xlclang++-generated code does not use this slot.
This is a follow-on of the origibal patch below with a change in comments.
https://reviews.llvm.org/rGa499051f10a2d0150b60c14493558476039f701a
Reviewed by: hubert.reinterpretcast, cebowleratibm
Differential Revision: https://reviews.llvm.org/D136257
Peiming Liu [Wed, 26 Oct 2022 19:07:25 +0000 (19:07 +0000)]
[mlir][sparse] code refactoring, move <tid, loop id> -> dim map to Merger.
To address unresolved comments in D136185
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D136780
Kevin Athey [Thu, 27 Oct 2022 20:57:25 +0000 (13:57 -0700)]
Revert "[OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info"
This reverts commit
3d0e9edd8e53fb72e85084f4170513159212839a.
Breaking HWASAN buildbot:
https://lab.llvm.org/buildbot/#/builders/236/builds/786
Shown by targetted builds breaking at this patch:
Built at this patch: https://lab.llvm.org/buildbot/#/builders/236/builds/803
Built at prior patch: https://lab.llvm.org/buildbot/#/builders/236/builds/804
Aart Bik [Thu, 27 Oct 2022 19:43:39 +0000 (12:43 -0700)]
[mlir][sparse] fix typo "admissable" -> "admissible"
Reviewed By: wrengr, Peiming
Differential Revision: https://reviews.llvm.org/D136878
David Blaikie [Thu, 27 Oct 2022 20:41:22 +0000 (20:41 +0000)]
Clang: Add release note for defaulted-special-members-POD GCC ABI fix
Follow-up to
7846d590033e8d661198f4c00f56f46a4993c526
David Blaikie [Thu, 27 Oct 2022 19:35:04 +0000 (19:35 +0000)]
[BinaryFormat:Dwarf] Add a couple of DW_IDX_GCC extensions
Seen here:
https://github.com/gcc-mirror/gcc/blob/
16e2427f50c208dfe07d07f18009969502c25dc8/include/dwarf2.def#L805-L806
Hongtao Yu [Mon, 17 Oct 2022 17:07:18 +0000 (10:07 -0700)]
[PseudoProbe] Replace relocation with offset for entry probe.
Currently pseudo probe encoding for a function is like:
- For the first probe, a relocation from it to its physical position in the code body
- For subsequent probes, an incremental offset from the current probe to the previous probe
The relocation could potentially cause relocation overflow during link time. I'm now replacing it with an offset from the first probe to the function start address.
A source function could be lowered into multiple binary functions due to outlining (e.g, coro-split). Since those binary function have independent link-time layout, to really avoid relocations from .pseudo_probe sections to .text sections, the offset to replace with should really be the offset from the probe's enclosing binary function, rather than from the entry of the source function. This requires some changes to previous section-based emission scheme which now switches to be function-based. The assembly form of pseudo probe directive is also changed correspondingly, i.e, reflecting the binary function name.
Most of the source functions end up with only one binary function. For those don't, a sentinel probe is emitted for each of the binary functions with a different name from the source. The sentinel probe indicates the binary function name to differentiate subsequent probes from the ones from a different binary function. For examples, given source function
```
Foo() {
…
Probe 1
…
Probe 2
}
```
If it is transformed into two binary functions:
```
Foo:
…
Foo.outlined:
…
```
The encoding for the two binary functions will be separate:
```
GUID of Foo
Probe 1
GUID of Foo
Sentinel probe of Foo.outlined
Probe 2
```
Then probe1 will be decoded against binary `Foo`'s address, and Probe 2 will be decoded against `Foo.outlined`. The sentinel probe of `Foo.outlined` makes sure there's not accidental relocation from `Foo.outlined`'s probes to `Foo`'s entry address.
On the BOLT side, to be minimal intrusive, the pseudo probe re-encoding sticks with the old encoding format. This is fine since unlike linker, Bolt processes the pseudo probe section as a whole and it is free from relocation overflow issues.
The change is downwards compatible as long as there's no mixed use of the old encoding and the new encoding.
Reviewed By: wenlei, maksfb
Differential Revision: https://reviews.llvm.org/D135912
Differential Revision: https://reviews.llvm.org/D135914
Differential Revision: https://reviews.llvm.org/D136394
Valentin Clement [Thu, 27 Oct 2022 20:23:40 +0000 (22:23 +0200)]
[flang] Remove debug flag added in D136824
wlei [Thu, 27 Oct 2022 20:13:04 +0000 (13:13 -0700)]
Use getCanonicalFnName for callee name
Ji, Jinsong [Thu, 27 Oct 2022 19:56:49 +0000 (12:56 -0700)]
[docs] Fix old path for clang-format
The path in vimrc was old, replace it with <path-to-this-file> to be
consistent with above.
Jason Molenda [Thu, 27 Oct 2022 20:11:20 +0000 (13:11 -0700)]
Handle an unknown binary platform type in debugserver
debugserver parses the Mach-O header & load commands of
binaries; if it does this with a binary whose LC_BUILD
platform enum it does not recognize, it will currently crash.
This patch changes MachProcss::GetPlatformString to return
an optional platform string, and updates the callers to
do the right thing when this optional could not be
provided.
Differential Revision: https://reviews.llvm.org/D136719
rdar://
100452994
Troy Johnson [Thu, 27 Oct 2022 19:31:34 +0000 (12:31 -0700)]
[clang][Sema][NFC] Remove redundant isTypeValid
These isTypeValid calls are redundant because the isDerivedFrom call performs the same checks.
Differential Revision: https://reviews.llvm.org/D136190
Joe Loser [Thu, 27 Oct 2022 15:05:08 +0000 (09:05 -0600)]
[ADT] Simplify hashing for tuples
Instead of using `std::index_sequence` with a helper function template to access
each element in the tuple, leverage `std::apply` from C++17 to do the heavy
lifting for us.
Differential Revision: https://reviews.llvm.org/D136850
Sanjay Patel [Thu, 27 Oct 2022 19:24:02 +0000 (15:24 -0400)]
[InstCombine] improve efficiency of sub demanded bits; NFC
There's no reason to shrink a constant or simplify
an operand in 2 steps.
This matches what we currently do for 'add' (although that
seems like it should be altered to handle the commutative
case).
Dmitry Sidorov [Thu, 27 Oct 2022 18:06:22 +0000 (14:06 -0400)]
[IR] Allow typed pointers to be used in vector types
Reviewed By: nikic, jcranmer-intel
Differential Revision: https://reviews.llvm.org/D136768
Jordan Rupprecht [Thu, 27 Oct 2022 19:23:22 +0000 (12:23 -0700)]
[NFC] Remove unused variables
Kristof Beyls [Thu, 27 Oct 2022 19:14:32 +0000 (21:14 +0200)]
Community calendar: more clearly document how to add events
Xing Xue [Thu, 27 Oct 2022 19:11:06 +0000 (15:11 -0400)]
[libc++abi][AIX] Use reserved slot in stack to pass the address of exception object
Summary:
The existing implementation of the personality for legacy IBM xlclang++ compiler generated code passes the address of exception object in r14 for the landing pad to retrieve with a call to __xlc_exception_handle(). This clobbers the content of r14 in user code (and potentially, when running cleanup actions, the address of another exception object being passed). This patch changes to use the stack slot reserved for compilers to pass the address. It has been confirmed that xlclang++-generated code does not use this slot.
Reviewed by: hubert.reinterpretcast, cebowleratibm
Jeff Niu [Tue, 25 Oct 2022 16:28:53 +0000 (09:28 -0700)]
[mlir] Add `parseSymbolName` that doesn't take an attribute list
This patch adds a version of `parseSymbolName` and
`parseOptionalSymbolName` to AsmParser that don't take an attribute name
and attribute list.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D136696
Valentin Clement [Thu, 27 Oct 2022 18:56:39 +0000 (20:56 +0200)]
[flang] Carry dynamic type when emboxing polymorphic pointer
In order to be passed as passed-object in the dynamic dispatch, the
polymorphic pointer entity are emboxed. In this process, the dynamic
type must be preserve and pass to fir.embox as the tdesc operand. This
patch introduce a new ExtendedValue that allow to carry over the
dynamic type when the value is unboxed.
Depends on D136820
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D136824
Valentin Clement [Thu, 27 Oct 2022 18:55:43 +0000 (20:55 +0200)]
[flang] Lower allocate for polymorphic pointer
Lowering of allocate statement for polymoprhic pointers is a bit
different than for allocatables. A call to `PointerNullifyDerived`
runtime function is done instead of `AllocatableInitDerived`.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D136820
Lei Zhang [Thu, 27 Oct 2022 18:31:21 +0000 (14:31 -0400)]
[mlir][spirv] Add target control to UnifyAliasedResourcePass
The UnifyAliasedResourcePass is actually only necessary for
targeting Apple GPUs via MoltenVK, where we need to translate
SPIR-V into MSL. The translation has limitations--no support
of aliased resources. So introducing a control to disable
this pass if targeting other platforms.
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D136869
Alexey Bataev [Tue, 28 Dec 2021 13:23:10 +0000 (05:23 -0800)]
[SLP]Improve analysis of same/alternate code ops and scheduling.
Should improve compile time for analysis and vectorization.
Metric: SLP.NumVectorInstructions
Program SLP.NumVectorInstructions
test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 6380.00 6378.00 -0.0%
test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 6380.00 6378.00 -0.0%
test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test 2023.00 2022.00 -0.0%
test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test 148.00 146.00 -1.4%
Generated more vector instructions.
Differential Revision: https://reviews.llvm.org/D127531
David Sherwood [Thu, 27 Oct 2022 18:24:51 +0000 (18:24 +0000)]
David Sherwood [Tue, 25 Oct 2022 11:06:51 +0000 (11:06 +0000)]
[AArch64][SVE2] Add the SVE2.1 pext and ptrue predicate-as-counter instructions
This patch adds the assembly/disassembly for the following instructions:
pext (predicate) : Set predicate from predicate-as-counter
ptrue (predicate-as-counter) : Initialise predicate-as-counter to all active
This patch also introduces the predicate-as-counter registers pn8, etc.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
Differential Revision: https://reviews.llvm.org/D136678
Aart Bik [Wed, 26 Oct 2022 22:07:18 +0000 (15:07 -0700)]
[mlir][sparse] add a cursor to sparse storage scheme
This prepare a subsequent revision that will generalize
the insertion code generation. Similar to the support lib,
insertions become much easier to perform with some "cursor"
bookkeeping. Note that we, in the long run, could perhaps
avoid storing the "cursor" permanently and use some
retricted-scope solution (alloca?) instead. However,
that puts harder restrictions on insertion-chain operations,
so for now we follow the more straightforward approach.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D136800
Kevin Sala [Thu, 27 Oct 2022 18:01:16 +0000 (18:01 +0000)]
[OpenMP][libomptarget] New plugin infrastructure and new CUDA plugin
This patch adds a new infrastructure for OpenMP target plugins. It also implements the CUDA and GenericELF64bit plugins under this new infrastructure. We place the sources in a separate directory named plugins-nextgen, and we build the new plugins as different plugin libraries. The original plugins, which remain untouched, will be used by default. However, the user can change this behavior at run-time through the boolean envar LIBOMPTARGET_NEXTGEN_PLUGINS. If enabled, the libomptarget will try to load the NextGen version of each plugin, falling back to the original if they are not present or valid.
The idea of this new plugin infrastructure is to implement the common parts of target plugins in generic classes (defined in files inside plugins-next/common/PluginInterface folder), and then, each specific plugin defines its own specific classes inheriting from the common ones. In this way, most logic remains on the common interface while reducing the plugin-specific source code. It is also beneficial in the sense that now most code and behavior are the same across the different plugins. As an example, we define classes for a plugin, a device, a device image, a stream manager, etc. The plugin object (a single instance per plugin library) holds different device objects (i.e., one per available device), while these latter are the responsible for managing its own resources.
Most code on this patch is based on the changes made by @jdoerfert (Johannes Doerfert)
Reviewed By: jhuber6, jdoerfert
Differential Revision: https://reviews.llvm.org/D134396
Bill Wendling [Thu, 20 Oct 2022 23:10:31 +0000 (16:10 -0700)]
[clang] Implement -fstrict-flex-arrays=3
The -fstrict-flex-arrays=3 is the most restrictive type of flex arrays.
No number, including 0, is allowed in the FAM. In the cases where a "0"
is used, the resulting size is the same as if a zero-sized object were
substituted.
This is needed for proper _FORTIFY_SOURCE coverage in the Linux kernel,
among other reasons. So while the only reason for specifying a
zero-length array at the end of a structure is for specify a FAM,
treating it as such will cause _FORTIFY_SOURCE not to work correctly;
__builtin_object_size will report -1 instead of 0 for a destination
buffer size to keep any kernel internals from using the deprecated
members as fake FAMs.
For example:
struct broken {
int foo;
int fake_fam[0];
struct something oops;
};
There have been bugs where the above struct was created because "oops"
was added after "fake_fam" by someone not realizing. Under
__FORTIFY_SOURCE, doing:
memcpy(p->fake_fam, src, len);
raises no warnings when __builtin_object_size(p->fake_fam, 1) returns -1
and may stomp on "oops."
Omitting a warning when using the (invalid) zero-length array is how GCC
treats -fstrict-flex-arrays=3. A warning in that situation is likely an
irritant, because requesting this option level is explicitly requesting
this behavior.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
Differential Revision: https://reviews.llvm.org/D134902
Peiming Liu [Thu, 27 Oct 2022 17:12:20 +0000 (17:12 +0000)]
[mlir][sparse] fix crash when sparsifying broadcast operations.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D136866
Craig Topper [Thu, 27 Oct 2022 17:24:35 +0000 (10:24 -0700)]
[RISCV] Fix an obvious CSE opportunity in LSR test case. NFC
rkayaith [Thu, 20 Oct 2022 04:51:06 +0000 (00:51 -0400)]
[mlir][CAPI] Allow specifying pass manager anchor
This adds a new function for creating pass managers that takes an
argument for the anchor string.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D136404
Chris Bieneman [Wed, 26 Oct 2022 17:25:19 +0000 (12:25 -0500)]
[ObjectYAML] Add support for DXContainer HASH
DXContainer files contain a part that has an MD5 of the generated
shader. This adds support to the ObjectYAML tooling to expand the hash
part data and hash iteself in preparation for adding hashing support to
DirectX code generation.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D136632
Michael Jones [Wed, 26 Oct 2022 20:42:39 +0000 (13:42 -0700)]
[libc] add fgets
This adds the fgets function and its unit tests.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D136785
eopXD [Wed, 28 Sep 2022 18:01:23 +0000 (11:01 -0700)]
[LSR] Drop LSR solution if it is less profitable than baseline
The LSR may suggest less profitable transformation to the loop. This
patch adds check to prevent LSR from generating worse code than what
we already have.
Since LSR affects nearly all targets, the patch is guarded by the
option 'lsr-drop-solution' and default as disable for now.
The next step should be extending an TTI interface to allow target(s)
to enable this enhancememnt.
Debug log is added to remind user of such choice to skip the LSR
solution.
Reviewed By: Meinersbur, #loopoptwg
Differential Revision: https://reviews.llvm.org/D126043
rkayaith [Thu, 20 Oct 2022 20:40:32 +0000 (16:40 -0400)]
[mlir][python] Include pipeline parse errors in exception message
Currently any errors during pipeline parsing are reported to stderr.
This adds a new pipeline parsing function to the C api that reports
errors through a callback, and updates the python bindings to use it.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D136402
Fangrui Song [Thu, 27 Oct 2022 16:25:21 +0000 (09:25 -0700)]
[llvm-readelf] --section-details: display SHF_COMPRESSED headers
readelf --section-details displays ch_type/ch_size/ch_addralign for
a SHF_COMPRESSED section. Port the feature. There is a small difference
that readelf doesn't display `[<corrupt>]` for an empty section while
we do.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D136636
Alexander Belyaev [Thu, 27 Oct 2022 16:02:32 +0000 (18:02 +0200)]
[mlir] Fix asan issue in Vectorization.cpp of Linalg.
Differential Revision: https://reviews.llvm.org/D136852
Roman Lebedev [Thu, 27 Oct 2022 16:08:43 +0000 (19:08 +0300)]
[NFC][PhaseOrdering] Add one more test for SROA after partial unroll
https://reviews.llvm.org/D136806
Craig Topper [Thu, 27 Oct 2022 15:52:05 +0000 (08:52 -0700)]
[LegalizeVectorOps][X86][RISCV] Expand vector S/USHLSAT instead of unrolling.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D136478
Dave Lee [Thu, 27 Oct 2022 15:48:12 +0000 (08:48 -0700)]
[lldb][test] Remove explicit mydir definitions (NFC)
Kito Cheng [Thu, 27 Oct 2022 15:45:59 +0000 (23:45 +0800)]
[RISCV] Drop single letter b extension support
It splited into several zb* extensions, and `b` is dropped after
0.93, so it time to retired that as other non-ratified zb* extensions.
Currntly clang can accept that with warning:
$ clang -target riscv64-elf ~/hello.c -S -march=rv64gcb
'+b' is not a recognized feature for this target (ignoring feature)
'+b' is not a recognized feature for this target (ignoring feature)
'+b' is not a recognized feature for this target (ignoring feature)
Reviewed By: asb, luismarques
Differential Revision: https://reviews.llvm.org/D136812
Dave Lee [Thu, 27 Oct 2022 05:05:41 +0000 (22:05 -0700)]
[lldb][test] Remove empty setUp/tearDown methods (NFC)
Alexandros Lamprineas [Tue, 25 Oct 2022 15:45:18 +0000 (16:45 +0100)]
[FuncSpec] Do not overestimate the specialization bonus for users inside loops.
When calculating the specialization bonus for a given function argument,
we recursively traverse the chain of (certain) users, accumulating the
instruction costs. Then we exponentially increase the bonus to account
for loop nests. This is problematic for two reasons: (a) the users might
not themselves be inside the loop nest, (b) if they are we are accounting
for it multiple times. Instead we should be adjusting the bonus before
traversing the user chain.
This reduces the instruction count for CTMark (newPM-O3) when Function
Specialization is enabled without actually reducing the amount of
specializations performed (geomean: -0.001% non-LTO, -0.406% LTO).
Differential Revision: https://reviews.llvm.org/D136692
Sanjay Patel [Thu, 27 Oct 2022 12:54:07 +0000 (08:54 -0400)]
[InstCombine] improve demanded bits for Sub operand 0
This is copying the code that was added for 'add' with D130075.
(That patch removed a fallthrough in the cases, but we can
probably still share at least some code again as a follow-up
cleanup, but I didn't want to risk it here.)
The reasoning is similar to the carry propagation for 'add':
if we don't demand low bits of the subtraction and the
subtrahend (aka RHS or operand 1) is known zero in those low
bits, then there can't be any borrowing required from the
higher bits of operand 0, so the low bits don't matter.
Also, the no-wrap flags can be propagated (and I think that
should be true for add too).
Here's an attempt to prove that in Alive2:
https://alive2.llvm.org/ce/z/xqh7Pa
(can add nsw or nuw to src and tgt, and it should still pass)
Differential Revision: https://reviews.llvm.org/D136788
Liming Liu [Thu, 27 Oct 2022 13:27:02 +0000 (06:27 -0700)]
[P0857R0 Part-B] Allows `require' clauses appearing in
template-template parameters. Although it effects whether a template can be
used as an argument for another template, the constraint seems not to
be checked, nor other major implementations (GCC, MSVC, et al.) check it.
Additionally, Part-A of the document seems to have been implemented.
So mark P0857R0 as completed.
Differential Revision: https://reviews.llvm.org/D134128
gonglingqin [Thu, 27 Oct 2022 12:45:57 +0000 (20:45 +0800)]
[LoongArch] Add codegen support for cmpxchg on LA64
Differential Revision: https://reviews.llvm.org/D135948
John Brawn [Thu, 27 Oct 2022 13:14:57 +0000 (14:14 +0100)]
[MachineCSE] Allow PRE of instructions that read physical registers
Currently MachineCSE forbids PRE when the instruction reads a physical
register. Relax this so that it's allowed when the value being read is
the same as what would be read in the place the instruction would be
hoisted to.
This is being done in preparation for adding FPCR handling to the
AArch64 backend, in order to prevent it to from worsening the
generated code, but for targets that already have a similar register
it should improve things.
This patch affects code generation in several tests. The new code
looks better except for in Thumb2/LowOverheadLoops/memcall.ll where
we perform PRE but the LowOverheadLoops transformation then undoes
it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse,
but actually the function as a whole is better (as a MOV is PRE'd).
Differential Revision: https://reviews.llvm.org/D136675
LLVM GN Syncbot [Thu, 27 Oct 2022 12:53:31 +0000 (12:53 +0000)]
[gn build] Port
b51b90d6e25c
LLVM GN Syncbot [Thu, 27 Oct 2022 12:53:30 +0000 (12:53 +0000)]
[gn build] Port
17059753f133
Nico Weber [Thu, 27 Oct 2022 12:47:12 +0000 (08:47 -0400)]
[gn build] semi-automatically ort
4f06d46f465c (LogicalView input files)
Michał Górny [Thu, 27 Oct 2022 12:30:02 +0000 (14:30 +0200)]
Revert "Harmonize cmake_policy() across standalone builds of all projects"
This reverts commit
88d7508dc479210f07abccb17f0194b66264b125.
It's reported to break builds when symlinking other projects inside
the `tools` directory.
Jakob Johnson [Wed, 26 Oct 2022 22:48:56 +0000 (15:48 -0700)]
[intelpt] Update Python tests to account for new errrors
Update the Python tests (ie tests run via `lldb-dotest -p TestTrace`) to
handle new error introduced in D136610.
Test Plan:
`lldb-dotest -p TestTrace`
Differential Revision: https://reviews.llvm.org/D136801
Carlos Alberto Enciso [Thu, 27 Oct 2022 11:04:27 +0000 (12:04 +0100)]
[llvm-debuginfo-analyzer] (08/09) - ELF Reader
The fix for the unitest case introduced a dependency on the
MC library causing a failure in:
https://lab.llvm.org/buildbot/#/builders/121/builds/24567
clang-ppc64le-multistage/stage1
undefined reference to symbol 'llvm::TargetRegistry::lookupTarget'
Added:
- MC to the LLVM_LINK_COMPONENTS list.
Reviewed By: jryans
Differential Revision: https://reviews.llvm.org/D136837
Alexander Belyaev [Thu, 27 Oct 2022 11:01:22 +0000 (13:01 +0200)]
[mlir] Fix printing when linalg.map has no inputs.
Differential Revision: https://reviews.llvm.org/D136836
Momchil Velikov [Thu, 27 Oct 2022 11:23:03 +0000 (12:23 +0100)]
Recommit: [FuncSpec] Fix specialisation based on literals
[fixed test to work with reverse iteration]
The `FunctionSpecialization` pass has support for specialising
functions, which are called with literal arguments. This functionality
is disabled by default and is enabled with the option
`-function-specialization-for-literal-constant` . There are a few
issues with the implementation, though:
* even with the default, the pass will still specialise based on
floating-point literals
* even when it's enabled, the pass will specialise only for the `i1`
type (or `i2` if all of the possible 4 values occur, or `i3` if all
of the possible 8 values occur, etc)
The reason for this is incorrect check of the lattice value of the
function formal parameter. The lattice value is `overdefined` when the
constant range of the possible arguments is the full set, and this is
the reason for the specialisation to trigger. However, if the set of
the possible arguments is not the full set, that must not prevent the
specialisation.
This patch changes the pass to NOT consider a formal parameter when
specialising a function if the lattice value for that parameter is:
* unknown or undef
* a constant
* a constant range with a single element
on the basis that specialisation is pointless for those cases.
Is also changes the criteria for picking up an actual argument to
specialise if the argument is:
* a LLVM IR constant
* has `constant` lattice value
has `constantrange` lattice value with a single element.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D135893
Change-Id: Iea273423176082ec51339aa66a5fe9fea83557ee
Michał Górny [Mon, 24 Oct 2022 04:31:37 +0000 (06:31 +0200)]
Harmonize cmake_policy() across standalone builds of all projects
Move `cmake_policy()` settings from `llvm/CMakeLists.txt` into a shared
`cmake/modules/CMakePolicy.cmake`. Include it from all relevant
projects that support standalone builds, in order to ensure that
the policies are consistently set whether they are built in-tree
or stand-alone.
Differential Revision: https://reviews.llvm.org/D136572
Oleg Shyshkov [Thu, 27 Oct 2022 11:32:52 +0000 (13:32 +0200)]
[mlir] Fix `AffineMap.dropResults`.
`AffineMap.dropResult` erases one result from the array and it changes indexing. Calling `dropResult` is a loop with increasing indexes does not produce a desired result.
Differential Revision: https://reviews.llvm.org/D136833
Max Kazantsev [Thu, 27 Oct 2022 10:54:15 +0000 (17:54 +0700)]
Fix iterator corruption in splitBasicBlockBefore
We should not delete block predecessors (via replacing successors
of terminators) while iterating them, otherwise we may skip some
of them. Instead, save predecessors to a separate vector and iterate
over it.
Nikita Popov [Thu, 27 Oct 2022 10:38:55 +0000 (12:38 +0200)]
[FunctionAttrs] Add additional tests with operand bundles (NFC)
Matthias Springer [Thu, 27 Oct 2022 10:24:02 +0000 (12:24 +0200)]
[mlir][tensor][bufferize] Support memory_space for tensor.pad
This change adds memory space support to tensor.pad. (tensor.generate and tensor.from_elements do not support memory spaces yet.)
The memory space is inferred from the buffer of the source tensor.
Instead of lowering tensor.pad to tensor.generate + tensor.insert_slice, it is now lowered to bufferization.alloc_tensor (with the correct memory space) + linalg.map + tensor.insert_slice.
Memory space support for the remaining two tensor ops is left for a later point, as this requires some more design discussions.
Differential Revision: https://reviews.llvm.org/D136265
Phoebe Wang [Thu, 27 Oct 2022 10:23:05 +0000 (18:23 +0800)]
Fix buildbot fail
Matthias Springer [Thu, 27 Oct 2022 10:20:05 +0000 (12:20 +0200)]
[mlir][tensor] Fix build: Add missing line break to test case
This should have been part of D136767.
Matthias Springer [Thu, 27 Oct 2022 09:54:01 +0000 (11:54 +0200)]
[mlir][tensor][bufferize] Lower tensor.generate to linalg.map
There is no memref equivalent of tensor.generate. The purpose of this change is to avoid creating scf.parallel loops during bufferization.
Differential Revision: https://reviews.llvm.org/D136767
Nikita Popov [Thu, 27 Oct 2022 10:00:30 +0000 (12:00 +0200)]
[BasicAA] Remove redundant libcall handling
The writeonly attribute for memset_pattern16 (and other referenced
libcalls) is being added by InferFunctionAttrs nowadays. No need
to special-case it here.
Utkarsh Saxena [Thu, 27 Oct 2022 09:50:43 +0000 (11:50 +0200)]
[clang] Do not hide base member using-decls with different template head.
Fixes: https://github.com/llvm/llvm-project/issues/50886
**Adding requires clause to template head** or **constraining the template parameter type** is ineffective because, even though it creates a non-equivalent template head [temp.over.link#6](https://eel.is/c++draft/temp.over.link#6) and hence eligible for overload resolution, `Derived::foo` still [hides any previous using decl](https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaOverload.cpp#L1283-L1301,).
Clang diverges from gcc here and can be seen more clearly in this example:
```
struct base {
template <int N, int M>
int foo() { return 1; };
};
struct bar : public base {
using base::foo;
template <int N>
int foo() { return 2; };
};
int main() {
bar f;
f.foo<10, 10>(); // clang previously errored while GCC does not.
}
```
https://godbolt.org/z/v5hnh6czq. We see that `bar::foo` hides `base::foo` because it only differs in the head.
Adding a trailing `requires` to the definition was a nice find. In this case, clang considers them [overloads](https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaOverload.cpp#L1148-L1152) because of [mismatching requires clause.](https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaOverload.cpp#L1390-L1405). So both of them make it to the overload resolution (where constrained Derived::foo is rejected then).
---
In this patch, we do not ignore matching the template head (template parameters, type contraints and trailing requires) while considering whether the using decl of base member should be hidden. The return type of a templated function is still not considered as different return types would create ambiguous candidates.
The changed tests looks reasonable and also matches GCC behaviour: https://godbolt.org/z/8KqPEThrY
Note: We are now able to create an ambiguity in case where both base member and derived member specialisations satisfy the constraints (when the constraints are not same). Ideally using-decl should not create ambiguity. I plan to fix this later if it gathers more attention.
Reviewed By: ilya-biryukov, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D136440
Matthias Springer [Thu, 27 Oct 2022 09:45:15 +0000 (11:45 +0200)]
[mlir] Fix circular dialect initialization
This change fixes a bug where a dialect is initialized multiple times. This triggers an assertion when the ops of the dialect are registered (`error: operation named ... is already registered`).
This bug can be triggered as follows:
1. Dialect A depends on dialect B (as per ADialect.td).
2. Somewhere there is an extension of dialect B that depends on dialect A (e.g., it defines external models create ops from dialect A). E.g.:
```
registry.addExtension(+[](MLIRContext *ctx, BDialect *dialect) {
BDialectOp::attachInterface ...
ctx->loadDialect<ADialect>();
});
```
3. When dialect A is loaded, its `initialize` function is called twice:
```
ADialect::ADialect()
| |
| v
| ADialect::initialize()
v
getOrLoadDialect<BDialect>()
|
v
(load extension of BDialect)
|
v
ctx->loadDialect<ADialect>() // user wrote this in the extension
|
v
getOrLoadDialect<ADialect>() // the dialect is not "fully" loaded yet
|
v
ADialect::ADialect()
|
v
ADialect::initialize()
```
An example of a dialect extension that depends on other dialects is `Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp`. That particular dialect extension does not trigger this bug. (It would trigger this bug if the SCF dialect would depend on the Tensor dialect.)
This change introduces a new dialect state: dialects that are currently being loaded. Same as dialects that were already fully loaded (and initialized), dialects that are in the process of being loaded are not loaded a second time.
Differential Revision: https://reviews.llvm.org/D136685
Phoebe Wang [Thu, 27 Oct 2022 09:08:49 +0000 (17:08 +0800)]
[X86][1/2] SUPPORT RAO-INT
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Initial authored by Liu Chen (@LiuChen3)
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D135951
Matthias Springer [Thu, 27 Oct 2022 08:59:52 +0000 (10:59 +0200)]
[mlir][vector][bufferize] Implement DestinationStyleOpInterface on TransferWriteOp
This simplifies the BufferizableOpInterface implementation of vector.transfer_write.
Differential Revision: https://reviews.llvm.org/D136348
Carlos Alberto Enciso [Thu, 27 Oct 2022 08:47:09 +0000 (09:47 +0100)]
[llvm-debuginfo-analyzer] (08/09) - ELF Reader
The unitest and test cases are platform dependent (x86_64)
causing failures in:
https://lab.llvm.org/buildbot/#/builders/245/builds/146
https://lab.llvm.org/buildbot/#/builders/188/builds/21397
No available targets are compatible with triple "x86_64-unknown-unknown".
Added:
- ';REQUIRES: x86-registered-target' to the LIT tests.
- Code to check if the target 'Triple::x86_64' is supported to
the unittest case.
eopXD [Thu, 27 Oct 2022 07:42:31 +0000 (00:42 -0700)]
Pre-commit test case for D136784
This is a pre-commit for the fix in D136784.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D136783