Amaury Séchet [Fri, 5 Aug 2022 19:46:26 +0000 (19:46 +0000)]
[NFC] Regenerates X86's win64-bool.ll
Slava Zakharin [Thu, 4 Aug 2022 19:07:35 +0000 (12:07 -0700)]
[flang] Lower MOD to Fortran runtime call.
This change removes dependency on pgmath mod, and also allows
Fortran runtime to issue a diagnostic message in case of zero
denominator.
Differential Revision: https://reviews.llvm.org/D131192
Craig Topper [Fri, 5 Aug 2022 19:41:28 +0000 (12:41 -0700)]
[RISCV] Don't use li+sh3add for constants that can use lui+add.
If we're adding a constant that can't use addi we try a few tricks,
one of which is using li+sh3add. We should not do this if lui+add
would work. For example adding 8192. Using sh3add prevents folding
a sext.w to form addw, thus increasing instruction count.
Tobias Hieta [Fri, 5 Aug 2022 19:44:56 +0000 (21:44 +0200)]
[llvm][macos] Fix usage of std::shared_mutex on old macOS SDK versions
When setting CMAKE_CXX_STANDARD to 17 and targeting a macOS version
under 10.12 the ifdefs would try to use std::shared_mutex because
the of the C++ standard. This should also check the targeted SDK.
See discussion in: https://reviews.llvm.org/D130689
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D131063
Rashmi Mudduluru [Fri, 5 Aug 2022 19:35:44 +0000 (12:35 -0700)]
fixes clang-tidy/checks/list.rst: a line was accidentally removed in
95a92995d45fc6fada43ecd91eba3e7aea90487a
Ben Langmuir [Tue, 2 Aug 2022 20:26:01 +0000 (13:26 -0700)]
[clang][modules] Don't depend on sharing FileManager during module build
Sharing the FileManager between the importer and the module build should
only be an optimization. Add a cc1 option -fno-modules-share-filemanager
to allow us to test this. Fix the path to modulemap files, which
previously depended on the shared FileManager when using path mapped to
an external file in a VFS.
Differential Revision: https://reviews.llvm.org/D131076
Ben Langmuir [Fri, 5 Aug 2022 17:56:56 +0000 (10:56 -0700)]
[clang] Fix redirection behaviour for cached FileEntryRef
In
6a79e2ff1989b we changed Filemanager::getEntryRef() to return the
redirecting FileEntryRef instead of looking through the redirection.
This commit fixes the case when looking up a cached file path to also
return the redirecting FileEntryRef. This mainly affects the behaviour
of calling getNameAsRequested() on the resulting entry ref.
Differential Revision: https://reviews.llvm.org/D131273
Jack Kirk [Fri, 5 Aug 2022 18:41:47 +0000 (11:41 -0700)]
[CUDA] Fixed sm version constrain for __bmma_m8n8k128_mma_and_popc_b1.
As stated in
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma:
".and operation in single-bit wmma requires sm_80 or higher."
tra@: Fixed a bug in builtins-nvptx-mma.py test generator and regenerated the tests.
Differential Revision: https://reviews.llvm.org/D131265
Philip Reames [Fri, 5 Aug 2022 19:08:03 +0000 (12:08 -0700)]
[RISCVInsertVSETVLI] Remove an unsound optimization
This fixes a bug reported privately by @craig.topper. Here's an example which illustrates the problem:
vsetivli a1, a0, e32, m1, ta, mu # both DefInfo and PrevInfo
vsetivli a2, a1, e32, m4, ta, mu
With the unsound result being:
vsetivli a1, a0, e32, m1, ta, mu
vsetivli a2, a0, e32, m4, ta, mu
Consider the case where this is running on a machine with VLEN=512,. For this case, the VLMAXs are 16 and 64 respectively.
Consider for a0 = 33. The correct result is: a1 = 16, and a2 = 16
After the unsound optimization: a1 = 16 and a2 = 33
This particular example used VLMAXs which differed by more than a power of two. With a difference of only one power of two, there's another form of this bug which involves the AVL < 2 x VLMAX special case, but that ones more complicated to construct as many examples turn out accidentally sound.
This patch takes the approach of simply removing the unsound optimization, but there are multiple sound sub-cases of it. I plan to return to at least a couple of them, but figured it was cleaner to remove the unsound optimization (for ease of backporting), and then review the new optimizations on their own.
Differential Revision: https://reviews.llvm.org/D131264
Zhaoshi Zheng [Fri, 8 Jul 2022 18:48:44 +0000 (11:48 -0700)]
[WinEH][ARM64] Split Unwind Info for Fucntions Larger than 1MB
Create function segments and emit unwind info of them.
A segment must be less than 1MB and no prolog or epilog is splitted between two
segments.
This patch should generate correct, though not optimal, unwind info for large
functions. Currently it only generate pacted info (.pdata) only for functions
that are less than 1MB (single-segment functions). This is NFC from before this
patch.
The next step is to enable (.pdata) only unwind info for the first segment or
segments that have neither prolog or epilog in a multi-segment function.
Another future work item is to further split segments that require more than 255
code words or have more than 65535 epilogs.
Reference:
https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling#function-fragments
Differential Revision: https://reviews.llvm.org/D130049
Slava Zakharin [Wed, 20 Jul 2022 03:39:58 +0000 (20:39 -0700)]
[flang] Propagate lowering options from driver.
This commit addresses concerns raised in D129497.
Propagate lowering options from driver to expressions lowering
via AbstractConverter instance. A single use case so far is
using optimized TRANSPOSE lowering with O1/O2/O3.
bbc does not support optimization level switches, so it uses
default LoweringOptions (e.g. optimized TRANSPOSE lowering
is enabled by default, but an engineering -opt-transpose=false
option can still override this).
Differential Revision: https://reviews.llvm.org/D130204
Jonas Devlieghere [Fri, 5 Aug 2022 18:17:18 +0000 (11:17 -0700)]
[lldb] Improve EXC_RESOURCE exception reason
Jason noted that the stop message we print for a memory high water mark
notification (EXC_RESOURCE) could be clearer. Currently, the stop
reason looks like this:
* thread #3, queue = 'com.apple.CFNetwork.LoaderQ', stop reason =
EXC_RESOURCE RESOURCE_TYPE_MEMORY (limit=14 MB, unused=0x0)
It's hard to read the message because the exception and the type
(EXC_RESOURCE RESOURCE_TYPE_MEMORY) blend together. Additionally, the
"observed=0x0" should not be printed for memory limit exceptions.
I wanted to continue to include the resource type from
<kern/exc_resource.h> while also explaining what it actually is. I used
the wording from the comments in the header. With this path, the stop
reason now looks like this:
* thread #5, stop reason = EXC_RESOURCE (RESOURCE_TYPE_MEMORY: high
watermark memory limit exceeded) (limit=14 MB)
rdar://
40466897
Differential revision: https://reviews.llvm.org/D131130
Jeff Bailey [Fri, 5 Aug 2022 06:24:43 +0000 (06:24 +0000)]
[libc] Update look and feel of libc.llvm.org
This design is borrowed from the lldb folks (thank you!) to declutter
the page.
* The version number at the top is removed.
* Links are pushed over to a sidebar
* The sidebar has headings
There are other minor changes:
* The warning about this project not being ready is now an RST "warning"
* Links to the Bug Reports and the Source Code are Added
* Refer to this project as either "The LLVM C LIbrary" or "The libc"
Tested:
Built locally
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D131242
Jim Ingham [Wed, 27 Jul 2022 16:27:51 +0000 (09:27 -0700)]
Reapply the commits to enable accurate hit-count detection for watchpoints.
This commit combines the initial commit (
7c240de609af), a fix for x86_64 Linux
(
3a0581501e76) and a fix for thinko in a last minute rewrite that I really
should have run the testsuite on.
Also, make sure that all the "I need to step over watchpoint" plans execute
before we call a public stop. Otherwise, e.g. if you have N watchpoints and
a Signal, the signal stop info will get us to stop with the watchpoints in a
half-done state.
Differential Revision: https://reviews.llvm.org/D130674
Eugene Zhulenev [Fri, 5 Aug 2022 17:35:39 +0000 (10:35 -0700)]
[mlir] Use SymbolUserOpInterface in LLVM::AddressOfOp verifier
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D131271
Lei Zhang [Fri, 5 Aug 2022 16:24:14 +0000 (12:24 -0400)]
[mlir][spirv] Add default Vulkan memory space to storage class mapping
Reviewed By: ThomasRaoux, kuhar
Differential Revision: https://reviews.llvm.org/D131128
Lei Zhang [Fri, 5 Aug 2022 16:10:38 +0000 (12:10 -0400)]
[mlir][spirv] Add a pass to map memref memory space
MemRef types now can carry an attribute to represent the memory
space. Still, upper layers in the compilation stack mostly use
nuemric values. They don't mean much (other than differentiating
separate memory domains) in MLIR's multi-level settings. Those
numeric memory space inside MemRef types need to be translated
into concrete SPIR-V storage classes during lowering to pin down
to concrete memory types.
Thus far we have been hardcoding an arbitrary mapping from memory
space to storage class for converting MemRef types. This works fine
for only targeting Vulkan; it falls apart if we want to target other
SPIR-V consumers like OpenCL, as different consumers might want
different storage classes for the buffer/variable of the same
lifetime. For example, StorageClass in Vulkan vs. CrossWorkgroup
in OpenCL.
So putting up a new pass to let the user to control how to map
MemRef memory spaces into SPIR-V storage classes. This provides
more flexibility and can address the awkwardness in the current
SPIR-V type converter. This pass should be the prelimiary step
towards lowering MemRef related types/ops into SPIR-V.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D130317
Sanjay Patel [Fri, 5 Aug 2022 15:51:59 +0000 (11:51 -0400)]
[InstSimplify] make uses of isImpliedCondition more efficient (NFCI)
As suggested in the post-commit comments for
019d76196f79fcff3c148,
this makes the usage symmetric with the 'and' patterns and should
be more efficient.
Paul Walker [Fri, 29 Jul 2022 17:49:10 +0000 (18:49 +0100)]
[SVE] Expand DUPM patterns to handle all integer vector types.
NOTE: i8 vector splats are ignored because the immediate range of
DUP already has full coverage.
Differential Revision: https://reviews.llvm.org/D131078
Than McIntosh [Fri, 5 Aug 2022 12:16:17 +0000 (08:16 -0400)]
tsan: fix bug in shadow reset introduced in D128909
Correct a bug in the code that resets shadow memory introduced as part
of a previous change for the Go race detector (D128909). The bug was
that only the most recently added shadow segment was being reset, as
opposed to the entire extent of the segment created so far. This
fixes a bug identified in Google internal testing (b/
240733951).
Differential Revision: https://reviews.llvm.org/D131256
Sanjay Patel [Fri, 5 Aug 2022 14:59:09 +0000 (10:59 -0400)]
[InstSimplify] use isImpliedCondition() instead of semi-duplicated code
We get a couple of improvements from recognizing swapped
operand patterns that were not handled by the replicated
code.
This should also enable simplifying larger patterns as
seen in issue #56653 and issue #56654, but that requires
enhancements to isImpliedCondition() itself.
Filipp Zhinkin [Fri, 5 Aug 2022 14:20:59 +0000 (10:20 -0400)]
[x86] add tests for bitwise logic of funnel shifts; NFC
Baseline tests for D130994
Nikita Popov [Tue, 2 Aug 2022 13:34:42 +0000 (15:34 +0200)]
Revert "[compiler-rt][CMake] Enable TF intrinsics on powerpc32 Linux"
As mentioned in https://reviews.llvm.org/D121379#3690593, this
change broke the build of compiler-rt targeting powerpc using GCC.
The 32-bit powerpc target is not supposed to emit 128-bit libcalls
-- if it does, then that's a backend bug and needs to be fixed there.
This reverts commit
8f24a56a3a9363f353c8da318d97491a6818781d.
Differential Revision: https://reviews.llvm.org/D130988
Tue Ly [Mon, 1 Aug 2022 13:57:29 +0000 (09:57 -0400)]
[libc] Implement sincosf function correctly rounded to all rounding modes.
Refactor common range reductions and evaluations for sinf, cosf, and
sincosf. Added exhaustive tests for sincosf.
Performance before the patch:
```
System LIBC reciprocal throughput : 30.205
LIBC reciprocal throughput : 30.533
System LIBC latency : 67.961
LIBC latency : 61.564
```
Performance after the patch:
```
System LIBC reciprocal throughput : 30.409
LIBC reciprocal throughput : 20.273
System LIBC latency : 67.527
LIBC latency : 61.959
```
Reviewed By: orex
Differential Revision: https://reviews.llvm.org/D130901
Mirko Brkusanin [Thu, 4 Aug 2022 17:24:31 +0000 (19:24 +0200)]
[AMDGPU] Remove unused MIMG tablegen variants
There are no AMDGPUSampleVariant versions for _G16, it is treated more like a
modifier for derivatives (_D) (also for intrinsics where it is overloaded type
instead of part of instrinsic name) so we ended up making more variants for
these instruction then we actually needed.
32-bit derivatives need 6 dwords at most, while 16-bit need 4 at most. Using
same AMDGPUSampleVariant for both, we ended up creating 2 extra variants per
instruction than were necessary.
In total this deletes 260 unused tablegen records.
Differential Revision: https://reviews.llvm.org/D131252
Aaron Ballman [Fri, 5 Aug 2022 13:16:37 +0000 (09:16 -0400)]
Removing redundant code; NFC
The same predicate is checked on line 12962 just above the removed code.
Alexander Belyaev [Fri, 5 Aug 2022 12:53:35 +0000 (14:53 +0200)]
Revert "[mlir] Extract offsets-sizes-strides computation from `makeTiledShape(s)`."
This reverts commit
56d94b3b902e21ff79b1ce9a6fb606a3f7c1c4db.
Dawid Jurczak [Thu, 4 Aug 2022 16:26:18 +0000 (18:26 +0200)]
[NFC] Add SmallVector constructor to allow creation of SmallVector<T> from ArrayRef of items convertible to type T
Extracted from https://reviews.llvm.org/D129781 and address comment:
https://reviews.llvm.org/D129781#3655571
Differential Revision: https://reviews.llvm.org/D130268
David Green [Fri, 5 Aug 2022 10:19:37 +0000 (11:19 +0100)]
[ConstProp] Don't fallthorugh for poison constants on vctp and active_lane_mask.
Given a poison constant as input, the dyn_cast to a ConstantInt would
fail so we would fall through to the generic code that attempts to fold
each element of the input vectors. The inputs to these intrinsics are
not vectors though, leading to a compile time crash. Instead bail out
properly for poison values by returning nullptr. This doesn't try to
define what poison means for these intrinsics.
Fixes #56945
David Spickett [Wed, 20 Jul 2022 15:55:36 +0000 (15:55 +0000)]
[llvm][IROutliner] Account for return void in sort comparator
This fixes 69 llvm tests that failed when EXPENSIVE_CHECKS was enabled.
llvm/test/Transforms/IROutliner/outlining-commutative-operands-opposite-order.ll
is one example.
When we have EXPENSIVE_CHECKS, _GLIBCXX_DEBUG is defined. This means
that libstdc++ will call the compare function to check if it is
implemented correctly (that !(a < a) is true).
This happens even if there is only one item and here, we expect
to see one return void or multiple return constant integer.
Don't sort if we have 1 item, but do assert that it is the 1
ret void we expect. In the comparator, assert that neither
Value is a nullptr in case one ended up in a the list somehow.
Reviewed By: AndrewLitteken
Differential Revision: https://reviews.llvm.org/D130230
Phoebe Wang [Fri, 5 Aug 2022 08:36:33 +0000 (01:36 -0700)]
[X86] Move getting module flag into `runOnMachineFunction` to reduce compile-time. NFCI
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D131245
Dimitry Andric [Thu, 4 Aug 2022 19:04:35 +0000 (21:04 +0200)]
[CMake] Find python before searching for python modules
In the top-level llvm `CMakeLists.txt`, we need to call
`find_package(Python3)` *before* including `config-ix.cmake`, otherwise
the latter will not be able to successfully search for python modules
using `find_python_module()`. Also set `LLVM_MINIMUM_PYTHON_VERSION`
before calling `find_package(Python3)`, moving it to `CMakeLists.txt`
from `HandleLLVMOptions.cmake`.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D131191
Chuanqi Xu [Fri, 5 Aug 2022 08:44:38 +0000 (16:44 +0800)]
[NFC] Requires x86-registered-target for test/pr56919.cpp
Balázs Kéri [Fri, 5 Aug 2022 08:05:34 +0000 (10:05 +0200)]
[clang][analyzer] Add more wide-character functions to CStringChecker
Support for functions wmempcpy, wmemmove, wmemcmp is added to the checker.
The same tests are copied that exist for the non-wide versions, with
non-wide functions and character types changed to the wide version.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D130470
Nathan James [Fri, 5 Aug 2022 07:42:51 +0000 (08:42 +0100)]
[clangd] Change the url for clang-tidy check documentation
In https://github.com/llvm/llvm-project/commit/
6e566bc5523f743bc34a7e26f050f1f2b4d699a8, The directory structure of the documentation for clang-tidy checks was changed, however clangd wasn't updated.
Now all the links generated will point to old dead pages.
This updated clangd to use the new page structure.
Reviewed By: sammccall, kadircet
Differential Revision: https://reviews.llvm.org/D128379
wanglei [Fri, 5 Aug 2022 06:48:37 +0000 (14:48 +0800)]
[LoongArch] Implement more of the ABI
According to the description of the LoongArch abi documentation,
(https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html)
the calling convention of LoongArch is almost the same as the RISCV's
(except for the vector part), so we borrow the implementation of RISCV.
This patch only guarantees the correctness of lp64d, because only the
part of lp64d is described in detail in the documentation.
Differential Revision: https://reviews.llvm.org/D130249
David Green [Fri, 5 Aug 2022 07:11:57 +0000 (08:11 +0100)]
[AArch64] Tone down the number of repeated fmov N2 scheduling tests. NFC
Chuanqi Xu [Fri, 5 Aug 2022 06:47:59 +0000 (14:47 +0800)]
[Coroutines] Remove lifetime intrinsics for spliied allocas in coroutine frames
Closing https://github.com/llvm/llvm-project/issues/56919
It is meaningless to preserve the lifetime markers for the spilled
allocas in the coroutine frames and it would block some optimizations
too.
David Green [Fri, 5 Aug 2022 06:48:42 +0000 (07:48 +0100)]
[AArch64][GlobalISel] Recognise some CCMPri
This is a simple addition to emitConditionalComparison, to match CCMP
with immediates using getIConstantVRegValWithLookThrough, letting it
select the CCMPri variants of the instructions.
Differential Revision: https://reviews.llvm.org/D131073
Xiang Li [Fri, 5 Aug 2022 06:05:46 +0000 (23:05 -0700)]
[NFC][HLSL] Fix build error caused missing typo update.
setHLSLFnuctionAttributes to setHLSLFunctionAttributes.
Differential Revision: https://reviews.llvm.org/D131240
Xiang Li [Fri, 5 Aug 2022 06:05:46 +0000 (23:05 -0700)]
[NFC][HLSL] Fix typo in CGHLSLRuntime.
Change setHLSLFnuctionAttributes to setHLSLFunctionAttributes.
Differential Revision: https://reviews.llvm.org/D131238
Austin Kerbow [Fri, 5 Aug 2022 05:51:39 +0000 (22:51 -0700)]
[AMDGPU] Pre-commit tests for D130797
Timm Bäder [Thu, 4 Aug 2022 10:53:06 +0000 (12:53 +0200)]
[clang] Consider array filler in MaybeElementDependentArrayfiller()
Any InitListExpr may have an array filler and since we may be evaluating
the array filler as well, we need to take into account that the array
filler expression might make the InitListExpr element dependent.
Fixes https://github.com/llvm/llvm-project/issues/56016
Differential Revision: https://reviews.llvm.org/D131155
Timm Bäder [Wed, 3 Aug 2022 15:05:07 +0000 (17:05 +0200)]
[clang][sema] Fix collectConjunctionTerms()
Consider:
A == 5 && A != 5
IfA is 5, the old collectConjunctionTerms() would call itself again for
the LHS (which it ignores), then the RHS (which it also ignores) and
then just return without ever adding anything to the Terms array.
Differential Revision: https://reviews.llvm.org/D131070
Xiang Li [Mon, 2 May 2022 07:04:00 +0000 (00:04 -0700)]
[HLSL] clang codeGen for HLSLShaderAttr.
Translate HLSLShaderAttr to IR level.
1. Skip mangle for hlsl entry functions.
2. Add function attribute for hlsl entry functions.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D124752
Shilei Tian [Fri, 5 Aug 2022 03:54:07 +0000 (23:54 -0400)]
[NFC] Fix wrong header in `LibC.cpp`
Paul Kirth [Fri, 5 Aug 2022 01:39:01 +0000 (01:39 +0000)]
[llvm][ir] Add missing license to ProfDataUtils
We failed to add these in D128860 or D128858
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D131226
Florian Mayer [Thu, 4 Aug 2022 22:02:52 +0000 (15:02 -0700)]
[libunwind] undef NDEBUG for assert.h in tests.
This makes sure the assertions also get verified in optimized builds.
This matches what is already done in bad_unwind_info.pass.cpp.
Reviewed By: #libunwind, MaskRay
Differential Revision: https://reviews.llvm.org/D131210
Jeff Bailey [Fri, 5 Aug 2022 02:44:02 +0000 (02:44 +0000)]
[libc] Trivial implementation of std::optional
This class has only the minimum functionality in it to provide what the
TZ variable parsing needs. In particular, the standard makes guarantees
about how trivial the destructors are, throws an expception if it's used
incorrectly, etc. There are also missing features.
Tested:
Trivial testsuite added, and use in development.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D129920
jacquesguan [Wed, 3 Aug 2022 06:58:05 +0000 (14:58 +0800)]
[mlir][Math] Add constant folder for Atan2Op.
This patch adds constant folder for Atan2Op which only supports single and double precision floating-point.
Differential Revision: https://reviews.llvm.org/D131050
Phoebe Wang [Fri, 5 Aug 2022 01:58:34 +0000 (09:58 +0800)]
Reland "[X86][MC] Always emit `rep` prefix for `bsf`"
`BMI` new instruction `tzcnt` has better performance than `bsf` on new
processors. Its encoding has a mandatory prefix '0xf3' compared to
`bsf`. If we force emit `rep` prefix for `bsf`, we will gain better
performance when the same code run on new processors.
GCC has already done this way: https://c.godbolt.org/z/6xere6fs1
Fixes #34191
Reviewed By: craig.topper, skan
Differential Revision: https://reviews.llvm.org/D130956
Tue Ly [Thu, 4 Aug 2022 20:04:14 +0000 (16:04 -0400)]
[libc] Add subtraction for UInt<N> class.
Add subtraction operators (-, -=) for UInt<N> class.
Reviewed By: michaelrj, orex
Differential Revision: https://reviews.llvm.org/D131196
Walter Erquinigo [Thu, 4 Aug 2022 23:24:37 +0000 (16:24 -0700)]
[trace][intel pt] Support a new kernel section in LLDB’s trace bundle schema
Add a new "kernel" section with following schema.
```
"kernel": {
"loadAddress"?: decimal | hex string | string decimal
# This is optional. If it's not specified, use default address 0xffffffff81000000.
"file": string
# path to the kernel image
}
```
Here's more details of the diff:
- If "kernel" section exist, it means current tracing mode is //KernelMode//.
- If tracing mode is //KernelMode//, the "processes" section must be empty and the "kernel" and "cpus" section must be provided. This is tested with `TestTraceLoad`.
- "kernel" section is parsed and turned into a new process with a single module which is the kernel image. The kernel process has N fake threads, one for each cpu.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D130805
Ellis Hoag [Fri, 29 Jul 2022 21:49:44 +0000 (14:49 -0700)]
[InstrProf][attempt 2] Add new format for -fprofile-list=
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.
Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.
This was originally landed as https://reviews.llvm.org/D130808 but was
reverted due to a Windows test failure.
Differential Revision: https://reviews.llvm.org/D131195
Richard Smith [Fri, 5 Aug 2022 00:07:43 +0000 (17:07 -0700)]
Fix parsing of comma fold-expressions as the operand of a C-style cast.
Xiang Li [Sat, 23 Apr 2022 05:56:15 +0000 (22:56 -0700)]
[HLSL] Support -E option for HLSL.
-E option will set entry function for hlsl.
The format is -E entry_name.
To avoid conflict with existing option with name 'E', add an extra prefix '--'.
A new field HLSLEntry is added to TargetOption.
To share code with HLSLShaderAttr, entry function will be add HLSLShaderAttr attribute too.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D124751
Arthur Eubanks [Thu, 4 Aug 2022 23:32:34 +0000 (16:32 -0700)]
[test][llvm-reduce] Use opaque pointers in tests
Siva Chandra Reddy [Wed, 27 Jul 2022 21:24:07 +0000 (21:24 +0000)]
[libc][NFC] Add a few compiler warning flags.
A bunch of cleanup to supress the new warnings is also done.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D130723
Kevin Gleason [Thu, 4 Aug 2022 23:22:50 +0000 (23:22 +0000)]
[MLIR] Fix arith.cmpi assembly syntax in the doc to match the implementation (NFC)
Kevin Gleason [Thu, 4 Aug 2022 23:15:07 +0000 (23:15 +0000)]
[MLIR] Fix arith.cmpf assembly syntax in the doc to match the implementation (NFC)
Matt Arsenault [Thu, 4 Aug 2022 02:44:51 +0000 (22:44 -0400)]
AMDGPU/clang: Remove dead code
The order has to be a constant and should be enforced by the builtin
definition. The fallthrough behavior would have been broken anyway.
There's still an existing issue/assert if you try to use garbage for the
ordering. The IRGen should be broken, but we also hit another assert
before that.
Fixes issue 56832
Leonard Chan [Thu, 4 Aug 2022 22:56:32 +0000 (22:56 +0000)]
Revert "[clang][Darwin] Always set the default C++ Standard Library to libc++"
This reverts commit
c5ccb78ade8136134e0ca9dde64de97f913f0f8c.
We're seeing darwin-stdlib.cpp fail on our linux, mac, and windows
builders:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/
b8806821020552676065/overview
Alex Langford [Thu, 4 Aug 2022 18:53:28 +0000 (11:53 -0700)]
Re-submit "[lldb] Filter DIEs based on qualified name where possible"
This reverts commit
967df65a3610f98a3bc0ec0f2303641d7bad176c.
This fixes test/Shell/SymbolFile/NativePDB/find-functions.cpp. When
looking up functions with the PDB plugins, if we are looking for a
full function name, we should use `GetName` to populate the `name`
field instead of `GetLookupName` since `GetName` has the more
complete information.
Fangrui Song [Thu, 4 Aug 2022 22:16:51 +0000 (15:16 -0700)]
[TTI] Change new getVectorInstrCost overload to use const reference after D131114
A const reference is preferred over a non-null const pointer.
`Type *` is kept as is to match the other overload.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D131197
Corentin Jabot [Thu, 4 Aug 2022 21:02:17 +0000 (23:02 +0200)]
[Clang] Fix capture of values initialized by bitfields
This fixes a regression introduced in 127bf44
Differential Revision: https://reviews.llvm.org/D131202
Ben Langmuir [Thu, 4 Aug 2022 21:20:44 +0000 (14:20 -0700)]
[orc-rt] Fix swift protocol metadata registration
The __swif5_proto and __swift5_protos sections had their meaning
inverted. Fix, and rename the arrays so it is more obvious which is
which.
Differential Revision: https://reviews.llvm.org/D131206
Sanjay Patel [Thu, 4 Aug 2022 21:41:19 +0000 (17:41 -0400)]
[ValueTracking] improve readability in isImpliedCond helper functions; NFC
This matches the caller code naming scheme and avoids the
potentially confusing transition from left/right to A/B.
Craig Topper [Thu, 4 Aug 2022 21:29:14 +0000 (14:29 -0700)]
[RISCV] Relax another one use restriction in performSRACombine.
When folding (sra (add (shl X, 32), C1), 32 - C) -> (shl (sext_inreg (add X, C1), i32), C)
it's possible that the add is used by multiple sras. We should
allow the combine if all the SRAs will eventually be updated.
After transforming all of the sras, the shls will share a single
(sext_inreg (add X, C1), i32).
This pattern occurs if an sra with 32 is used as index in multiple
GEPs with different scales. The shl from the GEPs will be combined
with the sra before we get a chance to match the sra pattern.
Sanjay Patel [Thu, 4 Aug 2022 20:23:09 +0000 (16:23 -0400)]
[ValueTracking] reduce code in isImpliedCondICmps; NFC
This copies the implementation of the subsequent match with constants.
Louis Dionne [Thu, 4 Aug 2022 20:48:10 +0000 (16:48 -0400)]
[compiler-rt] Don't build builtins beyond macOS 10.7
It's not supported anyways, and now Clang complains about it since
we didn't support -stdlib=libc++ back then.
Aart Bik [Thu, 4 Aug 2022 18:49:00 +0000 (11:49 -0700)]
[mlir][sparse] fix bug in complex zero detection
We were checking real-part twice, not real/imag-part.
The new test only passes after the bug fix.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D131190
Arthur Eubanks [Wed, 3 Aug 2022 03:23:03 +0000 (20:23 -0700)]
[InstrProf] Set prof global variables to internal linkage if adding a comdat
COFF has a verifier check that private global variables don't have a comdat of the same name.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D131043
Mingming Liu [Wed, 3 Aug 2022 21:20:30 +0000 (14:20 -0700)]
[AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation.
1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method.
2) This patch also changes a few callsites (VectorCombine.cpp,
SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method.
3) This is a split of D128302.
Differential Revision: https://reviews.llvm.org/D131114
Mats Petersson [Fri, 29 Apr 2022 14:24:37 +0000 (15:24 +0100)]
Prepare for inlining of SUM intrinsic
Find calls to FortranASum{Real8,Integer4}, check for dim and mask
arguments being absent - then produce an inlineable simple
version of the sum function.
(No longer a prototype, please review for push to llvm/main - not sure how to make Phabricator update the review with actual commit message)
Reviewed By: peixin, awarzynski
Differential Revision: https://reviews.llvm.org/D125407
David Green [Thu, 4 Aug 2022 19:52:26 +0000 (20:52 +0100)]
[AArch64] Add some extra GlobalISel CCMP tests coverage. NFC
Johannes Doerfert [Thu, 4 Aug 2022 18:13:40 +0000 (13:13 -0500)]
[Attributor][FIX] Deal with implicit `undef` in AAPotentialConstantValues.
In contrast to AAPotentialValues, the constant values version can
contain implicit `undef` in the set. We had an assertion that could
misfire before. Handle it properly now.
Krzysztof Drewniak [Wed, 6 Jul 2022 20:37:30 +0000 (20:37 +0000)]
[mlir][AMDGPU] Explicitly truncate memory addresses in buffer ops
As a percaution, truncate memory addresses passed to kernels to 48 bits,
since bits 48-63 of the buffer descriptor are used for the stride field
and, on gfx10, to control swizzling.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D131016
Nico Weber [Thu, 4 Aug 2022 19:33:26 +0000 (15:33 -0400)]
[gn build] port
976f37050dbd more
Follow-up to commit
51d84737b5a.
Marc Auberer [Thu, 4 Aug 2022 19:14:12 +0000 (19:14 +0000)]
[Docs] Fix missing docs strings for CallingConv.h
Replaces
```
//
```
with
```
///
```
for some code lines to make it visible in the auto-generated documentation.
Reviewed By: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D131152
Tue Ly [Wed, 3 Aug 2022 19:42:05 +0000 (15:42 -0400)]
[libc] Prevent overflow from intermediate results when adding UInt<N> values.
Prevent overflow from intermediate results when adding UInt<N> values.
Reviewed By: orex
Differential Revision: https://reviews.llvm.org/D131095
Sanjay Patel [Sun, 31 Jul 2022 21:42:58 +0000 (17:42 -0400)]
[InstSimplify] add tests for or-of-icmps; NFC
Alex Langford [Thu, 4 Aug 2022 18:51:47 +0000 (11:51 -0700)]
Revert "[lldb] Filter DIEs based on qualified name where possible"
This reverts commit
befa77e59a7760d8c4fdd177b234e4a59500f61c.
Looks like this broke a SymbolFileNativePDB test. I'll investigate and
resubmit with a fix soon.
Shilei Tian [Thu, 4 Aug 2022 18:48:07 +0000 (14:48 -0400)]
[OpenMP] Fix the test case issue that printf cannot be used in target region for AMDGPU
Mehdi Amini [Thu, 4 Aug 2022 18:47:28 +0000 (18:47 +0000)]
Revert "[mlir][test] Fix IR/AttributeTest.cpp compilation on Solaris"
This reverts commit
07aaa35f74d845a20d48e644671dce150ebf7748.
This breaks the Windows bot, and while the fix addressed the
`char`/`int8_t` case, it does not make sense for other cases like
`float`.
Fangrui Song [Thu, 4 Aug 2022 18:47:52 +0000 (11:47 -0700)]
[ELF] Parallelize input section initialization
This implements the last step of
https://discourse.llvm.org/t/parallel-input-file-parsing/60164 for the ELF port.
For an ELF object file, we previously did: parse, (parallel) initializeLocalSymbols, (parallel) postParseObjectFile.
Now we do: parse, (parallel) initSectionsAndLocalSyms, (parallel) postParseObjectFile.
initSectionsAndLocalSyms does most of input section initialization.
The sequential `parse` does SHT_ARM_ATTRIBUTES/SHT_RISCV_ATTRIBUTES/SHT_GROUP initialization for now.
Performance linking some programs with --threads=8 (glibc 2.33 malloc and mimalloc):
* clang: 1.05x as fast with glibc malloc, 1.03x as fast with mimalloc
* chrome: 1.04x as fast with glibc malloc, 1.03x as fast with mimalloc
* internal search program: 1.08x as fast with glibc malloc, 1.05x as fast with mimalloc
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D130810
Shilei Tian [Thu, 4 Aug 2022 18:37:47 +0000 (14:37 -0400)]
[OpenMP][DeviceRTL] Implement libc function `memcmp`
We will add some simple implementation of libc functions starting from
this patch, and the first one is `memcmp`, which is reported in #56929. Note that
`malloc` and `free` are not included in this patch because of the use of
`declare variant`. In the near future we will implement the two functions w/o
using any vendor provided function.
This fixes #56929.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D131182
LLVM GN Syncbot [Thu, 4 Aug 2022 18:26:38 +0000 (18:26 +0000)]
[gn build] Port
4038c859e58c
Craig Topper [Thu, 4 Aug 2022 18:16:27 +0000 (11:16 -0700)]
[RISCV] Relax a one use restriction performSRACombine
When folding (sra (add (shl X, 32), C1), 32 - C) -> (shl (sext_inreg (add X, C1), C)
ignore the use count on the (shl X, 32).
The sext_inreg after the transform is free. So we're only making
2 new instructions, the add and the shl. So we only need to be
concerned with replacing the original sra+add. The original shl
can have other uses. This helps if there are multiple different
constants being added to the same shl.
Alex Langford [Tue, 12 Jul 2022 15:51:30 +0000 (08:51 -0700)]
[lldb] Filter DIEs based on qualified name where possible
Context:
When setting a breakpoint by name, we invoke Module::FindFunctions to
find the function(s) in question. However, we use a Module::LookupInfo
to first process the user-provided name and figure out exactly what
we're looking for. When we actually perform the function lookup, we
search for the basename. After performing the search, we then filter out
the results using Module::LookupInfo::Prune. For example, given
a::b::foo we would first search for all instances of foo and then filter
out the results to just names that have a::b::foo in them. As one can
imagine, this involves a lot of debug info processing that we do not
necessarily need to be doing. Instead of doing one large post-processing
step after finding each instance of `foo`, we can filter them as we go
to save time.
Some numbers:
Debugging LLDB and placing a breakpoint on
llvm::itanium_demangle::StringView::begin without this change takes
approximately 70 seconds and resolves 31,920 DIEs. With this change,
placing the breakpoint takes around 30 seconds and resolves 8 DIEs.
Differential Revision: https://reviews.llvm.org/D129682
Arjun P [Thu, 4 Aug 2022 17:59:35 +0000 (18:59 +0100)]
[MLIR][Presburger] SlowMPInt::gcd: fix crash when sizes differ
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D131186
Arjun P [Thu, 4 Aug 2022 17:42:51 +0000 (18:42 +0100)]
[MLIR][Presburger] fourier-motzkin: check if all LCMs are 1 using a bool instead of by multiplying them
This can easily overflow and it is possible for these unsigned overflows to result in incorrect results.
For example, the two LCMs could be 641 and 6700417, which multiply to 2^32 + 1, which overflows to 1.
Unsigned overflows already occur in the existing tests.
Also, when switching to arbitrary-precision arithmetic, this results in a many
large integer multiplications resulting in a significant slowdown.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D131184
Fangrui Song [Thu, 4 Aug 2022 18:09:40 +0000 (11:09 -0700)]
[ELF] Add makeThreadLocal/makeThreadLocalN and remove InputFile::localSymStorage
makeThreadLocal/makeThreadLocalN are moved from D130810 ([ELF] Parallelize input
section initialization) here to make D130810 more focused on the refactor:
* COFF has some needs for multiple linker contexts. D108850 partially removed
global states from lldCommon but left the global variable `lctx`.
* To the best of my knowledge, all multiple-linker-context feature requests to
ELF are more from user convenience, with no very strong argument.
* In practice, ELF port is very difficult to remove global states without
introducing significant performance regression/hurting code readability.
* Per-thread allocators from D122922/D123879 are too expensive and will not
really benefit ELF.
This patch adds a simple thread_local based makeThreadLocal to
lld/Common/Memory.h. It will enable further optimization in ELF.
Louis Dionne [Thu, 4 Aug 2022 18:06:31 +0000 (14:06 -0400)]
[libc++] Clarify comment in CI pipeline definition
This partially reverts commit
7d855bb8e133. The comments were actually
not outdated, they were simply unclear.
Louis Dionne [Thu, 4 Aug 2022 18:03:16 +0000 (14:03 -0400)]
[libc++][NFC] Remove outdated comment in CI pipeline definition
Konstantin Varlamov [Thu, 4 Aug 2022 17:57:58 +0000 (10:57 -0700)]
[libc++] Fix a hard error in `contiguous_iterator<NoOperatorArrowIter>`.
Evaluating `contiguous_iterator` on an iterator that satisfies all the
constraints except the `to_address` constraint and doesn't have
`operator->` defined results in a hard error. This is because
instantiating `to_address` ends up instantiating templates
dependent on the given type which might lead to a hard error even
in a SFINAE context.
Differential Revision: https://reviews.llvm.org/D130835
Nikolas Klauser [Thu, 4 Aug 2022 17:54:13 +0000 (10:54 -0700)]
[libc++][ranges] Implement `ranges::is_permutation`
Co-authored-by: Konstantin Varlamov <varconst@apple.com>
Differential Revision: https://reviews.llvm.org/D127194
Zakk Chen [Thu, 4 Aug 2022 17:34:05 +0000 (17:34 +0000)]
[RISCV][Clang] Support policy function for all vector segment load.
We will switch all UndefValue to PoisonValue in follow up patches.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D126750
Sam Estep [Thu, 4 Aug 2022 17:45:30 +0000 (17:45 +0000)]
[clang][dataflow] Analyze method bodies
This patch adds the ability to context-sensitively analyze method bodies, by moving `ThisPointeeLoc` from `DataflowAnalysisContext` to `Environment`, and adding code in `pushCall` to set it.
Reviewed By: ymandel, sgatev, xazax.hun
Differential Revision: https://reviews.llvm.org/D131170
lorenzo chelini [Tue, 19 Jul 2022 14:13:22 +0000 (16:13 +0200)]
[MLIR] TilingInterface: Avoid map when tile divides iteration domain
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D131080
Petr Hosek [Thu, 21 Jul 2022 08:12:08 +0000 (08:12 +0000)]
[clang-doc] Default to Standalone executor and improve documentation
This should provide a more intuitive usage consistent with other tools.
Differential Revision: https://reviews.llvm.org/D130226