Matthias Braun [Wed, 16 Feb 2022 23:01:37 +0000 (15:01 -0800)]
PGOInstrumentation, GCOVProfiling: Split indirectbr critical edges regardless of PHIs
The `SplitIndirectBrCriticalEdges` function was originally designed for
`CodeGenPrepare` and skipped splitting of edges when the destination
block didn't contain any `PHI` instructions. This only makes sense when
reducing COPYs like `CodeGenPrepare`. In the case of
`PGOInstrumentation` or `GCOVProfiling` it would result in missed
counters and wrong result in functions with computed goto.
Differential Revision: https://reviews.llvm.org/D120096
Matthias Braun [Thu, 17 Feb 2022 21:57:06 +0000 (13:57 -0800)]
Simplify/cleanup BasicBlockUtilsTest
Cleanup BasicBolckUtilsTest using C++ raw string literals, remove
duplicated block functions and smaller style changes.
Differential Revision: https://reviews.llvm.org/D120095
Fangrui Song [Thu, 24 Feb 2022 00:08:25 +0000 (16:08 -0800)]
[asan] Allow -fsanitize-address-globals-dead-stripping with -fno-data-sections for ELF
-fdata-sections decides whether global variables go into different sections.
This is orthogonal to whether we place their metadata (`.data` or `asan_globals`) into different sections.
With -fno-data-sections, `-fsanitize-address-globals-dead-stripping` can still:
* deduplicate COMDAT `asan.module_ctor` and `asan.module_dtor`
* (with ld --gc-sections): for a data section (e.g. `.data`), if all global variables defined relative to it are unreferenced, discard them and associated `asan_globals` sections (rare but no need to exclude this case)
Similar to
c7b90947bd0179d914fea56b52be545c8f60f20a for PE/COFF.
Reviewed By: #sanitizers, kstoimenov, vitalybuka
Differential Revision: https://reviews.llvm.org/D120394
Amir Ayupov [Wed, 23 Feb 2022 06:54:15 +0000 (22:54 -0800)]
[BOLT][NFC] Fix undefined behavior in encodeAnnotationImm
Fix UBSan-reported issue in MCPlusBuilder::encodeAnnotationImm (left shift of a
negative value).
Test Plan:
```
ninja check-bolt
...
PASS: BOLT-Unit :: Core/./CoreTests/AArch64/MCPlusBuilderTester.Annotation/0 (1 of 140)
PASS: BOLT-Unit :: Core/./CoreTests/X86/MCPlusBuilderTester.Annotation/0 (131 of 134)
```
Reviewed By: maksfb, yota9
Differential Revision: https://reviews.llvm.org/D120260
Owen Anderson [Fri, 18 Feb 2022 08:15:25 +0000 (00:15 -0800)]
Teach the AArch64 backend to instruction select the BCAX instruction.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D120112
Nico Weber [Wed, 23 Feb 2022 23:41:20 +0000 (18:41 -0500)]
Reland "unbreak Modules/cxx20-export-import.cpp with LLVM_APPEND_VC_REV after
32b73bc6ab82"
This reverts commit
5086cff04eec4327acc22a90466854ad4d89d570.
32b73bc6ab82 relanded in
1592d88aa7bc.
Dave Lee [Wed, 2 Feb 2022 18:45:19 +0000 (10:45 -0800)]
[FormatVariadic] Mark index as required in docstring
After looking at the formatv docstring, I thought the index was optional (as it
is in other languages). This changes the header docs to show `index` instead of
`[index]`, to indicate that it is required.
Differential Revision: https://reviews.llvm.org/D118833
Vladimir Vereschaka [Thu, 17 Feb 2022 01:20:39 +0000 (17:20 -0800)]
[CMake] Use CMAKE_SYSROOT to build libs for Win to ARM cross tooolchain. NFC.
Provide CMAKE_SYSROOT for the libc++/libc++abi/libunwind libraries
instead of specific <foo>_SYSROOT for each of them.
Fixed passing some CMake arguments for the runtimes.
Referenced Differentials:
* https://reviews.llvm.org/D119836
* https://reviews.llvm.org/D112155
* https://reviews.llvm.org/D111672
Differential Revision: https://reviews.llvm.org/D120383
Nikolas Klauser [Wed, 23 Feb 2022 23:09:18 +0000 (00:09 +0100)]
[libc++] Add empty line in ReleaseNotes.rst
Joseph Huber [Wed, 23 Feb 2022 22:40:29 +0000 (17:40 -0500)]
[OpenMP][NFC] Address warnings and lint messages in CGOpenMPRuntime
Summary:
This patch addressed the warnings and linting messages for the
CGOpenMPRuntime.cpp file. This was causing some -Werror builds to fail.
Zahira Ammarguellat [Tue, 19 Oct 2021 16:12:57 +0000 (09:12 -0700)]
Add support for floating-point option `ffp-eval-method` and for
`pragma clang fp eval_method`.
Differential Revision: https://reviews.llvm.org/D109239
Vitaly Buka [Wed, 23 Feb 2022 22:29:09 +0000 (14:29 -0800)]
[NFC][hwasan] Check _GLIBCXX_RELEASE in test
Differential Revision: https://reviews.llvm.org/D119161
Michael Kruse [Tue, 22 Feb 2022 22:34:04 +0000 (16:34 -0600)]
[opt] Pin region viewer passes to legacy PM.
The RegionPrinter, RegionOnlyPrinter, RegionViewer and RegionOnlyViewer passes have not yet been ported to the new pass manager.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D119897
Vitaly Buka [Wed, 23 Feb 2022 22:05:00 +0000 (14:05 -0800)]
[HWASan] Use hwasan_memalign for aligned new.
Aligned new does not require size to be a multiple of alignment, so
memalign is the correct choice instead of aligned_alloc.
Fixes false reports for unaligned sizes.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D119161
minglotus-6 [Tue, 22 Feb 2022 20:07:43 +0000 (12:07 -0800)]
[SampleProf][Inliner] Add an option to turn off inliner in sample-profile pass.
Use case is offline evaluation (for inliner effectiveness) or debugging.
Differential Revision: https://reviews.llvm.org/D120344
Vitaly Buka [Wed, 23 Feb 2022 22:04:51 +0000 (14:04 -0800)]
[NFC][hwasan] Clang-format the file
Aaron Ballman [Wed, 23 Feb 2022 22:11:34 +0000 (17:11 -0500)]
Use function prototypes when appropriate; NFC
Reid Kleckner [Wed, 23 Feb 2022 22:05:26 +0000 (14:05 -0800)]
Fix more unused lambda capture warnings, NFC
Nikolas Klauser [Wed, 23 Feb 2022 22:05:22 +0000 (23:05 +0100)]
[libc++] Granularize chrono includes
Reviewed By: Quuxplusone, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D120141
Florian Mayer [Tue, 15 Feb 2022 19:41:52 +0000 (11:41 -0800)]
[HWASan] add test for debug info of allocas that don't need padding.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D119873
Reid Kleckner [Wed, 23 Feb 2022 22:01:01 +0000 (14:01 -0800)]
Fix unused lambda capture warning, NFC
Nicolas Miller [Tue, 22 Feb 2022 22:50:42 +0000 (14:50 -0800)]
[NVPTX] Add ex2.approx.f16/f16x2 support
his patch adds builtins and intrinsics for the f16 and f16x2 variants of the ex2
instruction.
These two variants were added in PTX7.0, and are supported by sm_75 and above.
Note that this isn't wired with the exp2 llvm intrinsic because the ex2
instruction is only available in its approx variant.
Running ptxas on the assembly generated by the test f16-ex2.ll works as
expected.
Differential Revision: https://reviews.llvm.org/D119157
Jakub Chlanda [Tue, 22 Feb 2022 22:45:19 +0000 (14:45 -0800)]
[NVPTX] Add more FMA intriniscs/builtins
This patch adds builtins/intrinsics for the following variants of FMA:
- f16, f16x2
- rn
- rn_ftz
- rn_sat
- rn_ftz_sat
- rn_relu
- rn_ftz_relu
- bf16, bf16x2
- rn
- rn_relu
ptxas (Cuda compilation tools, release 11.0, V11.0.194) is happy with the generated assembly.
Differential Revision: https://reviews.llvm.org/D118977
Jakub Chlanda [Tue, 22 Feb 2022 22:42:15 +0000 (14:42 -0800)]
[NVPTX] Expose float tys min, max, abs, neg as builtins
Adds support for the following builtins:
- abs, neg:
- .bf16,
- .bf16x2
- min, max
- {.ftz}{.NaN}{.xorsign.abs}.f16
- {.ftz}{.NaN}{.xorsign.abs}.f16x2
- {.NaN}{.xorsign.abs}.bf16
- {.NaN}{.xorsign.abs}.bf16x2
- {.ftz}{.NaN}{.xorsign.abs}.f32
Differential Revision: https://reviews.llvm.org/D117887
Joseph Huber [Wed, 23 Feb 2022 21:53:30 +0000 (16:53 -0500)]
[Clang][Docs] Add '-fopenmp-offload-mandatory' to command line reference
Philip Reames [Wed, 23 Feb 2022 21:48:03 +0000 (13:48 -0800)]
[SLP] Fastpath instructions not in block being scheduled [nfc]
Joseph Huber [Tue, 22 Feb 2022 21:15:34 +0000 (16:15 -0500)]
[OpenMP] Add option to make offloading mandatory
Currently when we generate OpenMP offloading code we always make
fallback code for the CPU. This is necessary for implementing features
like conditional offloading and ensuring that unhandled pragmas don't
result in missing symbols. However, this is problematic for a few cases.
For offloading tests we can silently fail to the host without realizing
that offloading failed. Additionally, this makes it impossible to
provide interoperabiility to other offloading schemes like HIP or CUDA
because those methods do not provide any such host fallback guaruntee.
this patch adds the `-fopenmp-offload-mandatory` flag to prevent
generating the fallback symbol on the CPU and instead replaces the
function with a dummy global and the failed branch with 'unreachable'.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D120353
Philip Reames [Wed, 23 Feb 2022 21:42:49 +0000 (13:42 -0800)]
[SLP] Replace a impossible branch condition with an assert [NFC]
An entire bundle must be inside the scheduling window. Assert that this property holds as opposed to checking it at runtime.
Fangrui Song [Wed, 23 Feb 2022 21:35:22 +0000 (13:35 -0800)]
[sanitizer][sancov] Use pc-1 for s390x
The stack trace addresses may be odd (normally addresses should be even), but
seems a good compromise when the instruction length (2,4,6) cannot be detected
easily.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D120432
Arthur Eubanks [Wed, 23 Feb 2022 21:26:32 +0000 (13:26 -0800)]
[clang] Remove Address::deprecated() from CGClass.cpp
Philip Reames [Wed, 23 Feb 2022 21:28:29 +0000 (13:28 -0800)]
{SLP] Make it clear ScheduleDataMap is keyed by instructions [NFC]
Fangrui Song [Wed, 23 Feb 2022 21:29:21 +0000 (13:29 -0800)]
[ELF][test] Fix edata-etext.s
Snehasish Kumar [Thu, 17 Feb 2022 22:44:49 +0000 (14:44 -0800)]
[instrprof] Rename the profile kind types to be more descriptive.
Based on the discussion in D115393, I've updated the names to be more
descriptive.
Reviewed By: ellis, MaskRay
Differential Revision: https://reviews.llvm.org/D120092
Philip Reames [Wed, 23 Feb 2022 21:08:18 +0000 (13:08 -0800)]
Revert "[SLP] Remove cap on schedule window size"
This reverts commit
6adf4b039e095224edbbecda5972e5e3353b53b6. Reverting while investigating https://github.com/llvm/llvm-project/issues/54029
Philip Reames [Wed, 23 Feb 2022 21:08:10 +0000 (13:08 -0800)]
Revert "[SLP] Simplify extendSchedulingRegion"
This reverts commit
8c85f3a0523070ef656e30e368df0a679c1400cd.
Shilei Tian [Wed, 23 Feb 2022 21:10:35 +0000 (16:10 -0500)]
[OpenMP][Offloading] Change N back to 256 in bug49334.cpp
Martin Storsjö [Tue, 25 Jan 2022 09:38:41 +0000 (09:38 +0000)]
[libcxx] [test] Fix time.get.byname get_one for Glibc and Windows
This matches the fixes for the wchar version in
f081cc50372f9415ef4fa2204a4b7f54153af455.
Differential Revision: https://reviews.llvm.org/D120283
Craig Topper [Wed, 23 Feb 2022 20:35:06 +0000 (12:35 -0800)]
[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable.
Internally to DAGCombiner the SDValues were passed by non-const
reference despite not being modified. They were then passed by
const reference to TLI.
This patch passes them by value which is consistent with the vast
majority of code.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D120420
Pawe Bylica [Wed, 23 Feb 2022 18:26:48 +0000 (19:26 +0100)]
[DAGCombine] Extend combineCarryDiamond()
In combineCarryDiamond() use getAsCarry() to find more candidates for being a carry flag.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D118362
Haojian Wu [Wed, 23 Feb 2022 20:34:20 +0000 (21:34 +0100)]
[pseudo] fix an out-of-bound error in LRTable.
Fix window debug build.
Jonas Devlieghere [Wed, 23 Feb 2022 19:52:20 +0000 (11:52 -0800)]
[lldb] Fix (unintentional) recursion in CommandObjectRegexCommand
Jim noticed that the regex command is unintentionally recursive. Let's
use the following command regex as an example:
(lldb) com regex humm 's/([^ ]+) ([^ ]+)/p %1 %2 %1 %2/'
If we call it with arguments foo bar, thing behave as expected:
(lldb) humm foo bar
(...)
foo bar foo bar
However, if we include %2 in the arguments, things break down:
(lldb) humm fo%2o bar
(...)
fobaro bar fobaro bar
The problem is that the implementation of the substitution is too naive.
It substitutes the %1 token into the target template in place, then does
the %2 substitution starting with the resultant string. So if the
previous substitution introduced a %2 token, it would get processed in
the second sweep, etc.
This patch addresses the issue by walking the command once and
substituting the % variables in place.
(lldb) humm fo%2o bar
(...)
fo%2o bar fo%2o bar
Furthermore, this patch also reports an error if not enough variables
were provided and add support for substituting %0.
rdar://
81236994
Differential revision: https://reviews.llvm.org/D120101
Philip Reames [Wed, 23 Feb 2022 19:57:56 +0000 (11:57 -0800)]
[SLP] Rearrange fields in ScheduleData for density [NFC]
Stefan Pintilie [Wed, 23 Feb 2022 15:18:19 +0000 (09:18 -0600)]
[NFC][PowerPC] Fix the check-cpu.ll test case.
This test doesn't work because the CHECK-NOT line is actually checking
something that only exists on stderr and not stdout.
Changed the test so that we now check both stderr and stdout.
Changed the test so that we check pwr9, pwr10, and future. The cpu names of
power9 or power10 are not supported in the llc backend.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D120349
Emilio Cota [Wed, 23 Feb 2022 15:51:36 +0000 (10:51 -0500)]
[mlir] Add sectionMemoryMapper to ExecutionEngineOptions
By specifying a sectionMemoryMapper, users can control how
memory for JIT code is allocated.
In particular, I need this in order to use a named memory
region so that profilers such as perf(1) can correctly label
execution cycles coming from JIT'ed code.
Reviewed-by: ezhulenev
Differential Revision: https://reviews.llvm.org/D120415
Fangrui Song [Wed, 23 Feb 2022 19:51:30 +0000 (11:51 -0800)]
[Driver] Add -fno-sanitize-address-globals-dead-stripping
It's customary for these options to have the -fno- form which is sometimes
handy to work around issues. Using the supported driver option is preferred over
the internal cl::opt option `-mllvm -asan-globals-live-support=0`
Reviewed By: kstoimenov, vitalybuka
Differential Revision: https://reviews.llvm.org/D120391
Philip Reames [Wed, 23 Feb 2022 19:40:03 +0000 (11:40 -0800)]
[SLP] Remove SchedulingPriority from ScheduleData [NFC]
First step in trying to shrink the memory footprint of ScheduleData to improve cache locality.
Martin Liska [Wed, 23 Feb 2022 19:25:17 +0000 (11:25 -0800)]
[PATCH] ASAN: Align declaration with definition of a fn
Fixes:
https://bugs.llvm.org/show_bug.cgi?id=51641
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D115447
Xu Mingjie [Wed, 23 Feb 2022 19:15:57 +0000 (11:15 -0800)]
[TSan][NFC] fixup for comment of Shadow
There should be 1-bit unused field between tid field and is_atomic field of Shadow.
Reviewed By: dvyukov, vitalybuka
Differential Revision: https://reviews.llvm.org/D119417
Vitaly Buka [Wed, 23 Feb 2022 19:21:04 +0000 (11:21 -0800)]
Revert "[TSan][NFC] fixup for comment of Shadow"
Wrong author.
This reverts commit
6bff092e3ed4ae1f21b290f88cf7152cb331aa48.
Philip Reames [Wed, 23 Feb 2022 16:51:33 +0000 (08:51 -0800)]
[SLP] Simplify extendSchedulingRegion
This change uses instruction's comesBefore method to simplify the code significantly. There's little compile time concern here because getSpillCost already calls comesBefore on every basic block which contains a vectorization candidate. The only additional times we'll build basic block ordering is when we can't schedule a vector candidate anywhere in the containing block.
Differential Revision: https://reviews.llvm.org/D120364
Jinsong Ji [Wed, 23 Feb 2022 18:43:27 +0000 (13:43 -0500)]
[libc++][AIX] Fix trivial_abi return tests for unique_ptr/weak_ptr
The unique_ptr_ret and weak_ptr_ret tests are not expected to pass on
AIX. These tests check that unique_ptr and weak_ptr are returned by
value, but on AIX, all structs are always returned by reference.
```
3.9.6 Function Return Values
...
Note: Structures of any length and character strings longer than four
bytes are returned in a storage buffer allocated by the caller. The
address of this buffer is passed as a hidden first argument in GPR3,
which causes the first explicit argument word to be passed in GPR4. This
hidden argument is treated as a formal argument and corresponds to the
first word of the argument area.
```
Reviewed By: #powerpc, daltenty, #libc, Quuxplusone, philnik
Differential Revision: https://reviews.llvm.org/D119952
Augie Fackler [Fri, 11 Feb 2022 23:32:38 +0000 (18:32 -0500)]
AttributorAttributes: avoid a crashing on bad alignments
Prior to this change, LLVM would attempt to optimize an
aligned_alloc(33, ...) call to the stack. This flunked an assertion when
trying to emit the alloca, which crashed LLVM. Avoid that with extra
checks.
Differential Revision: https://reviews.llvm.org/D119604
Arjun P [Tue, 22 Feb 2022 17:28:21 +0000 (17:28 +0000)]
[MLIR][Presburger] PresburgerSet::subtract: automatically restore state on return
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D120339
Vitaly Buka [Wed, 23 Feb 2022 19:15:57 +0000 (11:15 -0800)]
[TSan][NFC] fixup for comment of Shadow
There should be 1-bit unused field between tid field and is_atomic field of Shadow.
Reviewed By: dvyukov, vitalybuka
Differential Revision: https://reviews.llvm.org/D119417
William S. Moses [Wed, 16 Feb 2022 16:26:12 +0000 (11:26 -0500)]
[MLIR][Arith] Canonicalize cmpf(int to fp) to cmpi
Given a cmpf of either uitofp or sitofp and a constant, attempt to canonicalize it to a cmpi.
This PR rewrites equivalent code within LLVM to now apply to MLIR arith.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117257
Valentin Clement [Wed, 23 Feb 2022 18:48:07 +0000 (19:48 +0100)]
[flang] Lower function and subroutine calls
This patch introduce basic function/subroutine calls.
Because of the state of lowering only simple scalar arguments
can be used in the calls. This will be enhanced in follow up
patches with arrays, allocatable, pointer ans so on.
```
subroutine sub1()
end
subroutine sub2()
call sub1()
end
```
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D120419
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Simon Pilgrim [Wed, 23 Feb 2022 18:45:17 +0000 (18:45 +0000)]
[X86] combineX86ShufflesRecursively - pull out repeated getValueType/getSimpleValueType calls.
Jessica Paquette [Wed, 23 Feb 2022 18:35:52 +0000 (10:35 -0800)]
Revert "[MachineOutliner][AArch64] NFC: Split MBBs into "outlinable ranges""
This reverts commit
d97f997eb79d91b2872ac13619f49cb3a7120781.
This commit was not NFC.
(See: https://reviews.llvm.org/rGd97f997eb79d91b2872ac13619f49cb3a7120781)
Eugene Zhulenev [Thu, 17 Feb 2022 18:22:18 +0000 (10:22 -0800)]
[mlir] Async: update condition for dispatching block-aligned compute function
+ compare block size with the unrollable inner dimension
+ reduce nesting in the code and simplify a bit IR building
Reviewed By: cota
Differential Revision: https://reviews.llvm.org/D120075
Fangrui Song [Wed, 23 Feb 2022 18:15:42 +0000 (10:15 -0800)]
[ELF] Check COMMON symbols for PROVIDE and don't redefine COMMON symbols edata/end/etext
In GNU ld, the definition precedence is: regular symbol assignment > relocatable object definition > `PROVIDE` symbol assignment.
GNU ld's internal linker scripts define the non-reserved (by C and C++)
edata/end/etext with `PROVIDE` so the relocatable object definition takes
precedence. This makes sense because `int end;` is valid.
We currently redefine such symbols if they are COMMON, but not if they are
regular definitions, so `int end;` with -fcommon is essentially a UB in ld.lld.
Fix this (also improve consistency and match GNU ld) by using the
`isDefined` code path for `isCommon`. In GNU ld, reserved identifiers like
`__ehdr_start` do not use `PROVIDE`, while we treat them all as `PROVIDE`, this
seems fine.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D120389
wlei [Tue, 22 Feb 2022 20:09:01 +0000 (12:09 -0800)]
[llvm-profgen] Support symbol loading for debug fission
Support to load debug info from dwarf split file, like .dwo, .dwp files. Leverage the `getNonSkeletonUnitDIE(false)` API to achieve this.
Add test cause to make sure all the ranges is well retrieved by the loader.
Reviewed By: ayermolo, hoy, wenlei
Differential Revision: https://reviews.llvm.org/D115973
Arthur Eubanks [Wed, 23 Feb 2022 17:37:53 +0000 (09:37 -0800)]
[clang] Remove getPointerElementType() in EmitVTableTypeCheckedLoad()
Simon Pilgrim [Wed, 23 Feb 2022 17:29:41 +0000 (17:29 +0000)]
[X86] combineX86ShufflesRecursively - don't both widening inputs before calling combineX86ShuffleChain
combineX86ShuffleChain no longer has to assume that the shuffle inputs are the right size, so don't create unnecessary nodes messing up oneuse limits as detailed on Issue #45319
LLVM GN Syncbot [Wed, 23 Feb 2022 17:12:13 +0000 (17:12 +0000)]
[gn build] Port
7c1ee5e95f31
Sanjay Patel [Wed, 23 Feb 2022 16:25:01 +0000 (11:25 -0500)]
[DAG] try to convert multiply to shift via demanded bits
This is a fix for a regression discussed in:
https://github.com/llvm/llvm-project/issues/53829
We cleared more high multiplier bits with 995d400,
but that can lead to worse codegen because we would fail
to recognize the now disguised multiplication by neg-power-of-2
as a shift-left. The problem exists independently of the IR
change in the case that the multiply already had cleared high
bits. We also convert shl+sub into mul+add in instcombine's
negator.
This patch fills in the high-bits to see the shift transform
opportunity. Alive2 attempt to show correctness:
https://alive2.llvm.org/ce/z/GgSKVX
The AArch64, RISCV, and MIPS diffs look like clear wins. The
x86 code requires an extra move register in the minimal examples,
but it's still an improvement to get rid of the multiply on all
CPUs that I am aware of (because multiply is never as fast as a
shift).
There's a potential follow-up noted by the TODO comment. We
should already convert that pattern into shl+add in IR, so
it's probably not common:
https://alive2.llvm.org/ce/z/7QY_Ga
Fixes #53829
Differential Revision: https://reviews.llvm.org/D120216
Arthur Eubanks [Wed, 23 Feb 2022 17:08:03 +0000 (09:08 -0800)]
Revert "AttributorAttributes: avoid a crashing on bad alignments"
This reverts commit
70ff6fbeb9b5acb4995dc42286954b762d0937fd.
Breaks bots, e.g. http://45.33.8.238/linux/69375/step_12.txt.
Valentin Clement [Wed, 23 Feb 2022 17:04:15 +0000 (18:04 +0100)]
[flang][NFC] Clean up ConvertType
This patch removes unused or obsolete code in
the ConvertType.h and ConvertType.cpp files. These
files were landed together with the initial flang
upstreaming. This cleanup will help future upstreaming
effort from fir-dev and keep only used code.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D120405
Daniel Resnick [Tue, 22 Feb 2022 17:41:48 +0000 (10:41 -0700)]
[MLIR][Pass] Have PassRegistryEntry own pass strings
This eliminates the requirement that pass-related strings outlive pass
instances, which will facilitate future work enabling dynamic passes
written in other languages.
Differential Revision: https://reviews.llvm.org/D120341
Valentin Clement [Wed, 23 Feb 2022 17:01:58 +0000 (18:01 +0100)]
[flang] Lower complex constant
Add ability to lower complex constant.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D120402
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Sam McCall [Mon, 7 Feb 2022 18:11:16 +0000 (19:11 +0100)]
[Pseudo] Token/TokenStream, PP directive parser.
The TokenStream class is the representation of the source code that will
be fed into the GLR parser.
This patch allows a "raw" TokenStream to be built by reading source code.
It also supports scanning a TokenStream to find the directive structure.
Next steps (with placeholders in the code): heuristically choosing a
path through #ifs, preprocessing the code by stripping directives and comments.
These will produce a suitable stream to feed into the parser proper.
Differential Revision: https://reviews.llvm.org/D119162
Augie Fackler [Fri, 11 Feb 2022 23:32:38 +0000 (18:32 -0500)]
AttributorAttributes: avoid a crashing on bad alignments
Prior to this change, LLVM would attempt to optimize an
aligned_alloc(33, ...) call to the stack. This flunked an assertion when
trying to emit the alloca, which crashed LLVM. Avoid that with extra
checks.
Differential Revision: https://reviews.llvm.org/D119604
Philip Reames [Wed, 23 Feb 2022 04:00:43 +0000 (20:00 -0800)]
[SLP] Remove cap on schedule window size
This cap was first added in
848c1aa45 (back in 2015). Per the original commit message, the purpose was to avoid a compile time explosion in long basic blocks. The algorithmic problem in scheduling has now been fixed in
0539a26d.
In the meantime, the code has rotten fairly badly. Some intermediate refactoring caused the size to only be incremented if *both* iterators advance in the window search. This causes the size to be badly undercounted when near one end of a basic block. We no longer have any test which exercises the logic in an intentional way; there's one test which differs with this change, but the changes appear fairly orthoganol to the purpose of the test file.
Unfortunately, we no longer have the original motivating example, so it's possible that it also hits some other issue. I tested locally with a large example, but even at it's worst, that one doesn't demonstrate anything too extreme even without the algorithmic fix. It's clearly faster with, but only by ~20% which doesn't seem in line with the original commit message. If regressions with this patch are seen, please file a bug and I'll try to fix any other algorithmic problems which fall out.
Philipp Stephani [Wed, 23 Feb 2022 16:00:04 +0000 (17:00 +0100)]
clang-format.el: Make clang-format work in indirect buffers.
In an indirect buffer, buffer-file-name is nil, so check the base buffer
instead. This works fine in direct buffers where buffer-base-buffer returns
nil.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D120408
Malhar Jajoo [Wed, 2 Feb 2022 15:52:13 +0000 (15:52 +0000)]
[LAA] Add remarks for unbounded array access
Adds new optimization remarks when loop vectorization fails due to
the compiler being unable to find bound of an array access inside
a loop
Differential Revision: https://reviews.llvm.org/D115873
Simon Pilgrim [Wed, 23 Feb 2022 15:43:34 +0000 (15:43 +0000)]
[X86] combineX86ShuffleChainWithExtract - don't both widening inputs after peeking through ISD::EXTRACT_SUBVECTOR nodes
combineX86ShuffleChain no longer has to assume that the shuffle inputs are the right size, so don't create unnecessary nodes messing up oneuse limits as detailed on Issue #45319
Removing widening from combineX86ShufflesRecursively will be the next step, followed by removing combineX86ShuffleChainWithExtract entirely
Nikita Popov [Wed, 23 Feb 2022 15:32:40 +0000 (16:32 +0100)]
[Bitcode] Store function type IDs rather than function types
This resolves one of the type ID propagation TODOs.
Emilio Cota [Wed, 23 Feb 2022 03:27:54 +0000 (22:27 -0500)]
[mlir][NFC] Use options struct in ExecutionEngine::create
Its number of optional parameters has grown too large,
which makes adding new optional parameters quite a chore.
Fix this by using an options struct.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D120380
Nikita Popov [Wed, 23 Feb 2022 15:10:29 +0000 (16:10 +0100)]
[Clang][OpenMP] Remove use of getPointerElementType()
This new pointer element type use snuck in via D118632.
Arjun P [Wed, 23 Feb 2022 15:00:17 +0000 (15:00 +0000)]
[MLIR][Presburger] unittests: use an MLIRContext declared in parsePoly
Use an `MLIRContext` declared in a single place in the `parsePoly` function that almost all Presburger unit tests use for parsing sets. This function is only used in tests.
This saves us from having to declare and pass a new `MLIRContext` in every test.
Reviewed By: bondhugula, mehdi_amini
Differential Revision: https://reviews.llvm.org/D119251
Jay Foad [Wed, 23 Feb 2022 13:35:34 +0000 (13:35 +0000)]
[AMDGPU] Split fp min/max atomics test. NFC.
Split out f32 buffer, f64 buffer and image atomics. This just makes
it easier to test subtargets that only have some of these
instructions.
Differential Revision: https://reviews.llvm.org/D120407
Nikita Popov [Wed, 23 Feb 2022 14:49:12 +0000 (15:49 +0100)]
[InstCombine] Support min/max intrinsics in udiv->lshr fold
This complements the existing fold for selects. This fold is a bit
more conservative, requiring one-use. The other folds here should
probably also be subjected to a one-use restriction.
https://alive2.llvm.org/ce/z/Q9eCDU
https://alive2.llvm.org/ce/z/8YK2CJ
Nikita Popov [Wed, 23 Feb 2022 14:44:37 +0000 (15:44 +0100)]
[InstCombine] Add tests for udiv->lshr fold with min/max intrinsics (NFC)
Stefan Pintilie [Tue, 22 Feb 2022 21:17:18 +0000 (15:17 -0600)]
[PowerPC] Add the Power10 LXVKQ instrution.
Add the Power 10 instruction LXVKQ.
This patch was taken from an original patch by: Yi-Hong Lyu
Reviewed By: lei
Differential Revision: https://reviews.llvm.org/D117507
Jan Svoboda [Wed, 23 Feb 2022 13:51:40 +0000 (14:51 +0100)]
[clang][deps] Return the whole TU command line
The dependency scanner already generates canonical -cc1 command lines that can be used to compile discovered modular dependencies.
For translation unit command lines, the scanner only generates additional driver arguments the build system is expected to append to the original command line.
While this works most of the time, there are situations where that's not the case. For example with `-Wunused-command-line-argument`, Clang will complain about the `-fmodules-cache-path=` argument that's not being used in explicit modular builds. Combine that with `-Werror` and the build outright fails.
To prevent such failures, this patch changes the dependency scanner to return the full driver command line to compile the original translation unit. This gives us more opportunities to massage the arguments into something reasonable.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D118986
Jan Svoboda [Wed, 23 Feb 2022 14:18:00 +0000 (15:18 +0100)]
[clang][deps] NFC: Update documentation
In D113473, the dependency scanner stopped emitting "-fmodule-map-file=" arguments. Potential build systems are expected to not add any such arguments on their own. This commit removes mentions of such arguments to avoid confusion.
Rainer Orth [Wed, 23 Feb 2022 14:43:12 +0000 (15:43 +0100)]
[MC][ELF] Use SHF_SUNW_NODISCARD instead of SHF_GNU_RETAIN on Solaris
As requested in D107955 <https://reviews.llvm.org/D107955>, this patch
splits off the `MC` and `CodeGen` parts and adds a testcase.
Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D120318
Rainer Orth [Wed, 23 Feb 2022 14:41:43 +0000 (15:41 +0100)]
[ELF] Use SHF_SUNW_NODISCARD instead of SHF_GNU_RETAIN on Solaris
Instead of the GNU extension `SHF_GNU_RETAIN`, Solaris provides equivalent
functionality with `SHF_SUNW_NODISCARD`. This patch implements the necessary
support.
Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D107955
Nikita Popov [Wed, 23 Feb 2022 14:24:44 +0000 (15:24 +0100)]
[InstCombine] Further simplify udiv -> lshr folding
Rather than queuing up actions, have one function that does the
log2() fold in the obvious way, but with a flag that allows us
to check whether the fold will succeed without actually performing
it.
Aaron Ballman [Wed, 23 Feb 2022 14:11:34 +0000 (09:11 -0500)]
Silence some "not all control paths return a value" warnings; NFC
Sanjay Patel [Wed, 23 Feb 2022 14:06:11 +0000 (09:06 -0500)]
[InstSimplify] remove shift that is redundant with part of funnel shift
In D111530, I suggested that we add some relatively basic pattern-matching
folds for shifts and funnel shifts and avoid a more specialized solution
if possible.
We can start by implementing at least one of these in IR because it's
easier to write the code and verify with Alive2:
https://alive2.llvm.org/ce/z/qHpmNn
This will need to be adapted/extended for SDAG to handle the motivating
bug ( #49541 ) because the patterns only appear later with that example
(added some tests:
bb850d422b64)
This can be extended within InstSimplify to handle cases where we 'and'
with a shift too (in that case, kill the funnel shift).
We could also handle patterns where the shift and funnel shift directions
are inverted, but I think it's better to canonicalize that instead to
avoid pattern-match case explosion.
Differential Revision: https://reviews.llvm.org/D120253
Aaron Ballman [Wed, 23 Feb 2022 14:07:54 +0000 (09:07 -0500)]
Remove unused function; NFC
Jez Ng [Wed, 23 Feb 2022 13:57:54 +0000 (08:57 -0500)]
[lld-macho][nfc] Refactor MarkLive
This mirrors the code structure in `lld/ELF`. It also paves the way for
an upcoming diff where I templatize things.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D120376
Jez Ng [Wed, 23 Feb 2022 13:57:52 +0000 (08:57 -0500)]
[lld-macho][nfc] Move ICF-specific logic into ICF.cpp
This mirrors the code organization in `lld/ELF`.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D120378
Stanislav Gatev [Wed, 23 Feb 2022 13:38:51 +0000 (13:38 +0000)]
Revert "Revert "[clang][dataflow] Add support for global storage values""
This reverts commit
169e1aba55bed9f7ffa000f9f170ab2defbc40b2.
It also fixes an incorrect assumption in `initGlobalVars`.
Nikita Popov [Wed, 23 Feb 2022 13:52:56 +0000 (14:52 +0100)]
[InstCombine] Simplify udiv -> lshr folding
What we're really doing here is converting Op0 udiv Op1 into
Op0 lshr log2(Op1), so phrase it in that way. Actually pushing
the lshr into the log2(Op1) expression should be seen as a separate
transform.
Pavel Labath [Wed, 23 Feb 2022 13:51:55 +0000 (14:51 +0100)]
Fix HostProcessWindows for D120321
Jan Svoboda [Wed, 23 Feb 2022 13:15:47 +0000 (14:15 +0100)]
[clang][modules] Infer framework modules in explicit builds
This patch enables inferring framework modules in explicit builds in all contexts. Until now, inferring framework modules only worked with `-fimplicit-module-maps` due to this block of code:
```
// HeaderSearch::loadFrameworkModule
case LMM_InvalidModuleMap:
// Try to infer a module map from the framework directory.
if (HSOpts->ImplicitModuleMaps)
ModMap.inferFrameworkModule(Dir, IsSystem, /*Parent=*/nullptr);
break;
```
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D113880
Timm Bäder [Mon, 21 Feb 2022 15:01:13 +0000 (16:01 +0100)]
[clang][driver][wasm] Fix libstdc++ target-dependent include dir
The triple goes after the gcc version, not before. Also add the
/backward version.
Differential Revision: https://reviews.llvm.org/D120251
serge-sans-paille [Wed, 23 Feb 2022 13:28:56 +0000 (14:28 +0100)]
Add missing <ctime> include
As a follow-up to
eb4c8608115c1c9af0fc8cb5b1e9f2bc960014ef
Should fix http://45.33.8.238/win/53749/step_4.txt
Related to https://reviews.llvm.org/D120195
Pavel Labath [Mon, 21 Feb 2022 14:08:23 +0000 (15:08 +0100)]
[lldb] Simplify HostThreadMacOSX
The class is using an incredibly elaborate setup to create and destroy
an NSAutoreleasePool object. We can do it in a much simpler way by
making those calls inside our thread startup function.
The only effect of this patch is that the pool gets released at the end
of the ThreadCreateTrampoline function, instead of slightly later, when
pthreads begin thread-specific cleanup. However, the key destruction
order is unspecified, so nothing should be relying on that.
I didn't find a specific reason for why this would have to be done that
way in git history. It seems that before D5198, this was thread-specific
keys were the only way an os implementation (in Host::ThreadCreated)
could attach some value to a thread.
Differential Revision: https://reviews.llvm.org/D120322