Aart Bik [Tue, 15 Mar 2022 02:31:23 +0000 (19:31 -0700)]
[mlir][sparse] more test cases for linalg.index
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D121660
Roman Lebedev [Tue, 15 Mar 2022 15:36:39 +0000 (18:36 +0300)]
[X86] Fix AMD Znver3 model checks
While `-march=` is correctly detected as `znver3` for the cpu,
apparently the model check is incorrect:
```
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 5950X 16-Core Processor
CPU family: 25
Model: 33
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 0
Frequency boost: disabled
CPU max MHz: 6017.8462
CPU min MHz: 2200.0000
BogoMIPS: 8050.07
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse
3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_p
state ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbn
oinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 8 MiB (16 instances)
L3: 64 MiB (2 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling
Srbds: Not affected
Tsx async abort: Not affected
```
Model is 33 (0x21), while the code was expecting it to be `0x00 .. 0x1F`.
https://github.com/torvalds/linux/blob/v5.17-rc8/drivers/hwmon/k10temp.c#L432-L453 agrees.
I'm not sure if other ranges listed here should also be accepted.
I noticed this while implementing CPU model detection
for halide (https://github.com/halide/Halide/pull/6648)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D121708
Yi Kong [Tue, 15 Mar 2022 17:17:17 +0000 (01:17 +0800)]
Revert "[clang][driver] Emit a warning if -xc/-xc++ is after the last input file"
This reverts commit
1c1a4b9556db8579f7428605ac2c351bddde9ad5.
Some builders failed.
Yi Kong [Tue, 15 Mar 2022 10:50:06 +0000 (18:50 +0800)]
[clang][driver] Emit a warning if -xc/-xc++ is after the last input file
This follows the same warning GCC produces.
Differential Revision: https://reviews.llvm.org/D121683
Joe Nash [Tue, 15 Mar 2022 16:36:51 +0000 (12:36 -0400)]
[AMDGPU] Fix typo consecutive in GCNNSAReassign
Jonas Devlieghere [Tue, 15 Mar 2022 16:44:33 +0000 (09:44 -0700)]
Revert "[lldb] Synchronize output through the IOHandler"
This reverts commit
242c574dc03e4b90e992cc8d07436efc3954727f because it
breaks the following tests on the bots:
- TestGuiExpandThreadsTree.py
- TestBreakpointCallbackCommandSource.py
Igor Kudrin [Tue, 15 Mar 2022 16:46:22 +0000 (20:46 +0400)]
[llvm-cov gcov] Fix calculating coverage of template functions
Template functions share the same lines in source files, so the common
container of lines' properties cannot be used to calculate the coverage
statistics of individual functions.
> cat tmpl.cpp
template <int N> int test() { return N; }
int main() { return test<1>() + test<2>(); }
> clang++ --coverage tmpl.cpp -o tmpl
> ./tmpl
> llvm-cov gcov tmpl.cpp -f
...
Function '_Z4testILi1EEiv'
Lines executed:100.00% of 1
Function '_Z4testILi2EEiv'
Lines executed:-nan% of 0
...
> llvm-cov-patched gcov tmpl.cpp -f
...
Function '_Z4testILi1EEiv'
Lines executed:100.00% of 1
Function '_Z4testILi2EEiv'
Lines executed:100.00% of 1
...
Differential Revision: https://reviews.llvm.org/D121390
Shafik Yaghmour [Tue, 15 Mar 2022 16:34:12 +0000 (09:34 -0700)]
[LLDB] Fixing DWARFExpression handling of ValueType::FileAddress case for DW_OP_deref_size
Currently DW_OP_deref_size just drops the ValueType::FileAddress case and does
not attempt to handle it. This adds support for this case and a test that
verifies this support.
I did a little refactoring since DW_OP_deref and DW_OP_deref_size have some
overlap in code.
Also see: rdar://
66870821
Differential Revision: https://reviews.llvm.org/D121408
Andrzej Warzynski [Tue, 15 Mar 2022 13:16:47 +0000 (13:16 +0000)]
[flang][lowering] Add support for lowering of the `ibits` intrinsic
This patch adds support for lowering of the `ibits` intrinsic from Fortran
to the FIR dialect of MLIR.
This is part of the upstreaming effort from the `fir-dev` branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project
Differential Revision: https://reviews.llvm.org/D121693
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Jonas Devlieghere [Tue, 15 Mar 2022 16:29:39 +0000 (09:29 -0700)]
[lldb] Make the PlatformMacOSX unit test Apple specific
I thought that x86GetSupportedArchitectures would always return
x86_64-apple-macosx as a compatible architecture, regardless of the host
achitecture, but the Debian bot disagrees with that.
Jonas Devlieghere [Tue, 15 Mar 2022 16:13:57 +0000 (09:13 -0700)]
[lldb] Synchronize output through the IOHandler
Add synchronization to the IOHandler to prevent multiple threads from
writing concurrently to the output or error stream.
A scenario where this could happen is when a thread (the default event
thread for example) is using the debugger's asynchronous stream. We
would delegate this operation to the IOHandler which might be running on
another thread. Until this patch there was nothing to synchronize the
two at the IOHandler level.
Differential revision: https://reviews.llvm.org/D121500
Andrzej Warzynski [Tue, 15 Mar 2022 12:31:20 +0000 (12:31 +0000)]
[flang][lowering] Add support for lowering the `dim` intrinsic
This patch adds support for lowering of the `dim` intrinsic from Fortran
to the FIR dialect of MLIR.
This is part of the upstreaming effort from the `fir-dev` branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project
Differential Revision: https://reviews.llvm.org/D121689
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Siva Chandra Reddy [Tue, 15 Mar 2022 08:26:41 +0000 (08:26 +0000)]
[libc] Add implementation of POSIX lseek function.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D121676
Jonas Devlieghere [Tue, 15 Mar 2022 15:50:33 +0000 (08:50 -0700)]
[lldb] Fix platform selection on Apple Silicon (again)
This patch is another attempt to fix platform selection on Apple
Silicon. It partially undoes D117340 which tried to fix the issue by
always instantiating a remote-ios platform for "iPhone and iPad Apps on
Apple Silicon Macs".
While the previous patch worked for attaching, it broke launching and
everything else that expects the remote platform to be connected. I made
an attempt to work around that, but quickly found out that there were
just too may places that had this assumption baked in.
This patch takes a different approach and reverts back to marking the
host platform compatible with iOS triples. This brings us back to the
original situation where platform selection was broken for remote iOS
debugging on Apple Silicon. To fix that, we now look at the process'
host architecture to differentiate between iOS binaries running remotely
and iOS binaries running locally.
I tested the following scenarios, which now all uses the desired
platform:
- Launching an iOS binary on macOS: uses the host platform
- Attaching to an iOS binary on macOS: uses the host platform
- Attaching to a remote iOS binary: uses the remote-ios platform
rdar://
89840215
Differential revision: https://reviews.llvm.org/D121444
Andrzej Warzynski [Tue, 15 Mar 2022 10:58:50 +0000 (10:58 +0000)]
[flang][lowering] Add support for lowering the `dot_product` intrinsic
This patch adds support for lowering the `dot_product` intrinsic from
Fortran to the FIR dialect of MLIR.
This is part of the upstreaming effort from the `fir-dev` branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project
Differential Revision: https://reviews.llvm.org/D121684
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Mark Leair <leairmark@gmail.com>
Bixia Zheng [Mon, 14 Mar 2022 22:43:12 +0000 (15:43 -0700)]
[mlir][sparse][taco] Reorder a class.
Define IndexExpr before IndexVar. This is to prepare for the next change
to support the use of index values in tensor expressions.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121649
Craig Topper [Tue, 15 Mar 2022 15:27:38 +0000 (08:27 -0700)]
[LegalizeTypes][RISCV][WebAssembly] Expand ABS in PromoteIntRes_ABS if it will expand to sra+xor+sub later.
If we promote the ABS and then Expand in LegalizeDAG, then both the
sra and the xor will have their inputs sign extended. This generates
extra code on RISCV which lacks an i8 or i16 sign extend instructon.
If we expand during type legalization, then only the sra will get its
input sign extended. RISCV is able to combine this with the sra by
doing a shift left followed by an sra.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D121664
Craig Topper [Tue, 15 Mar 2022 15:23:55 +0000 (08:23 -0700)]
[DAGCombiner][RISCV] Adjust (aext (and (trunc x), cst)) -> (and x, cst) to sext cst based on target preference
RISCV strong prefers i32 values be sign extended to i64. This combine
was always zero extending the constant using APInt methods.
This adjusts the code so that it calls getNode using ISD::ANY_EXTEND instead.
getNode will call TLI.isSExtCheaperThanZExt to decide how to handle
the constant.
Tests were copied from D121598 where I noticed that we were creating
constants that were hard to materialize.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D121650
Pavel Labath [Tue, 15 Mar 2022 15:23:43 +0000 (16:23 +0100)]
Revert "[lldb/test] Make category-skipping logic "platform"-independent"
This reverts commit
dddf4ce034a8e06cc1351492dceece3fa2344c14. It breaks
a couple of tests on macos.
Craig Topper [Tue, 15 Mar 2022 15:17:21 +0000 (08:17 -0700)]
[RISCV] Remove lowerSPLAT_VECTOR
This code handles fixed vector SPLAT_VECTOR, but is never called in
any tests.
We only form fixed vector splat vectors for vXi64 on RV32 as part
of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS.
So the Custom handling for SPLAT_VECTOR is never needed.
This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that
DAGCombine will create it, but there's no need for Custom handler.
It will still be type legalized to SPLAT_VECTOR_PARTS.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D121673
Alex Brachet [Tue, 15 Mar 2022 15:18:41 +0000 (15:18 +0000)]
[libc] Fix exit not calling new handlers registered from a call to atexit in atexit handler
Alex Brachet [Tue, 15 Mar 2022 15:11:57 +0000 (15:11 +0000)]
[libc][BlockStore] Add back, pop_back and empty methods
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D121656
Yitzhak Mandelbaum [Tue, 15 Mar 2022 12:57:26 +0000 (12:57 +0000)]
[clang][dataflow] Allow disabling built-in transfer functions for CFG terminators
Terminators are handled specially in the transfer functions so we need an
additional check on whether the analysis has disabled built-in transfer
functions.
Differential Revision: https://reviews.llvm.org/D121694
Sanjay Patel [Tue, 15 Mar 2022 14:34:48 +0000 (10:34 -0400)]
[InstCombine] try harder to propagate 'nsz' through fneg-of-select
This can be viewed as swapping the select arms:
https://alive2.llvm.org/ce/z/jUvFMJ
...so we don't have the 'nsz' problem with the more general fold.
This unlocks other folds for the motivating fabs example.
This was discussed in issue #38828.
Sanjay Patel [Tue, 15 Mar 2022 12:50:06 +0000 (08:50 -0400)]
[InstCombine] add tests for fneg-of-select with FMF; NFC
Louis Dionne [Mon, 7 Mar 2022 21:58:16 +0000 (16:58 -0500)]
[libc++] Overhaul all tests for assertions and debug mode
Prior to this patch, there was no distinction between tests that check
basic assertions and tests that check full-fledged iterator debugging
assertions. Both were disabled when support for the debug mode is not
provided in the dylib, which is stronger than it needs to be.
Furthermore, all of the tests using "debug_macros.h" that contain more
than one assertion in them were broken -- any code after the first
assertion would never be executed.
This patch refactors all of our assertion-related tests to:
1. Be enabled whenever they can, i.e. basic assertions tests are run
even when the debug mode is disabled.
2. Use the superior `check_assertion.h` (previously `debug_mode_helper.h`)
instead of `debug_macros.h`, which allows multiple assertions in the
same program.
3. Coalesce some tests into the same file to make them more readable.
4. Use consistent naming for test files -- no more db{1,2,3,...,10} tests.
This is a large but mostly mechanical patch.
Differential Revision: https://reviews.llvm.org/D121462
Simon Moll [Tue, 15 Mar 2022 13:03:36 +0000 (14:03 +0100)]
[VE] strided v256.23 isel and tests
ISel for experimental.vp.strided.load|store for v256.32 types via
lowering to vvp_load|store SDNodes.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D121616
Sam Carroll [Tue, 15 Mar 2022 14:22:31 +0000 (15:22 +0100)]
[mlir] Fix --convert-func-to-llvm=emit-c-wrappers argument and result attribute handling
When using `--convert-func-to-llvm=emit-c-wrappers` the attribute arguments of the wrapper would not be created correctly in some cases.
This patch fixes that and introduces a set of tests for (hopefully) all corner cases.
See https://github.com/llvm/llvm-project/issues/53503
Author: Sam Carroll <sam.carroll@lmns.com>
Co-Author: Laszlo Kindrat <laszlo.kindrat@lmns.com>
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119895
Tue Ly [Mon, 14 Mar 2022 13:43:33 +0000 (09:43 -0400)]
[libc] Implement expm1f function that is correctly rounded for all rounding modes.
Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.
From exhaustive testings, using expf implementation, and subtract 1.0 before rounding the final result to single precision
gives correctly rounded results for all |x| > 2^-4 with 1 exception. When |x| < 2^-25, we use x + x^2 (implemented with a
single fma). And for 2^-25 <= |x| <= 2^-4, we use a single degree-8 minimax polynomial generated by Sollya.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121574
Simon Pilgrim [Tue, 15 Mar 2022 14:17:38 +0000 (14:17 +0000)]
[InstCombine] Add general constant support to eq/ne icmp(add(X,C1),add(Y,C2)) -> icmp(add(X,C1-C2),Y) fold
A further extension for Issue #32161
For eq/ne comparisons - the sign mismatch and bounds constraints are redundant, so if the that fold fails, fallback and just fold the constants directly.
https://alive2.llvm.org/ce/z/cdodNQ
The loop rotation test change looks mostly benign - the backend doesn't seem to suffer? https://gcc.godbolt.org/z/dErMY78To
Differential Revision: https://reviews.llvm.org/D121551
Simon Pilgrim [Tue, 15 Mar 2022 14:13:28 +0000 (14:13 +0000)]
[JITLink] Fix -Wparentheses warning in R_RISCV_SUB6 case.
Perform the mask inside parentheses before applying the offset
Ties Stuij [Tue, 15 Mar 2022 13:24:32 +0000 (13:24 +0000)]
[AARCH64] ssbs should be enabled by default for cortex-x1, cortex-x1c, cortex-a77
Reviewed By: amilendra
Differential Revision: https://reviews.llvm.org/D121206
Arnamoy Bhattacharyya [Tue, 15 Mar 2022 13:41:04 +0000 (09:41 -0400)]
[MLIR][OpenMP] Add support for basic SIMD construct
Patch adds a new operation for the SIMD construct. The op is designed to be very similar to the existing `wsloop` operation, so that the `CanonicalLoopInfo` of `OpenMPIRBuilder` can be used.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D118065
Steven Wu [Tue, 15 Mar 2022 13:13:51 +0000 (06:13 -0700)]
[ASAN] Fix darwin-interface test
Fix darwin interface test after D121464. asan_rtl_x86_64.S is not
available on Darwin.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D121636
LLVM GN Syncbot [Tue, 15 Mar 2022 13:08:09 +0000 (13:08 +0000)]
[gn build] Port
7262eacd4199
Wael Yehia [Tue, 15 Mar 2022 13:01:00 +0000 (13:01 +0000)]
Revert "Load pass plugins during option processing, so that plugin options are registered and live."
This reverts commit
5e8700ce8bf58bdf0a59eef99c85185a74177555.
Simon Pilgrim [Tue, 15 Mar 2022 13:01:22 +0000 (13:01 +0000)]
Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
Pavel Labath [Tue, 15 Mar 2022 12:48:27 +0000 (13:48 +0100)]
Remove a top-level "using namespace" in TargetTransformInfoImpl.h
Avoids polluting the namespace of all files including the header.
Wael Yehia [Mon, 14 Mar 2022 02:16:21 +0000 (22:16 -0400)]
Load pass plugins during option processing, so that plugin options are registered and live.
Pavel Labath [Mon, 14 Mar 2022 15:32:51 +0000 (16:32 +0100)]
[lldb/test] Make category-skipping logic "platform"-independent
The decision which categories are relevant for a particular test run
happen very early in the test setup process. They use the SBPlatform
object to determine which categories should be skipped. The platform
object created for this purpose transcends individual test runs.
This setup is not compatible with the direction discussed in
<https://discourse.llvm.org/t/multiple-platforms-with-the-same-name/59594>
-- when platform objects are tied to a specific (SB)Debugger, they need
to be created alongside it, which currently happens in the test setUp
method.
This patch is the first step in that direction -- it rewrites the
category skipping logic to avoid depending on a global SBPlatform
object. Fortunately, the skipping logic is fairly simple (and I believe
it outght to stay that way) and mainly consists of comparing the
platform name against some hardcoded lists. This patch bases this
comparison on the platform name instead of the os part of the triple (as
reported by the platform).
Differential Revision: https://reviews.llvm.org/D121605
Florian Hahn [Tue, 15 Mar 2022 12:32:06 +0000 (12:32 +0000)]
[BasicAA] Add test showing incorrect noalias result with wrapping.
@mul_may_overflow_var_nonzero_minabsvarindex_one_index shows BasicAA
incorrectly determining noalias for (%gep.917, i8* %gep.idx).
If %v ==
10581764700698480926, %idx == 917 and the GEPs alias.
https://alive2.llvm.org/ce/z/yzDgnn
Matthias Springer [Tue, 15 Mar 2022 12:22:21 +0000 (21:22 +0900)]
[mlir][bufferize] Extract buffer hoisting into separate function
This improves the modularity of the bufferization.
From now on, all ops that do not implement BufferizableOpInterface are considered hoisting barriers. Previously, all ops that do not implement the interface were not considered barriers and such ops had to be marked as barriers explicitly. This was unsafe because we could've hoisted across unknown ops where it was not safe to hoist.
As a side effect, this allows for cleaning up AffineBufferizableOpInterfaceImpl. This build unit no longer needed and can be deleted.
Differential Revision: https://reviews.llvm.org/D121519
Marek Kurdej [Tue, 15 Mar 2022 12:16:08 +0000 (13:16 +0100)]
[clang-format] Correctly format variable templates.
Fixes https://github.com/llvm/llvm-project/issues/54257.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D121456
Simon Pilgrim [Tue, 15 Mar 2022 12:16:11 +0000 (12:16 +0000)]
[X86] combineSelect - canonicalize (vXi1 bitcast(iX Cond)) with combineToExtendBoolVectorInReg before legalization
This replaces the attempt in
20af71f8ec47319d375a871db6fd3889c2487cbd to use combineToExtendBoolVectorInReg to create X86ISD::BLENDV masks directly, instead we use it to canonicalize the iX bitcast to a sign-extended mask and then truncate it back to vXi1 prior to legalization breaking it apart.
Fixes #53760
Marek Kurdej [Tue, 15 Mar 2022 11:00:37 +0000 (12:00 +0100)]
[clang-format] Add regression tests for function ref qualifiers on operator definition. NFC.
Fixes https://github.com/llvm/llvm-project/issues/54374.
Florian Hahn [Tue, 15 Mar 2022 11:13:18 +0000 (11:13 +0000)]
[LV] Make reduction-order.ll test independent of instruction naming.
Also update test to not use branch on undef.
Dmitry Makogon [Tue, 15 Mar 2022 10:10:41 +0000 (17:10 +0700)]
[NFC] Add LazyValueInfo::clear method
This method just calls LazyValueInfoImpl::clear
Marek Kurdej [Mon, 14 Mar 2022 11:00:19 +0000 (12:00 +0100)]
[clang-format] Correctly recognize arrays in template parameter list.
Fixes https://github.com/llvm/llvm-project/issues/54245.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D121584
Ivan Butygin [Sun, 13 Mar 2022 10:56:59 +0000 (13:56 +0300)]
[mlir][gpu] Introduce gpu.global_id op
Introduce OpenCL-style global_id op and corresponding spirv lowering.
Differential Revision: https://reviews.llvm.org/D121548
Ivan Butygin [Mon, 14 Mar 2022 13:24:08 +0000 (16:24 +0300)]
[mlir][spirv] Add AssumeTrueKHROp
Differential Revision: https://reviews.llvm.org/D121601
Matthias Springer [Tue, 15 Mar 2022 08:50:09 +0000 (17:50 +0900)]
[mlir][bufferize][NFC] Deallocate all buffers at the end of bufferization
This makes bufferization more modular. This is in preparation of future refactorings.
Differential Revision: https://reviews.llvm.org/D121362
Nikita Popov [Thu, 3 Mar 2022 10:55:35 +0000 (11:55 +0100)]
[OpenMPOpt] Avoid pointer element type access during region merging
Hardcode the function type as ParallelTask, which is the guaranteed
pointee type of this runtime function argument (if pointee types
exist). The elimination of the callee bitcast is left for InstCombine.
Differential Revision: https://reviews.llvm.org/D120885
Matthias Springer [Tue, 15 Mar 2022 08:34:46 +0000 (17:34 +0900)]
[mlir][bufferize][NFC] Split BufferizationState into AnalysisState/BufferizationState
Differential Revision: https://reviews.llvm.org/D121361
Jean Perier [Tue, 15 Mar 2022 08:32:43 +0000 (09:32 +0100)]
[flang] fulfill -Msave/-fno-automatic in main programs too
`semantics::IsSaved()` was not applying -Msave/-fno-automatic for main programs.
This caused issues since lowering relies on it to allocate static
variables. This did not match nvfortran/gfortran behaviors where
-fno-automatic/-Msave control the static allocation of scalars in
main programs.
Some program may rely on main program scalars to be statically allocated in
bss (and therefore initialized to zero) with -Msave/-fno-automatic
flags.
Differential Revision: https://reviews.llvm.org/D121603
Matthias Springer [Tue, 15 Mar 2022 08:25:56 +0000 (17:25 +0900)]
[mlir][bufferize] Fix config not passed to greedy rewriter
Also add a TODO to switch to a custom walk instead of the GreedyPatternRewriter, which should be more efficient. (The bufferization pattern is guaranteed to apply only a single time for every op, so a simple walk should suffice.)
We currently specify a top-to-bottom walk order. This is important because other walk orders could introduce additional casts and/or buffer copies. These canonicalize away again, but it is more efficient to never generate them in the first place.
Note: A few of these canonicalizations are not yet implemented.
Differential Revision: https://reviews.llvm.org/D121518
Siva Chandra Reddy [Tue, 15 Mar 2022 08:29:58 +0000 (08:29 +0000)]
[libc][Obvious] Fix typo in CMake file.
Jean Perier [Tue, 15 Mar 2022 08:23:50 +0000 (09:23 +0100)]
[flang] Hanlde COMPLEX 2/3/10 in runtime TypeCode(cat, kind)
Type codes for COMPLEX kinds 2, 3, and 10 were added in https://reviews.llvm.org/D117336
but handling for these kinds in TypeCode(cat, kind) has not been added
yet.
Differential Revision: https://reviews.llvm.org/D121587
Fangrui Song [Tue, 15 Mar 2022 08:24:01 +0000 (01:24 -0700)]
[MachineLICM] Simplify code and avoid adding nullptr values to ParentMap. NFC
Florian Hahn [Tue, 15 Mar 2022 08:22:30 +0000 (08:22 +0000)]
[LV] Remove LoopVectorBody from InnerLoopVectorizer. (NFCI)
Update places still referencing LoopVectorBody to use the vector loop to
get the vector loop header. This is needed to move vector loop
code-generation to VPlan completely, which in turn is needed to model
pre-header & exit blocks in VPlan as well.
River Riddle [Mon, 7 Mar 2022 09:33:58 +0000 (01:33 -0800)]
[mlir] Remove the deprecated ODS Op verifier/parser/printer code blocks
These have been deprecated for ~1 month now and can be removed.
Differential Revision: https://reviews.llvm.org/D121090
Stanislav Gatev [Mon, 14 Mar 2022 14:52:35 +0000 (14:52 +0000)]
[clang][dataflow] Model the behavior of non-standard optional constructors
Model nullopt, inplace, value, and conversion constructors.
Reviewed-by: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D121602
Adrian Kuegel [Tue, 15 Mar 2022 08:04:07 +0000 (09:04 +0100)]
[mlir][Bazel] Adjust build file to account for new td files.
Qiu Chaofan [Tue, 15 Mar 2022 07:52:24 +0000 (15:52 +0800)]
[PowerPC] Disable perfect shuffle by default
We are going to remove the old 'perfect shuffle' optimization since it
brings performance penalty in hot loop around vectors. For example, in
following loop sharing the same mask:
%v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
%v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead
of `vperm-vperm`. In some large loop cases, this causes 20%+ performance
penalty.
The original attempt to resolve this is to pre-record masks of every
shufflevector operation in DAG, but that is somewhat complex and brings
unnecessary computation (to scan all nodes) in optimization. Here we
disable it by default. There're indeed some cases becoming worse after
this, which will be fixed in a more careful way in future patches.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D121082
River Riddle [Sat, 12 Mar 2022 03:02:53 +0000 (19:02 -0800)]
[mlir] Refactor how parser/printers are specified for AttrDef/TypeDef
There is currently an awkwardly complex set of rules for how a
parser/printer is generated for AttrDef/TypeDef. It can change depending on if a
mnemonic was specified, if there are parameters, if using the assemblyFormat, if
individual parser/printer code blocks were specified, etc. This commit refactors
this to make what the attribute/type wants more explicit, and to better align
with how formats are specified for operations.
Firstly, the parser/printer code blocks are removed in favor of a
`hasCustomAssemblyFormat` bit field. This aligns with the operation format
specification (and is nice to remove code blocks from ODS).
This commit also adds a requirement to explicitly set `assemblyFormat` or
`hasCustomAssemblyFormat` when the mnemonic is set and the attr/type
has no parameters. This removes the weird implicit matrix of behavior,
and also encourages the author to make a conscious choice of either C++
or declarative format instead of implicitly opting them into the C++
format (we should be pushing towards declarative when possible).
Differential Revision: https://reviews.llvm.org/D121505
River Riddle [Thu, 17 Feb 2022 06:53:09 +0000 (22:53 -0800)]
[mlir] Rewrite and modernize the documentation for defining Attributes/Types
The current documentation is super old, crusty, and at times wrong. This commit
rewrites the documentation to focus on the TableGen declarative definition,
expounds on various components, and moves the doc out of Tutorials/ and into
a new top level `AttributesAndTypes.md` doc. As part of this, the AttrDef/TypeDef
documentation in OpDefinitions.md is removed.
Differential Revision: https://reviews.llvm.org/D120011
River Riddle [Fri, 28 Jan 2022 05:58:31 +0000 (21:58 -0800)]
[mlir] Split out AttrDef/TypeDef and pattern constructs from OpBase.td
OpBase.td has formed into a huge monolith of all ODS constructs. This
commits starts to rectify that by splitting out some constructs to their
own .td files.
Differential Revision: https://reviews.llvm.org/D118636
Mogball [Tue, 15 Mar 2022 07:12:37 +0000 (07:12 +0000)]
[mlir][ods] Add support for custom directive in attr/type formats
This patch adds support for custom directives in attribute and type formats. Custom directives dispatch calls to user-defined parser and printer functions.
For example, the assembly format "custom<Foo>($foo, ref($bar))" expects a function with the signature
```
LogicalResult parseFoo(AsmParser &parser, FailureOr<FooT> &foo, BarT bar);
void printFoo(AsmPrinter &printer, FooT foo, BarT bar);
```
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120944
esmeyi [Tue, 15 Mar 2022 06:40:50 +0000 (02:40 -0400)]
[NFC][XCOFF] Refactor and format XCOFFObjectWriter.cpp.
Reviewed By: jhenderson, DiggerLin
Differential Revision: https://reviews.llvm.org/D120858
Fangrui Song [Tue, 15 Mar 2022 06:15:15 +0000 (23:15 -0700)]
[llvm-objcopy] Simplify CompressedSection creation. NFC
Remove Expected<CompressedSection> factory functions in favor of constructors
now that zlib::compress returns void (D121512).
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D121644
Fangrui Song [Tue, 15 Mar 2022 05:33:08 +0000 (22:33 -0700)]
[MC][test] Add more .loc directives to improve portability with older zlib
Make .debug_line so larger so that MC will more assuredly compress .debug_line
(it doesn't compress a section if compressed content is not smaller).
Chris Lattner [Tue, 15 Mar 2022 04:56:14 +0000 (21:56 -0700)]
Revert "[mlirTranslateMain] Add a customization callback."
This reverts commit
f18d6af7e972ed0e2215ad098b4c5f52ccb68b5f.
This patch is a more controversial than I expected, it is
better to revert while the discussion continues. xref this
thread:
https://discourse.llvm.org/t/doc-mlir-translate-mlir-opt/60751/
xref this phab patch: https://reviews.llvm.org/D120970
Differential Revision: https://reviews.llvm.org/D121668
Thomas Raoux [Tue, 15 Mar 2022 04:55:07 +0000 (04:55 +0000)]
[mlir][nvvm] Fix bug in ldmatrix intrinsic conversion
The ldmatrix intrinsic trans option was inverted.
Bug found by @christopherbate!
Differential Revision: https://reviews.llvm.org/D121666
Jonas Devlieghere [Tue, 15 Mar 2022 04:54:07 +0000 (21:54 -0700)]
[lldb] Cleanup MacOSX platform headers (NFC)
While working on
dde487e54782 I noticed that the MacOSX platforms were
in need of some love. This patch cleans up the headers:
- Move platforms into the lldb_private namespace.
- Remove lldb_private:: prefixes to improve readability.
- Fix header includes and use forward declarations (iwyu).
- Fix formatting
Keith Smiley [Sat, 26 Feb 2022 05:12:56 +0000 (21:12 -0800)]
[clang] Fix DIFile directory root on Windows
On unix systems this logic would not separate the file and directory of
the DIFile unless they shared more components at the start than just the
root path character. The logic to do this was unix specific so it didn't
work on Windows. Now we check if the entire root_path is the same as
what you were going to set as the Dir and use the full filepath in that
case.
Differential Revision: https://reviews.llvm.org/D111579
Keith Smiley [Sat, 26 Feb 2022 04:40:21 +0000 (20:40 -0800)]
[test] Add lit helper for windows paths
This adds 2 new lit helpers `%{fs-src-root}` and `%{fs-sep}`, these
allow writing tests that correctly handle slashes on Windows. In the
case of tests like clang/test/CodeGen/debug-prefix-map.c, these are
unable to correctly test behavior on both platforms, unless they fork
and add OS requirements, because the relevant logic hits host specific
codepaths like checking if paths are absolute.
Differential Revision: https://reviews.llvm.org/D111457
Sam Clegg [Tue, 15 Mar 2022 02:57:12 +0000 (19:57 -0700)]
[WebAssembly] Fix asan issue from https://reviews.llvm.org/D121349
Ruiling Song [Tue, 11 Jan 2022 13:18:12 +0000 (21:18 +0800)]
AMDGPU: Use removeAllRegUnitsForPhysReg()
I met the issue here when working on something else.
Actually we have already reserved EXEC, but it looks
like the register coalescer is causing the sub-register
of EXEC appears in LiveIntervals. I have not looked
deeper why register coalscer have such behavior, but
removeAllRegUnitsForPhysReg() is the right way.
Reviewed By: critson, foad, arsenm
Differential Revision: https://reviews.llvm.org/D117014
Petr Hosek [Tue, 15 Mar 2022 02:01:26 +0000 (19:01 -0700)]
[CMake][Fuchsia] Use correct architecture for iossim
We should be building iossim for x86_64, not arm64.
Differential Revision: https://reviews.llvm.org/D121659
Jez Ng [Tue, 15 Mar 2022 01:51:15 +0000 (21:51 -0400)]
[lld-macho] -flat_namespace for dylibs should make all externs interposable
All references to interposable symbols can be redirected at runtime to
point to a different symbol definition (with the same name). For
example, if both dylib A and B define symbol _foo, and we load A before
B at runtime, then all references to _foo within dylib B will point to
the definition in dylib A.
ld64 makes all extern symbols interposable when linking with
`-flat_namespace`.
TODO 1: Support `-interposable` and `-interposable_list`, which should
just be a matter of parsing those CLI flags and setting the
`Defined::interposable` bit.
TODO 2: Set Reloc::FinalDefinitionInLinkageUnit correctly with this info
(we are currently not setting it at all, so we're erring on the
conservative side, but we should help the LTO backend generate more
optimal code.)
Reviewed By: modimo, MaskRay
Differential Revision: https://reviews.llvm.org/D119294
Jez Ng [Tue, 15 Mar 2022 01:51:11 +0000 (21:51 -0400)]
[lld-macho][nfc] Allow Defined symbols to be placed in binding sections
Previously, we only allowed this for DylibSymbols. However, in order to
properly support `-flat_namespace` as well as `-interposable`, we need
to allow this for Defined symbols too. Therefore we hoist the
`lazyBindOffset` and the `stubsHelperIndex` into the parent Symbol
class.
The actual change to support interposition under `-flat_namespace` is in
{D119294}; the NFC changes here have been split out for easier review.
Perf regression isn't stat sig on my 3.2 GHz 16-Core Intel Xeon W linking
chromium_framework:
base diff difference (95% CI)
sys_time 1.227 ± 0.021 1.234 ± 0.031 [ -0.3% .. +1.5%]
user_time 3.665 ± 0.036 3.674 ± 0.035 [ -0.2% .. +0.7%]
wall_time 4.596 ± 0.055 4.609 ± 0.064 [ -0.3% .. +0.9%]
samples 34 47
Max RSS regression is barely stat sig:
base diff difference (95% CI)
time
1003664356.324 ±
15404053.912
1010380403.613 ±
10578309.455 [ +0.0% .. +1.3%]
samples 37 31
Reviewed By: modimo
Differential Revision: https://reviews.llvm.org/D121351
Owen Pan [Mon, 14 Mar 2022 07:09:48 +0000 (00:09 -0700)]
[clang-format] Don't unwrap lines preceded by line comments
Fixes #53495
Differential Revision: https://reviews.llvm.org/D121576
Joseph Huber [Tue, 15 Mar 2022 01:51:38 +0000 (21:51 -0400)]
[OpenMP][Fix] Fix test failing after patch
Joseph Huber [Tue, 15 Mar 2022 01:16:45 +0000 (21:16 -0400)]
[OpenMP][Fix] Add offloading kind to AMDGPU libraries
Summary:
A previous patch added the offloading kind to the triple format we used.
I forgot to update the line where we add the AMDGPU libraries.
LLVM GN Syncbot [Tue, 15 Mar 2022 00:51:57 +0000 (00:51 +0000)]
[gn build] Port
9c542a5a4e1b
Julian Lettner [Wed, 9 Mar 2022 20:33:59 +0000 (12:33 -0800)]
Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.
Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be
removed in the future.
Differential Revision: https://reviews.llvm.org/D121327
Joseph Huber [Thu, 3 Mar 2022 19:56:13 +0000 (14:56 -0500)]
[CUDA] Add CUDA fatbinary magic
Nvidia uses fatbinaries to bundle all of their device code. This patch
adds the magic number "0x50ed55ba" used in their propeitary format to
the list of magic identifies. This is technically undocumented and could
unlikely be changed by Nvidia in the future.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D120932
Joseph Huber [Thu, 3 Mar 2022 18:07:44 +0000 (13:07 -0500)]
[OpenMP][NFC] Refactor new driver to be more general
This path refactors the new driver to be less dependent on OpenMP. This
is done in preparation for the new driver to be able to handle other
offloading kinds and compile them together.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120934
Joseph Huber [Mon, 21 Feb 2022 23:55:51 +0000 (18:55 -0500)]
[Clang] Add offload kind to embedded offload object
This patch adds the offload kind to the embedded section name in
preparation for offloading to different kinda like CUDA or HIP.
Depends on D120288
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120271
Joseph Huber [Mon, 21 Feb 2022 15:08:06 +0000 (10:08 -0500)]
[OpenMP] Implement dense map info for device file
This patch implements a DenseMap info struct for the device file type.
This is used to help grouping device files that have the same triple and
architecture. Because of this the filename, which will always be unique
for each file, is not used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120288
Joseph Huber [Mon, 21 Feb 2022 15:07:37 +0000 (10:07 -0500)]
[OpenMP] Try to embed offloading objects after codegen
Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a
codegen action, meaning we only handle this option correctly when the
input is a bitcode file. This patch adds the same handling to embed an
offloading object after we complete code generation. This allows us to
embed the object correctly if the input file is source or bitcode.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120270
Stanislav Mekhanoshin [Mon, 14 Mar 2022 19:39:52 +0000 (12:39 -0700)]
[AMDGPU] gfx940: disable OP_SEL on V_DOT instructions
Differential Revision: https://reviews.llvm.org/D121634
Vy Nguyen [Mon, 14 Mar 2022 22:41:57 +0000 (18:41 -0400)]
Reland "[lld-macho] Avoid using bump-alloc in TrieBuider""
This reverts commit
ee7a286cd3e4364d2f7d5b6ba4b8a6fc0d524854.
Stanislav Mekhanoshin [Thu, 10 Mar 2022 22:59:47 +0000 (14:59 -0800)]
[AMDGPU] Add symbolic names for gfx940 HWREGs
The namespaces of HWREGs is now overlapping with gfx10. Thus the
patch is longer than necessary to just support new names. It also
need to handle proper error messages, i.e. to issue a "specified
hardware register is not supported on this GPU" message.
This may need a major refactoring in the future.
Differential Revision: https://reviews.llvm.org/D121418
Andrew Browne [Mon, 14 Mar 2022 20:59:23 +0000 (13:59 -0700)]
[DFSan] Remove trampolines to unblock opaque pointers. (Reland with fix)
https://github.com/llvm/llvm-project/issues/54172
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D121250
Stanislav Mekhanoshin [Thu, 10 Mar 2022 22:04:16 +0000 (14:04 -0800)]
[AMDGPU] Support for gfx940 flat lds opcodes
Differential Revision: https://reviews.llvm.org/D121414
Stanislav Mekhanoshin [Thu, 10 Mar 2022 19:51:57 +0000 (11:51 -0800)]
[AMDGPU] Support gfx940 v_lshl_add_u64 instruction
Differential Revision: https://reviews.llvm.org/D121401
Ryan Prichard [Mon, 14 Mar 2022 22:43:50 +0000 (15:43 -0700)]
[ARM] __cxa_end_cleanup: avoid clobbering r4
The fix for D111703 clobbered r4 both to:
- Save/restore the original lr.
- Load the address of _Unwind_Resume for LIBCXXABI_BAREMETAL.
This patch saves and restores lr without clobbering any extra
registers.
For LIBCXXABI_BAREMETAL, it is still necessary to clobber one extra
register to hold the address of _Unwind_Resume, but it seems better to
use ip/r12 (intended for linker veneers/trampolines) than r4 for this
purpose.
The function also clobbers r0 for the _Unwind_Resume function's
parameter, but that is unavoidable.
Reviewed By: danielkiss, logan, MaskRay
Differential Revision: https://reviews.llvm.org/D121432
Dávid Bolvanský [Mon, 14 Mar 2022 22:39:52 +0000 (23:39 +0100)]
[Clang] noinline stmt attribute - emit warnings rather than errors
Compatible behaviour with always_inline stmt attribute
Jim Ingham [Mon, 14 Mar 2022 21:37:38 +0000 (14:37 -0700)]
Don't report memory return values on MacOS_arm64 of SysV_arm64 ABI's.
They don't require that the memory return address be restored prior to
function exit, so there's no guarantee the value is correct. It's better
to return nothing that something that's not accurate.
Differential Revision: https://reviews.llvm.org/D121348
Stanislav Mekhanoshin [Tue, 8 Mar 2022 21:33:48 +0000 (13:33 -0800)]
[AMDGPU] flat scratch SVS addressing mode for gfx940
Both VADDR and SADDR are used in SVS mode.
Differential Revision: https://reviews.llvm.org/D121254