Alfsonso Gregory [Sun, 3 Oct 2021 02:37:12 +0000 (08:07 +0530)]
[LLVM][IR] Fixed input arguments for Verifier getter
ParameterABIAttributes functions work with unsigned integers as the index, so having the getter be signed makes no sense. Additionally, for this reason, the loop vars that were signed were changed to unsigned too.
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D110344
Takafumi Arakaki [Sun, 3 Oct 2021 01:31:59 +0000 (21:31 -0400)]
Re-apply the fix on DwarfEHPrepare and add a test
This patch re-introduces the fix in the commit https://github.com/llvm/llvm-project/commit/
66b0cebf7f736 by @yrnkrn
> In DwarfEHPrepare, after all passes are run, RewindFunction may be a dangling
>
> pointer to a dead function. To make sure it's valid, doFinalization nullptrs
> RewindFunction just like the constructor and so it will be found on next run.
>
> llvm-svn: 217737
It seems that the fix was not migrated to `DwarfEHPrepareLegacyPass`.
This patch also updates `llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` to include `-run-twice` to exercise the cleanup. Without this patch `llvm-lit -v llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` fails with
```
-- Testing: 1 tests, 1 workers --
FAIL: LLVM :: CodeGen/X86/dwarf-eh-prepare.ll (1 of 1)
******************** TEST 'LLVM :: CodeGen/X86/dwarf-eh-prepare.ll' FAILED ********************
Script:
--
: 'RUN: at line 1'; /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice < /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -S | /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll
--
Exit Code: 2
Command Output (stderr):
--
Referencing function in another module!
call void @_Unwind_Resume(i8* %ehptr) #1
; ModuleID = '<stdin>'
void (i8*)* @_Unwind_Resume
; ModuleID = '<stdin>'
in function simple_cleanup_catch
LLVM ERROR: Broken function found, compilation aborted!
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice -S
1. Running pass 'Function Pass Manager' on module '<stdin>'.
2. Running pass 'Module Verifier' on function '@simple_cleanup_catch'
#0 0x000056121b570a2c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:0
#1 0x000056121b56eb64 llvm::sys::RunSignalHandlers() /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Signals.cpp:97:0
#2 0x000056121b56f28e SignalHandler(int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:397:0
#3 0x00007fc7e9b22980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
#4 0x00007fc7e87d3fb7 raise /build/glibc-S7xCS9/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
#5 0x00007fc7e87d5921 abort /build/glibc-S7xCS9/glibc-2.27/stdlib/abort.c:81:0
#6 0x000056121b4e1386 llvm::raw_svector_ostream::raw_svector_ostream(llvm::SmallVectorImpl<char>&) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:674:0
#7 0x000056121b4e1386 llvm::report_fatal_error(llvm::Twine const&, bool) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/ErrorHandling.cpp:114:0
#8 0x000056121b4e1528 (/home/arakaki/build/llvm-project/main/bin/opt+0x29e3528)
#9 0x000056121adfd03f llvm::raw_ostream::operator<<(llvm::StringRef) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:218:0
FileCheck error: '<stdin>' is empty.
FileCheck command line: /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll
--
********************
********************
Failed Tests (1):
LLVM :: CodeGen/X86/dwarf-eh-prepare.ll
Testing Time: 0.22s
Failed: 1
```
Reviewed By: loladiro
Differential Revision: https://reviews.llvm.org/D110979
Arthur O'Dwyer [Mon, 27 Sep 2021 04:58:56 +0000 (00:58 -0400)]
[libc++] [ranges] Uncomment operator<=> in transform and iota iterators.
The existing tests for transform_view::iterator weren't quite right,
and can be simplified now that we have more of C++20 available to us.
Having done that, let's use the same pattern for iota_view::iterator
as well.
Differential Revision: https://reviews.llvm.org/D110774
Mehdi Amini [Sat, 2 Oct 2021 23:55:25 +0000 (23:55 +0000)]
Fix memory leak in MLIR SPIRV ModuleCombiner
Mehdi Amini [Sat, 2 Oct 2021 23:53:02 +0000 (23:53 +0000)]
Fix/disable more MLIR tests exposing leaks in ASAN builds (NFC)
Mehdi Amini [Sat, 2 Oct 2021 23:16:35 +0000 (23:16 +0000)]
Fix multiple memory leaks in mlir-cpu-runner tests (NFC)
Mehdi Amini [Sat, 2 Oct 2021 23:07:39 +0000 (23:07 +0000)]
Fix memory leak in mlir-cpu-runner/sgemm_naive_codegen.mlir (NFC)
Mehdi Amini [Sat, 2 Oct 2021 21:28:28 +0000 (21:28 +0000)]
Fix Undefined Behavior in MLIR Diagnostic: don't call memcpy with a nullptr source
This happens when streaming an empty Twine as part of a diagnostic.
Differential Revision: https://reviews.llvm.org/D111002
Mehdi Amini [Sat, 2 Oct 2021 21:31:17 +0000 (21:31 +0000)]
Fix memory leaks in MLIR unit-tests (NFC)
Mehdi Amini [Sat, 2 Oct 2021 21:05:22 +0000 (21:05 +0000)]
Fix memory leaks in mlir/unittests/MLIRTableGenTests
Trying to get MLIR ASAN-clean.
Philip Reames [Sat, 2 Oct 2021 19:38:50 +0000 (12:38 -0700)]
[SCEV] Split isSCEVExprNeverPoison reasoning explicitly into scope and mustexecute parts [NFC]
Inspired by the needs to D111001 and D109845. The seperation of concerns also amakes it easier to reason about correctness and completeness.
Kazu Hirata [Sat, 2 Oct 2021 19:06:29 +0000 (12:06 -0700)]
[Target] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name. See
llvm/include/llvm/IR/InstrTypes.h for details.
Lang Hames [Sat, 2 Oct 2021 18:28:14 +0000 (11:28 -0700)]
[llvm-jitlink] Sink getPageSize call in Session::Create.
The page size for the host process is only needed in the in-process use case.
Simon Pilgrim [Fri, 1 Oct 2021 20:53:00 +0000 (21:53 +0100)]
[X86][Atom] Fix BSR/BSF uops + port usage
Both ports are required for BitScan ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner reports as well.
Craig Topper [Sat, 2 Oct 2021 17:44:05 +0000 (10:44 -0700)]
Revert "[RISCV] Add an GPR def to the Zvlseg SPILL/RELOAD pseudos"
This reverts commit
1f161919065fbfa2b39b8f373553a64b89f826f8.
We're seeing some issues with this internally. It seems that when
the spill is created by register allocation, the GPR doesn't get
allocated and an assertion fires during virtual register rewriting.
The .mir test case contains the spill before register allocation so
register allocation sees it as any other instruction.
mydeveloperday [Sat, 2 Oct 2021 17:04:32 +0000 (18:04 +0100)]
[clang-format] NFC 1% improvement in the overall clang-formatted status
Mehdi Amini [Sat, 2 Oct 2021 05:16:44 +0000 (05:16 +0000)]
Free memory leak on duplicate interface registration
I guess this is why we should use unique_ptr as much as possible.
Also fix the InterfaceAttachmentTest.cpp test.
Differential Revision: https://reviews.llvm.org/D110984
Simon Pilgrim [Sat, 2 Oct 2021 14:30:58 +0000 (15:30 +0100)]
[X86][SSE] Fix typo + infinite-loop in HOP(HOP'(X,X),HOP'(Y,Y)) fold (PR52040)
PR52040 identified several issues with the HOP(HOP'(X,X),HOP'(Y,Y)) -> HOP(PERMUTE(HOP'(X,Y)),PERMUTE(HOP'(X,Y)) slow-HOP fold.
Not only was there a copy+paste typo when accessing the inner HOP operands, but the (unnecessary) ReplaceAllUsesOfValueWith call was missing one use checks.
Now that we have better shuffle combines of HOPs we can just return a new HOP() sequence and not use ReplaceAllUsesOfValueWith at all - this actually improved pair_sum_v8i32_v4i32 codegen as it kicks off further shuffle combines.
Josh Learn [Sat, 2 Oct 2021 12:22:49 +0000 (13:22 +0100)]
[clang-format] Constructor initializer lists format with pp directives
Currently constructor initializer lists sometimes format incorrectly
when there is a preprocessor directive in the middle of the list.
This patch fixes the issue when parsing the initilizer list by
ignoring the preprocessor directive when checking if a block is
part of an initializer list.
rdar://
82554274
Reviewed By: MyDeveloperDay, HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D109951
mydeveloperday [Sat, 2 Oct 2021 12:18:00 +0000 (13:18 +0100)]
[clang-format] [docs] [NFC] improve clarity in the QualifierAlignment warning
Improve the clarity and guidance of the warning when using code modifying option in clang-format see {D69764}
Reviewed By: HazardyKnusperkeks, curdeius
Differential Revision: https://reviews.llvm.org/D110801
Mark de Wever [Sat, 2 Oct 2021 11:47:27 +0000 (13:47 +0200)]
[NFC][libc++] Use TEST_HAS_NO_EXCEPTIONS in tests.
Mark de Wever [Sat, 2 Oct 2021 11:41:05 +0000 (13:41 +0200)]
[libc++][doc] Update format status.
Updated based on recent commits, new reviews and work continuing for
P2216.
Simon Pilgrim [Fri, 1 Oct 2021 17:53:02 +0000 (18:53 +0100)]
[X86] decomposeMulByConstant - decompose legal vXi32 multiplies on SlowPMULLD targets and all vXi64 multiplies
X86's decomposeMulByConstant never permits mul decomposition to shift+add/sub if the vector multiply is legal.
Unfortunately this isn't great for SSE41+ targets which have PMULLD for vXi32 multiplies, but is often quite slow. This patch proposes to allow decomposition if the target has the SlowPMULLD flag (i.e. Silvermont). We also always decompose legal vXi64 multiplies - even latest IceLake has really poor latencies for PMULLQ.
Differential Revision: https://reviews.llvm.org/D110588
Simon Pilgrim [Thu, 30 Sep 2021 11:28:02 +0000 (12:28 +0100)]
[X86] Atom SSE shift-by-variable take 2uops/3uops not 1uop
Based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well.
Roman Lebedev [Sat, 2 Oct 2021 10:40:09 +0000 (13:40 +0300)]
[X86][Costmodel] Load/store i8 Stride=4 VF=32 interleaving costs
While we already model this tuple, the load cost is divergent from reality, so fix it.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/zWMhhnPYa - for intels `Block RThroughput: =56.0`; for ryzens, `Block RThroughput: <=24.0`
So pick cost of `56`.
For store we have:
https://godbolt.org/z/vnqqjWx51 - for intels `Block RThroughput: =12.0`; for ryzens, `Block RThroughput: <=4.0`
So pick cost of `12`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110971
Roman Lebedev [Sat, 2 Oct 2021 10:40:09 +0000 (13:40 +0300)]
[X86][Costmodel] Load/store i8 Stride=4 VF=16 interleaving costs
While we already model this tuple, the values are divergent from reality, so fix them.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/TrGW7cKsE - for intels `Block RThroughput: =24.0`; for ryzens, `Block RThroughput: <=12.0`
So pick cost of `24`.
For store we have:
https://godbolt.org/z/Mh7qaqEfe - for intels `Block RThroughput: =8.0`; for ryzens, `Block RThroughput: <=4.0`
So pick cost of `8`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110970
Roman Lebedev [Sat, 2 Oct 2021 10:40:04 +0000 (13:40 +0300)]
[X86][Costmodel] Load/store i8 Stride=4 VF=8 interleaving costs
While we already model this tuple, the values are divergent from reality, so fix them.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/v7746Wcf7 - for intels `Block RThroughput: =12.0`; for ryzens, `Block RThroughput: <=6.0`
So pick cost of `12`.
For store we have:
https://godbolt.org/z/aEeEohEbP - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110969
Roman Lebedev [Sat, 2 Oct 2021 10:39:58 +0000 (13:39 +0300)]
[X86][Costmodel] Load/store i8 Stride=4 VF=4 interleaving costs
While we already model this tuple, the store cost is divergent from reality, so fix it.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/1n4bPh7Tn - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
For store we have:
https://godbolt.org/z/r8K9sveqo - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110968
Roman Lebedev [Sat, 2 Oct 2021 10:39:54 +0000 (13:39 +0300)]
[X86][Costmodel] Load/store i8 Stride=4 VF=2 interleaving costs
While we already model this tuple, the values are divergent from reality, so fix them.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/KP6nn36zs - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
For store we have:
https://godbolt.org/z/ov95zhrq6 - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110966
Roman Lebedev [Sat, 2 Oct 2021 10:39:15 +0000 (13:39 +0300)]
[X86][Costmodel] Load/store i8 Stride=3 VF=32 interleaving costs
For VF=16, costs are correct.
For VF=32, load cost is divergent.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/qKjevqf4W - for intels `Block RThroughput: <=14.0`; for ryzens, `Block RThroughput: <=4.5`
So pick cost of `14`.
For store we have:
https://godbolt.org/z/xTssTq319 - for intels `Block RThroughput: =13.0`; for ryzens, `Block RThroughput: <=5.5`
So pick cost of `13`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110961
Roman Lebedev [Sat, 2 Oct 2021 10:39:15 +0000 (13:39 +0300)]
[X86][Costmodel] Load/store i8 Stride=3 VF=8 interleaving costs
While we already model this tuple, the values are divergent from reality, so fix them.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/1jeocxj55 - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=3.0`
So pick cost of `6`.
For store we have:
https://godbolt.org/z/fr7xfa3K5 - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `6`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110960
Roman Lebedev [Sat, 2 Oct 2021 10:39:10 +0000 (13:39 +0300)]
[X86][Costmodel] Load/store i8 Stride=3 VF=4 interleaving costs
While we already model this tuple, the values are divergent from reality, so fix them.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/obWz3PrfK - for intels `Block RThroughput: =3.0`; for ryzens, `Block RThroughput: <=1.5`
So pick cost of `3`.
For store we have:
https://godbolt.org/z/orjPshn3h - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110958
Roman Lebedev [Sat, 2 Oct 2021 10:39:05 +0000 (13:39 +0300)]
[X86][Costmodel] Load/store i8 Stride=3 VF=2 interleaving costs
While we already model this tuple, the values are divergent from reality, so fix them.
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3
For load we have:
https://godbolt.org/z/WYscYMcW4 - for intels `Block RThroughput: =3.0`; for ryzens, `Block RThroughput: <=1.5`
So pick cost of `3`.
For store we have:
https://godbolt.org/z/e9qvYdbbs - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.
I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110956
Mark de Wever [Tue, 25 May 2021 18:32:38 +0000 (20:32 +0200)]
[libc++][format] Implement Unicode support.
This adds the width estimation functions to the std-format-spec.
Implements parts of:
- P0645 Text Formatting
- P1868 width: clarifying units of width and precision in std::format
Reviewed By: #libc, ldionne, vitaut
Differential Revision: https://reviews.llvm.org/D103413
Tomasz Miąsko [Sat, 2 Oct 2021 05:58:54 +0000 (07:58 +0200)]
[llvm-cxxfilt] Replace isalnum with isAlnum from StringExtras
D104366 introduced a new llvm-cxxfilt test with non-ASCII characters,
which caused a failure on llvm-clang-x86_64-expensive-checks-win
builder, with a stack trace suggesting issue in a call to isalnum.
The argument to isalnum should be either EOF or a value that is
representable in the type unsigned char. The llvm-cxxfilt does not
perform a cast from char to unsigned char before the call, so the
value might be out of valid range.
Replace the call to isalnum with isAlnum from StringExtras, which takes
a char as the argument. This also makes the check independent of the
current locale.
Differential Revision: https://reviews.llvm.org/D110986
Amara Emerson [Sat, 2 Oct 2021 04:51:46 +0000 (21:51 -0700)]
[AArch64][GlobalISel] Lower G_SMULH/G_UMULH unless its one of the supported types.
s32 was also incorrectly marked as a supported type, and was causing fallbacks
because we don't support it.
Alexey Lapshin [Thu, 23 Sep 2021 09:26:25 +0000 (12:26 +0300)]
[DWARF][NFC] add ParentIdx and SiblingIdx to DWARFDebugInfoEntry for faster navigation.
This patch implements suggestion done while reviewing D102634. It adds two fields:
ParentIdx and SiblingIdx. These fields allow fast navigation to die parent and
die sibling. These fields are set at the moment when dies are loaded.
dsymutil works 2% faster with this patch(run on clang binary).
Differential Revision: https://reviews.llvm.org/D110363
Mehdi Amini [Sat, 2 Oct 2021 04:45:40 +0000 (04:45 +0000)]
Fix memory leaks in mlir/test/CAPI/ir.c
Mehdi Amini [Sat, 2 Oct 2021 04:06:17 +0000 (04:06 +0000)]
Add a `check-mlir-build-only` build target that only builds the dependencies of the `check-mlir` test target (NFC)
Nimish Mishra [Thu, 30 Sep 2021 19:11:57 +0000 (00:41 +0530)]
[flang][OpenMP] Added OpenMP 5.0 specification based semantic checks for sections construct and test case for simd construct
According to OpenMP 5.0 spec document, the following semantic restrictions have been dealt with in this patch.
1. [sections construct] Orphaned section directives are prohibited. That is, the section directives must appear within the sections construct and must not be encountered elsewhere in the sections region.
Semantic checks for the following are not necessary, since use of orphaned section construct (i.e. without an enclosing sections directive) throws parser errors and control flow never reaches the semantic checking phase. Added a test case for the same.
2. [sections construct] Must be a structured block
Added test case and made changes to branching logic
3. [simd construct] Must be a structured block / A program that branches in or out of a function with declare simd is non conforming
4. Fixed !$omp do's handling of unlabeled CYCLEs
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D108904
Shivam Gupta [Sat, 2 Oct 2021 02:05:15 +0000 (07:35 +0530)]
[libc++][Docs] Update benchmark doc wrt monorepo
Seems this section is not updated since we have transited to llvm-project monorepo.
At the start, we build libcxx under monorepo configuration but later try to make the separate configuration for libcxx build
and running benchmark.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D110722
LLVM GN Syncbot [Sat, 2 Oct 2021 00:21:42 +0000 (00:21 +0000)]
[gn build] Port
657f02d45804
Daniel Rodríguez Troitiño [Fri, 1 Oct 2021 21:30:21 +0000 (14:30 -0700)]
Revert "Extract LC_CODE_SIGNATURE related implementation out of LLD"
This reverts commit
cc8229603b67763e77a46894f88f7d3ddd04de34.
As discussed in the review of https://reviews.llvm.org/D109972, this was not
right approach, so we are reverting to start with a different approach.
Differential Revision: https://reviews.llvm.org/D110974
Philip Reames [Fri, 1 Oct 2021 23:39:23 +0000 (16:39 -0700)]
[test] add coverage for a SCEVUnknown scoped value in isSCEVExprNeverPoison
Note that a couple of the "negative" tests also end up showing miscompiles due to D109845 which is not yet fixed.
Philip Reames [Fri, 1 Oct 2021 23:30:44 +0000 (16:30 -0700)]
[SCEV] Stop blindly propagating flags from inbound geps to SCEV nodes
This fixes a violation of the wrap flag rules introduced in
c4048d8f. This was also noted in the (very old) PR23527.
The issue being fixed is that we assume the inbound flag on any GEP assumes that all users of *any* gep (or add) which happens to map to that SCEV would also be UB if the (other) gep overflowed. That's simply not true.
In terms of the test diffs, I don't see anything seriously problematic. The lost flags are expected (given the semantic restriction on when its legal to tag the SCEV), and there are several cases where the previously inferred flags are unsound per the new semantics.
The only common trend I noticed when looking at the deltas is that by not considering branch on poison as immediate UB in ValueTracking, we do miss a few cases we could reclaim. We may be able to claw some of these back with the follow ideas mentioned in PR51817.
It's worth noting that most of the changes are analysis result only changes. The two transform changes are pretty minimal. In one case, we miss the opportunity to infer a nuw (correctly). In the other, we fail to fold an exit and produce a loop invariant form instead. This one is probably over-reduced as the program appears to be undefined in practice, and neither before or after exploits that.
Differential Revision: https://reviews.llvm.org/D109789
Philip Reames [Fri, 1 Oct 2021 22:57:37 +0000 (15:57 -0700)]
[SCEV] Remove invariant requirement from isSCEVExprNeverPoison
This code is attempting to prove that I must execute if we enter the defining scope of the SCEV which will be created from I. In the case where it found a defining addrec scope, it had a rather odd restriction that all of the other operands must be loop invariant in that addrec's loop.
As near as I can tell here, we really only need a upper bound on the defining scope. If we can prove the stronger property, then we must also have proven the property on the exact defining scope as well.
In practice, the actual effect of this change is narrow. The compile time restriction at the top of the routine basically limits us to I being an arithmetic in some loop L with both an addrec operand in L, and a unknown operands in L. Possible to demonstrate, but the main value of the change is removing unneeded code.
Differential Revision: https://reviews.llvm.org/D110892
Philip Reames [Fri, 1 Oct 2021 22:34:58 +0000 (15:34 -0700)]
[test] split flags-from-poison.ll to allow ease of autogen update
Jessica Paquette [Fri, 1 Oct 2021 16:22:51 +0000 (09:22 -0700)]
[AArch64][GlobalISel] Change G_ANYEXT fed by scalar G_ICMP to G_ZEXT
This is a common pattern:
```
%icmp:_(s32) = G_ICMP intpred(eq), ...
%ext:_(s64) = G_ANYEXT %icmp(s32)
%and:_(s64) = G_AND %ext, 1
```
Here's an example: https://godbolt.org/z/T13f6o8zE
This pattern appears because of the following combine in the
LegalizationArtifactCombiner:
```
// zext(trunc x) - > and (aext/copy/trunc x), mask
```
Which kicks in when we widen the result of G_ICMP from 1 bit to 32 bits.
We know that, on AArch64, a scalar G_ICMP will produce 0 or 1. So the result
of `%ext` will always be 0 or 1 as well.
We have some KnownBits combines which eliminate redundant G_ANDs with masks.
These combines don't kick in with G_ANYEXT.
So, if we replace the G_ANYEXT with G_ZEXT in this situation, the KnownBits
based combines can remove the redundant G_AND.
I wasn't sure if it woud be more appropriate to
* Take this route
* Put this in the LegalizationArtifactCombiner.
* Allow 64 bit G_ICMP destinations
I decided on this route because
1) It's simple
2) I'm not sure if philosophically-speaking, we should be handling non-artifact
instructions + target-specific details like TargetBooleanContents in the
LegalizationArtifactCombiner
3) There is a lot of existing code which assumes we only have 32 bit G_ICMP
destinations. So, adding support for 64-bit destinations seems rather invasive
right now. I think that adding support for 64-bit destinations, or modelling
G_ICMP as ADDS/SUBS/etc is probably cleaner long term though.
This gives minor code size savings on all CTMark benchmarks.
Differential Revision: https://reviews.llvm.org/D110959
Stefan Pintilie [Fri, 1 Oct 2021 21:46:46 +0000 (16:46 -0500)]
[NFC][PowerPC] Add test case for byval store.
Added a test case for situations where a struct of size 1-7 bytes is
passed by value.
Daniil Suchkov [Fri, 1 Oct 2021 21:49:38 +0000 (21:49 +0000)]
Revert "[DomTree] Assert that blocks in queries aren't from another function"
This reverts commit
86046516e4f4527213c595c154c9971d81a49601.
This assertion fails on https://lab.llvm.org/buildbot/#/builders/98/builds/6690
Reverting it for now.
Amy Kwan [Fri, 1 Oct 2021 21:38:20 +0000 (16:38 -0500)]
Revert "tsan: fix and test detection of TLS races"
This reverts commit
b4c1e5cb73bd26e5853af77c2a235ca9f35e2577.
Reverting this as it contains a test that is currently failing on the PPC BE bots.
Amy Kwan [Fri, 1 Oct 2021 21:35:15 +0000 (16:35 -0500)]
Revert "tsan: fix tls_race3 test on darwin"
This reverts commit
ade5023c54cffcbefe0557b5473d55b06e40809b.
Reverting this commit as it is dependent on a test breaking the PPC BE bots.
Amy Kwan [Fri, 1 Oct 2021 21:32:32 +0000 (16:32 -0500)]
Revert "tsan: print a meaningful frame for stack races"
This reverts commit
ccc83ac7c501c8e117753af0729414350aa9c117.
Reverting this commit as it is dependent on additional commits breaking the
PPC BE bots.
Zequan Wu [Fri, 1 Oct 2021 21:37:09 +0000 (14:37 -0700)]
[Profile] Add a warning when lock file failed in __llvm_profile_set_file_object with continuous mode
Daniil Suchkov [Tue, 28 Sep 2021 23:44:50 +0000 (23:44 +0000)]
[DomTree] Assert that blocks in queries aren't from another function
This assertion should help us catch cases when DT is used in a way that
doesn't make much sense and usually indicates usage errors. In D110752
you can see a test on which this assertion catches a miscompile.
The assertion is added to getNode since all queries seem to be
routed through that function for all non-trivial cases.
Reviewed By: aeubanks, MaskRay
Differential Revision: https://reviews.llvm.org/D110751
Daniil Suchkov [Tue, 28 Sep 2021 23:51:15 +0000 (23:51 +0000)]
[SimpleLoopUnswitch] Don't unswitch constant conditions
Added an additional check for constants after simplification of
"select _, true, false" pattern. We need to prevent attempts to unswitch constant
conditions for two reasons:
a) Doing that doesn't make any sense, in the best case it will just burn
some compile time.
b) SimpleLoopUnswitch isn't designed to unswitch constant conditions
(due to (a)), so attempting that can cause miscompiles. The attached
testcase is an example of such miscompile.
Also added an assertion that'll make sure we aren't trying to replace
constants, so it will help us prevent such bugs in future. The assertion
from D110751 is another layer of protection against such cases.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D110752
Daniil Suchkov [Tue, 28 Sep 2021 23:30:50 +0000 (23:30 +0000)]
[Test] Add a test exposing a miscompile in SimpleLoopUnswitch.
The miscompile was introduced by
6b4b1dc6ec6f0bf0a1bb414fbe751ccab99d41a0.
wren romano [Fri, 1 Oct 2021 00:51:42 +0000 (17:51 -0700)]
[mlir][sparse] Sharing calls to adaptor.getOperands()[0]
This is preliminary work towards D110790. Depends On D110883.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110884
wren romano [Fri, 1 Oct 2021 00:47:43 +0000 (17:47 -0700)]
[mlir][sparse] Factoring out allocaIndices()
This is preliminary work towards D110790. Depends On D110882.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110883
wren romano [Fri, 1 Oct 2021 00:25:32 +0000 (17:25 -0700)]
[mlir][sparse] Factoring out getZero() and avoiding unnecessary Type params
This is preliminary work towards D110790
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110882
Nikita Popov [Fri, 1 Oct 2021 20:53:55 +0000 (22:53 +0200)]
[BasicAA] Make test more robust (NFC)
When taking into account the fact that GEP indices are truncated
to 32-bits in this test, the "path dependence" goes away, so
inferring MustAlias for all pointers would be correct. As this
goes against the spirit of the test, change it to extend from
i16 instead.
Nikita Popov [Fri, 1 Oct 2021 20:02:32 +0000 (22:02 +0200)]
[BasicAA] Add additional truncation tests (NFC)
These show that the known bits and non-zero heuristics are incorrect
when truncation is involved.
Daniel Resnick [Fri, 1 Oct 2021 00:14:00 +0000 (18:14 -0600)]
[mlir][capi] Add TypeID to MLIR C-API
Exposes mlir::TypeID to the C API as MlirTypeID along with various accessors
and helper functions.
Differential Revision: https://reviews.llvm.org/D110897
LLVM GN Syncbot [Fri, 1 Oct 2021 20:14:30 +0000 (20:14 +0000)]
[gn build] Port
c8c2b4629f75
Tomasz Miąsko [Fri, 1 Oct 2021 00:00:00 +0000 (00:00 +0000)]
[Demangle][Rust] Parse non-ASCII identifiers
Rust allows use of non-ASCII identifiers, which in Rust mangling scheme
are encoded using Punycode.
The encoding deviates from the standard by using an underscore as the
separator between ASCII part and a base-36 encoding of non-ASCII
characters (avoiding hypen-minus in the symbol name). Other than that,
the encoding follows the standard, and the decoder implemented here in
turn follows the one given in RFC 3492.
To avoid an extra intermediate memory allocation while decoding
Punycode, the interface of OutputStream is extended with an insert
method.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104366
Simon Pilgrim [Fri, 1 Oct 2021 20:07:26 +0000 (21:07 +0100)]
[DAG] scalarizeExtractedVectorLoad - replace getABITypeAlign with allowsMemoryAccess (PR45116)
One of the cases identified in PR45116 - we don't need to limit extracted loads to ABI alignment, we can use allowsMemoryAccess - which tests using getABITypeAlign, but also checks if a target permits (fast) misaligned memory loads by checking allowsMisalignedMemoryAccesses as a fallback.
I've also cleaned up the alignment calculation code - if we have a constant extraction index then the alignment can be based on an offset from the original vector load alignment, but for non-constant indices we should assume the worst (single element alignment only).
Differential Revision: https://reviews.llvm.org/D110486
Jay Foad [Fri, 1 Oct 2021 16:12:14 +0000 (17:12 +0100)]
[TwoAddressInstruction] Tweak constraining of tied operands
In collectTiedOperands, when handling an undef use that is tied to a
def, constrain the dst reg with the actual register class of the src
reg, instead of with the register class from the instructions's
MCInstrDesc. This makes a difference in some AMDGPU test cases like
this, before:
%16:sgpr_96 = INSERT_SUBREG undef %15:sgpr_96_with_sub0_sub1(tied-def 0), killed %11:sreg_64_xexec, %subreg.sub0_sub1
After, without this patch:
undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec
This fails machine verification if you force it to run after
TwoAddressInstruction (currently it is disabled) with:
*** Bad machine code: Invalid register class for subregister index ***
- function: s_load_constant_v3i32_align4
- basic block: %bb.0 (0xa011a88)
- instruction: undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec
- operand 0: undef %16.sub0_sub1:sgpr_96
Register class SGPR_96 does not fully support subreg index 4
After, with this patch:
undef %16.sub0_sub1:sgpr_96_with_sub0_sub1 = COPY killed %11:sreg_64_xexec
See also svn r159120 which introduced the code to handle tied undef
uses.
Differential Revision: https://reviews.llvm.org/D110944
Jay Foad [Fri, 1 Oct 2021 18:36:54 +0000 (19:36 +0100)]
[TwoAddressInstruction] Pre-commit a test case for D110944
Roman Lebedev [Fri, 1 Oct 2021 19:46:51 +0000 (22:46 +0300)]
[NFC][X86][Codegen] Add test coverage for interleaved i8 load/store stride=4
Roman Lebedev [Fri, 1 Oct 2021 19:41:11 +0000 (22:41 +0300)]
[NFC][X86][LV] Improve costmodel test coverage for interleaved i8 load/store stride=4
Jinsong Ji [Fri, 1 Oct 2021 19:12:00 +0000 (19:12 +0000)]
[AIX] Don't pass namedsects in LTO mode
LTO don't need binder option , don't pass it in LTO mode.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D110955
Nikita Popov [Fri, 1 Oct 2021 19:18:24 +0000 (21:18 +0200)]
[BasicAA] Add additional 32-bit truncation test (NFC)
This is a variant with a variable index, in which case the pointer
size adjustment is not performed.
Valentin Clement [Fri, 1 Oct 2021 19:14:14 +0000 (21:14 +0200)]
[fir][NFC] Move fir.global printer to cpp file
All big enough parser, printer and verifier are moved to the cpp file.
This is one of the last one to be moved.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D110929
ZijunZhao [Fri, 1 Oct 2021 18:52:36 +0000 (18:52 +0000)]
revert tsan part for investigation
Lang Hames [Fri, 1 Oct 2021 18:36:11 +0000 (11:36 -0700)]
[ORC] Fix LLVM modulemap after removal of ORC RPC in
33dd98e9e49.
Michał Górny [Fri, 1 Oct 2021 18:33:29 +0000 (20:33 +0200)]
[lldb] [Host] Sync TerminalState::Data to struct type
Sanjay Patel [Fri, 1 Oct 2021 17:30:44 +0000 (13:30 -0400)]
[InstCombine] fold (trunc (X>>C1)) << C to shift+mask directly
This is no-externally-visible-functional-difference-intended.
That is, the test diffs show identical instructions other than
name changes (those are included specifically to verify the logic).
The existing transforms created extra instructions and relied
on subsequent folds to get to the final result, but that could
conflict with other transforms like the proposed D110170 (and
caused that patch to be reverted twice so far because of infinite
combine loops).
LLVM GN Syncbot [Fri, 1 Oct 2021 18:18:21 +0000 (18:18 +0000)]
[gn build] Port
33dd98e9e499
Lang Hames [Fri, 1 Oct 2021 17:07:03 +0000 (10:07 -0700)]
[ORC] Remove ORC RPC.
With the removal of OrcRPCExecutorProcessControl and OrcRPCTPCServer in
6aeed7b19c4 the ORC RPC library no longer has any in-tree users.
Clients needing serialization for ORC should move to Simple Packed
Serialization (usually by adopting SimpleRemoteEPC for remote JITing).
Lei Zhang [Fri, 1 Oct 2021 18:12:54 +0000 (14:12 -0400)]
[mlir][linalg] Include InitTensorOp in tiling canonicalization
Tiling can create dim ops and those dim ops can take `InitTensorOp`
as input. Including it in the tiling canonicalization patterns
allows us to fold those dim ops away.
Also sorted the existing ops along the way.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D110876
Arthur Eubanks [Thu, 30 Sep 2021 20:57:55 +0000 (13:57 -0700)]
[NFC][AttributeList] Replace index_begin/end with an iterator
We expose the fact that we rely on unsigned wrapping to iterate through
all indexes. This can be confusing. Rather, keeping it as an
implementation detail through an iterator is less confusing and is less
code.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110885
Jay Foad [Wed, 29 Sep 2021 12:18:01 +0000 (13:18 +0100)]
[MachineLoopInfo] Enable machine verification after this pass
Enabling this does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.
Differential Revision: https://reviews.llvm.org/D110703
Jay Foad [Wed, 29 Sep 2021 12:09:22 +0000 (13:09 +0100)]
[LiveVariables] Skip verification of kills inside bundles
LiveVariables does not examine the contents of bundles, so
MachineVerifier should not expect it to know about kill flags on
operands of instructions inside a bundle.
With this fix we can enable machine verification after running the
LiveVariables analysis. Doing this does not show any problems in
check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build.
Differential Revision: https://reviews.llvm.org/D110700
Jay Foad [Wed, 29 Sep 2021 11:57:03 +0000 (12:57 +0100)]
[UnreachableMachineBlockElim] Enable machine verification after this pass
Enabling this does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.
Differential Revision: https://reviews.llvm.org/D110697
Jay Foad [Wed, 29 Sep 2021 11:44:32 +0000 (12:44 +0100)]
[ProcessImplicitDefs] Enable machine verification after this pass
Enabling this does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.
Differential Revision: https://reviews.llvm.org/D110695
Jay Foad [Wed, 29 Sep 2021 08:55:15 +0000 (09:55 +0100)]
[DetectDeadLanes] Enable machine verification after this pass
Machine verification after DetectDeadLanes has been disabled since the
pass was first added in D18427, but I guess this was just due to copy-
and-paste. Enabling it does not show any problems in check-llvm in an
LLVM_ENABLE_EXPENSIVE_CHECKS build.
Differential Revision: https://reviews.llvm.org/D110689
Arthur O'Dwyer [Fri, 1 Oct 2021 16:59:24 +0000 (12:59 -0400)]
[libc++] Revert the part of my b82683b that affected <version>.
This reverts part of commit
b82683b2eb3601f6e8970861b94ad7b37393aa90.
I hadn't intended to remove the `// -*- C++ -*-` comment line
from `libcxx/include/version`, only from the generated tests.
Thanks to Raul Tambre for the catch.
Lang Hames [Fri, 1 Oct 2021 16:34:16 +0000 (09:34 -0700)]
[ORC] Remove OrcRPCExecutorProcessControl ad OrcRPCTPCServer.
All in-tree tools have moved to SimpleRemoteEPC.
Kazu Hirata [Fri, 1 Oct 2021 16:57:40 +0000 (09:57 -0700)]
[Transforms] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name. See
llvm/include/llvm/IR/InstrTypes.h for details.
zhijian [Fri, 1 Oct 2021 16:37:51 +0000 (12:37 -0400)]
[AIX]implement the --syms and using "symbol index and qualname" for --sym --symbol--description for llvm-objdump for xcoff
Summary:
for xcoff :
implement the getSymbolFlag and getSymbolType() for option --syms.
llvm-objdump --sym , if the symbol is label, print the containing section for the symbol too.
when using llvm-objdump --sym --symbol--description, print the symbol index and qualname for symbol.
for example:
--symbol-description
00000000000000c0 l .text (csect: (idx: 2) .foov[PR]) (idx: 3) .foov
and without --symbol-description
00000000000000c0 l .text (csect: .foov) .foov
Reviewers: James Henderson,Esme Yi
Differential Revision: https://reviews.llvm.org/D109452
Roman Lebedev [Fri, 1 Oct 2021 16:34:57 +0000 (19:34 +0300)]
[NFC][Codegen][X86] Drop unused check prefixes in newly added tests
Michał Górny [Fri, 1 Oct 2021 16:23:25 +0000 (18:23 +0200)]
[lldb] [Host] Fix flipped logic in TerminalState::Save()
Arthur O'Dwyer [Fri, 1 Oct 2021 16:13:03 +0000 (12:13 -0400)]
[libc++] [test] Remove filenames from copyright headers. NFCI.
Discussed in D110794.
Anna Thomas [Fri, 1 Oct 2021 15:49:25 +0000 (11:49 -0400)]
[TrivialDeadness] Update function comment
isInstructionTriviallyDead also works for certain side-effecting
instructions.
Update incorrect comment (as suggested in D109917).
Peyton, Jonathan L [Mon, 20 Sep 2021 18:24:55 +0000 (13:24 -0500)]
[OpenMP][host runtime] Introduce kmp_cpuinfo_flags_t to replace integer flags
Store CPUID support flags as bits instead of using entire integers.
Differential Revision: https://reviews.llvm.org/D110091
Peyton, Jonathan L [Fri, 1 Oct 2021 16:06:58 +0000 (11:06 -0500)]
[OpenMP][testing] increase threshold for omp_get_wtime test
Arthur O'Dwyer [Thu, 30 Sep 2021 19:43:38 +0000 (15:43 -0400)]
[libc++] Remove "// -*- C++ -*-" comments from all .cpp files. NFCI.
Even if these comments have a benefit in .h files (for editors that
care about language but can't be configured to treat .h as C++ code),
they certainly have no benefit for files with the .cpp extension.
Discussed in D110794.
Arthur O'Dwyer [Thu, 30 Sep 2021 19:40:45 +0000 (15:40 -0400)]
[libc++] [test] Remove "// -*- C++ -*-" comments from generated .cpp files.
Even if these comments have a benefit in .h files (for editors that
care about language but can't be configured to treat .h as C++ code),
they certainly have no benefit for files with the .cpp extension.
Discussed in D110794.
Lang Hames [Fri, 1 Oct 2021 00:25:20 +0000 (17:25 -0700)]
[llvm-jitlink] Fix a FIXME.
ORC errors preserve the SymbolStringPool since
6fe2e9a9cc8, so we can stop
bailing out early.
Roman Lebedev [Fri, 1 Oct 2021 15:47:09 +0000 (18:47 +0300)]
[NFC][X86][Codegen] Add test coverage for interleaved i8 load/store stride=3