platform/upstream/llvm.git
2 years ago[PowerPC] Fix vector equality comparison for v2i64 pre-Power8
Nemanja Ivanovic [Tue, 21 Dec 2021 20:28:41 +0000 (14:28 -0600)]
[PowerPC] Fix vector equality comparison for v2i64 pre-Power8

The current code makes the assumption that equality
comparison can be performed with a word comparison
instruction. While this is true if the entire 64-bit
results are used, it does not generally work. It is
possible that the low order words and high order
words produce different results and a user of only
one will get the wrong result.

This patch adds an and of the result words so that
each word has the result of the comparison of the
entire doubleword that contains it.

Differential revision: https://reviews.llvm.org/D115678

2 years ago[PowerPC] Do not increase cost for getUserCost with MMA types
Nemanja Ivanovic [Tue, 21 Dec 2021 19:35:17 +0000 (13:35 -0600)]
[PowerPC] Do not increase cost for getUserCost with MMA types

Commit 150681f increases
cost of producing MMA types (vector pair and quad).
However, it increases the cost for getUserCost() which is
used in unrolling. As a result, loops that contain these
types already (from the user code) cannot be unrolled
(even with the user's unroll pragma). This was an unintended
sideeffect. Reverting that portion of the commit to allow
unrolling such loops.

Differential revision: https://reviews.llvm.org/D115424

2 years ago[libc] Show average runtime for math single-input-single-output performance tests.
Tue Ly [Tue, 21 Dec 2021 15:31:42 +0000 (10:31 -0500)]
[libc] Show average runtime for math single-input-single-output performance tests.

Run performance tests in denormal and normal ranges separately and show more detailed results.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D116112

2 years ago[gn build] Port f78d49e06883
LLVM GN Syncbot [Tue, 21 Dec 2021 18:57:55 +0000 (18:57 +0000)]
[gn build] Port f78d49e06883

2 years agotsan: remove old vector clocks
Dmitry Vyukov [Wed, 6 Oct 2021 06:56:42 +0000 (08:56 +0200)]
tsan: remove old vector clocks

They are unused in the new tsan runtime.

Depends on D112604.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D112605

2 years agotsan: remove hacky call
Dmitry Vyukov [Sun, 10 Oct 2021 06:18:15 +0000 (08:18 +0200)]
tsan: remove hacky call

It's unused in the new tsan runtime.

Depends on D112603.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D112604

2 years agotsan: reduce shadow ranges
Dmitry Vyukov [Thu, 11 Nov 2021 17:49:38 +0000 (18:49 +0100)]
tsan: reduce shadow ranges

The new tsan runtime has 2x more compact shadow.
Adjust shadow ranges accordingly.

Depends on D112603.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D113751

2 years agotsan: remove unused variable
Dmitry Vyukov [Tue, 16 Nov 2021 07:44:09 +0000 (08:44 +0100)]
tsan: remove unused variable

Depends on D113983.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113984

2 years agotsan: use VReport instead of VPrintf in background thread
Dmitry Vyukov [Tue, 16 Nov 2021 07:41:36 +0000 (08:41 +0100)]
tsan: use VReport instead of VPrintf in background thread

If there are multiple processes, it's hard to understand
what output comes from what process.
VReport prepends pid to the output. Use it.

Depends on D113982.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113983

2 years agotsan: better maintain current time in the background thread
Dmitry Vyukov [Tue, 16 Nov 2021 07:39:30 +0000 (08:39 +0100)]
tsan: better maintain current time in the background thread

Update now after long operations so that we don't use
stale value in subsequent computations.

Depends on D113981.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113982

2 years ago[ELF] --gc-sections: Work around SHT_PROGBITS .init_array
Fangrui Song [Tue, 21 Dec 2021 18:44:29 +0000 (10:44 -0800)]
[ELF] --gc-sections: Work around SHT_PROGBITS .init_array

Older Go cmd/link used SHT_PROGBITS for .init_array .
Work around the lack of https://golang.org/cl/373734 for a while.
It does not generate .fini_array or .preinit_array

2 years ago[libc++][NFC] Reformatting in random_device.h and random.cpp
Louis Dionne [Tue, 21 Dec 2021 16:27:19 +0000 (11:27 -0500)]
[libc++][NFC] Reformatting in random_device.h and random.cpp

2 years ago[LTO][WPD] Ignore unreachable function by analyzing IR.
minglotus-6 [Mon, 20 Dec 2021 20:44:05 +0000 (20:44 +0000)]
[LTO][WPD] Ignore unreachable function by analyzing IR.

In regular LTO, analyze IR and discard unreachable functions when finding virtual call targets.

Differential Revision: https://reviews.llvm.org/D116056

2 years ago[ELF] Optimize RelocationSection<ELFT>::writeTo
Fangrui Song [Tue, 21 Dec 2021 17:43:44 +0000 (09:43 -0800)]
[ELF] Optimize RelocationSection<ELFT>::writeTo

When linking a 1.2G output (nearly no debug info, 2846621 dynamic relocations) using `--threads=8`, I measured

```
9.131462 Total ExecuteLinker
1.449913 Total Write output file
1.445784 Total Write sections
0.657152 Write sections {"detail":".rela.dyn"}
```

This change decreases the .rela.dyn time to 0.25, leading to 4% speed up in the total time.

* The parallelSort is slow because of expensive r_sym/r_offset computation. Cache the values.
* The iteration is slow. Move r_sym/r_addend computation ahead of time and parallelize it.

With the change, the new encodeDynamicReloc is cheap (0.05s). So no need to parallelize it.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D115993

2 years ago[libcxx] [test] Don't rerun supportsVerify for each individual test
Martin Storsjö [Sun, 19 Dec 2021 14:16:15 +0000 (16:16 +0200)]
[libcxx] [test] Don't rerun supportsVerify for each individual test

We can't just memoize _supportsVerify in place in format.py, as it
previously was executed in each of the individual processes.

Instead use hasCompileFlag() and add a feature flag for it instead,
which can be used both by tests (that already have such a flag,
locally for one set of tests) and for the testing framework itself.

By using hasCompileFlag(), this also implicitly fixes two other issues:
Previously, _supportsVerify called subprocess.call() directly, which can
interpret command line quoting differently than lit.TestRunner.

(In particular, TestRunner handles arguments quoted by a single quote,
while launching Windows processes with subprocess.call() only supports
double quotes. This allows using shlex.quote(), which uses single quotes,
everywhere - as all commands now go through TestRunner. This should make
41d7909368bebc897467a75860a524a5f172564f redundant.)

Secondly, the old _supportsVerify method didn't include %{flags) or
%{compile_flags}.

Differential Revision: https://reviews.llvm.org/D116010

2 years ago[funcattrs] Infer access attributes for vararg arguments
Philip Reames [Tue, 21 Dec 2021 17:15:54 +0000 (09:15 -0800)]
[funcattrs] Infer access attributes for vararg arguments

This change allows us to infer access attributes (readnone, readonly) on arguments passed to vararg functions. Since there isn't a formal argument corresponding to the parameter, they'll never be considered part of the speculative SCC, but they can still benefit from attributes on the call site or the callee function.

The main motivation here is just to simplify the code, and remove some special casing. Previously, an indirect vararg call could return more precise results than an direct vararg call which is just weird.

Differential Revision: https://reviews.llvm.org/D115964

2 years ago[funcattrs] Fix incorrect readnone/readonly inference on captured arguments
Philip Reames [Tue, 21 Dec 2021 17:09:54 +0000 (09:09 -0800)]
[funcattrs] Fix incorrect readnone/readonly inference on captured arguments

This fixes a bug where we would infer readnone/readonly for a function which passed a value to a function which could capture it. With the value captured in memory, the function could reload the value from memory after the call, and write to it. Inferring the argument as readnone or readonly is unsound.

@jdoerfert apparently noticed this about two years ago, and tests were checked in with 76467c4, but the issue appears to have never gotten fixed.

Since this seems like this issue should break everything, let me explain why the case is actually fairly narrow. The main inference loop over the argument SCCs only analyzes nocapture arguments. As such, we can only hit this when construction the partial SCCs. Due to that restriction, we can only hit this when we have either a) a function declaration with a manually annotated argument, or b) an immediately self recursive call.

It's also worth highlighting that we do have cases we can infer readonly/readnone on a capturing argument validly. The easiest example is a function which simply returns its argument without ever accessing it.

Differential Revision: https://reviews.llvm.org/D115961

2 years agoSimplify WPD test case for hybrid LTO and thinTLO.
minglotus-6 [Tue, 21 Dec 2021 02:59:22 +0000 (02:59 +0000)]
Simplify WPD test case for hybrid LTO and thinTLO.

1) remove verbose information (function linkage types, alignment, TBAA) 2) remove unused element or replace irrelevant element with null (as placeholders) in virtual table, remove unused definitions of deleted elements accordingly.

Differential Revision: https://reviews.llvm.org/D116071

2 years ago[Clang] Fix build by restricting debug-info-objname.cpp test to x86.
Alexandre Ganea [Tue, 21 Dec 2021 17:22:25 +0000 (12:22 -0500)]
[Clang] Fix build by restricting debug-info-objname.cpp test to x86.

See: https://lab.llvm.org/buildbot/#/builders/188/builds/7188

2 years ago[Clang] debug-info-objname.cpp test: explictly encode a x86 target when using %clang_...
Alexandre Ganea [Tue, 21 Dec 2021 16:53:53 +0000 (11:53 -0500)]
[Clang] debug-info-objname.cpp test: explictly encode a x86 target when using %clang_cl to avoid falling back to a native CPU triple.

2 years ago[clang-format] Remove unnecessary qualifications. NFC.
Marek Kurdej [Tue, 21 Dec 2021 16:51:10 +0000 (17:51 +0100)]
[clang-format] Remove unnecessary qualifications. NFC.

2 years ago[Hexagon] Add ELF flags for Hexagon v69
Krzysztof Parzyszek [Tue, 21 Dec 2021 16:39:59 +0000 (08:39 -0800)]
[Hexagon] Add ELF flags for Hexagon v69

2 years ago[AArch64] Add a tablegen pattern for UZP2.
Alexandros Lamprineas [Tue, 21 Dec 2021 15:41:06 +0000 (15:41 +0000)]
[AArch64] Add a tablegen pattern for UZP2.

Converts concat_vectors((trunc (lshr)), (trunc (lshr))) to UZP2
when the shift amount is half the width of the vector element.

Differential Revision: https://reviews.llvm.org/D116021

2 years ago[clangd] Return error for textdocument/outgoingCalls rather than success
Kadir Cetinkaya [Tue, 21 Dec 2021 16:06:40 +0000 (17:06 +0100)]
[clangd] Return error for textdocument/outgoingCalls rather than success

2 years ago[clang-format] Remove unnecessary qualifications. NFC.
Marek Kurdej [Tue, 21 Dec 2021 15:58:08 +0000 (16:58 +0100)]
[clang-format] Remove unnecessary qualifications. NFC.

2 years agoAMDGPU/GlobalISel: Regenerate test checks
Matt Arsenault [Tue, 21 Dec 2021 15:20:26 +0000 (10:20 -0500)]
AMDGPU/GlobalISel: Regenerate test checks

2 years ago[clang-format] Fix SplitEmptyRecord affecting SplitEmptyFunction.
Marek Kurdej [Tue, 21 Dec 2021 15:44:44 +0000 (16:44 +0100)]
[clang-format] Fix SplitEmptyRecord affecting SplitEmptyFunction.

Fixes https://github.com/llvm/llvm-project/issues/50051.

Given the style:
```
BraceWrapping
  AfterFunction: true
 SplitEmptyFunction: true
 SplitEmptyRecord: false
...
```

The code that should be like:
```
void f(int aaaaaaaaaaaaaaaaaaaaaaaaaaaa,
       int bbbbbbbbbbbbbbbbbbbbbbbb)
{
}
```

gets the braces merged together:
```
void f(int aaaaaaaaaaaaaaaaaaaaaaaaaaaa,
       int bbbbbbbbbbbbbbbbbbbbbbbb)
{}
```

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D116049

2 years agotsan: fix failures after multi-threaded fork
Dmitry Vyukov [Tue, 21 Dec 2021 07:52:19 +0000 (08:52 +0100)]
tsan: fix failures after multi-threaded fork

Creating threads after a multi-threaded fork is semi-supported,
we don't give particular guarantees, but we try to not fail
on simple cases and we have die_after_fork=0 flag that enables
not dying on creation of threads after a multi-threaded fork.
This flag is used in the wild:
https://github.com/mongodb/mongo/blob/23c052e3e321dbab90f1863d4d5539d7c1a1cf44/SConstruct#L3599

fork_multithreaded.cpp test started hanging in debug mode
after the recent "tsan: fix deadlock during race reporting" commit,
which added proactive ThreadRegistryLock check in SlotLock.

But the test broke earlier after "tsan: remove quadratic behavior in pthread_join"
commit which made tracking of alive threads based on pthread_t stricter
(CHECK-fail on 2 threads with the same pthread_t, or joining a non-existent thread).
When we start a thread after a multi-threaded fork, the new pthread_t
can actually match one of existing values (for threads that don't exist anymore).
Thread creation started CHECK-failing on this, but the test simply
ignored this CHECK failure in the child thread and "passed".
But after "tsan: fix deadlock during race reporting" the test started hanging dead,
because CHECK failures recursively lock thread registry.

Fix this purging all alive threads from thread registry on fork.

Also the thread registry mutex somehow lost the internal deadlock detector id
and was excluded from deadlock detection. If it would have the id, the CHECK
wouldn't hang because of the nested CHECK failure due to the deadlock.
But then again the test would have silently ignore this error as well
and the bugs wouldn't have been noticed.
Add the deadlock detector id to the thread registry mutex.

Also extend the test to check more cases and detect more bugs.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D116091

2 years ago[libc++][NFC] Fix links to https://llvm.org/PR20183 in the tests
Louis Dionne [Tue, 21 Dec 2021 15:33:57 +0000 (10:33 -0500)]
[libc++][NFC] Fix links to https://llvm.org/PR20183 in the tests

2 years ago[Clang] Disable debug-info-objname.cpp test on Unix until I sort out the issue.
Alexandre Ganea [Tue, 21 Dec 2021 15:32:35 +0000 (10:32 -0500)]
[Clang] Disable debug-info-objname.cpp test on Unix until I sort out the issue.

2 years ago[clang][NFC] Refactor coroutine_traits lookup
Nathan Sidwell [Tue, 21 Dec 2021 14:45:25 +0000 (09:45 -0500)]
[clang][NFC] Refactor coroutine_traits lookup

To allow transition from the TS-specified
std::experimental::coroutine_traits to the C++20-specified
std::coroutine_traits, we lookup in both places and provide helpful
diagnostics. This refactors the code to avoid separate paths to
std::experimental lookups.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D116029

2 years ago[LV] Ensure WidenCanonicalIVRecipe is always created in header (NFC).
Florian Hahn [Tue, 21 Dec 2021 15:13:18 +0000 (15:13 +0000)]
[LV] Ensure WidenCanonicalIVRecipe is always created in header (NFC).

The VPWidenCanonicalIVRecipe must always be created in the phi section
of the header block. Use that block as insert point.

2 years ago[libcxx][test] Verify customization point object properties
Joe Loser [Wed, 15 Dec 2021 02:19:10 +0000 (21:19 -0500)]
[libcxx][test] Verify customization point object properties

Add test for various customization point object properties as defined by
the Standard. Test various CPOs from `<ranges>`, `<iterator>`,
`<concepts>`, etc.

The test is mostly from https://reviews.llvm.org/D107036 and split up
into this.

Differential Revision: https://reviews.llvm.org/D115588

2 years ago[Debugify] Use WeakWH map collected before Pass when checking loc drop
Djordje Todorovic [Tue, 21 Dec 2021 14:52:55 +0000 (15:52 +0100)]
[Debugify] Use WeakWH map collected before Pass when checking loc drop

This fixes a typo/bug when checking for pointer reuse when testing
DI location preservation in the Debugify original mode (when
checking -g generated Debug Info).

Differential Revision: https://reviews.llvm.org/D115621

2 years ago[CodeGen] Avoid more pointer element type accesses
Nikita Popov [Tue, 21 Dec 2021 14:15:23 +0000 (15:15 +0100)]
[CodeGen] Avoid more pointer element type accesses

2 years ago[CodeView] Emit S_OBJNAME record
Alexandre Ganea [Tue, 21 Dec 2021 14:26:17 +0000 (09:26 -0500)]
[CodeView] Emit S_OBJNAME record

Thanks to @zturner for the initial patch!

Differential Revision: https://reviews.llvm.org/D43002

2 years ago[clang-format] NFC use recently added Style.isJavaScript()
mydeveloperday [Tue, 21 Dec 2021 14:24:12 +0000 (14:24 +0000)]
[clang-format] NFC use recently added Style.isJavaScript()

Improve the readability of these if(Style==FormatStyle::LK_JavsScript) clauses

2 years agoAlignConsecutiveDeclarations not working for 'const' keyword in JavsScript
mydeveloperday [Tue, 21 Dec 2021 13:57:43 +0000 (13:57 +0000)]
AlignConsecutiveDeclarations not working for 'const' keyword in JavsScript

https://github.com/llvm/llvm-project/issues/49846

Fixes #49846

AlignConsecutiveDeclarations  is not working for "let" and "const" in JavaScript

let letVariable     = 5;
const constVariable = 10;

Reviewed By: owenpan, HazardyKnusperkeks, curdeius

Differential Revision: https://reviews.llvm.org/D115990

2 years ago[CodeGen] Accept Address in CreateLaunderInvariantGroup
Nikita Popov [Tue, 21 Dec 2021 13:40:33 +0000 (14:40 +0100)]
[CodeGen] Accept Address in CreateLaunderInvariantGroup

Add an overload that accepts and returns an Address, as we
generally just want to replace the pointer with a laundered one,
while retaining remaining information.

2 years ago[libc++] Rename __s1/__s2 to __dest/__source in __copy_constexpr. NFC.
Arthur O'Dwyer [Tue, 21 Dec 2021 13:33:30 +0000 (08:33 -0500)]
[libc++] Rename __s1/__s2 to __dest/__source in __copy_constexpr. NFC.

This consistently completes the renaming started in D115986.

2 years ago[mlir][memref] ReinterpretCast: allow static sizes/strides/offset where affine map...
Butygin [Thu, 28 Oct 2021 16:04:35 +0000 (19:04 +0300)]
[mlir][memref] ReinterpretCast: allow static sizes/strides/offset where affine map expects dynamic

* There is no reason to forbid that case
* Also, user will get very unfriendly error like `expected result type with offset = -9223372036854775808 instead of 1`

Differential Revision: https://reviews.llvm.org/D114678

2 years ago[clangd] Fix typo in test. NFC
Sam McCall [Tue, 21 Dec 2021 13:16:54 +0000 (14:16 +0100)]
[clangd] Fix typo in test. NFC

2 years ago[X86] getTargetVShiftNode - remove shift-by-constant handling.
Simon Pilgrim [Tue, 21 Dec 2021 13:16:41 +0000 (13:16 +0000)]
[X86] getTargetVShiftNode - remove shift-by-constant handling.

Move shift-by-constant handling and move it into its only user (VSHIFT intrinsics lowering).

This is some prep-work for getTargetVShiftNode to no longer take a scalar shift amount - we're introducing temporary ISD::EXTRACT_VECTOR_ELT nodes via SelectionDAG::getSplatValue to accommodate this which can cause various issues, including unnecessary scalarization and xmm->gpr->xmm transfers, and causes problems for 32-bit codegen if we fail to remove an (illegal) i64 scalar extracted from a (legal) vXi64 vector.

2 years ago[CodeGen] Avoid some pointer element type accesses
Nikita Popov [Mon, 20 Dec 2021 15:17:27 +0000 (16:17 +0100)]
[CodeGen] Avoid some pointer element type accesses

This avoids some pointer element type accesses when compiling
C++ code.

2 years ago[gn build] (semiautomatically) port 9b4f179bf8d3
Nico Weber [Fri, 17 Dec 2021 12:26:34 +0000 (07:26 -0500)]
[gn build] (semiautomatically) port 9b4f179bf8d3

2 years ago[libc++] Allow __move_constexpr to work with unrelated pointers
Nikolas Klauser [Tue, 21 Dec 2021 10:22:50 +0000 (11:22 +0100)]
[libc++] Allow __move_constexpr to work with unrelated pointers

Allow `__move_constexpr` to work with unrelated pointers and `_LIBCPP_ASSERT` that `__copy_constexpr`, `__move_constexpr` and `__assign_constexpr` are only run during constant evaluation

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D115986

2 years ago[SVE] Reintroduce -scalable-vectorization=preferred as an alias to "on".
Paul Walker [Tue, 21 Dec 2021 12:51:40 +0000 (12:51 +0000)]
[SVE] Reintroduce -scalable-vectorization=preferred as an alias to "on".

Some buildbots still rely on the experimental flag, so let's keep
it until everything has been migrated to the new "on by default"
state.

2 years agotsan: always handle closing of file descriptors
Dmitry Vyukov [Tue, 21 Dec 2021 09:30:01 +0000 (10:30 +0100)]
tsan: always handle closing of file descriptors

If we miss both close of a file descriptor and a subsequent open
if the same file descriptor number, we report false positives
between operations on the old and on the new descriptors.

There are lots of ways to create new file descriptors, but for closing
there is mostly close call. So we try to handle at least it.
However, if the close happens in an ignored library, we miss it
and start reporting false positives.

Handle closing of file descriptors always, even in ignored libraries
(as we do for malloc/free and other critical functions).
But don't imitate memory accesses on close for ignored libraries.

FdClose checks validity of the fd (fd >= 0) itself,
so remove the excessive checks in the callers.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D116095

2 years ago[GlobalISel] Verify operand types for G_SHL, G_LSHR, G_ASHR
Jay Foad [Thu, 16 Dec 2021 09:52:48 +0000 (09:52 +0000)]
[GlobalISel] Verify operand types for G_SHL, G_LSHR, G_ASHR

Differential Revision: https://reviews.llvm.org/D115868

2 years ago[X86] LowerRotate - enable vXi32 splat handling
Simon Pilgrim [Tue, 21 Dec 2021 11:19:15 +0000 (11:19 +0000)]
[X86] LowerRotate - enable vXi32 splat handling

Pull out the "rotl(x,y) --> (unpack(x,x) << zext(splat(y % bw))) >> bw" special case from vXi8 lowering so we can reuse it for vXi32 types as well.

There's still some regressions with vXi16 to handle before this becomes entirely general.

It also allows us to remove the now unnecessary hack for handling amount-modulo before splatting.

2 years ago[DAG] Constify SelectionDAG::isSplatValue()
Simon Pilgrim [Sun, 19 Dec 2021 17:03:53 +0000 (17:03 +0000)]
[DAG] Constify SelectionDAG::isSplatValue()

This doesn't generate any nodes so should be usable by methods with const SelectionDAG &.

2 years ago[FuncSpec] Rename internal option. NFC.
Sjoerd Meijer [Tue, 21 Dec 2021 10:19:49 +0000 (10:19 +0000)]
[FuncSpec] Rename internal option. NFC.

Rename option MaxConstantsThreshold to MaxClonesThreshold. Not only is this
more descriptive, this is also in preparation of introducing another threshold
to analyse more than just 1 constant argument as we currently do, and to better
distinguish these options/thresholds.

2 years ago[llvm-mca] Compare multiple files
Djordje Todorovic [Tue, 21 Dec 2021 10:53:21 +0000 (11:53 +0100)]
[llvm-mca] Compare multiple files

Script (llvm-mca-compare.py) uses llvm-mca tool to print
statistics in console for multiple files.
Script requires specified --llvm-mca-binary option (specified
relative path to binary of llvm-mca).
Options: --args [="-option1=<arg> -option2=<arg> ..."], -v or -h can also be used.

The script is used as follows:
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s --llvm-mca-binary=build/bin/llvm-mca

Patch by Milica Matic <Milica.Matic@syrmia.com>

Differential Revision: https://reviews.llvm.org/D115138

2 years ago[mlir][Support] Avoid multiplication in floorDiv / ceilDiv
Stephan Herhut [Tue, 21 Dec 2021 10:31:28 +0000 (11:31 +0100)]
[mlir][Support] Avoid multiplication in floorDiv / ceilDiv

Using comparisons instead avoids potential overflow.

Differential Revision: https://reviews.llvm.org/D116096

2 years ago[CodeGen] Avoid pointee type access during global var declaration
Nikita Popov [Tue, 21 Dec 2021 10:44:45 +0000 (11:44 +0100)]
[CodeGen] Avoid pointee type access during global var declaration

All callers pass in a GlobalVariable, so we can conveniently fetch
the type from there.

2 years ago[AArch64][SVE] Lower shuffles to permute instructions: zip1/2, uzp1/2, trn1/2
Andrew Wei [Tue, 21 Dec 2021 10:14:21 +0000 (18:14 +0800)]
[AArch64][SVE] Lower shuffles to permute instructions: zip1/2, uzp1/2, trn1/2

Attempt to lower a shuffle as a permute instruction(zip/uzp/trn) for fixed length SVE.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D113376

2 years agotsan: remove unused ReportMutex::destroyed
Dmitry Vyukov [Tue, 16 Nov 2021 08:02:59 +0000 (09:02 +0100)]
tsan: remove unused ReportMutex::destroyed

Depends on D113980.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113981

2 years agotsan: change ReportMutex::id type to int
Dmitry Vyukov [Tue, 16 Nov 2021 08:09:14 +0000 (09:09 +0100)]
tsan: change ReportMutex::id type to int

We used to use u64 as mutex id because it was some
tricky identifier built from address and reuse count.
Now it's just the mutex index in the report (0, 1, 2...),
so use int to represent it.

Depends on D112603.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113980

2 years agoRevert "[NFC] [C++20] [Modules] Add tests for template instantiation in transitively...
Chuanqi Xu [Tue, 21 Dec 2021 10:33:29 +0000 (18:33 +0800)]
Revert "[NFC] [C++20] [Modules] Add tests for template instantiation in transitively imported module"

This reverts commit 4f103e956157515dd800951f73ed550b1a0477f4.

The tests couldn't pass under windows.

2 years agotsan: optimize __tsan_read/write16
Dmitry Vyukov [Thu, 25 Nov 2021 14:44:19 +0000 (15:44 +0100)]
tsan: optimize __tsan_read/write16

These callbacks are used for SSE vector accesses.
In some computational programs these accesses dominate.
Currently we do 2 uninlined 8-byte accesses to handle them.
Inline and optimize them similarly to unaligned accesses.
This reduces the vector access benchmark time from 8 to 3 seconds.

Depends on D112603.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D114594

2 years ago[NFC] [C++20] [Modules] Add tests for template instantiation in transitively imported...
Chuanqi Xu [Tue, 21 Dec 2021 09:36:50 +0000 (17:36 +0800)]
[NFC] [C++20] [Modules] Add tests for template instantiation in transitively imported module

This commit adds two test about template class instantiation in
transitively imported module. They are used as pre-commit tests for
successive patches.

2 years ago[ELF] Remove unneeded SectionBase::repl indirection
Fangrui Song [Tue, 21 Dec 2021 08:39:16 +0000 (00:39 -0800)]
[ELF] Remove unneeded SectionBase::repl indirection

sec->repl equals sec after rL371216.

2 years ago[VE] FADD,FSUB,FMUL,FDIV v256f32|f64 isel and tests
Simon Moll [Tue, 21 Dec 2021 08:15:23 +0000 (09:15 +0100)]
[VE] FADD,FSUB,FMUL,FDIV v256f32|f64 isel and tests

Depends on D115940 for the `Binary_rv_vr_vv` pattern class op isel
fragment used for divisions.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D116035

2 years ago[ConstantFold][GlobalOpt] Don't create x86_mmx null value
Nikita Popov [Tue, 21 Dec 2021 08:11:41 +0000 (09:11 +0100)]
[ConstantFold][GlobalOpt] Don't create x86_mmx null value

This fixes the assertion failure reported at
https://reviews.llvm.org/D114889#3198921 with a straightforward
check, until the cleaner fix in D115924 can be reapplied.

2 years ago[InstCombine] Drop outdated alignment comment (NFC)
Nikita Popov [Tue, 21 Dec 2021 07:58:48 +0000 (08:58 +0100)]
[InstCombine] Drop outdated alignment comment (NFC)

Loads always have an alignment now, so this is no longer relevant.

2 years ago[VE] U|SDIV v256i32|64 isel and tests
Simon Moll [Tue, 21 Dec 2021 07:43:31 +0000 (08:43 +0100)]
[VE] U|SDIV v256i32|64 isel and tests

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D115940

2 years ago[llvm] Construct SmallVector with iterator ranges (NFC)
Kazu Hirata [Tue, 21 Dec 2021 07:43:23 +0000 (23:43 -0800)]
[llvm] Construct SmallVector with iterator ranges (NFC)

2 years ago[ARM] Use range-based for loops (NFC)
Kazu Hirata [Tue, 21 Dec 2021 07:06:47 +0000 (23:06 -0800)]
[ARM] Use range-based for loops (NFC)

2 years ago[RISCV] Precommit tests for override hasAndNotCompare.
jacquesguan [Mon, 20 Dec 2021 03:44:09 +0000 (11:44 +0800)]
[RISCV] Precommit tests for override hasAndNotCompare.

Precommit tests for D115922.

Differential Revision: https://reviews.llvm.org/D116013

2 years ago[AMDGPU][NFC] Update DWARF extension for locations on the stack
Tony Tye [Tue, 21 Dec 2021 04:47:26 +0000 (04:47 +0000)]
[AMDGPU][NFC] Update DWARF extension for locations on the stack

- Improve extension description.
- Rename "What is DWARF?" section to better reflect what it is
  describing.

Reviewed By: kzhuravl

Differential Revision: https://reviews.llvm.org/D116077

2 years agoBPF: report better error message for BTF_TYPE_ID_REMOTE relo failure
Yonghong Song [Mon, 20 Dec 2021 00:58:12 +0000 (16:58 -0800)]
BPF: report better error message for BTF_TYPE_ID_REMOTE relo failure

Matteo Croce reported a bpf backend fatal error in
https://github.com/llvm/llvm-project/issues/52779

A simplified case looks like:
  $ cat bug.c
  extern int do_smth(int);
  int test() {
    return __builtin_btf_type_id(*(typeof(do_smth) *)do_smth, 1);
  }
  $ clang -target bpf -O2 -g -c bug.c
  fatal error: error in backend: Empty type name for BTF_TYPE_ID_REMOTE reloc
  ...

The reason for the fatal error is that the relocation is against
a DISubroutineType like type 13 below:
  !10 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
  !11 = !{}
  !12 = !DILocation(line: 3, column: 10, scope: !7)
  !13 = !DISubroutineType(types: !14)
  !14 = !{!10, !10}

The DISubroutineType doesn't have a name and there is no way for
downstream bpfloader/kernel to do proper relocation for it.

But we can improve error message to be more specific for this case.
The patch improved the error message to be:
  fatal error: error in backend: SubroutineType not supported for BTF_TYPE_ID_REMOTE reloc

Differential Revision: https://reviews.llvm.org/D116063

2 years ago[PowerPC][llvm-objdump] enable --symbolize-operands for PowerPC ELF/XCOFF.
Esme-Yi [Tue, 21 Dec 2021 04:17:57 +0000 (04:17 +0000)]
[PowerPC][llvm-objdump] enable --symbolize-operands for PowerPC ELF/XCOFF.

Summary: When disassembling, symbolize a branch target operand
to print a label instead of a real address.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D114492

2 years ago[tsan] Disable test from D115759 on Darwin
Vitaly Buka [Tue, 21 Dec 2021 03:37:34 +0000 (19:37 -0800)]
[tsan] Disable test from D115759 on Darwin

2 years ago[NFC] Fix clang-tidy issues in CalcSpillWeights.cpp
Mircea Trofin [Tue, 21 Dec 2021 03:21:31 +0000 (19:21 -0800)]
[NFC] Fix clang-tidy issues in CalcSpillWeights.cpp

2 years ago[memprof][NFC] Fix mismatched-new-delete in memprof tests
Xu Mingjie [Tue, 21 Dec 2021 02:35:08 +0000 (18:35 -0800)]
[memprof][NFC] Fix mismatched-new-delete in memprof tests

Fix mismatched-new-delete in memprof test_new_load_store.cpp and test_terse.cpp

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D116024

2 years agoPort __sanitizer::StopTheWorld to Windows
Clemens Wasser [Tue, 21 Dec 2021 02:25:53 +0000 (18:25 -0800)]
Port __sanitizer::StopTheWorld to Windows

This also makes the sanitizer_stoptheworld_test cross-platform by using the STL, rather than pthread.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D115204

2 years ago[LTO] Fix incomplete optimization remarks for dead functions when PreOptModuleHook...
Xu Mingjie [Tue, 21 Dec 2021 02:16:09 +0000 (18:16 -0800)]
[LTO] Fix incomplete optimization remarks for dead functions when PreOptModuleHook or PostInternalizeModuleHook is defined

In 20a895c4be01769a37dfffb3c6b513a7bc9b8d17, we introduce `finalizeOptimizationRemarks()` to make sure we flush the diagnostic remarks file in case the linker doesn't call the global destructors before exiting.
In https://reviews.llvm.org/D73597, we add optimization remarks for removed functions for debugging or for detecting dead code.
But there is a case, if PreOptModuleHook or PostInternalizeModuleHook is defined (e.g. `--plugin-opt=emit-llvm` is passed to linker), we do not call `finalizeOptimizationRemarks()`, therefore we will get an incomplete optimization remarks file.
This patch make sure we flush the diagnostic remarks file when PreOptModuleHook or PostInternalizeModuleHook is defined.

Reviewed By: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D115417

2 years ago[sanitizer] Fix compress_stack_depot.cpp test on Darwin
Vitaly Buka [Tue, 21 Dec 2021 02:13:27 +0000 (18:13 -0800)]
[sanitizer] Fix compress_stack_depot.cpp test on Darwin

All platforms which can start the thread should stop it as well.

2 years ago[DSE] Remove calls with known writes to dead memory
Philip Reames [Tue, 21 Dec 2021 02:10:23 +0000 (18:10 -0800)]
[DSE] Remove calls with known writes to dead memory

This is a reapply of a8a51fe5, which was reverted in 1ba99e due to a failing compiler-rt test.   That test was a false positive because it was checking asan failures not accounting for the fact the call could be validly optimized out.  I hopefully managed to stablize that test in 9b955f.  (That's a speculative fix due to disk consumption needed to build compiler-rt tests locally being absurd.)

Original commit message follows..

The majority of this change is sinking logic from instcombine into MemoryLocation such that it can be generically reused. If we have a call with a single analyzable write to an argument, we can treat that as-if it were a store of unknown size.

Merging the code in this was unblocks DSE in the store to dead memory code paths. In theory, it should also enable classic DSE of such calls, but the code appears to not know how to use object sizes to refine unknown access bounds (yet).

In addition, this does make the isAllocRemovable path slightly stronger by reusing the libfunc and additional intrinsics bits which are already in getForDest.

Differential Revision: https://reviews.llvm.org/D115904

2 years agoAttempt to stablize compiler-rt/test/asan/TestCases/strncpy-overflow.cpp
Philip Reames [Tue, 21 Dec 2021 01:51:11 +0000 (17:51 -0800)]
Attempt to stablize compiler-rt/test/asan/TestCases/strncpy-overflow.cpp

This attempts to adjust the test to still exercise the expected codepath after D115904.  This test is fundementally rather fragile.

Unfortunately, I have not been able to confirm this workaround either does, or does not, work.  Attempting check-all with compiler-rt blows through an additional 30GB of disk space so my build config which exceeds my local disk space.

2 years agoRevert "[LTO] Add a function `LTOCodeGenerator::getMergedModule`"
Shilei Tian [Tue, 21 Dec 2021 01:34:04 +0000 (20:34 -0500)]
Revert "[LTO] Add a function `LTOCodeGenerator::getMergedModule`"

This reverts commit 75a5eaf7c6d60e95b11ea572b78fdb8788d15ddc.

2 years ago[tsan] Fix Darwin crash after D115759
Vitaly Buka [Mon, 20 Dec 2021 23:41:17 +0000 (15:41 -0800)]
[tsan] Fix Darwin crash after D115759

Remove global constructor which may or may not be needed for Android,
at it breaks Darwin now.

2 years agodocs: Clarify licensing rules for the project
Tom Stellard [Tue, 21 Dec 2021 01:03:06 +0000 (17:03 -0800)]
docs: Clarify licensing rules for the project

Reviewed By: lattner, kristof.beyls

Differential Revision: https://reviews.llvm.org/D113427

2 years ago[LTO] Add a function `LTOCodeGenerator::getMergedModule`
Shilei Tian [Tue, 21 Dec 2021 01:01:37 +0000 (20:01 -0500)]
[LTO] Add a function `LTOCodeGenerator::getMergedModule`

One of the uses of `LTOCodeGenerator` is to take it as a middle+back end. Sometimes
it is very helpful to access, especially get information from the optimized module.
If the information can be changed in optimization, it cannot be get before the
module is added to `LTOCodeGenerator`. This patch adds a function
`LTOCodeGenerator::getMergedModule` to access the `MergedModule`.

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D114201

2 years agoFix clang-tidy issues in mlir/ (NFC)
Mehdi Amini [Fri, 17 Dec 2021 18:45:10 +0000 (18:45 +0000)]
Fix clang-tidy issues in mlir/ (NFC)

Differential Revision: https://reviews.llvm.org/D115956

2 years agoAArch64/GlobalISel: Fix memory type in test
Matt Arsenault [Sat, 18 Dec 2021 19:32:39 +0000 (14:32 -0500)]
AArch64/GlobalISel: Fix memory type in test

2 years agoAMDGPU/GlobalISel: Stop using NarrowScalar/FewerElements for unaligned splitting
Matt Arsenault [Mon, 26 Jul 2021 21:22:01 +0000 (17:22 -0400)]
AMDGPU/GlobalISel: Stop using NarrowScalar/FewerElements for unaligned splitting

These actions should only be used for adjusting the register types
(and the memory type as needed to satisfy the register
type). Unaligned accesses should be split as a type of lowering.

This has the effect of improving the code in many cases since now we
produce zextloads instead of separate loads with ands. The load/store
legality rules still seem far more complicated than necessary though.

2 years ago[Analysis] fix cast in ValueTracking to allow constant expression
Sanjay Patel [Mon, 20 Dec 2021 21:13:55 +0000 (16:13 -0500)]
[Analysis] fix cast in ValueTracking to allow constant expression

The test would crash because a non-instruction negate op made it in here.

Fixes #51506

2 years ago[mlir][arith] Clean up ExpandOps pass
Mogball [Mon, 20 Dec 2021 21:58:39 +0000 (21:58 +0000)]
[mlir][arith] Clean up ExpandOps pass

2 years ago[docs]LLVM Tutorial: fix the typo in Cpu0 URL
Jinsong Ji [Mon, 20 Dec 2021 21:45:51 +0000 (21:45 +0000)]
[docs]LLVM Tutorial: fix the typo in Cpu0 URL

jonathan2251.github.com/lbd/ is 404. Update the URL to .io one according
to https://github.com/Jonathan2251/lbd/blob/master/README.md.

2 years ago[mlir] Add `mlir/unittests/BUILD.bazel`
Mogball [Mon, 20 Dec 2021 21:36:50 +0000 (21:36 +0000)]
[mlir] Add `mlir/unittests/BUILD.bazel`

Unit tests are not getting built as part of bazel runs.

Reviewed By: mehdi_amini, GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D116046

2 years ago[mlir][ods] FIx incorrect comments in PassGen (NFC)
Mogball [Mon, 20 Dec 2021 21:10:53 +0000 (21:10 +0000)]
[mlir][ods] FIx incorrect comments in PassGen (NFC)

And incorrect command option description for `-gen-pass-decls`.

2 years ago[Clang] Add __builtin_function_start
Sami Tolvanen [Tue, 10 Aug 2021 17:03:01 +0000 (10:03 -0700)]
[Clang] Add __builtin_function_start

Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as operating system kernels, which may
need the address of an actual function body without the jump table
indirection.

This change adds the __builtin_function_start() builtin, which
accepts an argument that can be constant-evaluated to a function,
and returns the address of the function body.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353
Depends on D108478

Reviewed By: pcc, rjmccall

Differential Revision: https://reviews.llvm.org/D108479

2 years ago[llvm][IR] Add no_cfi constant
Sami Tolvanen [Tue, 10 Aug 2021 17:02:17 +0000 (10:02 -0700)]
[llvm][IR] Add no_cfi constant

With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces
function references with CFI jump table references, which is a problem
for low-level code that needs the address of the actual function body.

For example, in the Linux kernel, the code that sets up interrupt
handlers needs to take the address of the interrupt handler function
instead of the CFI jump table, as the jump table may not even be mapped
into memory when an interrupt is triggered.

This change adds the no_cfi constant type, which wraps function
references in a value that LowerTypeTestsModule::replaceCfiUses does not
replace.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353
Reviewed By: nickdesaulniers, pcc

Differential Revision: https://reviews.llvm.org/D108478

2 years agoFix clang-tidy issues in mlir/ (NFC)
Mehdi Amini [Mon, 20 Dec 2021 19:45:05 +0000 (19:45 +0000)]
Fix clang-tidy issues in mlir/ (NFC)

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D115956

2 years agoSilence warning with MSVC2019
Alexandre Ganea [Mon, 20 Dec 2021 00:28:31 +0000 (19:28 -0500)]
Silence warning with MSVC2019

This prevents "warning C4551: function call missing argument list"

2 years ago[lit] Support relative path arguments
Geoffrey Martin-Noble [Fri, 10 Dec 2021 01:17:00 +0000 (17:17 -0800)]
[lit] Support relative path arguments

Currently the behavior with relative paths is pretty broken. It differs
between external shell and internal shell because the path resolution
is done with a different working directory. With the internal shell,
it's resolved relative to the directory from which lit is executed,
whereas with the external shell it's resolved relative to where the
test case is executed. To make matters worse, using the internal shell
the filepath to binaries looked up with `which` is returned relative
to the directory from which lit is executed, but then executed from
the test execution directory. That means that relative paths with the
internal shell give a `[Errno 2] No such file or directory` error
instead of the expected `command not found`.

To address these issues this patch makes lit interpret relative paths
as relative to the directory from which lit was invoked and modifies
`which` to return absolute paths, matching the behavior of its
namesake unix function.

See https://groups.google.com/g/llvm-dev/c/KzMWlOXR98Y/m/QJoqn0U5HAAJ

Reviewed By: yln

Differential Revision: https://reviews.llvm.org/D115486

2 years ago[OpenMP][libomp] Add use-all syntax to KMP_HW_SUBSET
Jonathan Peyton [Wed, 15 Dec 2021 20:36:44 +0000 (14:36 -0600)]
[OpenMP][libomp] Add use-all syntax to KMP_HW_SUBSET

This patch allows the user to request all resources of a particular
layer (or core-attribute). The syntax of KMP_HW_SUBSET is modified
so the number of units requested is optional or can be replaced with an
'*' character.

e.g., KMP_HW_SUBSET=c:intel_atom@3 will use all the cores after offset 3
e.g., KMP_HW_SUBSET=*c:intel_core will use all the big cores
e.g., KMP_HW_SUBSET=*s,*c,1t will use all the sockets, all cores per
      each socket and 1 thread per core.

Differential Revision: https://reviews.llvm.org/D115826

2 years ago[flang] Add a semantics test for co_max
Damian Rouson [Wed, 1 Dec 2021 23:55:06 +0000 (15:55 -0800)]
[flang] Add a semantics test for co_max

Test a range of acceptable forms of co_max calls, including
combinations of keyword and non-keyword actual arguments of
numeric types.  Also test that several invalid forms of
co_max call generate the correct error messages.

Reviewed By: ktras

Differential Revision: https://reviews.llvm.org/D113083

2 years ago[Support] Revert posix_fallocate in resize_file
Fangrui Song [Mon, 20 Dec 2021 19:16:03 +0000 (11:16 -0800)]
[Support] Revert posix_fallocate in resize_file

This reverts 3816c53f040cc6aa06425978dd504b0bd5b7899c and removes follow-up
fixups.

The original intention was to show error earlier (posix_fallocate time) than
later for ld.lld but it appears to cause some problems which make it not free.

* FreeBSD ZFS: EINVAL, not too bad.
* FreeBSD UFS: according to khng "devastatingly slow on freebsd because UFS on freebsd does not have preallocation support like illumos. It zero-fills."
* NetBSD: maybe EOPNOTSUPP
* Linux tmpfs: unless tmpfs is set up to use huge pages (requires CONFIG_TRANSPARENT_HUGE_PAGECACHE=y), I can consistently demonstrate ~300ms delay for a 1.4GiB output.
* Linux ext4: I don't measure any benefit, either backed by a hard disk or by a file in tmpfs.
* The current code organization of `defined(HAVE_POSIX_FALLOCATE)` costs us a macro dispatch for AIX.

I think we should just remove it. I think if posix_fallocate ever finds demonstrable benefit,
it is likely Linux specific and will not need HAVE_POSIX_FALLOCATE, and possibly opt-in by some specific programs.

In a filesystem with CoW and compression, the ENOSPC benefit may be lost as well.

Reviewed By: khng300

Differential Revision: https://reviews.llvm.org/D115957