platform/upstream/llvm.git
3 years agoRevert "[InlineCost] Enable the cost benefit analysis on FDO"
Nico Weber [Thu, 25 Mar 2021 20:41:32 +0000 (16:41 -0400)]
Revert "[InlineCost] Enable the cost benefit analysis on FDO"

This reverts commit ef69aa961d12dee2141a79b05c9637d8cc9c0c74.
Makes clang assert in PGO builds, see repro tgz in
https://bugs.chromium.org/p/chromium/issues/detail?id=1192783#c6

3 years ago[clang][driver] Support HWASan in the Fuchsia toolchain
Leonard Chan [Thu, 25 Mar 2021 18:30:44 +0000 (11:30 -0700)]
[clang][driver] Support HWASan in the Fuchsia toolchain

These contain clang driver changes for supporting HWASan on Fuchsia.
This includes hwasan multilibs and the dylib path change.

Differential Revision: https://reviews.llvm.org/D99361

3 years ago[NFCI][SimplifyCFG] Don't pay for a Small{Map,Set}Vector when plain SmallSet will...
Roman Lebedev [Thu, 25 Mar 2021 19:58:10 +0000 (22:58 +0300)]
[NFCI][SimplifyCFG] Don't pay for a Small{Map,Set}Vector when plain SmallSet will suffice

This *only* changes the cases where we *really* don't care
about the iteration order of the underlying contained,
namely when we will use the values from it to form DTU updates.

3 years ago[IR] Lift attribute handling for assume bundles into CallBase
Nikita Popov [Wed, 24 Mar 2021 16:56:23 +0000 (17:56 +0100)]
[IR] Lift attribute handling for assume bundles into CallBase

Rather than special-casing assume in BasicAA getModRefBehavior(),
do this one level higher, in the attribute handling of CallBase.

For assumes with operand bundles, the inaccessiblememonly attribute
applies regardless of operand bundles.

3 years ago[PowerPC] auto-generate complete testchecks; NFC
Sanjay Patel [Thu, 25 Mar 2021 18:58:51 +0000 (14:58 -0400)]
[PowerPC] auto-generate complete testchecks; NFC

The full checks demonstrate a problem that comes up in:
https://llvm.org/PR49610

3 years ago[flang] fix spurious runtime crash on TRIM('')
peter klausler [Thu, 25 Mar 2021 18:03:32 +0000 (11:03 -0700)]
[flang] fix spurious runtime crash on TRIM('')

The standard interoperability routine CFI_establish() does not
accept a zero-length CHARACTER type.  Since these can be valid
results of intrinsic function references, work around the design
of CFI_establish() in the wrapper routine that calls it.

Differential Revision: https://reviews.llvm.org/D99296

3 years ago[Support][Windows] Make sure only executables are found by sys::findProgramByName
Markus Böck [Thu, 25 Mar 2021 19:26:20 +0000 (20:26 +0100)]
[Support][Windows] Make sure only executables are found by sys::findProgramByName

The function utilizes Windows' SearchPathW function, which as I found out today, may also return directories. After looking at the Unix implementation of the file I found that it contains a check whether the found path is also executable. While fixing the Windows implementation, I also learned that sys::fs::access returns successfully when querying whether directories are executable, which the Unix version does not.

This patch makes both of these functions equivalent to their Unix implementation and insures that any path returned by sys::findProgramByName on Windows may only be executable, just like the Unix implementation.

The equivalent additions I have made to the Windows implementation, in the Unix implementation are here:
sys::findProgramByName: https://github.com/llvm/llvm-project/blob/39ecfe614350fa5db7b8f13f81212f8e3831a390/llvm/lib/Support/Unix/Program.inc#L90
sys::fs::access: https://github.com/llvm/llvm-project/blob/c2a84771bb63947695ea50b89160c02b36fb634d/llvm/lib/Support/Unix/Path.inc#L608

I encountered this issue when running the LLVM testsuite. Commands of the form not test ... would fail to correctly execute test.exe, which is part of GnuWin32, as it actually tried to execute a folder called test, which happened to be in a directory on my PATH.

Differential Revision: https://reviews.llvm.org/D99357

3 years ago[NFC] Module::getInstructionCount() is const
Mircea Trofin [Thu, 25 Mar 2021 19:28:47 +0000 (12:28 -0700)]
[NFC] Module::getInstructionCount() is const

3 years ago[CUDA][HIP] add __builtin_get_device_side_mangled_name
Yaxun (Sam) Liu [Wed, 24 Mar 2021 21:28:56 +0000 (17:28 -0400)]
[CUDA][HIP] add __builtin_get_device_side_mangled_name

Add builtin function __builtin_get_device_side_mangled_name
to get device side manged name for functions and global
variables, which can be used to get symbol address of kernels
or variables by mangled name in dynamically loaded
bundled code objects at run time.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D99301

3 years ago[AMDGPU] Refactoring mfma intrinsic definitions. NFC.
Stanislav Mekhanoshin [Thu, 25 Mar 2021 19:04:57 +0000 (12:04 -0700)]
[AMDGPU] Refactoring mfma intrinsic definitions. NFC.

Differential Revision: https://reviews.llvm.org/D99366

3 years ago[lld-macho][nfc] Removed unnecessary static_cast
Vy Nguyen [Thu, 25 Mar 2021 18:59:54 +0000 (14:59 -0400)]
[lld-macho][nfc] Removed unnecessary static_cast

Differential Revision: https://reviews.llvm.org/D99365

3 years ago[flang][driver] Fix typos and inconsistent comments (nfc)
Andrzej Warzynski [Thu, 25 Mar 2021 18:59:48 +0000 (18:59 +0000)]
[flang][driver] Fix typos and inconsistent comments (nfc)

3 years ago[Hexagon] Limit virtual register reuse range in FI elimination
Krzysztof Parzyszek [Thu, 25 Mar 2021 18:43:55 +0000 (13:43 -0500)]
[Hexagon] Limit virtual register reuse range in FI elimination

3 years ago[lld-macho] Add support for --threads
Jez Ng [Thu, 25 Mar 2021 18:39:45 +0000 (14:39 -0400)]
[lld-macho] Add support for --threads

Code and test are largely identical to the LLD-ELF equivalents.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99312

3 years ago[lld-macho] Add more TimeTraceScopes
Jez Ng [Thu, 25 Mar 2021 18:39:44 +0000 (14:39 -0400)]
[lld-macho] Add more TimeTraceScopes

I added just enough to allow us to see a top-level breakdown of time taken. This
is the result of loading the time-trace output into `chrome:://tracing`:

https://gist.githubusercontent.com/int3/236c723cbb4b6fa3b2d340bb6395c797/raw/ef5e8234f3fdf609bf93b50f54f4e0d9bd439403/tracing.png

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D99311

3 years ago[lld-macho] Fix typo in diagnostic message
Jez Ng [Wed, 24 Mar 2021 18:43:09 +0000 (14:43 -0400)]
[lld-macho] Fix typo in diagnostic message

3 years ago[JITLink][MachO/x86-64] Remove stale commented-out code.
Lang Hames [Thu, 25 Mar 2021 18:45:30 +0000 (11:45 -0700)]
[JITLink][MachO/x86-64] Remove stale commented-out code.

This commented-out code was accidentally left in during the transition from
MachO-specific to generic x86-64 edge kinds (ecf6466f01c).

3 years agoRemove unused function, fix warning (NFC)
Mehdi Amini [Thu, 25 Mar 2021 18:36:33 +0000 (18:36 +0000)]
Remove unused function, fix warning (NFC)

The `mayNotHaveTerminator` was initially on Block but moved to the
verifier before landing and wasn't removed from its original place
where it is unused.

3 years ago[clang] Pass option directly to command. NFC
Shoaib Meenai [Thu, 25 Mar 2021 07:20:01 +0000 (00:20 -0700)]
[clang] Pass option directly to command. NFC

This code was written back when LLVM's minimum required CMake version
was 2.8.8, and I assume ExternalProject_Add_Step didn't take this option
at that point. It does now though, so we should just use the option.
Setting the _EP_* property is entirely equivalent (and is in fact how
these commands behave internally), but that also feels like an internal
implementation detail we shouldn't be relying on.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D99322

3 years ago[clang] Always execute multi-stage install steps
Shoaib Meenai [Thu, 25 Mar 2021 07:16:47 +0000 (00:16 -0700)]
[clang] Always execute multi-stage install steps

We want installs to be executed even if binaries haven't changed, e.g.
so that we can install to multiple places. This is consistent with how
non-multi-stage install targets (e.g. the regular install-distribution
target) behave.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D99321

3 years ago[flang] Fix error compiling std::min on macos
Tim Keith [Thu, 25 Mar 2021 18:18:39 +0000 (11:18 -0700)]
[flang] Fix error compiling std::min on macos

On macos, `size_t` is `unsigned long` while `size_t - int64_t` is
`unsigned long long` so std::min requires an explicit type to compile.

Differential Revision: https://reviews.llvm.org/D99340

3 years ago[clang][Syntax] Optimize expandedTokens for token ranges.
Utkarsh Saxena [Mon, 22 Mar 2021 14:40:37 +0000 (15:40 +0100)]
[clang][Syntax] Optimize expandedTokens for token ranges.

`expandedTokens(SourceRange)` used to do a binary search to get the
expanded tokens belonging to a source range. Each binary search uses
`isBeforeInTranslationUnit` to order two source locations. This is
inherently very slow.
By profiling clangd we found out that users like clangd::SelectionTree
spend 95% of time in `isBeforeInTranslationUnit`. Also it is worth
noting that users of `expandedTokens(SourceRange)` majorly use ranges
provided by AST to query this funciton. The ranges provided by AST are
token ranges (starting at the beginning of a token and ending at the
beginning of another token).

Therefore we can avoid the binary search in majority of the cases by
maintaining an index of ExpandedToken by their SourceLocations. We still
do binary search for ranges which are not token ranges but such
instances are quite low.

Performance:
`~/build/bin/clangd --check=clang/lib/Serialization/ASTReader.cpp`
Before: Took 2:10s to complete.
Now: Took 1:13s to complete.

Differential Revision: https://reviews.llvm.org/D99086

3 years ago[flang] fold LOGICAL intrinsic calls
Jean Perier [Thu, 25 Mar 2021 17:36:06 +0000 (18:36 +0100)]
[flang] fold LOGICAL intrinsic calls

Folding of LOGICAL intrinsic procedure was missing in the front-end causing
crash when using it in parameter expressions.
Simply fold LOGICAL calls to evaluate::Convert<T>.

Differential Revision: https://reviews.llvm.org/D99346

3 years ago[clangd] Fix a use-after-free
Kadir Cetinkaya [Thu, 25 Mar 2021 10:04:35 +0000 (11:04 +0100)]
[clangd] Fix a use-after-free

Clangd was storing reference to a possibly-dead string in compiled
config. This patch fixes the issue by copying suppression strings from
fragments into compiled Config.

Fixes https://github.com/clangd/clangd/issues/724.

Differential Revision: https://reviews.llvm.org/D99326

3 years ago[Analyzer] Infer 0 value when the divisible is 0 (bug fix)
Gabor Marton [Thu, 25 Mar 2021 14:29:41 +0000 (15:29 +0100)]
[Analyzer] Infer 0 value when the divisible is 0 (bug fix)

Currently, we infer 0 if the divisible of the modulo op is 0:
  int a = x < 0; // a can be 0
  int b = a % y; // b is either 1 % sym or 0
However, we don't when the op is / :
  int a = x < 0; // a can be 0
  int b = a / y; // b is either 1 / sym or 0 / sym

This commit fixes the discrepancy.

Differential Revision: https://reviews.llvm.org/D99343

3 years ago[libc++] [C++2b] [P2162] Allow inheritance from std::variant.
Marek Kurdej [Thu, 25 Mar 2021 17:09:11 +0000 (18:09 +0100)]
[libc++] [C++2b] [P2162] Allow inheritance from std::variant.

This patch changes the variant even in pre-C++2b.
It should not break anything, only allow use cases that didn't work previously.

Notes:
 `__as_variant` is used in `__visitation::__variant::__visit_alt`, but I haven't used it in `__visitation::__variant::__visit_alt_at`.
That's because it is used only in `__visit_value_at`, which in turn is always used on variant specializations (that's in comparison operators).

* https://wg21.link/P2162

Reviewed By: ldionne, #libc, Quuxplusone

Differential Revision: https://reviews.llvm.org/D97394

3 years ago[mlir][linalg] Add output tensor args folding for linalg.tiled_loop.
Alexander Belyaev [Thu, 25 Mar 2021 17:08:30 +0000 (18:08 +0100)]
[mlir][linalg] Add output tensor args folding for linalg.tiled_loop.

Folds away TiledLoopOp output tensors when the following conditions are met:
* result of `linalg.tiled_loop` has no uses
* output tensor is the argument of `linalg.yield`

Example:

```
%0 = linalg.tiled_loop ...  outs (%out, %out_buf:tensor<...>, memref<...>) {
  ...
  linalg.yield %out : tensor ...
}
```

Becomes

```
linalg.tiled_loop ...  outs (%out_buf:memref<...>) {
  ...
  linalg.yield
}
```

Differential Revision: https://reviews.llvm.org/D99333

3 years ago[flang][driver] Add options for -std=f2018
Arnamoy Bhattacharyya [Thu, 25 Mar 2021 17:02:05 +0000 (13:02 -0400)]
[flang][driver] Add options for -std=f2018

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D97119

3 years agoRevert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existin...
Uday Bondhugula [Thu, 25 Mar 2021 11:23:45 +0000 (16:53 +0530)]
Revert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants."

This reverts commit 361b7d125b438cda13fa45f13790767a62252be9 by Chris
Lattner <clattner@nondot.org> dated Fri Mar 19 21:22:15 2021 -0700.

The change to the greedy rewriter driver picking a different order was
made without adequate analysis of the trade-offs and experimentation. A
change like this has far reaching consequences on transformation
pipelines, and a major impact upstream and downstream. For eg., one
can’t be sure that it doesn’t slow down a large number of cases by small
amounts or create other issues. More discussion here:
https://llvm.discourse.group/t/speeding-up-canonicalize/3015/25

Reverting this so that improvements to the traversal order can be made
on a clean slate, in bigger steps, and higher bar.

Differential Revision: https://reviews.llvm.org/D99329

3 years ago[ARM] Revert WhileLoopStartLR to DoLoopStart
David Green [Thu, 25 Mar 2021 16:44:15 +0000 (16:44 +0000)]
[ARM] Revert WhileLoopStartLR to DoLoopStart

If a WhileLoopStartLR is reverted due to calls in the preheader, we may
still be able to instead create a DoLoopStart, preserving the low
overhead loop. This adds code for that, only reverting the
WhileLoopStartR to a Br/Cmp, leaving the rest of the low overhead loop
in place.

Differential Revision: https://reviews.llvm.org/D98413

3 years ago[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffffffff).
Craig Topper [Thu, 25 Mar 2021 06:23:16 +0000 (23:23 -0700)]
[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffffffff).

We look for this pattern frequently in isel patterns so its a
good idea to try to preserve it.

This also let's us remove our special isel handling for srliw
and use a direct pattern match of (srl (and X, 0xffffffff), C)
since no bits will be removed from the and mask.

Differential Revision: https://reviews.llvm.org/D99042

3 years agoFix: Reordering parameters in getFile and getFileOrSTDIN
Abhina Sreeskantharajan [Thu, 25 Mar 2021 15:55:30 +0000 (11:55 -0400)]
Fix: Reordering parameters in getFile and getFileOrSTDIN

There was a new getFileOrSTDIN call added recently which was not included in my patch. https://reviews.llvm.org/D99110
I reordered the args to match the new order.

Reviewed By: tunz

Differential Revision: https://reviews.llvm.org/D99349

3 years ago[SLP] Fix crash in reduction for integer min/max
Yevgeny Rouban [Thu, 25 Mar 2021 14:32:55 +0000 (21:32 +0700)]
[SLP] Fix crash in reduction for integer min/max

The SCEV commit b46c085d2b6d1 [NFCI] SCEVExpander:
    emit intrinsics for integral {u,s}{min,max} SCEV expressions
seems to reveal a new crash in SLPVectorizer.
SLP crashes expecting a SelectInst as an externally used value
but umin() call is found.

The patch relaxes the assumption to make the IR flag propagation safe.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D99328

3 years ago[clang-tidy] Fix mpi checks when running multiple TUs per clang-tidy process
Nathan James [Thu, 25 Mar 2021 14:38:35 +0000 (14:38 +0000)]
[clang-tidy] Fix mpi checks when running multiple TUs per clang-tidy process

Both the mpi-type-mismatch and mpi-buffer-deref check make use of a static MPIFunctionClassifier object.
This causes issue as the classifier is initialized with the first ASTContext that produces a match.
If the check is enabled on multiple translation units in a single clang-tidy process, this classifier won't be reinitialized for each TU. I'm not an expert in the MPIFunctionClassifier but I'd imagine this is a source of UB.
It is suspected that this bug may result in the crash caused here: https://bugs.llvm.org/show_bug.cgi?id=48985. However even if not the case, this should still be addressed.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D98275

3 years agoReuse `os` variable in AllocateTarget; NFC
Sven van Haastregt [Thu, 25 Mar 2021 14:38:02 +0000 (14:38 +0000)]
Reuse `os` variable in AllocateTarget; NFC

3 years agoadd print-change diff modes that do not use colour
Jamie Schmeiser [Thu, 25 Mar 2021 14:32:13 +0000 (10:32 -0400)]
add print-change diff modes that do not use colour

Summary:
The colour characters currently added to the output of -print-changed=diff
and -print-changed=diff-quiet cause difficulties when capturing the output
and examining it in an editor. Change the function to not have the colour
characters and add 2 new choices (-print-changed=cdiff and
-print-changed=cdiff-quiet) to retain the existing functionality of adding
the colour characters.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban)
Differential Revision: https://reviews.llvm.org/D97398

3 years ago[libc++] Eliminate <compare>'s dependency on <array>.
Arthur O'Dwyer [Wed, 24 Mar 2021 23:14:51 +0000 (19:14 -0400)]
[libc++] Eliminate <compare>'s dependency on <array>.

This refactor is not only a good idea, but is in fact required by the standard,
in the sense that <array> is mandated to include <compare>.
So <compare> shouldn't have a circular dependency on <array>!

Differential Revision: https://reviews.llvm.org/D99307

3 years ago[libc++] [P1032] Misc constexpr bits in <iterator>, <string_view>, <tuple>, <utility>.
Arthur O'Dwyer [Wed, 10 Feb 2021 00:12:16 +0000 (19:12 -0500)]
[libc++] [P1032] Misc constexpr bits in <iterator>, <string_view>, <tuple>, <utility>.

This completes the implementation of P1032's changes to <iterator>,
<string_view>, <tuple>, and <utility> in C++20.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1032r1.html

Drive-by fix a couple of unintended rvalues in "*iterators*/*.fail.cpp".

Differential Revision: https://reviews.llvm.org/D96385

3 years ago[SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads
Kerry McLaughlin [Thu, 25 Mar 2021 13:37:02 +0000 (13:37 +0000)]
[SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads

D95598 added a cost model for broadcast shuffle, which should enable loops
such as the following to vectorize, where the load of b[42] is invariant
and can be done using a scalar load + splat:

  for (int i=0; i<n; ++i)
    a[i] = b[i] + b[42];

This patch adds tests to verify that we can vectorize such loops.

Reviewed By: joechrisellis

Differential Revision: https://reviews.llvm.org/D98506

3 years ago[HWASan] Use page aliasing on x86_64.
Matt Morehouse [Thu, 25 Mar 2021 13:34:25 +0000 (06:34 -0700)]
[HWASan] Use page aliasing on x86_64.

Userspace page aliasing allows us to use middle pointer bits for tags
without untagging them before syscalls or accesses.  This should enable
easier experimentation with HWASan on x86_64 platforms.

Currently stack, global, and secondary heap tagging are unsupported.
Only primary heap allocations get tagged.

Note that aliasing mode will not work properly in the presence of
fork(), since heap memory will be shared between the parent and child
processes.  This mode is non-ideal; we expect Intel LAM to enable full
HWASan support on x86_64 in the future.

Reviewed By: vitalybuka, eugenis

Differential Revision: https://reviews.llvm.org/D98875

3 years ago[NFC] Reordering parameters in getFile and getFileOrSTDIN
Abhina Sreeskantharajan [Thu, 25 Mar 2021 13:47:25 +0000 (09:47 -0400)]
[NFC] Reordering parameters in getFile and getFileOrSTDIN

In future patches I will be setting the IsText parameter frequently so I will refactor the args to be in the following order. I have removed the FileSize parameter because it is never used.

```
  static ErrorOr<std::unique_ptr<MemoryBuffer>>
  getFile(const Twine &Filename, bool IsText = false,
          bool RequiresNullTerminator = true, bool IsVolatile = false);

  static ErrorOr<std::unique_ptr<MemoryBuffer>>
  getFileOrSTDIN(const Twine &Filename, bool IsText = false,
                 bool RequiresNullTerminator = true);

 static ErrorOr<std::unique_ptr<MB>>
 getFileAux(const Twine &Filename, uint64_t MapSize, uint64_t Offset,
            bool IsText, bool RequiresNullTerminator, bool IsVolatile);

  static ErrorOr<std::unique_ptr<WritableMemoryBuffer>>
  getFile(const Twine &Filename, bool IsVolatile = false);
```

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D99182

3 years agofix readability-braces-around-statements Stmt type dependency
Alexander Lanin [Thu, 25 Mar 2021 13:44:41 +0000 (09:44 -0400)]
fix readability-braces-around-statements Stmt type dependency

Replaces Token based approach to identify EndLoc of Stmt with AST traversal.
This also improves handling of macros.

Fixes Bugs 22785, 25970 and 35754.

3 years ago[SystemZ][z/OS] csv files should be text files
Abhina Sreeskantharajan [Thu, 25 Mar 2021 13:18:49 +0000 (09:18 -0400)]
[SystemZ][z/OS] csv files should be text files

This patch sets the OF_Text flag correctly for the csv file.

Reviewed By: anirudhp

Differential Revision: https://reviews.llvm.org/D99285

3 years ago[SLP]Improve and simplify extendSchedulingRegion.
Alexey Bataev [Wed, 24 Mar 2021 14:13:58 +0000 (07:13 -0700)]
[SLP]Improve and simplify extendSchedulingRegion.

We do not need to scan further if the upper end or lower end of the
basic block is reached already and the instruction is not found. It
means that the instruction is definitely in the lower part of basic
block or in the upper block relatively.
This should improve compile time for the very big basic blocks.

Differential Revision: https://reviews.llvm.org/D99266

3 years ago[Debugify] Expose original debug info preservation check as CC1 option
Djordje Todorovic [Thu, 11 Mar 2021 14:55:13 +0000 (06:55 -0800)]
[Debugify] Expose original debug info preservation check as CC1 option

In order to test the preservation of the original Debug Info metadata
in your projects, a front end option could be very useful, since users
usually report that a concrete entity (e.g. variable x, or function fn2())
is missing debug info. The [0] is an example of running the utility
on GDB Project.

This depends on: D82546 and D82545.

Differential Revision: https://reviews.llvm.org/D82547

3 years ago[X86][SSE] Add pmulh tests where the source ops are not generated from sign/zero...
Simon Pilgrim [Thu, 25 Mar 2021 12:12:04 +0000 (12:12 +0000)]
[X86][SSE] Add pmulh tests where the source ops are not generated from sign/zero-extends

3 years ago[X86][SSE] Rename pmulh tests to show they're from sign/zero-extends
Simon Pilgrim [Thu, 25 Mar 2021 11:52:28 +0000 (11:52 +0000)]
[X86][SSE] Rename pmulh tests to show they're from sign/zero-extends

I'm intending to add additional coverage based off computeKnownBits/ComputeNumSignBits as suggested by PR45897

3 years ago[RISCV] Optimize select-like vector shuffles
Fraser Cormack [Wed, 24 Mar 2021 14:54:20 +0000 (14:54 +0000)]
[RISCV] Optimize select-like vector shuffles

This patch adds a small optimization for vector shuffle lowering,
detecting shuffles which can be re-expressed as vector selects.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99270

3 years ago[PowerPC][NFC] Provide legacy names for VSX loads and stores
Nemanja Ivanovic [Thu, 25 Mar 2021 11:32:12 +0000 (06:32 -0500)]
[PowerPC][NFC] Provide legacy names for VSX loads and stores

Before we unified the names of the builtins across all the
compilers, there were a number of synonyms between them. There
is code out there that uses XL naming for some of these loads and
stores. This just adds those names.

3 years ago[NewPM] Disable non-trivial loop-unswitch on targets with divergence
Sameer Sahasrabuddhe [Thu, 25 Mar 2021 11:27:10 +0000 (11:27 +0000)]
[NewPM] Disable non-trivial loop-unswitch on targets with divergence

Unswitching a loop on a non-trivial divergent branch is expensive
since it serializes the execution of both version of the
loop. But identifying a divergent branch needs divergence analysis,
which is a function level analysis.

The legacy pass manager handles this dependency by isolating such a
loop transform and rerunning the required function analyses. This
functionality is currently missing in the new pass manager, and there
is no safe way for the SimpleLoopUnswitch pass to depend on
DivergenceAnalysis. So we conservatively assume that all non-trivial
branches are divergent if the target has divergence.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D98958

3 years ago[RISCV] Pre-commit shuffle test cases for D99270
Fraser Cormack [Wed, 24 Mar 2021 14:50:21 +0000 (14:50 +0000)]
[RISCV] Pre-commit shuffle test cases for D99270

3 years ago[RISCV] Optimize BUILD_VECTOR sequences that reveal hidden splats
Fraser Cormack [Tue, 23 Mar 2021 12:28:35 +0000 (12:28 +0000)]
[RISCV] Optimize BUILD_VECTOR sequences that reveal hidden splats

This patch adds further optimization techniques to RVV BUILD_VECTOR
lowering. It teaches the compiler to find splats of larger vector
element types "hidden" in smaller ones. For example, a v4i8 build_vector
(0x1, 0x2, 0x1, 0x2) could be splat as v2i16 0x0201. This is generally
more optimal than the dominant-element BUILD_VECTORs and so takes
priority.

This optimization is currently limited to all-constant-or-undef
BUILD_VECTORs as those were found to be the most common. There's no
reason this couldn't be extended to other BUILD_VECTORs, but the
additional bit-manipulation instructions may require more sophisticated
heuristics.

There are some cases where the materialization of the larger constant
takes more scalar instructions than it does to build the vector with
vector instructions. We could add heuristics to try and catch this.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99195

3 years ago[X86][AVX] Truncate vectors with PACKSS/PACKUS on AVX2 targets
Simon Pilgrim [Thu, 25 Mar 2021 10:34:34 +0000 (10:34 +0000)]
[X86][AVX] Truncate vectors with PACKSS/PACKUS on AVX2 targets

Until AVX512 we don't have any vector truncation instructions, and always lower using shuffles instead.

combineVectorTruncation performs this earlier than lowering as it makes it easier to use any sign/zero-extended bits in the truncated bits with PACKSS/PACKUS to perform the shuffle.

We currently don't attempt to use combineVectorTruncation on AVX2 targets as in the past 256-bit PACKSS/PACKUS tended to cause 128-bit lane shuffle regressions - but these should now be all resolved with combineHorizOpWithShuffle and in all cases we now reduce the amount of cross-lane shuffling and variable shuffle mask usage.

Differential Revision: https://reviews.llvm.org/D96609

3 years ago[X86][AVX] splitIntVSETCC - handle separate (canonicalized) SETCC operands
Simon Pilgrim [Thu, 25 Mar 2021 10:01:52 +0000 (10:01 +0000)]
[X86][AVX] splitIntVSETCC - handle separate (canonicalized) SETCC operands

LowerVSETCC calls splitIntVSETCC after canonicalizing certain patterns, in particular (X & CPow2 != 0) -> (X & CPow2 == CPow2).

Unfortunately if we're splitting for AVX1/non-AVX512BW cases, we lose these canonicalizations as we call the split with the original SetCC node, and when the split nodes are later lowered in LowerVSETCC the patterns are lost behind extract_subvector etc. But if we pass the canonicalized operands for splitting we retain the optimizations.

Differential Revision: https://reviews.llvm.org/D99256

3 years ago[clang-format] Fix ObjC method indent after f7f9f94b
Krasimir Georgiev [Mon, 22 Mar 2021 09:52:19 +0000 (10:52 +0100)]
[clang-format] Fix ObjC method indent after f7f9f94b

Commit
https://github.com/llvm/llvm-project/commit/f7f9f94b2e2b4c714bac9036f6b73a3df42daaff
changed the indent of ObjC method arguments from +4 to +2, if the method
occurs after a block statement.  I believe this was unintentional and there
was insufficient ObjC test coverage to catch this.

Example: `clang-format -style=google test.mm`

before:
```
void aaaaaaaaaaaaaaaaaaaaa(int c) {
  if (c) {
    f();
  }
  [dddddddddddddddddddddddddddddddddddddddddddddddddddddddd
      eeeeeeeeeeeeeeeeeeeeeeeeeeeee:^(fffffffffffffff gggggggg) {
        f(SSSSS, c);
      }];
}
```

after:
```
void aaaaaaaaaaaaaaaaaaaaa(int c) {
  if (c) {
    f();
  }
  [dddddddddddddddddddddddddddddddddddddddddddddddddddddddd
    eeeeeeeeeeeeeeeeeeeeeeeeeeeee:^(fffffffffffffff gggggggg) {
      f(SSSSS, c);
    }];
}
```

Differential Revision: https://reviews.llvm.org/D99063

3 years ago[lldb] Fix TestVSCode.test_progress_events on Linux due to vdso
Raphael Isemann [Thu, 25 Mar 2021 09:44:20 +0000 (10:44 +0100)]
[lldb] Fix TestVSCode.test_progress_events on Linux due to vdso

This currently fails when we get the module for `[vdso]` which doesn't have
any parsing event associated with it as it's just created from memory.

3 years agoTrivial change to fix builds
Kiran Chandramohan [Thu, 25 Mar 2021 09:31:04 +0000 (09:31 +0000)]
Trivial change to fix builds

Pass the context while creating the Patternslist.

3 years ago[mlir] Support MemRefType with multiple AffineMaps in getStridesAndOffset
Vladislav Vinogradov [Tue, 23 Mar 2021 10:30:30 +0000 (13:30 +0300)]
[mlir] Support MemRefType with multiple AffineMaps in getStridesAndOffset

Compose multiple AffineMaps into single map before strides extraction.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D99166

3 years ago[mlir] Translate global initializers after creating all LLVM IR globals
Jean Perier [Thu, 25 Mar 2021 08:40:42 +0000 (09:40 +0100)]
[mlir] Translate global initializers after creating all LLVM IR globals

In case an operation in a global initializer region refers to another
global variable defined afterwards in the module of itself, translation
to LLVM IR was currently crashing because it could not find the LLVM IR global
when going through the initializer block.

To solve this problem, split global conversion to LLVM IR into two passes. A
first pass that creates LLVM IR global variables, and a second one that converts
the initializer, if any, and adds it to the llvm global.

Differential Revision: https://reviews.llvm.org/D99246

3 years agoRevert "[libcxxabi] Use cxx-headers target to consume libcxx headers"
Petr Hosek [Thu, 25 Mar 2021 08:50:11 +0000 (01:50 -0700)]
Revert "[libcxxabi] Use cxx-headers target to consume libcxx headers"

This reverts commit 72728e12806ae4f85c7ab79b92f2d1c20981d596
which broke libcxxabi tests under the runtimes build.

3 years ago[libcxx] [test] Quote env variables that are set with a shell "export" in ssh.py
Martin Storsjö [Thu, 4 Mar 2021 08:37:02 +0000 (10:37 +0200)]
[libcxx] [test] Quote env variables that are set with a shell "export" in ssh.py

This safeguards against cases if some of the env vars contain chars
that are problematic for shells, e.g. if called with --env "X=Y;Z".

(In cases of cross testing for windows, the PATH variable can end up
specified with semicolon separators - even if specifying a PATH when
cross testing in such differing environments might not make sense or
do anything - but this makes ssh.py not break on such a variable.)

Differential Revision: https://reviews.llvm.org/D99242

3 years ago[LLD] Fix probing a MSYS based 'tar' in a Windows Container
Martin Storsjö [Wed, 24 Mar 2021 21:58:54 +0000 (23:58 +0200)]
[LLD] Fix probing a MSYS based 'tar' in a Windows Container

Don't run the 'tar' tool in a cleared environment with only the
LANG variable set, just set LANG on top of the existing environment.

If the 'tar' tool is an MSYS based tool, running it in a Windows
Container hangs if all environment variables are cleared - in
particular, the USERPROFILE variable needs to be kept intact.

This is the same issue fixed as was fixed in other places in
9de63b2e051cb3e79645cc20b83b4d33d132cba0, but contrary to running
the actual tests, running with an as-cleared-as-possible environment
here is less important.

Differential Revision: https://reviews.llvm.org/D99304

3 years ago[RISCV] Add more tests that can be improved by D99042.
Craig Topper [Thu, 25 Mar 2021 06:55:58 +0000 (23:55 -0700)]
[RISCV] Add more tests that can be improved by D99042.

3 years ago[lld] add context-sensitive PGO options for COFF.
Yolanda Chen [Thu, 25 Mar 2021 02:55:18 +0000 (19:55 -0700)]
[lld] add context-sensitive PGO options for COFF.

Add lld CSPGO (Contex-Sensitive PGO) options for COFF target.

Reference the ELF options from https://reviews.llvm.org/D56675

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D98763

3 years ago[libcxx] updates regular_invocable test to actually test regular_invocable
Christopher Di Bella [Wed, 24 Mar 2021 23:13:02 +0000 (23:13 +0000)]
[libcxx] updates regular_invocable test to actually test regular_invocable

The test wasn't previously testing this concept, but its base.

Differential Revision: https://reviews.llvm.org/D99306

3 years ago[Driver] Add -fno-split-stack
Chuanqi Xu [Wed, 24 Mar 2021 08:50:39 +0000 (16:50 +0800)]
[Driver] Add -fno-split-stack

Summary: Add -fno-split-stack and rename CC1 option from `-split-stacks`
to `-fsplit-stack`.

Test Plan: check-all

Differential Revision: https://reviews.llvm.org/D99245

3 years ago[GlobalISel] Fix crash in RBS with a non-generic IMPLICIT_DEF.
Amara Emerson [Wed, 24 Mar 2021 18:28:40 +0000 (11:28 -0700)]
[GlobalISel] Fix crash in RBS with a non-generic IMPLICIT_DEF.

This may occur when swifterror codegen in the translator generates these,
but we shouldn't try to handle them since they should have regclasses anyway.

rdar://75784009

Differential Revision: https://reviews.llvm.org/D99287

3 years agoAdd missing cases in RISCVMCExpr::getVariantKindName
Serge Pavlov [Fri, 19 Mar 2021 07:50:30 +0000 (14:50 +0700)]
Add missing cases in RISCVMCExpr::getVariantKindName

Differential Revision: https://reviews.llvm.org/D98929

3 years ago[RISCV] Add some 32-bit ctlz and cttz idiom tests to rv64zbb.ll. NFC
Craig Topper [Thu, 25 Mar 2021 04:38:53 +0000 (21:38 -0700)]
[RISCV] Add some 32-bit ctlz and cttz idiom tests to rv64zbb.ll. NFC

This implements various idioms using ctlz/cttz like Log2, Log2_Ceil,
findFirstSetBit, etc.

Some of these demonstrate that we fail to use clzw because the
idiom breaks the isel patterns we use. The isel pattern we use
is (add (cttz (and X, 0xffffffff)), -32). Some of the idioms
cause the constant on the add to be different.

3 years agoDefine a `NoTerminator` traits that allows operations with a single block region...
Mehdi Amini [Thu, 11 Mar 2021 23:58:02 +0000 (23:58 +0000)]
Define a `NoTerminator` traits that allows operations with a single block region to not provide a terminator

In particular for Graph Regions, the terminator needs is just a
historical artifact of the generalization of MLIR from CFG region.
Operations like Module don't need a terminator, and before Module
migrated to be an operation with region there wasn't any needed.

To validate the feature, the ModuleOp is migrated to use this trait and
the ModuleTerminator operation is deleted.

This patch is likely to break clients, if you're in this case:

- you may iterate on a ModuleOp with `getBody()->without_terminator()`,
  the solution is simple: just remove the ->without_terminator!
- you created a builder with `Builder::atBlockTerminator(module_body)`,
  just use `Builder::atBlockEnd(module_body)` instead.
- you were handling ModuleTerminator: it isn't needed anymore.
- for generic code, a `Block::mayNotHaveTerminator()` may be used.

Differential Revision: https://reviews.llvm.org/D98468

3 years ago[RISCV] Remove duplicate DebugLoc variables from cases in ReplaceNodeResults. NFC
Craig Topper [Thu, 25 Mar 2021 03:21:29 +0000 (20:21 -0700)]
[RISCV] Remove duplicate DebugLoc variables from cases in ReplaceNodeResults. NFC

We already created a DebugLoc at the top of the function. We can
just use that one.

3 years ago[lldb/ObjC] Make the NonPointerIsaCache initialization lazy
Fred Riss [Fri, 13 Mar 2020 23:17:38 +0000 (16:17 -0700)]
[lldb/ObjC] Make the NonPointerIsaCache initialization lazy

The objc_debug_isa_class_mask magic value that the objc runtime vends
is now initialized using a static initializer instead of a constant
value. The runtime plugin itself will be initialized before the value
is computed and as a result, the cache will get the wrong value.

Making the creation of the NonPointerIsaCache fully lazy fixes this.

3 years ago[lldb] Format AppleObjCRuntimeV2 (NFC)
Jonas Devlieghere [Thu, 25 Mar 2021 01:59:21 +0000 (18:59 -0700)]
[lldb] Format AppleObjCRuntimeV2 (NFC)

3 years ago[Polly] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off builds
Fangrui Song [Thu, 25 Mar 2021 02:56:43 +0000 (19:56 -0700)]
[Polly] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off builds

3 years ago[dfsan] test flush on only x86
Jianzhou Zhao [Thu, 25 Mar 2021 02:45:10 +0000 (02:45 +0000)]
[dfsan] test flush on only x86

3 years ago[Driver] Use -dynamic-linker /lib/ld-musl-i386.so.1 for i?86-linux-musl
Fangrui Song [Thu, 25 Mar 2021 02:44:53 +0000 (19:44 -0700)]
[Driver] Use -dynamic-linker /lib/ld-musl-i386.so.1 for i?86-linux-musl

Noticed by Khem Raj

3 years ago[flang][fir] Add the pre-code gen rewrite pass and codegen ops.
Eric Schweitz [Fri, 26 Feb 2021 02:44:02 +0000 (18:44 -0800)]
[flang][fir] Add the pre-code gen rewrite pass and codegen ops.

Before the conversion to LLVM-IR dialect and ultimately LLVM IR, FIR is
partially rewritten into a codegen form.  This patch adds that pass, the
fircg dialect, and the small set of Ops in the fircg (sub) dialect.
Fircg is not part of the FIR dialect and should never be used outside of
the (closed) conversion to LLVM IR.

Authors: Eric Schweitz, Jean Perier, Rajan Walia, et.al.

Differential Revision: https://reviews.llvm.org/D98063

3 years ago[RISCV] Fix mcount name
Nathan Chancellor [Tue, 23 Mar 2021 22:16:50 +0000 (15:16 -0700)]
[RISCV] Fix mcount name

GCC's name for this symbol is _mcount, which the Linux kernel expects in
a few different place:

  $ echo 'int main(void) { return 0; }' | riscv32-linux-gcc -c -pg -o tmp.o -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                          0000000c:  R_RISCV_CALL _mcount

  $ echo 'int main(void) { return 0; }' | riscv64-linux-gcc -c -pg -o tmp.o -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                  000000000000000c:  R_RISCV_CALL _mcount

  $ echo 'int main(void) { return 0; }' | clang -c -pg -o tmp.o --target=riscv32-linux-gnu -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                          0000000a:  R_RISCV_CALL_PLT     mcount

  $ echo 'int main(void) { return 0; }' | clang -c -pg -o tmp.o --target=riscv64-linux-gnu -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                  000000000000000a:  R_RISCV_CALL_PLT     mcount

Set MCountName to "_mcount" in RISCVTargetInfo then prevent it from
getting overridden in certain OSTargetInfo constructors.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D98881

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
3 years ago[llvm-cov] Check path emptyness in path-equivalence after removing dots.
Zequan Wu [Thu, 25 Mar 2021 00:54:26 +0000 (17:54 -0700)]
[llvm-cov] Check path emptyness in path-equivalence after removing dots.

3 years agoPlumb TLI through isSafeToExecuteUnconditionally [NFC]
Philip Reames [Thu, 25 Mar 2021 00:51:06 +0000 (17:51 -0700)]
Plumb TLI through isSafeToExecuteUnconditionally [NFC]

Split from D95815 to reduce patch size.  Isn't (yet) used for anything, only the client side is wired up.

3 years ago[Utils][NFC] Fix regex substitution for update test checks
Giorgis Georgakoudis [Sun, 21 Mar 2021 20:08:44 +0000 (13:08 -0700)]
[Utils][NFC] Fix regex substitution for update test checks

Relates to: https://reviews.llvm.org/D97107

3 years ago[mlir][tosa] Add tosa.bitwise_not lowering to constant and xor
Rob Suderman [Tue, 23 Mar 2021 22:31:07 +0000 (15:31 -0700)]
[mlir][tosa] Add tosa.bitwise_not lowering to constant and xor

Lowering of bitwise_not to linalg dialect using a xor operation with a constant
of all-bits-one.

Differential Revision: https://reviews.llvm.org/D99221

3 years ago [dfsan] Test dfsan_flush with origins
Jianzhou Zhao [Wed, 24 Mar 2021 19:08:15 +0000 (19:08 +0000)]
 [dfsan] Test dfsan_flush with origins

This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D99295

3 years ago[deref] Implement initial set of inference rules for deref-at-point
Philip Reames [Wed, 24 Mar 2021 23:18:09 +0000 (16:18 -0700)]
[deref] Implement initial set of inference rules for deref-at-point

This implements a subset of the initial set of inference rules proposed in the llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree". The nolias one got moved to a separate review as there was some concerns raised which require further discussion.

Differential Revision: https://reviews.llvm.org/D99135

3 years agoRevert "[HWASan] Use page aliasing on x86_64."
Matt Morehouse [Wed, 24 Mar 2021 23:17:38 +0000 (16:17 -0700)]
Revert "[HWASan] Use page aliasing on x86_64."

This reverts commit 63f73c3eb9716256ab8dbb868e16d08a88636cba due to
breakage on aarch64 without TBI.

3 years ago[InlineCost] Make cost-benefit decision explicit
Wenlei He [Wed, 24 Mar 2021 21:33:45 +0000 (14:33 -0700)]
[InlineCost] Make cost-benefit decision explicit

With cost-benefit analysis for inlining, we bypass the cost-threshold by returning inline result from call analyzer early.

However the cost and threshold are still available from call analyzer, and when cost is actually higher than threshold, we incorrect set the reason.

The change makes the decision from cost-benefit analysis explicit. It's mostly NFC, except that it allows the priority-based sample loader inliner used by CSSPGO to use cost-benefit heuristic.

Differential Revision: https://reviews.llvm.org/D99302

3 years ago[Clang][Sema] Implement GCC -Wcast-function-type
Yuanfang Chen [Wed, 24 Mar 2021 23:03:13 +0000 (16:03 -0700)]
[Clang][Sema] Implement GCC -Wcast-function-type

```
Warn when a function pointer is cast to an incompatible function
pointer. In a cast involving function types with a variable argument
list only the types of initial arguments that are provided are
considered. Any parameter of pointer-type matches any other
pointer-type. Any benign differences in integral types are ignored, like
int vs. long on ILP32 targets. Likewise type qualifiers are ignored. The
function type void (*) (void) is special and matches everything, which
can be used to suppress this warning. In a cast involving pointer to
member types this warning warns whenever the type cast is changing the
pointer to member type. This warning is enabled by -Wextra.
```

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D97831

3 years ago[InlineCost] Enable the cost benefit analysis on FDO
Kazu Hirata [Wed, 24 Mar 2021 22:36:48 +0000 (15:36 -0700)]
[InlineCost] Enable the cost benefit analysis on FDO

This patch enables the cost-benefit-analysis-based inliner by default
if we have instrumentation profile.

- SPEC CPU 2017 shows a 0.4% improvement.

- An internal large benchmark shows a 0.9% reduction in the cycle
  count along with 14.6% reduction in the number of call instructions
  executed.

Differential Revision: https://reviews.llvm.org/D98213

3 years ago[libc++] Match declaration for non-member function std::swap(std::packaged_task)...
jasonliu [Wed, 24 Mar 2021 22:31:58 +0000 (22:31 +0000)]
[libc++] Match declaration for non-member function std::swap(std::packaged_task) with what standard specify

Standard specifies:
```
template<class R, class... ArgTypes>
  void swap(packaged_task<R(ArgTypes...)>& x, packaged_task<R(ArgTypes...)>& y) noexcept;
```

Differential Revision: https://reviews.llvm.org/D99102

3 years ago[Driver] Bring back "Clean up Debian multiarch /usr/include/<triplet> madness"
Fangrui Song [Wed, 24 Mar 2021 22:25:36 +0000 (15:25 -0700)]
[Driver] Bring back "Clean up Debian multiarch /usr/include/<triplet> madness"

This reverts commit aae84b8e3939e815bbc1e64b3b30c0f10b055be4.

The chromium goma folks want to use a Debian sysroot without
lib/x86_64-linux-gnu to perform `clang -c` but no link action. The previous
commit has removed D.getVFS().exists check to make such usage work.

3 years ago[Driver] Linux.cpp: delete unneeded D.getVFS().exists checks
Fangrui Song [Wed, 24 Mar 2021 22:19:01 +0000 (15:19 -0700)]
[Driver] Linux.cpp: delete unneeded D.getVFS().exists checks

Not only can this save unneeded filesystem stats, it can make `clang
--sysroot=/path/to/debian-sysroot -c a.cc` work (get `-internal-isystem
$sysroot/usr/include/x86_64-linux-gnu`) even without `lib/x86_64-linux-gnu/`.
This should make thakis happy.

3 years ago[mlir][linalg] Fold fill -> tensor_reshape chain
Lei Zhang [Wed, 24 Mar 2021 21:52:14 +0000 (17:52 -0400)]
[mlir][linalg] Fold fill -> tensor_reshape chain

For such op chains, we can create new linalg.fill ops
with the result type of the linalg.tensor_reshape op.

Differential Revision: https://reviews.llvm.org/D99116

3 years ago[mlir][linalg] Support dropping unit dimensions for init tensors
Lei Zhang [Wed, 24 Mar 2021 21:51:44 +0000 (17:51 -0400)]
[mlir][linalg] Support dropping unit dimensions for init tensors

init tensor operands also has indexing map and generally follow
the same constraints we expect for non-init-tensor operands.

Differential Revision: https://reviews.llvm.org/D99115

3 years ago[mlir][linalg] Allow controlling folding unit dim reshapes
Lei Zhang [Wed, 24 Mar 2021 21:51:14 +0000 (17:51 -0400)]
[mlir][linalg] Allow controlling folding unit dim reshapes

This commit exposes an option to the pattern
FoldWithProducerReshapeOpByExpansion to allow
folding unit dim reshapes. This gives callers
more fine-grained controls.

Differential Revision: https://reviews.llvm.org/D99114

3 years ago[mlir][affine] Add canonicalization to merge affine min/max ops
Lei Zhang [Wed, 24 Mar 2021 21:50:39 +0000 (17:50 -0400)]
[mlir][affine] Add canonicalization to merge affine min/max ops

This identifies a pattern where the producer affine min/max op
is bound to a dimension/symbol that is used as a standalone
expression in the consumer affine op's map. In that case the
producer affine min/max op can be merged into its consumer.

For example, a pattern like the following:

```
  %0 = affine.min affine_map<()[s0] -> (s0 + 16, s0 * 8)> ()[%sym1]
  %1 = affine.min affine_map<(d0)[s0] -> (s0 + 4, d0)> (%0)[%sym2]
```

Can be turned into:

```
  %1 = affine.min affine_map<
         ()[s0, s1] -> (s0 + 4, s1 + 16, s1 * 8)> ()[%sym2, %sym1]
```

Differential Revision: https://reviews.llvm.org/D99016

3 years ago[mlir][affine] Deduplicate affine min/max op expressions
Lei Zhang [Wed, 24 Mar 2021 21:50:08 +0000 (17:50 -0400)]
[mlir][affine] Deduplicate affine min/max op expressions

If there are multiple identical expressions in an affine
min/max op's map, we can just keep one.

Differential Revision: https://reviews.llvm.org/D99015

3 years ago[mlir][linalg] Fuse producers with non-permutation indexing maps
Lei Zhang [Wed, 24 Mar 2021 21:49:58 +0000 (17:49 -0400)]
[mlir][linalg] Fuse producers with non-permutation indexing maps

Until now Linalg fusion only allow fusing producers whose operands
are all permutation indexing maps. It's easier to deduce the
subtensor/subview but it is an unnecessary constraint, as in tiling
we have more advanced logic to deduce the subranges even when the
operand is not of permutation indexing maps, e.g., the input operand
for convolution ops.

This patch uses the logic on tiling side to deduce subranges for
fusion. This enables fusing convolution with its consumer ops
when possible.

Along the way, we are now generating proper affine.min ops to guard
against size boundaries, if we cannot be certain they won't be
out of bounds.

Differential Revision: https://reviews.llvm.org/D99014

3 years ago[mlir][linalg] NFC: Move makeTiledShapes into Utils.{h|cpp}
Lei Zhang [Wed, 24 Mar 2021 21:49:31 +0000 (17:49 -0400)]
[mlir][linalg] NFC: Move makeTiledShapes into Utils.{h|cpp}

This is a preparation step to reuse makeTiledShapes in tensor
fusion. Along the way, did some lightweight cleanups.

Differential Revision: https://reviews.llvm.org/D99013

3 years ago[libc++][AIX] Initial patch to unblock the libc++ build on AIX
jasonliu [Wed, 24 Mar 2021 17:26:52 +0000 (17:26 +0000)]
[libc++][AIX] Initial patch to unblock the libc++ build on AIX

This path would unblock the build of libc++ library on AIX:
1. Add _AIX guard for _LIBCPP_HAS_THREAD_API_PTHREAD
2. Use uselocale to actually take the locale setting
   into account.
3. extract_mtime and extract_atime mod needed for AIX. As stat
   structure on AIX uses internal structure st_timespec to store
   time for binary compatibility reason. So we need to convert it
   back to timespec here.
4. Do not build cxa_thread_atexit.cpp for libcxxabi on AIX.

Differential Revision: https://reviews.llvm.org/D97558

3 years ago[ValueTracking] peek through min/max to find isKnownToBeAPowerOfTwo
Sanjay Patel [Wed, 24 Mar 2021 21:51:29 +0000 (17:51 -0400)]
[ValueTracking] peek through min/max to find isKnownToBeAPowerOfTwo

This is similar to the select logic just ahead of the new code.
Min/max choose exactly one value from the inputs, so if both of
those are a power-of-2, then the result must be a power-of-2.

This might help with D98152, but we likely still need other
pieces of the puzzle to avoid regressions.

The change in PatternMatch.h is needed to build with clang.
It's possible there is a better way to deal with the 'const'
incompatibities.

Differential Revision: https://reviews.llvm.org/D99276