platform/upstream/llvm.git
3 years agofix typos to cycle bots
Nico Weber [Wed, 2 Dec 2020 01:27:33 +0000 (20:27 -0500)]
fix typos to cycle bots

3 years ago[mlir][PDL] Add append specialization for ByteCode OpCode to fix GCC5 build
River Riddle [Wed, 2 Dec 2020 01:08:38 +0000 (17:08 -0800)]
[mlir][PDL] Add append specialization for ByteCode OpCode to fix GCC5 build

3 years ago[msan] Replace 8 by kShadowTLSAlignment
Jianzhou Zhao [Mon, 23 Nov 2020 05:53:03 +0000 (05:53 +0000)]
[msan] Replace 8 by kShadowTLSAlignment

Reviewed-by: eugenis
Differential Revision: https://reviews.llvm.org/D92275

3 years ago[lld] Use -1 as tombstone value for discarded code ranges
Eric Leese [Wed, 2 Dec 2020 00:01:33 +0000 (16:01 -0800)]
[lld] Use -1 as tombstone value for discarded code ranges

Under existing behavior discarded functions are relocated to have the start pc
0. This causes problems when debugging as they typically overlap the first
function and lldb symbol resolution frequently chooses a discarded function
instead of the correct one. Using the value -1 or -2 (depending on which DWARF
section we are writing) is sufficient to prevent lldb from resolving to these
symbols.

Reviewed By: MaskRay, yurydelendik, sbc100

Differential Revision: https://reviews.llvm.org/D91803

3 years agoRecommit "[clang][Fuchsia] Add relative-vtables multilib"
Leonard Chan [Wed, 2 Dec 2020 00:59:04 +0000 (16:59 -0800)]
Recommit "[clang][Fuchsia] Add relative-vtables multilib"

This recommits fdbd84c6c819d4462546961f6086c1524d5d5ae8 whose initial
build issues were fixed in 19bdc8e5a307f6eb209d4f91620d70bd2f80219e.

3 years agoFix typo in testcase runline that got there because I have very bad hands
Jessica Paquette [Wed, 2 Dec 2020 00:55:25 +0000 (16:55 -0800)]
Fix typo in testcase runline that got there because I have very bad hands

llvm/test/CodeGen/AArch64/GlobalISel/speculative-hardening-brcond.mir had a
slash in its runline.

3 years ago[NFC] Disable new test from D92428 on PPC TSAN
Vitaly Buka [Wed, 2 Dec 2020 00:53:18 +0000 (16:53 -0800)]
[NFC] Disable new test from D92428 on PPC TSAN

3 years ago[WebAssembly] Rename --lto-no-new-pass-manager to --no-lto-new-pass-manager
Fangrui Song [Wed, 2 Dec 2020 00:52:37 +0000 (16:52 -0800)]
[WebAssembly] Rename --lto-no-new-pass-manager to --no-lto-new-pass-manager

In addition, disallow `-lto-new-pass-manager` (see D79371).

Note: the ELF port has also adopted --no-lto-new-pass-manager

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D92422

3 years ago[AArch64][GlobalISel] Don't write to WZR in non-flag-setting G_BRCOND case
Jessica Paquette [Mon, 2 Nov 2020 17:47:51 +0000 (09:47 -0800)]
[AArch64][GlobalISel] Don't write to WZR in non-flag-setting G_BRCOND case

We are avoiding writing to WZR just about everywhere else.

Also update the code to use MachineIRBuilder for the sake of consistency.

We also didn't have a GlobalISel testcase for this path, so add a simple one
now.

Differential Revision: https://reviews.llvm.org/D90626

3 years agoRemove CXXBasePaths::found_decls and simplify and modernize its only
Richard Smith [Wed, 2 Dec 2020 00:28:46 +0000 (16:28 -0800)]
Remove CXXBasePaths::found_decls and simplify and modernize its only
caller.

This function did not satisfy its documented contract: it only
considered the first lookup result on each base path, not all lookup
results. It also performed unnecessary memory allocations.

This change results in a minor change to our representation: we now
include overridden methods that are found by any derived-to-base path
(not involving another override) in the list of overridden methods for a
function, rather than filtering out functions from bases that are both
direct virtual bases and indirect virtual bases for which the indirect
virtual base path contains another override for the function. (That
filtering rule is part of the class-scope name lookup rules, and doesn't
really have much to do with enumerating overridden methods.) The users
of the list of overridden methods do not appear to rely on this
filtering having happened, and it's simpler to not do it.

3 years agogithub actions: Update branch_sync to push to main
Tom Stellard [Wed, 2 Dec 2020 00:22:30 +0000 (16:22 -0800)]
github actions: Update branch_sync to push to main

3 years ago[sanitizer] Make DTLS_on_tls_get_addr signal safer
Vitaly Buka [Tue, 1 Dec 2020 21:14:18 +0000 (13:14 -0800)]
[sanitizer] Make DTLS_on_tls_get_addr signal safer

Avoid relocating DTV table and use linked list of mmap-ed pages.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D92428

3 years ago[NFC] Extract ForEachDVT
Vitaly Buka [Tue, 1 Dec 2020 05:04:04 +0000 (21:04 -0800)]
[NFC] Extract ForEachDVT

3 years ago[RISCVAsmParser] Allow a SymbolRef operand to be a complex expression
Fangrui Song [Wed, 2 Dec 2020 00:08:09 +0000 (16:08 -0800)]
[RISCVAsmParser] Allow a SymbolRef operand to be a complex expression

So that instructions like `lla a5, (0xFF + end) - 4` (supported by GNU as) can
be parsed.

Add a missing test that an operand like `foo + foo` is not allowed.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D92293

3 years ago[lld/mac] Fix issues around thin archives
Nico Weber [Fri, 20 Nov 2020 15:14:57 +0000 (10:14 -0500)]
[lld/mac] Fix issues around thin archives

- most importantly, fix a use-after-free when using thin archives,
  by putting the archive unique_ptr to the arena allocator. This
  ports D65565 to MachO

- correctly demangle symbol namess from archives in diagnostics

- add a test for thin archives -- it finds this UaF, but only when
  running it under asan (it also finds the demangling fix)

- make forceLoadArchive() use addFile() with a bool to have the archive
  loading code in fewer places. no behavior change; matches COFF port a
  bit better

Differential Revision: https://reviews.llvm.org/D92360

3 years ago[llvm] Fix for failing test from fdbd84c6c819d4462546961f6086c1524d5d5ae8
Leonard Chan [Tue, 1 Dec 2020 23:45:07 +0000 (15:45 -0800)]
[llvm] Fix for failing test from fdbd84c6c819d4462546961f6086c1524d5d5ae8

When handling a DSOLocalEquivalent operand change:

- Remove assertion checking that the `To` type and current type are the
  same type. This is not always a requirement.
- Add a missing bitcast from an old DSOLocalEquivalent to the type of
  the new one.

3 years ago[AArch64][GlobalISel] Select Bcc when it's better than TB(N)Z
Jessica Paquette [Tue, 1 Dec 2020 01:21:21 +0000 (17:21 -0800)]
[AArch64][GlobalISel] Select Bcc when it's better than TB(N)Z

Instead of falling back to selecting TB(N)Z when we fail to select an
optimized compare against 0, select Bcc instead.

Also simplify selectCompareBranch a little while we're here, because the logic
was kind of hard to follow.

At -O0, this is a 0.1% geomean code size improvement for CTMark.

A simple example of where this can kick in is here:
https://godbolt.org/z/4rra6P

In the example above, GlobalISel currently produces a subs, cset, and tbnz.
SelectionDAG, on the other hand, just emits a compare and b.le.

Differential Revision: https://reviews.llvm.org/D92358

3 years ago[NFC][AMDGPU] AMDGPU code object V4 ABI documentation
Tony [Tue, 1 Dec 2020 04:12:04 +0000 (04:12 +0000)]
[NFC][AMDGPU] AMDGPU code object V4 ABI documentation

- Documantation for AMDGPU code object V4.
- Documentation clarification for code object V2 and V3.
- Documentation for the clang-offload-bundler.
- Numerous other documentation clarifications.

Change-Id: I338b327cc9e75da6c987b7e081b496402a5a020e

Differential Revision: https://reviews.llvm.org/D92434

3 years ago[gn build] Port 3fcb0eeb152
LLVM GN Syncbot [Tue, 1 Dec 2020 23:11:06 +0000 (23:11 +0000)]
[gn build] Port 3fcb0eeb152

3 years ago[gn build] Format all gn files
Arthur Eubanks [Tue, 1 Dec 2020 23:07:16 +0000 (15:07 -0800)]
[gn build] Format all gn files

$ git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format

3 years ago[gn build] Manually port 8fee2ee9
Arthur Eubanks [Tue, 1 Dec 2020 23:02:50 +0000 (15:02 -0800)]
[gn build] Manually port 8fee2ee9

3 years ago[ms] [llvm-ml] Support command-line defines
Eric Astor [Tue, 1 Dec 2020 22:48:49 +0000 (17:48 -0500)]
[ms] [llvm-ml] Support command-line defines

Enable command-line defines as textmacros

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D90059

3 years ago[mlir][PDL] Add support for PDL bytecode and expose PDL support to OwningRewritePatte...
River Riddle [Tue, 1 Dec 2020 22:30:18 +0000 (14:30 -0800)]
[mlir][PDL] Add support for PDL bytecode and expose PDL support to OwningRewritePatternList

PDL patterns are now supported via a new `PDLPatternModule` class. This class contains a ModuleOp with the pdl::PatternOp operations representing the patterns, as well as a collection of registered C++ functions for native constraints/creations/rewrites/etc. that may be invoked via the pdl patterns. Instances of this class are added to an OwningRewritePatternList in the same fashion as C++ RewritePatterns, i.e. via the `insert` method.

The PDL bytecode is an in-memory representation of the PDL interpreter dialect that can be efficiently interpreted/executed. The representation of the bytecode boils down to a code array(for opcodes/memory locations/etc) and a memory buffer(for storing attributes/operations/values/any other data necessary). The bytecode operations are effectively a 1-1 mapping to the PDLInterp dialect operations, with a few exceptions in cases where the in-memory representation of the bytecode can be more efficient than the MLIR representation. For example, a generic `AreEqual` bytecode op can be used to represent AreEqualOp, CheckAttributeOp, and CheckTypeOp.

The execution of the bytecode is split into two phases: matching and rewriting. When matching, all of the matched patterns are collected to avoid the overhead of re-running parts of the matcher. These matched patterns are then considered alongside the native C++ patterns, which rewrite immediately in-place via `RewritePattern::matchAndRewrite`,  for the given root operation. When a PDL pattern is matched and has the highest benefit, it is passed back to the bytecode to execute its rewriter.

Differential Revision: https://reviews.llvm.org/D89107

3 years ago[lld-macho] Add isCodeSection()
Jez Ng [Tue, 1 Dec 2020 22:45:13 +0000 (14:45 -0800)]
[lld-macho] Add isCodeSection()

This is the same logic that ld64 uses to determine which sections
contain functions. This was added so that we could determine which
STABS entries should be N_FUN.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92430

3 years ago[lld-macho] Flesh out STABS implementation
Jez Ng [Tue, 1 Dec 2020 22:45:12 +0000 (14:45 -0800)]
[lld-macho] Flesh out STABS implementation

This addresses a lot of the comments in {D89257}. Ideally it'd have been
done in the same diff, but the commits in between make that difficult.

This diff implements:
* N_GSYM and N_STSYM, the STABS for global and static symbols
* Has the STABS reflect the section IDs of their referent symbols
* Ensures we don't fail when encountering absolute symbols or files with
  no debug info
* Sorts STABS symbols by file to minimize the number of N_OSO entries

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92366

3 years ago[lld-macho] Add archive name and file modtime to STABS output
Jez Ng [Tue, 1 Dec 2020 22:45:11 +0000 (14:45 -0800)]
[lld-macho] Add archive name and file modtime to STABS output

We should also set the modtime when running LTO. That will be done in a
future diff, together with support for the `-object_path_lto` flag.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D91318

3 years ago[lld-macho] Emit empty string as first entry of string table
Jez Ng [Tue, 1 Dec 2020 22:45:10 +0000 (14:45 -0800)]
[lld-macho] Emit empty string as first entry of string table

ld64 emits string tables which start with a space and a zero byte. We
match its behavior here since some tools depend on it.

Similar rationale as {D89561}.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D89639

3 years ago[lld-macho] Emit local symbols in symtab; record metadata in LC_DYSYMTAB
Jez Ng [Tue, 1 Dec 2020 22:45:09 +0000 (14:45 -0800)]
[lld-macho] Emit local symbols in symtab; record metadata in LC_DYSYMTAB

Symbols of the same type must be laid out contiguously: following ld64's
lead, we choose to emit all local symbols first, then external symbols,
and finally undefined symbols. For each symbol type, the LC_DYSYMTAB
load command will record the range (start index and total number) of
those symbols in the symbol table.

This work was motivated by the fact that LLDB won't search for debug
info if LC_DYSYMTAB says there are no local symbols (since STABS symbols
are all local symbols). With this change, LLDB is now able to display
the source lines at a given breakpoint when debugging our binaries.

Some tests had to be updated due to local symbol names now appearing in
`llvm-objdump`'s output.

Reviewed By: #lld-macho, smeenai, clayborg

Differential Revision: https://reviews.llvm.org/D89285

3 years ago[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Jez Ng [Tue, 1 Dec 2020 22:45:01 +0000 (14:45 -0800)]
[lld-macho] Emit STABS symbols for debugging, and drop debug sections

Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.

With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.

Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.

Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:

1. We can split up subsections by symbol even if `.subsections_with_symbols`
   is not set, but include constraints to ensure those subsections retain
   their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
   and I'm more inclined toward it, but I'm not sure if there are use cases
   that it doesn't handle well. As such I'm punting on the decision for now.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D89257

3 years ago[gn build] (manually) port 8fee2ee9a68
Nico Weber [Tue, 1 Dec 2020 23:02:18 +0000 (18:02 -0500)]
[gn build] (manually) port 8fee2ee9a68

3 years ago[clang-format] Add new option PenaltyIndentedWhitespace
Mark Nauwelaerts [Sat, 31 Oct 2020 13:15:38 +0000 (14:15 +0100)]
[clang-format] Add new option PenaltyIndentedWhitespace

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D90534

3 years ago[lld][WebAssembly] Feedback from D92038. NFC
Sam Clegg [Tue, 1 Dec 2020 22:04:21 +0000 (14:04 -0800)]
[lld][WebAssembly] Feedback from D92038. NFC

Differential Revision: https://reviews.llvm.org/D92429

3 years ago[libc++abi] Don't try calling __libcpp_aligned_free when aligned allocation is disabled
Louis Dionne [Tue, 1 Dec 2020 22:43:33 +0000 (17:43 -0500)]
[libc++abi] Don't try calling __libcpp_aligned_free when aligned allocation is disabled

See https://reviews.llvm.org/rGa78aaa1ad512#962077 for details.

3 years ago[ms] [llvm-ml] Introduce command-line compatibility for ml.exe and ml64.exe
Eric Astor [Mon, 30 Nov 2020 20:15:18 +0000 (15:15 -0500)]
[ms] [llvm-ml] Introduce command-line compatibility for ml.exe and ml64.exe

Switch to OptParser for command-line handling

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D90058

3 years agoAvoid redundant inline with LLVM_ATTRIBUTE_ALWAYS_INLINE
James Park [Tue, 1 Dec 2020 22:28:46 +0000 (14:28 -0800)]
Avoid redundant inline with LLVM_ATTRIBUTE_ALWAYS_INLINE

Fix MSVC warning when __forceinline is paired with inline.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D85264

3 years ago[lld-macho] Extend PIE option handling
Jez Ng [Tue, 1 Dec 2020 05:07:16 +0000 (21:07 -0800)]
[lld-macho] Extend PIE option handling

* Enable PIE by default if targeting 10.6 or above on x86-64. (The
  manpage says 10.7, but that actually applies only to i386, and in
  general varies based on the target platform. I didn't update the
  manpage because listing all the different behaviors would make for a
  pretty long description.)
* Add support for `-no_pie`
* Remove `HelpHidden` from `-pie`

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D92362

3 years agoRevert "[FastISel] Flush local value map on ever instruction" and dependent patches
David Blaikie [Tue, 1 Dec 2020 21:23:30 +0000 (13:23 -0800)]
Revert "[FastISel] Flush local value map on ever instruction" and dependent patches

This reverts commit cf1c774d6ace59c5adc9ab71b31e762c1be695b1.

This change caused several regressions in the gdb test suite - at least
a sample of which was due to line zero instructions making breakpoints
un-lined. I think they're worth investigating/understanding more (&
possibly addressing) before moving forward with this change.

Revert "[FastISel] NFC: Clean up unnecessary bookkeeping"
This reverts commit 3fd39d3694d32efa44242c099e923a7f4d982095.

Revert "[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option"
This reverts commit a474657e30edccd9e175d92bddeefcfa544751b2.

Revert "Remove static function unused after cf1c774."
This reverts commit dc35368ccf17a7dca0874ace7490cc3836fb063f.

Revert "[lldb] Fix TestThreadStepOut.py after "Flush local value map on every instruction""
This reverts commit 53a14a47ee89dadb8798ca8ed19848f33f4551d5.

3 years ago[lldb] [test] Reenable two passing tests on FreeBSD
Michał Górny [Tue, 1 Dec 2020 22:00:54 +0000 (23:00 +0100)]
[lldb] [test] Reenable two passing tests on FreeBSD

[Reenable TestReproducerAttach and TestThreadSpecificBpPlusCondition
on FreeBSD -- both seem to pass correctly now.

3 years agoMake offset field optional in RegisterInfo packet for Arm64
Muhammad Omair Javaid [Tue, 1 Dec 2020 22:09:14 +0000 (03:09 +0500)]
Make offset field optional in RegisterInfo packet for Arm64

This patch carries forward our aim to remove offset field from qRegisterInfo
packets and XML register description. I have created a new function which
returns if offset fields are dynamic meaning client can calculate offset on
its own based on register number sequence and register size. For now this
function only returns true for NativeRegisterContextLinux_arm64 but we can
test this for other architectures and make it standard later.

As a consequence we do not send offset field from lldb-server (arm64 for now)
while other stubs dont have an offset field so it wont effect them for now.
On the client side we have replaced previous offset calculation algorithm
with a new scheme, where we sort all primary registers in increasing
order of remote regnum and then calculate offset incrementally.

This committ also includes a test to verify all of above functionality
on Arm64.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D91241

3 years agoRegisterInfoPOSIX_arm64 remove unused bytes from g/G packet
Muhammad Omair Javaid [Tue, 1 Dec 2020 22:09:14 +0000 (03:09 +0500)]
RegisterInfoPOSIX_arm64 remove unused bytes from g/G packet

This came up while putting together our new strategy to create g/G packets
in compliance with GDB RSP protocol where register offsets are calculated in
increasing order of register numbers without any unused spacing.

RegisterInfoPOSIX_arm64::GPR size was being calculated after alignment
correction to 8 bytes which meant there was a 4 bytes unused space between
last gpr (cpsr) and first vector register V. We have put LLVM_PACKED_START
decorator on RegisterInfoPOSIX_arm64::GPR to make sure single byte
alignment is enforced. Moreover we are now doing to use arm64 user_pt_regs
struct defined in ptrace.h for accessing ptrace user registers.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D92063

3 years ago[OpenMP51][DOCS] Claim "add present modifier in defaultmap clause", NFC.
cchen [Tue, 1 Dec 2020 22:07:00 +0000 (16:07 -0600)]
[OpenMP51][DOCS] Claim "add present modifier in defaultmap clause", NFC.

3 years agoReland [CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/
Arthur Eubanks [Wed, 25 Nov 2020 04:40:47 +0000 (20:40 -0800)]
Reland [CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/

This allows us to use its value everywhere, rather than just clang. Some
other places, like opt and lld, will use its value soon.

Rename it internally to LLVM_ENABLE_NEW_PASS_MANAGER.

The #define for it is now in llvm-config.h.

The initial land accidentally set the value of
LLVM_ENABLE_NEW_PASS_MANAGER to the string
ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER instead of its value.

Reviewed By: rnk, hans

Differential Revision: https://reviews.llvm.org/D92072

3 years ago[libc++] NFC: Remove unused macros in <__config>
Louis Dionne [Tue, 1 Dec 2020 21:49:48 +0000 (16:49 -0500)]
[libc++] NFC: Remove unused macros in <__config>

3 years ago[LLD][ELF][NewPM] Add option to force legacy PM
Arthur Eubanks [Tue, 1 Dec 2020 19:51:04 +0000 (11:51 -0800)]
[LLD][ELF][NewPM] Add option to force legacy PM

In preparation for the NPM switch.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92417

3 years ago[MLIR] Fix genTypeInterfaceMethods() to work correctly with InferTypeOpInterface
Rahul Joshi [Tue, 1 Dec 2020 19:19:59 +0000 (11:19 -0800)]
[MLIR] Fix genTypeInterfaceMethods() to work correctly with InferTypeOpInterface

- Change InferTypeOpInterface::inferResultTypes to use fully qualified types matching
  the ones generated by genTypeInterfaceMethods, so the redundancy can be detected.
- Move genTypeInterfaceMethods() before genOpInterfaceMethods() so that the
  inferResultTypes method generated by genTypeInterfaceMethods() takes precedence
  over the declaration that might be generated by genOpInterfaceMethods()
- Modified an op in the test dialect to exercise this (the modified op would fail to
  generate valid C++ code due to duplicate inferResultTypes methods).

Differential Revision: https://reviews.llvm.org/D92414

3 years agoRevert "[CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/"
Arthur Eubanks [Tue, 1 Dec 2020 21:12:12 +0000 (13:12 -0800)]
Revert "[CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/"

The new pass manager was accidentally enabled by default with this change.

This reverts commit a36bd4c90dcca82be9b64f65dbd22e921b6485ef.

3 years agoFix erroneous edit in https://github.com/llvm/llvm-project/actions/runs/394499364
Zahira Ammarguellat [Tue, 1 Dec 2020 20:34:18 +0000 (12:34 -0800)]
Fix erroneous edit in https://github.com/llvm/llvm-project/actions/runs/394499364

3 years ago[LTO][wasm][NewPM] Allow using new pass manager for wasm LTO
Arthur Eubanks [Tue, 1 Dec 2020 20:22:27 +0000 (12:22 -0800)]
[LTO][wasm][NewPM] Allow using new pass manager for wasm LTO

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D92150

3 years ago[OpenMP] Add support for Intel's umonitor/umwait
Terry Wilmarth [Tue, 1 Dec 2020 20:03:40 +0000 (14:03 -0600)]
[OpenMP] Add support for Intel's umonitor/umwait

These changes add support for Intel's umonitor/umwait usage in wait
code, for architectures that support those intrinsic functions. Usage of
umonitor/umwait is off by default, but can be turned on by setting the
KMP_USER_LEVEL_MWAIT environment variable.

Differential Revision: https://reviews.llvm.org/D91189

3 years ago[MLIR][LLVM] Fix a tiny typo in the dialect docs.
ergawy [Tue, 1 Dec 2020 20:06:33 +0000 (20:06 +0000)]
[MLIR][LLVM] Fix a tiny typo in the dialect docs.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D92333

3 years ago[clang-scan-deps] Improve argument parsing to find target object file path.
Sylvain Audi [Mon, 30 Nov 2020 16:56:37 +0000 (11:56 -0500)]
[clang-scan-deps] Improve argument parsing to find target object file path.

Support the joined version of -o (-ofilepath), and ensure we use the last provided -o option.

Differential Revision: https://reviews.llvm.org/D92330

3 years ago[CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/
Arthur Eubanks [Wed, 25 Nov 2020 04:40:47 +0000 (20:40 -0800)]
[CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/

This allows us to use its value everywhere, rather than just clang. Some
other places, like opt and lld, will use its value soon.

The #define for it is now in llvm-config.h.

Reviewed By: rnk, hans

Differential Revision: https://reviews.llvm.org/D92072

3 years ago[gn build] sync script: try to make sync script even clearer
Nico Weber [Tue, 1 Dec 2020 19:35:21 +0000 (14:35 -0500)]
[gn build] sync script: try to make sync script even clearer

Turns out startswith() takes an optional start parameter :)

No behavior change.

3 years ago[DAGCombiner][NFC] Replace duplicate implementation flipBoolean with DAG.getLogicalNOT
Layton Kifer [Tue, 1 Dec 2020 19:09:04 +0000 (22:09 +0300)]
[DAGCombiner][NFC] Replace duplicate implementation flipBoolean with DAG.getLogicalNOT

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D92246

3 years agoAPINotes: constify `dump` methods (NFC)
Saleem Abdulrasool [Mon, 30 Nov 2020 23:54:08 +0000 (23:54 +0000)]
APINotes: constify `dump` methods (NFC)

This simply marks the functions as const as they do not mutate the
value.  This is useful for debugging iterations during development.
NFCI.

3 years agoArgument dependent lookup with class argument is recursing into base
Zahira Ammarguellat [Fri, 6 Nov 2020 14:38:22 +0000 (06:38 -0800)]
Argument dependent lookup with class argument is recursing into base
classes that haven't been instantiated. This is generating an assertion
in DeclTemplate.h. Fix for Bug25668.

3 years agostatic const char *const foo => const char foo[]
Fangrui Song [Tue, 1 Dec 2020 18:33:18 +0000 (10:33 -0800)]
static const char *const foo => const char foo[]

By default, a non-template variable of non-volatile const-qualified type
having namespace-scope has internal linkage, so no need for `static`.

3 years ago[ELF][test] Fix lto/version-script2.ll
Fangrui Song [Tue, 1 Dec 2020 18:22:32 +0000 (10:22 -0800)]
[ELF][test] Fix lto/version-script2.ll

3 years ago[LTO][NewPM] Run verifier when doing LTO
Arthur Eubanks [Tue, 1 Dec 2020 18:14:38 +0000 (10:14 -0800)]
[LTO][NewPM] Run verifier when doing LTO

This matches the legacy PM.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D92138

3 years agoRevert "[LV] Epilogue Vectorization with Optimal Control Flow"
Bardia Mahjour [Tue, 1 Dec 2020 17:48:36 +0000 (12:48 -0500)]
Revert "[LV] Epilogue Vectorization with Optimal Control Flow"

This reverts commit 9c5504adceb544d9954ddb8ff3035a414f4b1423.
Reverting to investigate build failure in http://lab.llvm.org:8011/#/builders/98/builds/1461/steps/9

3 years ago[libc++] Optimize the number of assignments in std::exclusive_scan
Louis Dionne [Tue, 24 Nov 2020 17:29:08 +0000 (12:29 -0500)]
[libc++] Optimize the number of assignments in std::exclusive_scan

Reported in https://twitter.com/blelbach/status/1169807347142676480

Differential Revision: https://reviews.llvm.org/D67273

3 years agoLet .llvm_bb_addr_map section use the same unique id as its associated .text section.
Rahman Lavaee [Tue, 1 Dec 2020 17:20:34 +0000 (09:20 -0800)]
Let .llvm_bb_addr_map section use the same unique id as its associated .text section.

Currently, `llvm_bb_addr_map` sections are generated per section names because we use
the `LinkedToSymbol` argument of getELFSection. This will cause the address map tables of functions
grouped into the same section when `-function-sections=true -unique-section-names=false` which is not
the intended behaviour. This patch lets the unique id of every `.text` section propagate to the associated
`.llvm_bb_addr_map` section.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92113

3 years ago[BasicAA] Add test for suboptimal result with unknown sizes (NFC)
Nikita Popov [Tue, 1 Dec 2020 17:19:40 +0000 (18:19 +0100)]
[BasicAA] Add test for suboptimal result with unknown sizes (NFC)

3 years ago[NFC][clang-tidy] Port rename_check.py to Python3
Roman Lebedev [Tue, 1 Dec 2020 16:50:56 +0000 (19:50 +0300)]
[NFC][clang-tidy] Port rename_check.py to Python3

3 years agoclang/darwin: Use response files with ld64.lld.darwinnew
Nico Weber [Tue, 1 Dec 2020 16:46:15 +0000 (11:46 -0500)]
clang/darwin: Use response files with ld64.lld.darwinnew

The new MachO lld just grew support for response files in D92149, so let
the clang driver use it.

Differential Revision: https://reviews.llvm.org/D92399

3 years ago[LV] Epilogue Vectorization with Optimal Control Flow
Bardia Mahjour [Tue, 1 Dec 2020 16:57:16 +0000 (11:57 -0500)]
[LV] Epilogue Vectorization with Optimal Control Flow

This is yet another attempt at providing support for epilogue
vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none
and reviews D30247 and D88819.

Similar to D88819, this patch achieve epilogue vectorization by
executing a single vplan twice: once on the main loop and a second
time on the epilogue loop (using a different VF). However it's able
to handle more loops, and generates more optimal control flow for
cases where the trip count is too small to execute any code in vector
form.

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D89566

3 years ago[ELF] Error for undefined foo@v1
Fangrui Song [Tue, 1 Dec 2020 16:59:54 +0000 (08:59 -0800)]
[ELF] Error for undefined foo@v1

If an object file has an undefined foo@v1, we emit a dynamic symbol foo.
This is incorrect if at runtime a shared object provides the non-default version foo@v1
(the undefined foo may bind to foo@@v2, for example).

GNU ld issues an error for this case, even if foo@v1 is undefined weak
(https://sourceware.org/bugzilla/show_bug.cgi?id=3351). This behavior makes
sense because to represent an undefined foo@v1, we have to construct a Verneed
entry. However, without knowing the defining filename, we cannot construct a
Verneed entry (Verneed::vn_file is unavailable).

This patch implements the error.

Depends on D92258

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D92260

3 years ago[MemCpyOpt] Port to MemorySSA
Nikita Popov [Fri, 2 Oct 2020 19:41:19 +0000 (21:41 +0200)]
[MemCpyOpt] Port to MemorySSA

This is a straightforward port of MemCpyOpt to MemorySSA following
the approach of D26739. MemDep queries are replaced with MSSA queries
without changing the overall structure of the pass. Some care has
to be taken to account for differences between these APIs
(MemDep also returns reads, MSSA doesn't).

Differential Revision: https://reviews.llvm.org/D89207

3 years ago[ELF] Make foo@@v1 resolve undefined foo@v1
Fangrui Song [Tue, 1 Dec 2020 16:54:01 +0000 (08:54 -0800)]
[ELF] Make foo@@v1 resolve undefined foo@v1

The symbol resolution rules for versioned symbols are:

* foo@@v1 (default version) resolves both undefined foo and foo@v1
* foo@v1 (non-default version) resolves undefined foo@v1

Note, foo@@v1 must be defined (the assembler errors if attempting to
create an undefined foo@@v1).

For defined foo@@v1 in a shared object, we call `SymbolTable::addSymbol` twice,
one for foo and the other for foo@v1. We don't do the same for object files, so
foo@@v1 defined in one object file incorrectly does not resolve a foo@v1
reference in another object file.

This patch fixes the issue by reusing the --wrap code to redirect symbols in
object files. This has to be done after processing input files because
foo and foo@v1 are two separate symbols if we haven't seen foo@@v1.

Add a helper `Symbol::getVersionSuffix` to retrieve the optional trailing
`@...` or `@@...` from the possibly truncated symbol name.

Depends on D92258

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D92259

3 years ago[ELF][test] Add some tests for versioned symbols in object files
Fangrui Song [Tue, 1 Dec 2020 16:49:14 +0000 (08:49 -0800)]
[ELF][test] Add some tests for versioned symbols in object files

Test the symbol resolution related to

* defined foo@@v1 and foo@v1 in object files/shared objects
* undefined foo@v1
* weak foo@@v1 and foo@v1
* visibility
* interaction with --wrap.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D92258

3 years ago[X86] Support modifier @PLTOFF for R_X86_64_PLTOFF64
Fangrui Song [Tue, 1 Dec 2020 16:39:00 +0000 (08:39 -0800)]
[X86] Support modifier @PLTOFF for R_X86_64_PLTOFF64

`gcc -mcmodel=large` can emit @PLTOFF.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D92294

3 years ago[InstSimplify] Add tests that fold instructions with poison operands (NFC)
Juneyoung Lee [Tue, 1 Dec 2020 16:01:42 +0000 (01:01 +0900)]
[InstSimplify] Add tests that fold instructions with poison operands (NFC)

3 years ago[MergeICmps] Fix missing split.
Clement Courbet [Tue, 1 Dec 2020 08:44:23 +0000 (09:44 +0100)]
[MergeICmps] Fix missing split.

We were not correctly splitting a blocks for chains of length 1.

Before that change, additional instructions for blocks in chains of
length 1 were not split off from the block before removing (this was
done correctly for chains of longer size).
If this first block contained an instruction referenced elsewhere,
deleting the block, would result in invalidation of the produced value.

This caused a miscompile which motivated D92297 (before D17993,
nonnull and dereferenceable attributed were not added so MergeICmps were
not triggered.) The new test gep-references-bb.ll demonstrate the issue.

The regression was introduced in
rG0efadbbcdeb82f5c14f38fbc2826107063ca48b2.

This supersedes D92364.

Test case by MaskRay (Fangrui Song).

Differential Revision: https://reviews.llvm.org/D92375

3 years ago[HIP] Fix static-lib test CHECK bug
Aaron En Ye Shi [Tue, 1 Dec 2020 15:46:19 +0000 (15:46 +0000)]
[HIP] Fix static-lib test CHECK bug

Fix hip test failures that were introduced by
previous changes to hip-toolchain-rdc-static-lib.hip
test. The .*lld.* is matching a longer string than
expected.

Differential Revision: https://reviews.llvm.org/D92342

3 years ago[x86] adjust cost model values for minnum/maxnum with fast-math-flags
Sanjay Patel [Tue, 1 Dec 2020 15:35:24 +0000 (10:35 -0500)]
[x86] adjust cost model values for minnum/maxnum with fast-math-flags

Without FMF, we lower these intrinsics into something like this:

vmaxsd %xmm0, %xmm1, %xmm2
vcmpunordsd %xmm0, %xmm0, %xmm0
vblendvpd %xmm0, %xmm1, %xmm2, %xmm0

But if we can ignore NANs, the single min/max instruction is enough
because there is no need to fix up the x86 logic that corresponds to
X > Y ? X : Y.

We probably want to make other adjustments for FP intrinsics with FMF
to account for specialized codegen (for example, FSQRT).

Differential Revision: https://reviews.llvm.org/D92337

3 years ago[DAG] Remove unused variable. NFC.
Benjamin Kramer [Tue, 1 Dec 2020 15:29:02 +0000 (16:29 +0100)]
[DAG] Remove unused variable. NFC.

3 years ago[ARM] Mark select and selectcc of MVE vector operations as expand.
David Green [Tue, 1 Dec 2020 15:05:55 +0000 (15:05 +0000)]
[ARM] Mark select and selectcc of MVE vector operations as expand.

We already expand select and select_cc in codegenprepare, but they can
still be generated under some situations. Explicitly mark them as expand
to ensure they are not produced, leading to a failure to select the
nodes.

Differential Revision: https://reviews.llvm.org/D92373

3 years ago[InstCombine] canonicalize sign-bit-shift of difference to ext(icmp)
Sanjay Patel [Tue, 1 Dec 2020 13:51:19 +0000 (08:51 -0500)]
[InstCombine] canonicalize sign-bit-shift of difference to ext(icmp)

icmp is the preferred spelling in IR because icmp analysis is
expected to be better than any other analysis. This should
lead to more follow-on folding potential.

It's difficult to say exactly what we should do in codegen to
compensate. For example on AArch64, which of these is preferred:
sub w8, w0, w1
lsr w0, w8, #31

vs:
cmp w0, w1
cset w0, lt

If there are perf regressions, then we should deal with those in
codegen on a case-by-case basis.

A possible motivating example for better optimization is shown in:
https://llvm.org/PR43198 but that will require other transforms
before anything changes there.

Alive proof:
https://rise4fun.com/Alive/o4E

  Name: sign-bit splat
  Pre: C1 == (width(%x) - 1)
  %s = sub nsw %x, %y
  %r = ashr %s, C1
  =>
  %c = icmp slt %x, %y
  %r = sext %c

  Name: sign-bit LSB
  Pre: C1 == (width(%x) - 1)
  %s = sub nsw %x, %y
  %r = lshr %s, C1
  =>
  %c = icmp slt %x, %y
  %r = zext %c

3 years ago[lldb][NFC] Modernize and cleanup TestClassTemplateParameterPack
Raphael Isemann [Tue, 1 Dec 2020 14:49:51 +0000 (15:49 +0100)]
[lldb][NFC] Modernize and cleanup TestClassTemplateParameterPack

* Un-inline the test.
* Use expect_expr everywhere and also check all involved types.
* Clang-format the test sources.
* Explain what we're actually testing with the 'C' and 'D' templates.
* Split out the non-template-parameter-pack part of the test into its own small test.

3 years ago[DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111)
Simon Pilgrim [Tue, 1 Dec 2020 14:21:22 +0000 (14:21 +0000)]
[DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111)

Move the X86 VSELECT->USUBSAT fold to DAGCombiner - there's nothing target specific about these folds.

3 years ago[ConstraintElimination] Decompose GEP %ptr, ZEXT(SHL()).
Florian Hahn [Mon, 30 Nov 2020 15:43:39 +0000 (15:43 +0000)]
[ConstraintElimination] Decompose GEP %ptr, ZEXT(SHL()).

Add support to decompose a GEP with a ZEXT(SHL()) operand.

3 years agolld/ELF: Make three rarely-used flags work with --reproduce
Nico Weber [Tue, 1 Dec 2020 00:54:04 +0000 (19:54 -0500)]
lld/ELF: Make three rarely-used flags work with --reproduce

All three use readFile() for their argument so their argument file is
already copied to the tar, but we weren't rewriting the argument to
point to the path used in the tar file.

No test because the change is trivial (several other flags in
createResponseFile() also aren't tested, likely for the same reason.)

Differential Revision: https://reviews.llvm.org/D92356

3 years ago[RISCV][crt] support building without init_array
Alexey Baturo [Tue, 1 Dec 2020 12:58:31 +0000 (15:58 +0300)]
[RISCV][crt] support building without init_array

Reviewed By: luismarques, phosek, kito-cheng

Differential Revision: https://reviews.llvm.org/D87997

3 years ago[VE] Add vmul and vdiv intrinsic instructions
Kazushi (Jam) Marukawa [Tue, 1 Dec 2020 11:08:22 +0000 (20:08 +0900)]
[VE] Add vmul and vdiv intrinsic instructions

Add vmul and vdiv intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92377

3 years ago[X86] Add PR48223 usubsat test case
Simon Pilgrim [Tue, 1 Dec 2020 13:51:27 +0000 (13:51 +0000)]
[X86] Add PR48223 usubsat test case

3 years ago[InstCombine] Optimize away the unnecessary multi-use sign-extend
Bhramar Vatsa [Tue, 1 Dec 2020 13:35:04 +0000 (16:35 +0300)]
[InstCombine] Optimize away the unnecessary multi-use sign-extend

C.f. https://bugs.llvm.org/show_bug.cgi?id=47765

Added a case for handling the sign-extend (Shl+AShr) for multiple uses,
to optimize it away for an individual use,
when the demanded bits aren't affected by sign-extend.

https://rise4fun.com/Alive/lgf

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D91343

3 years ago[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold, 2
Roman Lebedev [Tue, 1 Dec 2020 12:48:32 +0000 (15:48 +0300)]
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold, 2

If the shift amount was undef for some lane, the shift amount in opposite
shift is irrelevant for that lane, and the new shift amount for that lane
can be undef.

3 years ago[OpenMP] libomp: add UNLIKELY hints to rarely executed branches
AndreyChurbanov [Tue, 1 Dec 2020 13:53:21 +0000 (16:53 +0300)]
[OpenMP] libomp: add UNLIKELY hints to rarely executed branches

Added UNLIKELY hint to one-time or rarely executed branches.
This improves performance of the library on some tasking benchmarks.

Differential Revision: https://reviews.llvm.org/D92322

3 years ago[InstCombine] add tests for sign-bit-shift-of-sub; NFC
Sanjay Patel [Tue, 1 Dec 2020 12:37:06 +0000 (07:37 -0500)]
[InstCombine] add tests for sign-bit-shift-of-sub; NFC

3 years agoRemove rm -f cortex-a57-misched-mla.s; hopefully the bots have all cycled past it now
Hans Wennborg [Tue, 1 Dec 2020 12:50:49 +0000 (13:50 +0100)]
Remove rm -f cortex-a57-misched-mla.s; hopefully the bots have all cycled past it now

3 years agoRevert "[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc())))...
Roman Lebedev [Tue, 1 Dec 2020 12:47:04 +0000 (15:47 +0300)]
Revert "[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold"

It seems i have missed checklines, temporairly reverting,
will reland momentairly..

This reverts commit aa1aa135097ecfab6d9917a435142030eff0a226.

3 years ago[NFC][InstCombine] sext.ll: @test9: avoid only differently-cased names for values...
Roman Lebedev [Tue, 1 Dec 2020 12:33:12 +0000 (15:33 +0300)]
[NFC][InstCombine] sext.ll: @test9: avoid only differently-cased names for values and block names

3 years ago[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold
Roman Lebedev [Tue, 1 Dec 2020 12:11:14 +0000 (15:11 +0300)]
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold

If the shift amount was undef for some lane, the shift amount in opposite
shift is irrelevant for that lane, and the new shift amount for that lane
can be undef.

3 years ago[NFC][InstCombine] Improve vector undef test coverage for sext(ashr(shl(trunc())...
Roman Lebedev [Tue, 1 Dec 2020 12:04:40 +0000 (15:04 +0300)]
[NFC][InstCombine] Improve vector undef test coverage for sext(ashr(shl(trunc()))) fold

3 years ago[InstCombine] Evaluate new shift amount for sext(ashr(shl(trunc()))) fold in wide...
Roman Lebedev [Tue, 1 Dec 2020 12:00:15 +0000 (15:00 +0300)]
[InstCombine] Evaluate new shift amount for sext(ashr(shl(trunc()))) fold in wide type (PR48343)

It is not correct to compute that new shift amount in it's narrow type
and only then extend it into the wide type:

----------------------------------------
Optimization: PR48343 good
Precondition: (width(%X) == width(%r))
  %o0 = trunc %X
  %o1 = shl %o0, %Y
  %o2 = ashr %o1, %Y
  %r = sext %o2
=>
  %n0 = sext %Y
  %n1 = sub width(%o0), %n0
  %n2 = sub width(%X), %n1
  %n3 = shl %X, %n2
  %r = ashr %n3, %n2

Done: 2016
Optimization is correct!

----------------------------------------
Optimization: PR48343 bad
Precondition: (width(%X) == width(%r))
  %o0 = trunc %X
  %o1 = shl %o0, %Y
  %o2 = ashr %o1, %Y
  %r = sext %o2
=>
  %n0 = sub width(%o0), %Y
  %n1 = sub width(%X), %n0
  %n2 = sext %n1
  %n3 = shl %X, %n2
  %r = ashr %n3, %n2

Done: 1
ERROR: Domain of definedness of Target is smaller than Source's for i9 %r

Example:
%X i9 = 0x000 (0)
%Y i4 = 0x3 (3)
%o0 i4 = 0x0 (0)
%o1 i4 = 0x0 (0)
%o2 i4 = 0x0 (0)
%n0 i4 = 0x1 (1)
%n1 i4 = 0x8 (8, -8)
%n2 i9 = 0x1F8 (504, -8)
%n3 i9 = 0x000 (0)
Source value: 0x000 (0)
Target value: undef

I.e. we should be computing it in the wide type from the beginning.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48343

3 years ago[NFC][InstCombine] Add PR48343 miscompiled testcase
Roman Lebedev [Tue, 1 Dec 2020 11:49:28 +0000 (14:49 +0300)]
[NFC][InstCombine] Add PR48343 miscompiled testcase

3 years ago[NFC][InstCombine] Autogenerate sext.ll test checklines
Roman Lebedev [Tue, 1 Dec 2020 11:48:46 +0000 (14:48 +0300)]
[NFC][InstCombine] Autogenerate sext.ll test checklines

3 years ago[SimplifyCFG] FoldBranchToCommonDest: don't require that cmp of br is last instruction
Roman Lebedev [Tue, 1 Dec 2020 08:07:28 +0000 (11:07 +0300)]
[SimplifyCFG] FoldBranchToCommonDest: don't require that cmp of br is last instruction

There is no correctness need for that, and since we allow live-out
uses, this could theoretically happen, because currently nothing
will move the cond to right before the branch in those tests.
But regardless, lifting that restriction even makes the transform
easier to understand.

This makes the transform happen in 81 more cases (+0.55%)
)

3 years ago[NFC][SimplifyCFG] fold-branch-to-common-dest: add tests with cond of br not being...
Roman Lebedev [Tue, 1 Dec 2020 07:59:08 +0000 (10:59 +0300)]
[NFC][SimplifyCFG] fold-branch-to-common-dest: add tests with cond of br not being the last op

3 years ago[DAG] Move vselect(icmp_ult, -1, add(x,y)) -> uaddsat(x,y) to DAGCombine (PR40111)
Simon Pilgrim [Tue, 1 Dec 2020 11:56:12 +0000 (11:56 +0000)]
[DAG] Move vselect(icmp_ult, -1, add(x,y)) -> uaddsat(x,y) to DAGCombine (PR40111)

Move the X86 VSELECT->UADDSAT fold to DAGCombiner - there's nothing target specific about these folds.

The SSE42 test diffs are relatively benign - its avoiding an extra constant load in exchange for an extra xor operation - there are extra register moves, which is annoying as all those operations should commute them away.

Differential Revision: https://reviews.llvm.org/D91876