platform/upstream/llvm.git
4 years ago[X86] Enable commuting of EVEX VCMP for all immediate values during isel.
Craig Topper [Tue, 17 Sep 2019 04:40:58 +0000 (04:40 +0000)]
[X86] Enable commuting of EVEX VCMP for all immediate values during isel.

llvm-svn: 372065

4 years ago[test] Disable reproducer dump test on Windows
Jonas Devlieghere [Tue, 17 Sep 2019 03:58:32 +0000 (03:58 +0000)]
[test] Disable reproducer dump test on Windows

llvm-svn: 372064

4 years agoFix reliance on -flax-vector-conversions in AVX intrinsics headers and
Richard Smith [Tue, 17 Sep 2019 03:56:30 +0000 (03:56 +0000)]
Fix reliance on -flax-vector-conversions in AVX intrinsics headers and
corresponding tests.

llvm-svn: 372063

4 years agoFix reliance on lax vector conversions in tests for x86 intrinsics.
Richard Smith [Tue, 17 Sep 2019 03:56:28 +0000 (03:56 +0000)]
Fix reliance on lax vector conversions in tests for x86 intrinsics.

llvm-svn: 372062

4 years agoRemove reliance on lax vector conversions from altivec.h in VSX mode.
Richard Smith [Tue, 17 Sep 2019 03:56:26 +0000 (03:56 +0000)]
Remove reliance on lax vector conversions from altivec.h in VSX mode.

llvm-svn: 372061

4 years ago[ScriptInterpreter] Initialize globals when loading a scripting module.
Jonas Devlieghere [Tue, 17 Sep 2019 03:55:58 +0000 (03:55 +0000)]
[ScriptInterpreter] Initialize globals when loading a scripting module.

The LoadScriptingModule used by command script import wasn't
initializing the LLDB global variables (things like `lldb.frame` and
`lldb.debugger`). They would get initialized however when running the
interactive script interpreter or running a single script line (e.g.
`script print(lldb.frame)`). This patch fixes that by properly
initializing the globals when loading a Python module.

Differential revision: https://reviews.llvm.org/D67644

llvm-svn: 372060

4 years ago[ELF][Hexagon] Allow PT_LOAD to have overlapping p_offset ranges on EM_HEXAGON
Fangrui Song [Tue, 17 Sep 2019 02:45:38 +0000 (02:45 +0000)]
[ELF][Hexagon] Allow PT_LOAD to have overlapping p_offset ranges on EM_HEXAGON

Port the D64906 technique to EM_HEXAGON. This concludes the patch series.

Differential Revision: https://reviews.llvm.org/D67605

llvm-svn: 372059

4 years agoPush lambda scope earlier when transforming lambda expression
Nicholas Allegra [Tue, 17 Sep 2019 01:43:33 +0000 (01:43 +0000)]
Push lambda scope earlier when transforming lambda expression

Differential Revision: https://reviews.llvm.org/D66067

llvm-svn: 372058

4 years agoRevert "[lldb][NFC] Make ApplyObjcCastHack less scary"
Jim Ingham [Tue, 17 Sep 2019 00:44:48 +0000 (00:44 +0000)]
Revert "[lldb][NFC] Make ApplyObjcCastHack less scary"

This reverts commit 21641a2f6dbac22653befd03496e0850537882ff.

It was causing the following test failures:

lldb-Suite.lang/objc/objc-class-method.TestObjCClassMethod.py
lldb-Suite.lang/objc/foundation.TestObjCMethodsString.py
lldb-Suite.lang/objc/foundation.TestConstStrings.py
lldb-Suite.lang/objc/radar-9691614.TestObjCMethodReturningBOOL.py
lldb-Suite.lang/objc/foundation.TestObjCMethodsNSArray.py

llvm-svn: 372057

4 years ago[libFuzzer] Always print DSO map on Fuchsia libFuzzer launch
Jake Ehrlich [Tue, 17 Sep 2019 00:34:41 +0000 (00:34 +0000)]
[libFuzzer] Always print DSO map on Fuchsia libFuzzer launch

Fuchsia doesn't have /proc/id/maps, so it relies on the kernel logging system
to provide the DSO map to be able to symbolize in the context of ASLR. The DSO
map is logged automatically on Fuchsia when encountering a crash or writing to
the sanitizer log for the first time in a process. There are several cases
where libFuzzer doesn't encounter a crash, e.g. on timeouts, OOMs, and when
configured to print new PCs as they become covered, to name a few. Therefore,
this change always writes to the sanitizer log on startup to ensure the DSO map
is available in the log.

Author: aarongreen
Differential Revision: https://reviews.llvm.org/D66233

llvm-svn: 372056

4 years ago[OPENMP] Fix the test, NFC
Alexey Bataev [Tue, 17 Sep 2019 00:08:50 +0000 (00:08 +0000)]
[OPENMP] Fix the test, NFC

llvm-svn: 372055

4 years agollvm-reduce: Clean out previous test temp/output dir, since it was a dir and now...
David Blaikie [Mon, 16 Sep 2019 23:56:26 +0000 (23:56 +0000)]
llvm-reduce: Clean out previous test temp/output dir, since it was a dir and now it's used as just a single file

llvm-svn: 372054

4 years agollvm-reduce: Remove some string copies
David Blaikie [Mon, 16 Sep 2019 23:54:57 +0000 (23:54 +0000)]
llvm-reduce: Remove some string copies

llvm-svn: 372053

4 years ago[test] Fail gracefully if the regex doesn't match
Jonas Devlieghere [Mon, 16 Sep 2019 23:49:42 +0000 (23:49 +0000)]
[test] Fail gracefully if the regex doesn't match

This test is failing on the Fedora bot (staging). Rather than failing
with an IndexError, we should trigger an assert and dump the log when
the regex doesn't match.

llvm-svn: 372052

4 years agoRevert r372035: "[lit] Make internal diff work in pipelines"
Joel E. Denny [Mon, 16 Sep 2019 23:47:46 +0000 (23:47 +0000)]
Revert r372035: "[lit] Make internal diff work in pipelines"

This breaks a Windows bot.

llvm-svn: 372051

4 years ago[GlobalISel] Partially revert r371901.
Amara Emerson [Mon, 16 Sep 2019 23:46:03 +0000 (23:46 +0000)]
[GlobalISel] Partially revert r371901.

r371901 was overeager and widenScalarDst() and the like in the legalizer
attempt to increment the insert point given in order to add new instructions
after the currently legalizing inst. In cases where the insertion point is not
exactly the current instruction, then callers need to de-compensate for the
behaviour by decrementing the insertion iterator before calling them. It's not
a nice state of affairs, for now just undo the problematic parts of the change.

llvm-svn: 372050

4 years agollvm-reduce: Make tests shell-independent by passing the interpreter on the command...
David Blaikie [Mon, 16 Sep 2019 23:41:19 +0000 (23:41 +0000)]
llvm-reduce: Make tests shell-independent by passing the interpreter on the command line rather than using #! in the test file

llvm-svn: 372049

4 years agoAdd libc to path mappings in git-llvm.
David L. Jones [Mon, 16 Sep 2019 23:36:35 +0000 (23:36 +0000)]
Add libc to path mappings in git-llvm.

llvm-svn: 372048

4 years agoFix swig python package path
Haibo Huang [Mon, 16 Sep 2019 23:31:16 +0000 (23:31 +0000)]
Fix swig python package path

Summary:
The path defined in CMakeLists.txt doesn't match the path generated in
our python script. This change fixes that.

LLVM_LIBRARY_OUTPUT_INTDIR is defined as:

${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib${LLVM_LIBDIR_SUFFIX})

On the other hand, the path of site-package is generaged in
get_framework_python_dir_windows() in finishSwigPythonLLDB.py as:
(Dispite its name, the function is used for everything other than xcode)

prefix/cmakeBuildConfiguration/distutils.sysconfig.get_python_lib()

From lldb/CMakeLists.txt, we can see that:
prefix=${CMAKE_BINARY_DIR},
cmakeBuildConfiguration=${CMAKE_CFG_INTDIR}

And from python source code, we can see get_python_lib() always returns
lib/pythonx.y/site-packages for posix, or Lib/site-packages for windows:
https://github.com/python/cpython/blob/3.8/Lib/distutils/sysconfig.py#L128

We should make them match each other.

Subscribers: mgorny, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D67583

llvm-svn: 372047

4 years ago[Reproducer] Implement dumping packets.
Jonas Devlieghere [Mon, 16 Sep 2019 23:31:06 +0000 (23:31 +0000)]
[Reproducer] Implement dumping packets.

This patch completes the dump functionality by adding support for
dumping a reproducer's GDB remote packets.

Differential revision: https://reviews.llvm.org/D67636

llvm-svn: 372046

4 years agoFix warning: lambda capture 'temp_file_path' is not used
Jonas Devlieghere [Mon, 16 Sep 2019 22:55:49 +0000 (22:55 +0000)]
Fix warning: lambda capture 'temp_file_path' is not used

llvm-svn: 372044

4 years ago[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
Nemanja Ivanovic [Mon, 16 Sep 2019 22:54:52 +0000 (22:54 +0000)]
[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32

Add the missing piece of r372029.
Somehow when the patch for review D61961 was committed, only the test case
went in and the code didn't. This of course caused all kinds of build bot
breaks.
This patch just adds the code for that patch.

Author: Lei Huang
Differential revision: https://reviews.llvm.org/D61961

llvm-svn: 372043

4 years ago[Remarks] Allow remarks::Format::YAML to take a string table
Francis Visoiu Mistrih [Mon, 16 Sep 2019 22:45:17 +0000 (22:45 +0000)]
[Remarks] Allow remarks::Format::YAML to take a string table

It should be allowed to take a string table in case all the strings in
the remarks point there, but it shouldn't use it during serialization.

llvm-svn: 372042

4 years ago[test] Clean up previous raw profile before merging into it
Vedant Kumar [Mon, 16 Sep 2019 22:32:18 +0000 (22:32 +0000)]
[test] Clean up previous raw profile before merging into it

This fixes a test failure in instrprof-set-file-object-merging.c which
seems to have been caused by reuse of stale data in old raw profiles.

llvm-svn: 372041

4 years ago[OPENMP]Fix the test, NFC.
Alexey Bataev [Mon, 16 Sep 2019 22:17:10 +0000 (22:17 +0000)]
[OPENMP]Fix the test, NFC.

llvm-svn: 372040

4 years ago[Modules][Objective-C] Use complete decl from module when diagnosing missing import
Bruno Cardoso Lopes [Mon, 16 Sep 2019 22:00:29 +0000 (22:00 +0000)]
[Modules][Objective-C] Use complete decl from module when diagnosing missing import

Summary:
Otherwise the definition (first found) for ObjCInterfaceDecl's might
precede the module one, which will eventually lead to crash, since
diagnoseMissingImport needs one coming from a module.

This behavior changed after Richard's r342018, which started to look
into the definition of ObjCInterfaceDecls.

rdar://problem/49237144

Reviewers: rsmith, arphaman

Subscribers: jkorous, dexonsmith, ributzka, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D66982

llvm-svn: 372039

4 years ago[compiler-rt][crt] make test case nontrivial in check_cxx_section_exists
Jian Cai [Mon, 16 Sep 2019 21:47:47 +0000 (21:47 +0000)]
[compiler-rt][crt]  make test case nontrivial in check_cxx_section_exists

Summary:
.init_array gets optimized away when building with -O2 and as a result,
check_cxx_section_exists failed to pass -DCOMPILER_RT_HAS_INITFINI_ARRAY
when building crtbegin.o and crtend.o, which causes binaries linked with
them encounter segmentation fault. See https://crbug.com/855759 for
details. This change prevents .init_array section to be optimized away
even with -O2 or higher optimization level.

Subscribers: dberris, mgorny, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D67628

llvm-svn: 372038

4 years ago[clang-tidy] add checks to bugprone-posix-return
Jian Cai [Mon, 16 Sep 2019 21:43:56 +0000 (21:43 +0000)]
[clang-tidy] add checks to bugprone-posix-return

This check now also checks if any calls to pthread_* functions expect negative return values. These functions return either 0 on success or an errno on failure, which is positive only.

llvm-svn: 372037

4 years agoAdd a director, along with README.txt and LICENSE.txt, for libc.
David L. Jones [Mon, 16 Sep 2019 21:39:08 +0000 (21:39 +0000)]
Add a director, along with README.txt and LICENSE.txt, for libc.

llvm-svn: 372036

4 years ago[lit] Make internal diff work in pipelines
Joel E. Denny [Mon, 16 Sep 2019 21:22:29 +0000 (21:22 +0000)]
[lit] Make internal diff work in pipelines

When using lit's internal shell, RUN lines like the following
accidentally execute an external `diff` instead of lit's internal
`diff`:

```
 # RUN: program | diff file -
 # RUN: not diff file1 file2 | FileCheck %s
```

Such cases exist now, in `clang/test/Analysis` for example.  We are
preparing patches to ensure lit's internal `diff` is called in such
cases, which will then fail because lit's internal `diff` cannot
currently be used in pipelines and doesn't recognize `-` as a
command-line option.

To enable pipelines, this patch moves lit's `diff` implementation into
an out-of-process script, similar to lit's `cat` implementation.  A
follow-up patch will implement `-` to mean stdin.

Reviewed By: probinson, stella.stamenova

Differential Revision: https://reviews.llvm.org/D66574

llvm-svn: 372035

4 years agoRevert "Implement std::condition_variable via pthread_cond_clockwait() where available"
Dan Albert [Mon, 16 Sep 2019 21:20:32 +0000 (21:20 +0000)]
Revert "Implement std::condition_variable via pthread_cond_clockwait() where available"

This reverts commit 5e37d7f9ff257ec62d733d3d94b11f03e0fe51ca.

llvm-svn: 372034

4 years ago[NFC] Test commit access
Bardia Mahjour [Mon, 16 Sep 2019 20:44:15 +0000 (20:44 +0000)]
[NFC] Test commit access

llvm-svn: 372033

4 years ago[Docs] Bug fix for docs homepage
DeForest Richards [Mon, 16 Sep 2019 20:29:56 +0000 (20:29 +0000)]
[Docs] Bug fix for docs homepage

Removes reference to non-existent Reference Documentation page.

llvm-svn: 372032

4 years ago[Docs] Adds Getting Started/Tutorials, Reference to LLVM docs homepage
DeForest Richards [Mon, 16 Sep 2019 20:19:32 +0000 (20:19 +0000)]
[Docs] Adds Getting Started/Tutorials, Reference to LLVM docs homepage

Adds a section for Getting Started/Tutorials and Reference topics to the LLVM docs homepage.

llvm-svn: 372031

4 years ago[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
Lei Huang [Mon, 16 Sep 2019 20:04:15 +0000 (20:04 +0000)]
[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32

This is a follow up patch from https://reviews.llvm.org/D57857 to handle
extract_subvector v4f32.  For cases where we fpext of v2f32 to v2f64 from
extract_subvector we currently generate on P9 the following:

  lxv 0, 0(3)
  xxsldwi 1, 0, 0, 1
  xscvspdpn 2, 0
  xxsldwi 3, 0, 0, 3
  xxswapd 0, 0
  xscvspdpn 1, 1
  xscvspdpn 3, 3
  xscvspdpn 0, 0
  xxmrghd 0, 0, 3
  xxmrghd 1, 2, 1
  stxv 0, 0(4)
  stxv 1, 0(5)

This patch custom lower it to the following sequence:

  lxv 0, 0(3)       # load the v4f32 <w0, w1, w2, w3>
  xxmrghw 2, 0, 0   # Produce the following vector <w0, w0, w1, w1>
  xxmrglw 3, 0, 0   # Produce the following vector <w2, w2, w3, w3>
  xvcvspdp 2, 2     # FP-extend to <d0, d1>
  xvcvspdp 3, 3     # FP-extend to <d2, d3>
  stxv 2, 0(5)      # Store <d0, d1> (%vecinit11)
  stxv 3, 0(4)      # Store <d2, d3> (%vecinit4)

Differential Revision: https://reviews.llvm.org/D61961

llvm-svn: 372029

4 years ago[NFC] Move dumping into GDBRemotePacket
Jonas Devlieghere [Mon, 16 Sep 2019 20:02:57 +0000 (20:02 +0000)]
[NFC] Move dumping into GDBRemotePacket

This moves the dumping logic from the GDBRemoteCommunicationHistory
class into the GDBRemotePacket so that it can be reused from the
reproducer command object.

llvm-svn: 372028

4 years agoOpen fstream files in O_CLOEXEC mode when possible.
Dan Albert [Mon, 16 Sep 2019 19:26:41 +0000 (19:26 +0000)]
Open fstream files in O_CLOEXEC mode when possible.

Reviewers: EricWF, mclow.lists, ldionne

Reviewed By: ldionne

Subscribers: smeenai, dexonsmith, christof, ldionne, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D59839

llvm-svn: 372027

4 years agodo not emit -Wunused-macros warnings in -frewrite-includes mode (PR15614)
Lubos Lunak [Mon, 16 Sep 2019 19:18:37 +0000 (19:18 +0000)]
do not emit -Wunused-macros warnings in -frewrite-includes mode (PR15614)

-frewrite-includes calls PP.SetMacroExpansionOnlyInDirectives() to avoid
macro expansions that are useless in that mode, but this can lead
to -Wunused-macros false positives. As -frewrite-includes does not emit
normal warnings, block -Wunused-macros too.

Differential Revision: https://reviews.llvm.org/D65371

llvm-svn: 372026

4 years ago[Coverage] Speed up file-based queries for coverage info, NFC
Vedant Kumar [Mon, 16 Sep 2019 19:08:44 +0000 (19:08 +0000)]
[Coverage] Speed up file-based queries for coverage info, NFC

Speed up queries for coverage info in a file by reducing the amount of
time spent determining whether a function record corresponds to a file.

This gives a 36% speedup when generating a coverage report for `llc`.
The reduction is entirely in user time.

rdar://54758110

Differential Revision: https://reviews.llvm.org/D67575

llvm-svn: 372025

4 years ago[Coverage] Assert that filenames in a TU are unique, NFC
Vedant Kumar [Mon, 16 Sep 2019 19:08:41 +0000 (19:08 +0000)]
[Coverage] Assert that filenames in a TU are unique, NFC

llvm-svn: 372024

4 years ago[lld] Update lld driver to use new LTO APIs to handle libcall symbols
Steven Wu [Mon, 16 Sep 2019 18:49:57 +0000 (18:49 +0000)]
[lld] Update lld driver to use new LTO APIs to handle libcall symbols

NFC. Remove duplicated code in ELF/COFF driver and libLTO legacy
interfaces.

llvm-svn: 372022

4 years ago[LTO][Legacy] Add new C inferface to query libcall functions
Steven Wu [Mon, 16 Sep 2019 18:49:54 +0000 (18:49 +0000)]
[LTO][Legacy] Add new C inferface to query libcall functions

Summary:
This is needed to implemented the same approach as lld (implemented in r338434)
for how to handling symbols that can be generated by LTO code generator
but not present in the symbol table for linker that uses legacy C APIs.

libLTO is in charge of providing the list of symbols. Linker is in
charge of implementing the eager loading from static libraries using
the list of symbols.

rdar://problem/52853974

Reviewers: tejohnson, bd1976llvm, deadalnix, espindola

Reviewed By: tejohnson

Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67568

llvm-svn: 372021

4 years ago[PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups
Reid Kleckner [Mon, 16 Sep 2019 18:49:09 +0000 (18:49 +0000)]
[PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups

This fixes relocations against __profd_ symbols in discarded sections,
which is PR41380.

In general, instrumentation happens very early, and optimization and
inlining happens afterwards. The counters for a function are calculated
early, and after inlining, counters for an inlined function may be
widely referenced by other functions.

For C++ inline functions of all kinds (linkonce_odr &
available_externally mainly), instr profiling wants to deduplicate these
__profc_ and __profd_ globals. Otherwise the binary would be quite
large.

I made __profd_ and __profc_ comdat in r355044, but I chose to make
__profd_ internal. At the time, I was only dealing with coverage, and in
that case, none of the instrumentation needs to reference __profd_.
However, if you use PGO, then instrumentation passes add calls to
__llvm_profile_instrument_range which reference __profd_ globals. The
solution is to make these globals externally visible by using
linkonce_odr linkage for data as was done for counters.

This is safe because PGO adds a CFG hash to the names of the data and
counter globals, so if different TUs have different globals, they will
get different data and counter arrays.

Reviewers: xur, hans

Differential Revision: https://reviews.llvm.org/D67579

llvm-svn: 372020

4 years ago[ARM][Codegen] Autogenerate arm-cgp-casts.ll test.
Roman Lebedev [Mon, 16 Sep 2019 18:28:22 +0000 (18:28 +0000)]
[ARM][Codegen] Autogenerate arm-cgp-casts.ll test.

Apparently it got broken by r372009 while i thought it was r372012.

llvm-svn: 372019

4 years ago[lldb] Remove SetCount/ClearCount from Flags
Raphael Isemann [Mon, 16 Sep 2019 18:02:49 +0000 (18:02 +0000)]
[lldb] Remove SetCount/ClearCount from Flags

Summary:
These functions are only used in tests where we should test the actual flag values instead of counting all bits for an approximate check.
Also these popcount implementation aren't very efficient and doesn't seem to be optimised to anything fast.

Reviewers: davide, JDevlieghere

Reviewed By: davide, JDevlieghere

Subscribers: abidh, JDevlieghere, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D67540

llvm-svn: 372018

4 years ago[lldb][NFC] Make ApplyObjcCastHack less scary
Raphael Isemann [Mon, 16 Sep 2019 18:02:21 +0000 (18:02 +0000)]
[lldb][NFC] Make ApplyObjcCastHack less scary

llvm-svn: 372017

4 years agoImplement std::condition_variable via pthread_cond_clockwait() where available
Dan Albert [Mon, 16 Sep 2019 17:57:48 +0000 (17:57 +0000)]
Implement std::condition_variable via pthread_cond_clockwait() where available

std::condition_variable is currently implemented via
pthread_cond_timedwait() on systems that use pthread. This is
problematic, since that function waits by default on CLOCK_REALTIME
and libc++ does not provide any mechanism to change from this
default.

Due to this, regardless of if condition_variable::wait_until() is
called with a chrono::system_clock or chrono::steady_clock parameter,
condition_variable::wait_until() will wait using CLOCK_REALTIME. This
is not accurate to the C++ standard as calling
condition_variable::wait_until() with a chrono::steady_clock parameter
should use CLOCK_MONOTONIC.

This is particularly problematic because CLOCK_REALTIME is a bad
choice as it is subject to discontinuous time adjustments, that may
cause condition_variable::wait_until() to immediately timeout or wait
indefinitely.

This change fixes this issue with a new POSIX function,
pthread_cond_clockwait() proposed on
http://austingroupbugs.net/view.php?id=1216. The new function is
similar to pthread_cond_timedwait() with the addition of a clock
parameter that allows it to wait using either CLOCK_REALTIME or
CLOCK_MONOTONIC, thus allowing condition_variable::wait_until() to
wait using CLOCK_REALTIME for chrono::system_clock and CLOCK_MONOTONIC
for chrono::steady_clock.

pthread_cond_clockwait() is implemented in glibc (2.30 and later) and
Android's bionic (Android API version 30 and later).

This change additionally makes wait_for() and wait_until() with clocks
other than chrono::system_clock use CLOCK_MONOTONIC.<Paste>

llvm-svn: 372016

4 years ago[Clang][Codegen] Disable arm_acle.c test.
Roman Lebedev [Mon, 16 Sep 2019 17:46:08 +0000 (17:46 +0000)]
[Clang][Codegen] Disable arm_acle.c test.

This test is broken by design. Clang codegen tests should not depend
on llvm middle-end behaviour, they should *only* test clang codegen.
Yet this test runs whole optimization pipeline.
I've really tried to fix it, but there isn't just a few things
that depend on passes, but everything there does.

llvm-svn: 372015

4 years ago[Clang][Codegen] Relax available-externally-suppress.c test
Roman Lebedev [Mon, 16 Sep 2019 17:46:01 +0000 (17:46 +0000)]
[Clang][Codegen] Relax available-externally-suppress.c test

That test is broken by design.
It depends on llvm middle-end behavior.
No clang codegen test should be doing that.
This one is salvageable by relaxing check lines.

llvm-svn: 372014

4 years ago[X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands
Simon Pilgrim [Mon, 16 Sep 2019 17:30:33 +0000 (17:30 +0000)]
[X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands

Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector.

llvm-svn: 372013

4 years ago[ARM] A predicate cast of a predicate cast is a predicate cast
David Green [Mon, 16 Sep 2019 17:29:07 +0000 (17:29 +0000)]
[ARM] A predicate cast of a predicate cast is a predicate cast

The adds some very basic folding of PREDICATE_CASTS, removing cases when they
are chained together. These would already be removed eventually, as these are
lowered to copies. This just allows it to happen earlier, which can help other
simplifications.

Differential Revision: https://reviews.llvm.org/D67591

llvm-svn: 372012

4 years ago[OPENMP]Fix parsing/sema for function templates with declare simd.
Alexey Bataev [Mon, 16 Sep 2019 17:06:31 +0000 (17:06 +0000)]
[OPENMP]Fix parsing/sema for function templates with declare simd.

Need to return original declaration group with FunctionTemplateDecl, not
the inner FunctionDecl, to correctly handle parsing of directives with
the templates parameters.

llvm-svn: 372011

4 years ago[SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB...
Roman Lebedev [Mon, 16 Sep 2019 16:18:24 +0000 (16:18 +0000)]
[SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB cost

Summary:
Previously, if the threshold was 2, we were willing to speculatively
execute 2 cheap instructions in both basic blocks (thus we were willing
to speculatively execute cost = 4), but weren't willing to speculate
when one BB had 3 instructions and other one had no instructions,
even thought that would have total cost of 3.

This looks inconsistent to me.
I don't think `cmov`-like instructions will start executing
until both of it's inputs are available: https://godbolt.org/z/zgHePf
So i don't see why the existing behavior is the correct one.

Also, let's add it's own `cl::opt` for this threshold,
with default=4, so it is not stricter than the previous threshold:
will allow to fold when there are 2 BB's each with cost=2.
And since the logic has changed, it will also allow to fold when
one BB has cost=3 and other cost=1, or there is only one BB with cost=4.

This is an alternative solution to D65148:
This fix is mainly motivated by `signbit-like-value-extension.ll` test.
That pattern comes up in JPEG decoding, see e.g.
`Figure F.12 â€“ Extending the sign bit of a decoded value in V`
of `ITU T.81` (JPEG specification).
That branch is not predictable, and it is within the innermost loop,
so the fact that that pattern ends up being stuck with a branch
instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial.

This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240)
| metric                                 |     old |     new | delta |      % |
| x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |  0.00% |
| x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 | -0.31% |
| x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 | -0.03% |
| x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 | -0.17% |
| x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 | -0.37% |
| x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |  3.69% |
| x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |  0.00% |

We significantly decrease basic block count, notably decrease instruction count,
significantly decrease branch count and very significantly increase `cmov` count.

Performance-wise, unsurprisingly, this has great effect on
target RawSpeed benchmark. I'm seeing 5 **major** improvements:
```
Benchmark                                                                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean                                  -0.3064         -0.3064      226.9913      157.4452      226.9800      157.4384
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median                                -0.3057         -0.3057      226.8407      157.4926      226.8282      157.4828
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev                                -0.4985         -0.4954        0.3051        0.1530        0.3040        0.1534
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean                                   -0.1747         -0.1747       80.4787       66.4227       80.4771       66.4146
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median                                 -0.1742         -0.1743       80.4686       66.4542       80.4690       66.4436
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev                                 +0.6089         +0.5797        0.0670        0.1078        0.0673        0.1062
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean                                  -0.1598         -0.1598      171.6996      144.2575      171.6915      144.2538
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median                                -0.1598         -0.1597      171.7109      144.2755      171.7018      144.2766
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev                                +0.4024         +0.3850        0.0847        0.1187        0.0848        0.1175
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean                                   -0.0550         -0.0551      280.3046      264.8800      280.3017      264.8559
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median                                 -0.0554         -0.0554      280.2628      264.7360      280.2574      264.7297
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev                                 +0.7005         +0.7041        0.2779        0.4725        0.2775        0.4729
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean                                   -0.0354         -0.0355      316.7396      305.5208      316.7342      305.4890
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median                                 -0.0354         -0.0356      316.6969      305.4798      316.6917      305.4324
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev                                 +0.0493         +0.0330        0.3562        0.3737        0.3563        0.3681
```

That being said, it's always best-effort, so there will likely
be cases where this worsens things.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67318

llvm-svn: 372009

4 years ago[clangd] Simplify semantic highlighting visitor
Ilya Biryukov [Mon, 16 Sep 2019 16:16:03 +0000 (16:16 +0000)]
[clangd] Simplify semantic highlighting visitor

Summary:
- Functions to compute highlighting kinds for things are separated from
  the ones that add highlighting tokens.
  This keeps each of them more focused on what they're doing: getting
  locations and figuring out the kind of the entity, correspondingly.

- Less special cases in visitor for various nodes.

This change is an NFC.

Reviewers: hokein

Reviewed By: hokein

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67341

llvm-svn: 372008

4 years ago[InstCombine] remove unneeded one-use checks for icmp fold
Sanjay Patel [Mon, 16 Sep 2019 16:15:25 +0000 (16:15 +0000)]
[InstCombine] remove unneeded one-use checks for icmp fold

Related folds were added in:
rL125734
...the code comment about register pressure is discussed in
more detail in:
https://bugs.llvm.org/show_bug.cgi?id=2698

But 10 years later, perf testing bzip2 with this change now
shows a slight (0.2% average) improvement on Haswell although
that's probably within test noise.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 and rL371981 are related patches in this series.

llvm-svn: 372007

4 years ago[InstCombine] move tests for icmp+add; NFC
Sanjay Patel [Mon, 16 Sep 2019 15:33:40 +0000 (15:33 +0000)]
[InstCombine] move tests for icmp+add; NFC

llvm-svn: 372004

4 years ago[ARM] Add patterns for BSWAP intrinsic on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:20:10 +0000 (15:20 +0000)]
[ARM] Add patterns for BSWAP intrinsic on MVE

BSWAP can use the VREV instruction on MVE to produce better results than
expanding.

llvm-svn: 372002

4 years ago[ARM] Add patterns for bitreverse intrinsic on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:20:03 +0000 (15:20 +0000)]
[ARM] Add patterns for bitreverse intrinsic on MVE

BITREVERSE can use the VBRSR which will reverse and right shift.
Shifting right by 0 will just reverse the bits.

llvm-svn: 372001

4 years ago[ARM] Lower CTTZ on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:19:56 +0000 (15:19 +0000)]
[ARM] Lower CTTZ on MVE

Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and
count the leading zeros, equivalent to a count trailing zeros (CTTZ).

llvm-svn: 372000

4 years ago[ARM] Add patterns for CTLZ on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:19:49 +0000 (15:19 +0000)]
[ARM] Add patterns for CTLZ on MVE

CTLZ intrinsic can use the VCLS instruction on MVE, which produces
better results than expanding.

llvm-svn: 371999

4 years ago[ExecutionEngine] Don't dereference a dyn_cast result. NFCI.
Simon Pilgrim [Mon, 16 Sep 2019 15:19:11 +0000 (15:19 +0000)]
[ExecutionEngine] Don't dereference a dyn_cast result. NFCI.

The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.

llvm-svn: 371998

4 years ago[libFuzzer] Remove unused version of FuzzedDataProvider.h.
Max Moroz [Mon, 16 Sep 2019 15:00:21 +0000 (15:00 +0000)]
[libFuzzer] Remove unused version of FuzzedDataProvider.h.

Summary: The actual version lives in compiler-rt/include/fuzzer/.

Reviewers: Dor1s

Reviewed By: Dor1s

Subscribers: delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D67623

llvm-svn: 371997

4 years ago[LV] Add ARM MVE tail-folding tests
Sjoerd Meijer [Mon, 16 Sep 2019 14:56:26 +0000 (14:56 +0000)]
[LV] Add ARM MVE tail-folding tests

Now that the vectorizer can do tail-folding (rL367592), and the ARM backend
understands MVE masked loads/stores (rL371932), it's time to add the MVE
tail-folding equivalent of the X86 tests that I added.

llvm-svn: 371996

4 years ago[SystemZ] Call erase() on the right MBB in SystemZTargetLowering::emitSelect()
Jonas Paulsson [Mon, 16 Sep 2019 14:49:36 +0000 (14:49 +0000)]
[SystemZ]  Call erase() on the right MBB in SystemZTargetLowering::emitSelect()

Since MBB was split *before* MI, the MI(s) will reside in JoinMBB (MBB) at
the point of erasing them, so calling StartMBB->erase() is actually wrong,
although it is "working" by all appearances.

Review: Ulrich Weigand
llvm-svn: 371995

4 years ago[NFC] remove unused functions
Guillaume Chatelet [Mon, 16 Sep 2019 14:48:58 +0000 (14:48 +0000)]
[NFC] remove unused functions

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67616

llvm-svn: 371994

4 years agoAMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source
Matt Arsenault [Mon, 16 Sep 2019 14:26:14 +0000 (14:26 +0000)]
AMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source

This was producing an illegal copy which would hit an assert
later. Error on selection for now until this is implemented.

llvm-svn: 371993

4 years agoAMDGPU/GlobalISel: Fix some broken run lines
Matt Arsenault [Mon, 16 Sep 2019 14:14:40 +0000 (14:14 +0000)]
AMDGPU/GlobalISel: Fix some broken run lines

llvm-svn: 371992

4 years agoAMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL
Matt Arsenault [Mon, 16 Sep 2019 14:14:37 +0000 (14:14 +0000)]
AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL

llvm-svn: 371991

4 years agoAMDGPU/GlobalISel: Remove another illegal select test
Matt Arsenault [Mon, 16 Sep 2019 14:14:31 +0000 (14:14 +0000)]
AMDGPU/GlobalISel: Remove another illegal select test

llvm-svn: 371990

4 years ago[X86][NFC] Add a `use-aa` feature.
Clement Courbet [Mon, 16 Sep 2019 14:05:28 +0000 (14:05 +0000)]
[X86][NFC] Add a `use-aa` feature.

Summary:
This allows enabling useaa on the command-line and will allow enabling the
feature on a per-CPU basis where benchmarking shows improvements.

This is modelled after the ARM/AArch64 target.

Reviewers: RKSimon, andreadb, craig.topper

Subscribers: javed.absar, kristof.beyls, hiraditya, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67266

llvm-svn: 371989

4 years ago[InstCombine] add/move tests for icmp with add operand; NFC
Sanjay Patel [Mon, 16 Sep 2019 14:05:19 +0000 (14:05 +0000)]
[InstCombine] add/move tests for icmp with add operand; NFC

llvm-svn: 371988

4 years ago[clangd][vscode] update the development doc.
Haojian Wu [Mon, 16 Sep 2019 14:03:06 +0000 (14:03 +0000)]
[clangd][vscode] update the development doc.

llvm-svn: 371986

4 years agoMove some definitions from Sema to Basic to fix shared libs build
Erich Keane [Mon, 16 Sep 2019 13:58:59 +0000 (13:58 +0000)]
Move some definitions from Sema to Basic to fix shared libs build

r371875 moved some functionality around to a Basic header file, but
didn't move its definitions as well.  This patch moves some things
around so that shared library building can work.

llvm-svn: 371985

4 years ago[docs][llvm-strings] Write llvm-strings documentation
James Henderson [Mon, 16 Sep 2019 13:56:12 +0000 (13:56 +0000)]
[docs][llvm-strings] Write llvm-strings documentation

Previously we only had a stub document.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D67554

llvm-svn: 371984

4 years ago[docs][llvm-size] Write llvm-size documentation
James Henderson [Mon, 16 Sep 2019 13:20:37 +0000 (13:20 +0000)]
[docs][llvm-size] Write llvm-size documentation

Previously we only had a stub document.

Reviewed by: serge-sans-paille, MaskRay

Differential Revision: https://reviews.llvm.org/D67555

llvm-svn: 371983

4 years ago[ARM] Fold VCMP into VPT
David Green [Mon, 16 Sep 2019 13:02:41 +0000 (13:02 +0000)]
[ARM] Fold VCMP into VPT

MVE has VPT instructions, which perform the duties of both a VCMP and a VPST in
a single instruction, performing the compare and starting the VPT block in one.
This teaches the MVEVPTBlockPass to fold them, searching back through the
basicblock for a valid VCMP and creating the VPT from its operands.

There are some changes to the VPT instructions to accommodate this, altering
the order of the operands to match the VCMP better, and changing P0 register
defs to be VPR defs, as is used in other places.

Differential Revision: https://reviews.llvm.org/D66577

llvm-svn: 371982

4 years ago[InstCombine] remove unneeded one-use checks for icmp fold
Sanjay Patel [Mon, 16 Sep 2019 12:54:34 +0000 (12:54 +0000)]
[InstCombine] remove unneeded one-use checks for icmp fold

This fold and several others were added in:
rL125734 <https://reviews.llvm.org/rL125734>
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 is a related patch in this series.

llvm-svn: 371981

4 years ago[clangd] Bump vscode-clangd v0.0.17
Haojian Wu [Mon, 16 Sep 2019 12:51:07 +0000 (12:51 +0000)]
[clangd] Bump vscode-clangd v0.0.17

CHANGELOG:
- added semantic highlighting support (under the clangd.semanticHighlighting
  flag);
- better error message when clangd fails to execute refactoring-like
  actions;
- improved the readme doc;

llvm-svn: 371980

4 years ago[InstCombine] add icmp tests with extra uses; NFC
Sanjay Patel [Mon, 16 Sep 2019 12:19:18 +0000 (12:19 +0000)]
[InstCombine] add icmp tests with extra uses; NFC

llvm-svn: 371979

4 years ago[InstCombine] fix comments to match code; NFC
Sanjay Patel [Mon, 16 Sep 2019 12:12:05 +0000 (12:12 +0000)]
[InstCombine] fix comments to match code; NFC

This blob was written before match() existed, so it
could probably be reduced significantly.

But I suspect it isn't well tested, so tests would have
to be added to reduce risk from logic changes.

llvm-svn: 371978

4 years agogn build: Merge r371976
Nico Weber [Mon, 16 Sep 2019 11:33:54 +0000 (11:33 +0000)]
gn build: Merge r371976

llvm-svn: 371977

4 years agoImplement semantic selections.
Utkarsh Saxena [Mon, 16 Sep 2019 11:29:35 +0000 (11:29 +0000)]
Implement semantic selections.

Summary:
For a given cursor position, it returns ranges that are interesting to the user.
Currently the semantic ranges correspond to the nodes of the syntax trees.

Subscribers: mgorny, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67358

llvm-svn: 371976

4 years ago[VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI.
Simon Pilgrim [Mon, 16 Sep 2019 11:22:44 +0000 (11:22 +0000)]
[VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI.

The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place.

llvm-svn: 371975

4 years ago[SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference...
Simon Pilgrim [Mon, 16 Sep 2019 10:48:16 +0000 (10:48 +0000)]
[SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference warning. NFCI.

llvm-svn: 371974

4 years ago[SLPVectorizer] Don't dereference a dyn_cast result. NFCI.
Simon Pilgrim [Mon, 16 Sep 2019 10:35:09 +0000 (10:35 +0000)]
[SLPVectorizer] Don't dereference a dyn_cast result. NFCI.

The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.

llvm-svn: 371973

4 years agoAdded return statement to fix compile and build warning:
Sjoerd Meijer [Mon, 16 Sep 2019 10:30:37 +0000 (10:30 +0000)]
Added return statement to fix compile and build warning:

llvm-rtdyld.cpp:966:7: warning: variable â€˜Result’ set but not used

llvm-svn: 371972

4 years ago[clangd] Fix a crash when renaming operator.
Haojian Wu [Mon, 16 Sep 2019 10:16:56 +0000 (10:16 +0000)]
[clangd] Fix a crash when renaming operator.

Summary:
The renamelib uses a tricky way to calculate the end location by relying
on decl name, this is incorrect for the overloaded operator (the name is
"operator++" instead of "++"), which will cause out-of-file offset.

We also disable renaming operator symbol, this case is tricky, and
renamelib doesnt handle it properly.

Reviewers: ilya-biryukov

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67607

llvm-svn: 371971

4 years ago[ELF][ARM] Fix -Werror buildbots NFC.
Peter Smith [Mon, 16 Sep 2019 10:07:53 +0000 (10:07 +0000)]
[ELF][ARM] Fix -Werror buildbots NFC.

Provide a missing initializer to get rid of warning provoking buildbot
failures.

error: missing field 'rel' initializer
[-Werror,-Wmissing-field-initializers]

llvm-svn: 371970

4 years agoChange signature of __builtin_rotateright64 back to unsigned
Karl-Johan Karlsson [Mon, 16 Sep 2019 09:52:23 +0000 (09:52 +0000)]
Change signature of __builtin_rotateright64 back to unsigned

The signature of __builtin_rotateright64 was by misstake changed from
unsigned to signed in r360863, this patch will change it back to
unsigned as intended.

This fixes pr43309

Reviewers: efriedma, hans

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D67606

llvm-svn: 371969

4 years agoFix the rst doc, unbreak buildbot.
Haojian Wu [Mon, 16 Sep 2019 09:46:53 +0000 (09:46 +0000)]
Fix the rst doc, unbreak buildbot.

llvm-svn: 371968

4 years ago[SVE][Inline-Asm] Add constraints for SVE predicate registers
Kerry McLaughlin [Mon, 16 Sep 2019 09:45:27 +0000 (09:45 +0000)]
[SVE][Inline-Asm] Add constraints for SVE predicate registers

Summary:
Adds the following inline asm constraints for SVE:
  - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive
  - Upa: SVE predicate register with full range, P0 to P15

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin

Reviewed By: rovka

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66524

llvm-svn: 371967

4 years agogn build: Merge r371965
Nico Weber [Mon, 16 Sep 2019 09:43:26 +0000 (09:43 +0000)]
gn build: Merge r371965

llvm-svn: 371966

4 years ago[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417
Peter Smith [Mon, 16 Sep 2019 09:38:38 +0000 (09:38 +0000)]
[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417

The --fix-cortex-a8 option implements a linker workaround for the
coretex-a8 erratum 657417. A summary of the erratum conditions is:
- A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two
4KiB regions.
- The destination of the branch is to the first 4KiB region.
- The instruction before the branch is a 32-bit Thumb-2 non-branch
instruction.

The linker fix is to redirect the branch to a patch not in the first
4KiB region. The patch forwards the branch on to its target.

The cortex-a8, is an old CPU, with the first implementation of this
workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in
early Android Phones and there are some critical applications that still
need to run on a cortex-a8 that have the erratum. The patch is applied
roughly 10 times on LLD and 20 on Clang when they are built with
--fix-cortex-a8 on an Arm system.

The formal erratum description is avaliable in the ARM Core Cortex-A8
(AT400/AT401) Errata Notice document. This is available from Arm on
request but it seems to be findable via a web search.

Differential Revision: https://reviews.llvm.org/D67284

llvm-svn: 371965

4 years ago[clang-tidy] performance-inefficient-vector-operation: Support proto repeated field
Haojian Wu [Mon, 16 Sep 2019 08:54:10 +0000 (08:54 +0000)]
[clang-tidy] performance-inefficient-vector-operation: Support proto repeated field

Summary:
Finds calls that add element to protobuf repeated field in a loop
without calling Reserve() before the loop. Calling Reserve() first can avoid
unnecessary memory reallocations.

A new option EnableProto is added to guard this feature.

Patch by Cong Liu!

Reviewers: gribozavr, alexfh, hokein, aaron.ballman

Reviewed By: hokein

Subscribers: lebedev.ri, xazax.hun, Eugene.Zelenko, cfe-commits

Tags: #clang, #clang-tools-extra

Differential Revision: https://reviews.llvm.org/D67135

llvm-svn: 371963

4 years ago[test] Add -z separate-code to fix tests that ae sensitive to exact addresses after...
Fangrui Song [Mon, 16 Sep 2019 07:52:30 +0000 (07:52 +0000)]
[test] Add -z separate-code to fix tests that ae sensitive to exact addresses after r371958

llvm-svn: 371962

4 years agogn build: Merge r371959
Nico Weber [Mon, 16 Sep 2019 07:34:23 +0000 (07:34 +0000)]
gn build: Merge r371959

llvm-svn: 371961

4 years ago[AArch64] Some more FP16 FMA pattern matching
Sjoerd Meijer [Mon, 16 Sep 2019 07:32:13 +0000 (07:32 +0000)]
[AArch64] Some more FP16 FMA pattern matching

After our previous machinecombiner exercises (rL371321, rL371818, rL371833), we
were still missing a few FP16 FMA patterns.

Differential Revision: https://reviews.llvm.org/D67576

llvm-svn: 371960

4 years ago[SystemZ] Merge the SystemZExpandPseudo pass into SystemZPostRewrite.
Jonas Paulsson [Mon, 16 Sep 2019 07:29:37 +0000 (07:29 +0000)]
[SystemZ]  Merge the SystemZExpandPseudo pass into SystemZPostRewrite.

SystemZExpandPseudo:s only job was to expand LOCRMux instructions into jump
sequences. This needs to be done if expandLOCRPseudo() or expandSELRPseudo()
fails to find a legal opcode (all registers "high" or "low"). This task has
now been moved to SystemZPostRewrite while removing the SystemZExpandPseudo
pass.

It is in fact preferred to expand these pseudos directly after register
allocation in SystemZPostRewrite since the hinted register combinations are
then not subject to later optimizations.

Review: Ulrich Weigand
https://reviews.llvm.org/D67432

llvm-svn: 371959

4 years ago[ELF][X86] Allow PT_LOAD to have overlapping p_offset ranges on EM_X86_64
Fangrui Song [Mon, 16 Sep 2019 07:05:34 +0000 (07:05 +0000)]
[ELF][X86] Allow PT_LOAD to have overlapping p_offset ranges on EM_X86_64

Port the D64906 technique to EM_X86_64.

Differential Revision: https://reviews.llvm.org/D67482

llvm-svn: 371958

4 years ago[ELF] Map the ELF header at imageBase
Fangrui Song [Mon, 16 Sep 2019 07:04:16 +0000 (07:04 +0000)]
[ELF] Map the ELF header at imageBase

If there is no readonly section, we map:

* The ELF header at imageBase+maxPageSize
* Program headers at imageBase+maxPageSize+sizeof(Ehdr)
* The first section .text at imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)

Due to the interaction between Writer<ELFT>::fixSectionAlignments and
LinkerScript::allocateHeaders,
`alignDown(p_vaddr(R PT_LOAD)) = alignDown(p_vaddr(RX PT_LOAD))`.
The RX PT_LOAD will override the R PT_LOAD at runtime, which is not ideal:

```
// PHDR at 0x401034, should be 0x400034
  PHDR           0x000034 0x00401034 0x00401034 0x000a0 0x000a0 R   0x4
// R PT_LOAD contains just Ehdr and program headers.
// At 0x401000, should be 0x400000
  LOAD           0x000000 0x00401000 0x00401000 0x000d4 0x000d4 R   0x1000
  LOAD           0x0000d4 0x004010d4 0x004010d4 0x00001 0x00001 R E 0x1000
```

* createPhdrs allocates the headers to the R PT_LOAD.
* fixSectionAlignments assigns `imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)` (formula: `alignTo(dot, maxPageSize) + dot % config->maxPageSize`) to addrExpr of .text
* allocateHeaders computes the minimum address among SHF_ALLOC sections, i.e. addr(.text)
* allocateHeaders sets address of ELF header to `addr(.text)-sizeof(Ehdr)-sizeof(program headers) = imageBase+maxPageSize`

The main observation is that when the SECTIONS command is not used, we
don't have to call allocateHeaders. This requires an assumption that
the presence of PT_PHDR and addresses of headers can be decided
regardless of address information.

This may seem natural because dot is not manipulated by a linker script.
The other thing is that we have to drop the special rule for -T<section>
in `getInitialDot`. If -Ttext is smaller than the image base, the headers
will not be allocated with the old behavior (allocateHeaders is called)
but always allocated with the new behavior.

The behavior change is not a problem. Whether and where headers are
allocated can vary among linkers, or ld.bfd across different versions
(--enable-separate-code or not). It is thus advised to use a linker
script with the PHDRS command to have a consistent behavior across
linkers. If PT_PHDR is needed, an explicit --image-base can be a simpler
alternative.

Differential Revision: https://reviews.llvm.org/D67325

llvm-svn: 371957