platform/upstream/llvm.git
3 years ago[CUDA][HIP] Fix linkage for -fgpu-rdc
Yaxun (Sam) Liu [Wed, 28 Oct 2020 14:44:21 +0000 (10:44 -0400)]
[CUDA][HIP] Fix linkage for -fgpu-rdc

Currently for explicit template function instantiation in CUDA/HIP device
compilation clang emits instantiated kernel with external linkage
and instantiated device function with internal linkage.

This is fine for -fno-gpu-rdc since there is only one TU.

However this causes duplicate symbols for kernels for -fgpu-rdc if
the same instantiation happen in multiple TU. Or missing symbols
if a device function calls an explicitly instantiated template function
in a different TU.

To make explicit template function instantiation work for
-fgpu-rdc we need to follow the C++ linkage paradigm, i.e.
use weak_odr linkage.

Differential Revision: https://reviews.llvm.org/D90311

3 years ago[InstCombine] Perform C-(X+C2) --> (C-C2)-X transform before using Negator
Roman Lebedev [Tue, 3 Nov 2020 12:32:31 +0000 (15:32 +0300)]
[InstCombine] Perform  C-(X+C2) --> (C-C2)-X  transform before using Negator

In particular, it makes it fire for C=0, because negator doesn't want
to perform that fold since in general it's not beneficial.

3 years ago[InstCombine] Negator: - (C - %x) --> %x - C (PR47997)
Roman Lebedev [Tue, 3 Nov 2020 10:34:38 +0000 (13:34 +0300)]
[InstCombine] Negator: - (C - %x) --> %x - C (PR47997)

This relaxes one-use restriction on that `sub` fold,
since apparently the addition of Negator broke
preexisting `C-(C2-X) --> X+(C-C2)` (with C=0) fold.

3 years ago[NFC][InstCombine] Negator: add test coverage for `(?? - (%y + C))` pattern (PR47997)
Roman Lebedev [Tue, 3 Nov 2020 11:06:42 +0000 (14:06 +0300)]
[NFC][InstCombine] Negator: add test coverage for `(?? - (%y + C))` pattern (PR47997)

3 years ago[NFC][InstCombine] Negator: add test coverage for `(?? - (C - %y))` pattern (PR47997)
Roman Lebedev [Tue, 3 Nov 2020 11:04:36 +0000 (14:04 +0300)]
[NFC][InstCombine] Negator: add test coverage for `(?? - (C - %y))` pattern (PR47997)

3 years ago[NFC][InstCombine] Add test coverage for PR47997
Roman Lebedev [Tue, 3 Nov 2020 10:18:20 +0000 (13:18 +0300)]
[NFC][InstCombine] Add test coverage for PR47997

3 years ago[SCCP] Handle bitcast of vector constants.
Florian Hahn [Tue, 3 Nov 2020 10:32:38 +0000 (10:32 +0000)]
[SCCP] Handle bitcast of vector constants.

Vectors where all elements have the same known constant range are treated as a
single constant range in the lattice. When bitcasting such vectors, there is a
mis-match between the width of the lattice value (single constant range) and
the original operands (vector). Go to overdefined in that case.

Fixes PR47991.

3 years ago[ARM] Remove unused variable. NFC
David Green [Tue, 3 Nov 2020 12:58:10 +0000 (12:58 +0000)]
[ARM] Remove unused variable. NFC

3 years ago[OpenMP][libomptarget][Tests] fix failing test
Joachim Protze [Tue, 3 Nov 2020 11:31:05 +0000 (12:31 +0100)]
[OpenMP][libomptarget][Tests] fix failing test

D88149 updated `omp_get_initial_device` behavior to conform with OpenMP 5.1.
omp_get_initial_device() == omp_get_num_devices()

3 years ago[OpenMP][OMPT][NFC] Fix flaky test
Joachim Protze [Mon, 2 Nov 2020 15:43:12 +0000 (16:43 +0100)]
[OpenMP][OMPT][NFC] Fix flaky test

As reported by @ronlieb, the test shows intermittent fails.
The test failed, if the dependent task was already finished, when the depending
task was to be created. We have other tests to check for the dependences pair.

3 years ago[OpenMP][Tool] Handle detached tasks in Archer
Joachim Protze [Fri, 30 Oct 2020 08:36:07 +0000 (09:36 +0100)]
[OpenMP][Tool] Handle detached tasks in Archer

Since detached tasks are supported by clang and the OpenMP runtime, Archer
must expect to receive the corresponding callbacks.

This patch adds support to interpret the synchronization semantics of
omp_fulfill_event and cleans up the handling of task switches.

3 years agoRevert "[CodeGen] [WinException] Only produce handler data at the end of the function...
Hans Wennborg [Tue, 3 Nov 2020 12:01:55 +0000 (13:01 +0100)]
Revert "[CodeGen] [WinException] Only produce handler data at the end of the function if needed"

This caused an explosion in ICF times during linking on Windows when libfuzzer
instrumentation is enabled. For a small binary we see ICF time go from ~0 to
~10 s. For a large binary it goes from ~1 s to forevert (I gave up after 30
minutes).

See comment on the code review.

> If we are going to write handler data (that is written as variable
> length data following after the unwind info in .xdata), we need to
> emit the handler data immediately, but for cases where no such
> info is going to be written, skip emitting it right away. (Unwind
> info for all remaining functions that hasn't gotten it emitted
> directly is emitted at the end.)
>
> This does slightly change the ordering of sections (triggering a
> bunch of updates to DebugInfo/COFF tests), but the change should be
> benign.
>
> This also matches GCC's assembly output, which doesn't output
> .seh_handlerdata unless it actually is needed.
>
> For ARM64, the unwind info can be packed into the runtime function
> entry itself (leaving no data in the .xdata section at all), but
> that can only be done if there's no follow-on data in the .xdata
> section. If emission of the unwind info is triggered via
> EmitWinEHHandlerData (or the .seh_handlerdata directive), which
> implicitly switches to the .xdata section, there's a chance of the
> caller wanting to pass further data there, so the packed format
> can't be used in that case.
>
> Differential Revision: https://reviews.llvm.org/D87448

This reverts commit 36c64af9d7f97414d48681b74352c9684077259b.

3 years ago[JITLink][ELF] Implement R_X86_64_PLT32 relocations
Stefan Gränitz [Tue, 3 Nov 2020 12:05:38 +0000 (12:05 +0000)]
[JITLink][ELF] Implement R_X86_64_PLT32 relocations

Basic implementation for call and jmp branches with 32 bit offset. Branches to local targets produce
Branch32 edges that are resolved like a regular PCRel32 relocations. Branches to external (undefined)
targets produce Branch32ToStub edges and go through a PLT entry by default. If the target happens to
get resolved within the 32 bit range from the callsite, the edge is relaxed during post-allocation
optimization. There is a test for each of these cases.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D90331

3 years ago[clang-tidy] adding "--config-file=<file-path>" to specify custom config file.
Hiral Oza [Tue, 3 Nov 2020 10:11:44 +0000 (10:11 +0000)]
[clang-tidy] adding "--config-file=<file-path>" to specify custom config file.

Let clang-tidy to read config from specified file.
Example:
$ clang-tidy --config-file=/some/path/myTidyConfig --list-checks --
...this will read config from '/some/path/myTidyConfig'.

ClangTidyMain.cpp reads ConfigFile into string and then assigned read data to 'Config' i.e. makes like '--config' code flow internally.

May speed-up tidy runtime since now it will just look-up <file-path>
instead of searching ".clang-tidy" in parent-dir(s).

Directly specifying config path helps setting build dependencies.

Thanks to @DmitryPolukhin for valuable suggestion. This patch now propose
change only in ClangTidyMain.cpp.

Reviewed By: DmitryPolukhin

Differential Revision: https://reviews.llvm.org/D89936

3 years agoFix 'default label in switch which covers all enumeration values' warning
serge-sans-paille [Tue, 3 Nov 2020 11:57:44 +0000 (12:57 +0100)]
Fix 'default label in switch which covers all enumeration values' warning

3 years ago[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops
David Green [Tue, 3 Nov 2020 11:53:09 +0000 (11:53 +0000)]
[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops

If an instruction will be lowered to a call there is no advantage of
using a low overhead loop as the LR register will need to be spilled and
reloaded around the call, and the low overhead will end up being
reverted. This teaches our hardware loop lowering that these memory
intrinsics will be calls under certain situations.

Differential Revision: https://reviews.llvm.org/D90439

3 years ago[ARM] Low overhead loop memcpy lowering test. NFC
David Green [Tue, 3 Nov 2020 11:44:50 +0000 (11:44 +0000)]
[ARM] Low overhead loop memcpy lowering test. NFC

3 years ago[AArch64][SVE] NFC: Guard all SVE tests for TypeSize warnings.
Sander de Smalen [Tue, 3 Nov 2020 10:25:06 +0000 (10:25 +0000)]
[AArch64][SVE] NFC: Guard all SVE tests for TypeSize warnings.

This patch adds a bunch of CHECK lines to guard against implicit
conversions of TypeSize -> uint64_t occuring in code-paths that previously
were safe for scalable vectors.

3 years ago[MLIR] Added test operations to replace linalg dependency for
Alexander Bosch [Tue, 13 Oct 2020 14:02:21 +0000 (16:02 +0200)]
[MLIR] Added test operations to replace linalg dependency for
BufferizeTests.

Summary:
Added test operations to replace the LinalgDialect dependency in tests
which use the buffer-deallocation, buffer-hoisting,
buffer-loop-hoisting, promote-buffers-to-stack,
buffer-placement-preparation-allowed-memref-resutls and
buffer-placement-preparation pass. Adapted the corresponding tests cases
and TestBufferPlacement.cpp.

Differential Revision: https://reviews.llvm.org/D90037

3 years agoMake the implicit nesting behavior of the PassManager user-controllable and default...
Mehdi Amini [Tue, 3 Nov 2020 11:17:39 +0000 (11:17 +0000)]
Make the implicit nesting behavior of the PassManager user-controllable and default to false

This is an error prone behavior, I frequently have ~20 min debugging sessions when I hit
an unexpected implicit nesting. This default makes the C++ API safer for users.

Depends On D90669

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D90671

3 years agoHandle the verifier at run() time in the PassManager instead of build time
Mehdi Amini [Tue, 3 Nov 2020 11:17:01 +0000 (11:17 +0000)]
Handle the verifier at run() time in the PassManager instead of build time

This simplifies a few parts of the pass manager, but in particular we don't add as many
verifierpass as there are passes in the pipeline, and we can now enable/disable the
verifier after the fact on an already built PassManager.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D90669

3 years agoChange the PrintOpStatsPass to operate on any operation instead of just ModuleOp
Mehdi Amini [Tue, 3 Nov 2020 05:00:43 +0000 (05:00 +0000)]
Change the PrintOpStatsPass to operate on any operation instead of just ModuleOp

This allows to use it on other operation, like a GPUModule for example.

3 years agoRemove mlir-c/Core.h which is superseded by the new API in mlir-c/IR.h
Mehdi Amini [Mon, 2 Nov 2020 21:27:28 +0000 (21:27 +0000)]
Remove mlir-c/Core.h which is superseded by the new API in mlir-c/IR.h

This header was an initial early attempt at a crude C API for bindings,
but it isn't used and redundant with the new API. At this point it only
contributes to more confusion.

Differential Revision: https://reviews.llvm.org/D90643

3 years agoAdd test missing from previous commit
Stephen Kelly [Mon, 2 Nov 2020 23:30:52 +0000 (23:30 +0000)]
Add test missing from previous commit

3 years ago[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel...
Simon Pilgrim [Tue, 3 Nov 2020 10:49:33 +0000 (10:49 +0000)]
[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts

The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Differential Revision: https://reviews.llvm.org/D90625

3 years ago[mlir] Convert `memref_reshape` to LLVM.
Alexander Belyaev [Tue, 3 Nov 2020 10:36:14 +0000 (11:36 +0100)]
[mlir] Convert `memref_reshape` to LLVM.

https://llvm.discourse.group/t/rfc-standard-memref-cast-ops/1454/15

Differential Revision: https://reviews.llvm.org/D90377

3 years ago[SLP] Pass VecPred argument to getCmpSelInstrCost.
Florian Hahn [Tue, 3 Nov 2020 09:55:47 +0000 (09:55 +0000)]
[SLP] Pass VecPred argument to getCmpSelInstrCost.

Check if all compares in VL have the same predicate and pass it to
getCmpSelInstrCost, to improve cost-modeling on targets that only
support compare/select combinations for certain uniform predicates.

This leads to additional vectorization in some cases

```
Same hash: 217 (filtered out)
Remaining: 19
Metric: SLP.NumVectorInstructions

Program                                        base    slp2    diff
 test-suite...marks/SciMark2-C/scimark2.test    11.00   26.00  136.4%
 test-suite...T2006/445.gobmk/445.gobmk.test    79.00  135.00  70.9%
 test-suite...ediabench/gsm/toast/toast.test    54.00   71.00  31.5%
 test-suite...telecomm-gsm/telecomm-gsm.test    54.00   71.00  31.5%
 test-suite...CI_Purple/SMG2000/smg2000.test   426.00  542.00  27.2%
 test-suite...ch/g721/g721encode/encode.test    30.00   24.00  -20.0%
 test-suite...000/186.crafty/186.crafty.test   116.00  138.00  19.0%
 test-suite...ications/JM/ldecod/ldecod.test   697.00  765.00   9.8%
 test-suite...6/464.h264ref/464.h264ref.test   822.00  886.00   7.8%
 test-suite...chmarks/MallocBench/gs/gs.test   154.00  162.00   5.2%
 test-suite...nsumer-lame/consumer-lame.test   621.00  651.00   4.8%
 test-suite...lications/ClamAV/clamscan.test   223.00  231.00   3.6%
 test-suite...marks/7zip/7zip-benchmark.test   680.00  695.00   2.2%
 test-suite...CFP2000/177.mesa/177.mesa.test   2121.00 2129.00  0.4%
 test-suite...:: External/Povray/povray.test   2406.00 2412.00  0.2%
 test-suite...TimberWolfMC/timberwolfmc.test   634.00  634.00   0.0%
 test-suite...CFP2006/433.milc/433.milc.test   1036.00 1036.00  0.0%
 test-suite.../Benchmarks/nbench/nbench.test   321.00  321.00   0.0%
 test-suite...ctions-flt/Reductions-flt.test    NaN      5.00   nan%
```

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D90124

3 years ago[lld] missing doc entry for error handling script
serge-sans-paille [Tue, 3 Nov 2020 10:14:38 +0000 (11:14 +0100)]
[lld] missing doc entry for error handling script

Fix http://lab.llvm.org:8011/#/builders/69/builds/67

3 years ago[AArch64] Redundant masks in downcast long multiply
Nicholas Guy [Thu, 22 Oct 2020 12:41:05 +0000 (13:41 +0100)]
[AArch64] Redundant masks in downcast long multiply

Adds patterns to catch masks preceeding a long multiply,
and generating a single umull/smull instruction instead.

Differential revision: https://reviews.llvm.org/D89956

3 years agoProvide a hook to customize missing library error handling
serge-sans-paille [Mon, 19 Oct 2020 11:19:52 +0000 (13:19 +0200)]
Provide a hook to customize missing library error handling

Make it possible for lld users to provide a custom script that would help to
find missing libraries. A possible scenario could be:

    % clang /tmp/a.c -fuse-ld=lld -loauth -Wl,--error-handling-script=/tmp/addLibrary.py
    unable to find library -loauth
    looking for relevant packages to provides that library

        liboauth-0.9.7-4.el7.i686
        liboauth-devel-0.9.7-4.el7.i686
        liboauth-0.9.7-4.el7.x86_64
        liboauth-devel-0.9.7-4.el7.x86_64
        pix-1.6.1-3.el7.x86_64

Where addLibrary would be called with the missing library name as first argument
(in that case addLibrary.py oauth)

Differential Revision: https://reviews.llvm.org/D87758

3 years ago[CostModel] Make target intrinsics cheap by default
David Green [Tue, 3 Nov 2020 09:58:28 +0000 (09:58 +0000)]
[CostModel] Make target intrinsics cheap by default

This patch changes the intrinsics cost model to assume that by default
target intrinsics are cheap. This didn't seem to be the case for all
intrinsics, and is potentially an MVE problem due to our scalarization
overheads. Cheap seems to be a good default in general though.

Differential Revision: https://reviews.llvm.org/D90597

3 years ago[NFCI] Add StackOffset class and base classes for ElementCount, TypeSize.
Sander de Smalen [Mon, 2 Nov 2020 21:32:57 +0000 (21:32 +0000)]
[NFCI] Add StackOffset class and base classes for ElementCount, TypeSize.

This patch adds a linear polynomial base class, called LinearPolyBase, which
serves as a base class for StackOffset. It tries to represent a linear
polynomial like:

  c0 * scale0 + c1 * scale1 + ... + cK * scaleK

where the scale is implicit, meaning that only the coefficients are
encoded.

This patch also adds a univariate linear polynomial, which serves as
a base class for ElementCount and TypeSize. This tries to represent a
linear polynomial where only one dimension can be set at any one time,
i.e. a TypeSize is either fixed-sized, or scalable-sized, but cannot be
a combination of the two.

  class LinearPolyBase
     ^
     |
     +---- class StackOffset  (dimensions = 2 (fixed/scalable), type = int64_t)

  class UnivariateLinearPolyBase
     |
     |
     +---- class LinearPolySize (dimensions = 2 (fixed/scalable))
                  ^
                  |
                  +-------- class ElementCount  (type = unsigned)
                  |
                  |
                  +-------- class TypeSize      (type = uint64_t)

Reviewed By: ctetreau, david-arm

Differential Revision: https://reviews.llvm.org/D88982

3 years ago[LLDB][NFC] treat Lua error codes in a more explicit manner
Pedro Tammela [Sun, 1 Nov 2020 16:23:36 +0000 (16:23 +0000)]
[LLDB][NFC] treat Lua error codes in a more explicit manner

This patch is a minor suggestion to not rely on the fact
that the `LUA_OK` macro is 0.

This assumption could change in future versions of the C API.

Differential Revision: https://reviews.llvm.org/D90556

3 years ago[mlir] Add to shape.is_broadcastable description
Tres Popp [Tue, 3 Nov 2020 09:23:54 +0000 (10:23 +0100)]
[mlir] Add to shape.is_broadcastable description

3 years ago[mlir] Add partial lowering of shape.cstr_broadcastable.
Tres Popp [Tue, 13 Oct 2020 15:56:45 +0000 (17:56 +0200)]
[mlir] Add partial lowering of shape.cstr_broadcastable.

Because cstr operations allow more instruction reordering than asserts, we only
lower cstr_broadcastable to std ops with cstr_require. This ensures that the
more drastic lowering to asserts can happen specifically with the user's desire.

Differential Revision: https://reviews.llvm.org/D89325

3 years ago[lldb] [Plugins/FreeBSDRemote] Disable GetMemoryRegionInfo()
Michał Górny [Mon, 2 Nov 2020 22:48:55 +0000 (23:48 +0100)]
[lldb] [Plugins/FreeBSDRemote] Disable GetMemoryRegionInfo()

Disable GetMemoryRegionInfo() in order to unbreak expression parsing.
For some reason, the presence of non-stub function causes LLDB to fail
to detect system libraries correctly.  Through being unable to find
mmap() and allocate memory, this leads to expression parser being
broken.

The issue is non-trivial and it is going to require more time debugging.
On the other hand, the downsides of missing the function are minimal
(2 failing tests), and the benefit of working expression parser
justifies disabling it temporarily.  Furthermore, the old FreeBSD plugin
did not implement it anyway, so it allows us to switch to the new plugin
without major regressions.

The really curious part is that the respective code in the NetBSD plugin
yields very similar results, yet does not seem to break the expression
parser.

Differential Revision: https://reviews.llvm.org/D90650

3 years ago[lldb] [Process/FreeBSDRemote] Remove GetSharedLibraryInfoAddress override
Michał Górny [Mon, 2 Nov 2020 16:01:40 +0000 (17:01 +0100)]
[lldb] [Process/FreeBSDRemote] Remove GetSharedLibraryInfoAddress override

Remove the NetBSD-specific override of GetSharedLibraryInfoAddress(),
restoring the generic implementation from NativeProcessELF.

Differential Revision: https://reviews.llvm.org/D90620

3 years ago[lldb] [Process/FreeBSDRemote] Fix attaching via lldb-server
Michał Górny [Sat, 31 Oct 2020 09:30:13 +0000 (10:30 +0100)]
[lldb] [Process/FreeBSDRemote] Fix attaching via lldb-server

Fix two bugs that caused attaching to a process in a pre-connected
lldb-server to fail.  These are:

1. Prematurely reporting status in NativeProcessFreeBSD::Attach().
   The SetState() call defaulted to notify the process, and LLGS tried
   to send the stopped packet before the process instance was assigned
   to it.  While at it, add an assert for that in LLGS.

2. Duplicate call to ReinitializeThreads() (via SetupTrace()) that
   overwrote the stopped status in threads.  Now SetupTrace() is called
   directly by NativeProcessFreeBSD::Attach() (not the Factory) in place
   of ReinitializeThreads().

This fixes at least commands/process/attach/TestProcessAttach.py
and python_api/hello_world/TestHelloWorld.py.

Differential Revision: https://reviews.llvm.org/D90525

3 years ago[lldb] [Host/{free,net}bsd] Fix process matching by name
Michał Górny [Fri, 30 Oct 2020 11:00:45 +0000 (12:00 +0100)]
[lldb] [Host/{free,net}bsd] Fix process matching by name

Fix process matching by name to make 'process attach -n ...' work.

The process finding code has an optimization that defers getting
the process name and executable format after the numeric (PID, UID...)
parameters are tested.  However, the ProcessInstanceInfoMatch.Matches()
method has been matching process name against the incomplete process
information as well, and effectively no process ever matched.

In order to fix this, create a copy of ProcessInstanceInfoMatch, set
it to ignore process name and se this copy for the initial match.
The same fix applies to FreeBSD and NetBSD host code.

Differential Revision: https://reviews.llvm.org/D90454

3 years ago[lldb] [Process/FreeBSDRemote] Implement thread GetName()
Michał Górny [Wed, 28 Oct 2020 11:40:45 +0000 (12:40 +0100)]
[lldb] [Process/FreeBSDRemote] Implement thread GetName()

Implement NativeThreadFreeBSD::GetName().  This is based
on the equivalent code in the legacy FreeBSD plugin, except it is
modernized a bit to use llvm::Optional and std::vector for data storage.

Differential Revision: https://reviews.llvm.org/D90298

3 years ago[llvm-readobj/libObject] - Allow dumping objects that has a broken SHT_SYMTAB_SHNDX...
Georgii Rymar [Tue, 13 Oct 2020 13:17:04 +0000 (16:17 +0300)]
[llvm-readobj/libObject] - Allow dumping objects that has a broken SHT_SYMTAB_SHNDX section.

Currently it is impossible to create an instance of ELFObjectFile when the
SHT_SYMTAB_SHNDX can't be read. We error out when fail to parse the
SHT_SYMTAB_SHNDX section in the factory method.

This change delays reading of the SHT_SYMTAB_SHNDX section entries,
with it llvm-readobj is now able to work with such inputs.

Differential revision: https://reviews.llvm.org/D89379

3 years agoAMDGPU/GlobalISel: Use same builder/observer in post-legalizer-combiner
Petar Avramovic [Tue, 3 Nov 2020 08:23:56 +0000 (09:23 +0100)]
AMDGPU/GlobalISel: Use same builder/observer in post-legalizer-combiner

Change match/apply functions into methods of new target specific combiner
helper class. Use reference to MachineIRBuilder from helper instead of
constructing new MachineIRBuilder each time new instruction needs to made.
Allows correct tracking of newly created instructions.

Differential Revision: https://reviews.llvm.org/D90623

3 years ago[clang] Fix the fsanitize.c testcase after eaae6fdf67e1f. NFC.
Martin Storsjö [Tue, 3 Nov 2020 08:20:36 +0000 (10:20 +0200)]
[clang] Fix the fsanitize.c testcase after eaae6fdf67e1f. NFC.

After that commit, the vptr sanitizer is enabled for mingw targets.

3 years ago[NFC] Refactor code in IndVars, preparing for further improvement
Max Kazantsev [Tue, 3 Nov 2020 07:59:04 +0000 (14:59 +0700)]
[NFC] Refactor code in IndVars, preparing for further improvement

3 years ago[clang] [MinGW] Allow using the vptr sanitizer
Martin Storsjö [Sun, 1 Nov 2020 20:49:49 +0000 (22:49 +0200)]
[clang] [MinGW] Allow using the vptr sanitizer

Differential Revision: https://reviews.llvm.org/D90572

3 years ago[compiler-rt] [ubsan] Use the itanium type info lookup for mingw targets
Martin Storsjö [Sun, 1 Nov 2020 20:52:08 +0000 (22:52 +0200)]
[compiler-rt] [ubsan] Use the itanium type info lookup for mingw targets

Differential Revision: https://reviews.llvm.org/D90571

3 years ago[PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Esme-Yi [Tue, 3 Nov 2020 07:44:11 +0000 (07:44 +0000)]
[PowerPC] Extend folding RLWINM + RLWINM to post-RA.

Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.

Reviewed By: shchenz, steven.zhang

Differential Revision: https://reviews.llvm.org/D89855

3 years ago[libcxx] [test] Use error_code::default_error_condition to check errors against the...
Martin Storsjö [Thu, 29 Oct 2020 10:40:13 +0000 (12:40 +0200)]
[libcxx] [test] Use error_code::default_error_condition to check errors against the expected codes

error_code returned from functions might not be of the generic category,
but of the system category, which can have different error code values.
Use default_error_condition() to remap errors to the generic category
where possible, to allow comparing them to the expected values.

Use the ErrorIs() helper instead of a direct comparison against
an excpected value.

Differential Revision: https://reviews.llvm.org/D90602

3 years ago[libcxx] Avoid double frees of file descriptors in the fallback ifstream/ofstream...
Martin Storsjö [Mon, 2 Nov 2020 08:19:42 +0000 (10:19 +0200)]
[libcxx] Avoid double frees of file descriptors in the fallback ifstream/ofstream codepath

So far, most actual uses of libc++ std::filesystem probably use
the sendfile or fcopyfile implementations.

Differential Revision: https://reviews.llvm.org/D90601

3 years ago[libcxx] [test] Create symlink_to_dir as the right kind, as a directory symlink
Martin Storsjö [Fri, 30 Oct 2020 23:01:28 +0000 (01:01 +0200)]
[libcxx] [test] Create symlink_to_dir as the right kind, as a directory symlink

This was missed in 5c39eebc126d.

Differential Revision: https://reviews.llvm.org/D90600

3 years ago[libcxx] [test] Avoid an unused variable in non-libcpp cases in path.append
Martin Storsjö [Thu, 22 Oct 2020 10:28:35 +0000 (13:28 +0300)]
[libcxx] [test] Avoid an unused variable in non-libcpp cases in path.append

Differential Revision: https://reviews.llvm.org/D89947

3 years ago[libcxx] [test] Fix the fs.op.absolute test to cope with windows paths
Martin Storsjö [Wed, 14 Oct 2020 10:16:03 +0000 (13:16 +0300)]
[libcxx] [test] Fix the fs.op.absolute test to cope with windows paths

Prepend the root path on the already_absolute testcase, and construct
a path ending with the preferred separator for the test reference for
"foo/".

Differential Revision: https://reviews.llvm.org/D89944

3 years ago[libcxxabi] Build all of libcxxabi with _LIBCPP_BUILDING_LIBRARY defined
Martin Storsjö [Fri, 30 Oct 2020 17:11:23 +0000 (19:11 +0200)]
[libcxxabi] Build all of libcxxabi with _LIBCPP_BUILDING_LIBRARY defined

Various definitions from libcxx need to be set in the same way
as if building libcxx itself.

Differential Revision: https://reviews.llvm.org/D90476

3 years ago[scan-build] Fix clang++ pathname again
Stephan Bergmann [Thu, 15 Oct 2020 16:10:22 +0000 (18:10 +0200)]
[scan-build] Fix clang++ pathname again

e00629f777d9d62875730f40d266727df300dbb2 "[scan-build] Fix clang++ pathname" had
removed the -MAJOR.MINOR suffix, but since presumably LLVM 7 the suffix is only
-MAJOR, so ClangCXX (i.e., the CLANG_CXX environment variable passed to
clang/tools/scan-build/libexec/ccc-analyzer) now contained a non-existing
/path/to/clang-12++ (which apparently went largely unnoticed as
clang/tools/scan-build/libexec/ccc-analyzer falls back to just 'clang++' if the
executable denoted by CLANG_CXX does not exist).

For the new clang/test/Analysis/scan-build/cxx-name.test to be effective,
%scan-build must now take care to pass the clang executable's resolved pathname
(i.e., ending in .../clang-MAJOR rather than just .../clang) to --use-analyzer.

Differential Revision: https://reviews.llvm.org/D89481

3 years ago[NFC] Split lambda into 2 parts for further reuse
Max Kazantsev [Tue, 3 Nov 2020 07:13:28 +0000 (14:13 +0700)]
[NFC] Split lambda into 2 parts for further reuse

3 years ago[RISCV] Remove isel patterns for fshl/fshr with same inputs. NFC
Craig Topper [Tue, 3 Nov 2020 06:54:20 +0000 (22:54 -0800)]
[RISCV] Remove isel patterns for fshl/fshr with same inputs. NFC

These were being selected to ROL/ROR, but DAG combine should
canonicalize fshl/fshr with same inputs to rotl/rotr which we
also have patterns for.

3 years ago[IndVars] Use knowledge about execution on last iteration when removing checks
Max Kazantsev [Tue, 3 Nov 2020 06:18:46 +0000 (13:18 +0700)]
[IndVars] Use knowledge about execution on last iteration when removing checks

If we know that some check will not be executed on the last iteration, we can use this
fact to eliminate its check.

Differential Revision: https://reviews.llvm.org/D88210
Reviwed By: ebrevnov

3 years ago[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo.
Esme-Yi [Tue, 3 Nov 2020 06:28:56 +0000 (06:28 +0000)]
[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo.

Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example D88274. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.
This is a NFC patch to move the folding patterns to PPCInstrInfo, and the follow-up works will be calling it in pre-emit-peephole and expand the patterns to handle more cases.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D89846

3 years ago[Flang][OpenMP][NFC][1/2] Reorder OmpStructureChecker and simplify it.
Sameeran joshi [Wed, 28 Oct 2020 18:13:49 +0000 (23:43 +0530)]
[Flang][OpenMP][NFC][1/2] Reorder OmpStructureChecker and simplify it.

`OmpStructureChecker` has too much boilerplate code in source file.
It was not easy to figure out the seperation of clauses inside 'OmpClause' and
the ones which had a seperate node in parse-tree.h.

This patch:
1. Removes the boilerplate by defining a few macros.
2. Makes seperation between constructs, directives and clauses(sub classes are seperated).
3. Macros could have been shared between OMP and OACC, template specilizations might have
   been costly hence used macros.
Follows the same strategy used for `AccStructureChecker`.

Next patch in series to simplify OmpStructureChecker would try to simplify
boilerplates inside the functions and either create abstractions or use if
something is available inside check-directive-structure.h

Reviewed By: kiranchandramohan, clementval

Differential Revision: https://reviews.llvm.org/D90324

3 years agoPut back the test pragma-fp-exc.cpp
Serge Pavlov [Tue, 3 Nov 2020 05:24:22 +0000 (12:24 +0700)]
Put back the test pragma-fp-exc.cpp

This test was removed in 5963e028e714 because it failed on cores where
support of constrained intrinsics was limited. Now this test is enabled
only on x86.

3 years ago[Libomptarget][NFC] Move global Libomptarget state to a struct
Atmn Patel [Fri, 30 Oct 2020 05:04:34 +0000 (01:04 -0400)]
[Libomptarget][NFC] Move global Libomptarget state to a struct

Presently, there a number of global variables in libomptarget (devices,
RTLs, tables, mutexes, etc.) that are not placed within a struct. This
patch places them into a struct ``PluginManager``. All of the functions
that act on this data remain free.

Differential Revision: https://reviews.llvm.org/D90519

3 years ago[docs] Fix clang/docs/UsersManual.rst after D87528 & D88446
Fangrui Song [Tue, 3 Nov 2020 05:07:15 +0000 (21:07 -0800)]
[docs] Fix clang/docs/UsersManual.rst after D87528 & D88446

3 years ago[polly] Fix -Wunused-lambda-capture and -Wunused-variable
Fangrui Song [Tue, 3 Nov 2020 04:35:26 +0000 (20:35 -0800)]
[polly] Fix -Wunused-lambda-capture and -Wunused-variable

3 years ago[sanitizer] Cleanup -Wnon-virtual-dtor warnings
Vitaly Buka [Tue, 3 Nov 2020 01:33:22 +0000 (17:33 -0800)]
[sanitizer] Cleanup -Wnon-virtual-dtor warnings

3 years ago[CodeGen] Fix regression from D83655
Jessica Clarke [Tue, 3 Nov 2020 03:57:46 +0000 (03:57 +0000)]
[CodeGen] Fix regression from D83655

Arm EHABI has a null LSDASection as it does its own thing, so we should
continue to return null in that case rather than try and cast it.

3 years ago[RISCV] Only return DestSourcePair from isCopyInstrImpl for registers
Jessica Clarke [Tue, 3 Nov 2020 03:55:47 +0000 (03:55 +0000)]
[RISCV] Only return DestSourcePair from isCopyInstrImpl for registers

ADDI often has a frameindex in operand 1, but consumers of this
interface, such as MachineSink, tend to call getReg() on the Destination
and Source operands, leading to the following crash when building
FreeBSD after this implementation was added in 8cf6778d30:

```
clang: llvm/include/llvm/CodeGen/MachineOperand.h:359: llvm::Register llvm::MachineOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
 #0 0x00007f4286f9b4d0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) llvm/lib/Support/Unix/Signals.inc:563:0
 #1 0x00007f4286f9b587 PrintStackTraceSignalHandler(void*) llvm/lib/Support/Unix/Signals.inc:630:0
 #2 0x00007f4286f9926b llvm::sys::RunSignalHandlers() llvm/lib/Support/Signals.cpp:71:0
 #3 0x00007f4286f9ae52 SignalHandler(int) llvm/lib/Support/Unix/Signals.inc:405:0
 #4 0x00007f428646ffd0 (/lib/x86_64-linux-gnu/libc.so.6+0x3efd0)
 #5 0x00007f428646ff47 raise /build/glibc-2ORdQG/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 #6 0x00007f42864718b1 abort /build/glibc-2ORdQG/glibc-2.27/stdlib/abort.c:81:0
 #7 0x00007f428646142a __assert_fail_base /build/glibc-2ORdQG/glibc-2.27/assert/assert.c:89:0
 #8 0x00007f42864614a2 (/lib/x86_64-linux-gnu/libc.so.6+0x304a2)
 #9 0x00007f428d4078e2 llvm::MachineOperand::getReg() const llvm/include/llvm/CodeGen/MachineOperand.h:359:0
#10 0x00007f428d8260e7 attemptDebugCopyProp(llvm::MachineInstr&, llvm::MachineInstr&) llvm/lib/CodeGen/MachineSink.cpp:862:0
#11 0x00007f428d826442 performSink(llvm::MachineInstr&, llvm::MachineBasicBlock&, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::SmallVectorImpl<llvm::MachineInstr*>&) llvm/lib/CodeGen/MachineSink.cpp:918:0
#12 0x00007f428d826e27 (anonymous namespace)::MachineSinking::SinkInstruction(llvm::MachineInstr&, bool&, std::map<llvm::MachineBasicBlock*, llvm::SmallVector<llvm::MachineBasicBlock*, 4u>, std::less<llvm::MachineBasicBlock*>, std::allocator<std::pair<llvm::MachineBasicBlock* const, llvm::SmallVector<llvm::MachineBasicBlock*, 4u> > > >&) llvm/lib/CodeGen/MachineSink.cpp:1073:0
#13 0x00007f428d824a2c (anonymous namespace)::MachineSinking::ProcessBlock(llvm::MachineBasicBlock&) llvm/lib/CodeGen/MachineSink.cpp:410:0
#14 0x00007f428d824513 (anonymous namespace)::MachineSinking::runOnMachineFunction(llvm::MachineFunction&) llvm/lib/CodeGen/MachineSink.cpp:340:0
```

Thus, check that operand 1 is also a register in the condition.

Reviewed By: arichardson, luismarques

Differential Revision: https://reviews.llvm.org/D89090

3 years ago[crashlog] Remove commented out code (NFC)
Jonas Devlieghere [Tue, 3 Nov 2020 03:46:43 +0000 (19:46 -0800)]
[crashlog] Remove commented out code (NFC)

Remove commented out code and print statements.

3 years ago[crashlog] Turn crash log parsing modes into a Python 'enum' (NFC)
Jonas Devlieghere [Tue, 3 Nov 2020 03:18:51 +0000 (19:18 -0800)]
[crashlog] Turn crash log parsing modes into a Python 'enum' (NFC)

Python doesn't support enums before PEP 435, but using a class with
constants is how it's commonly emulated. It can be converted into a real
Enum (in Python 3.4 and later) by extending the Enum class:

  class CrashLogParseMode(Enum):
      NORMAL = 0
      THREAD = 1
      IMAGES = 2
      THREGS = 3
      SYSTEM = 4
      INSTRS = 5

3 years ago[PowerPC] Skip IEEE 128-bit FP type in FastISel
Qiu Chaofan [Tue, 3 Nov 2020 03:17:11 +0000 (11:17 +0800)]
[PowerPC] Skip IEEE 128-bit FP type in FastISel

Vector types, quadword integers and f128 currently cannot be handled in
FastISel. We did not skip f128 type in lowering arguments, which causes
a crash. This patch will fix it.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D90206

3 years ago[PowerPC] [NFC] Rename VCMPo to VCMP_rec
Qiu Chaofan [Tue, 3 Nov 2020 02:53:35 +0000 (10:53 +0800)]
[PowerPC] [NFC] Rename VCMPo to VCMP_rec

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D90581

3 years ago[crashlog] Fix and simplify the way we import lldb
Jonas Devlieghere [Tue, 3 Nov 2020 02:56:24 +0000 (18:56 -0800)]
[crashlog] Fix and simplify the way we import lldb

Don't try to guess the location of LLDB.framework but use xcrun to ask
the command line driver for the location of the lldb module.

3 years ago[Syntax] DeclaratorList is a List
Sam McCall [Sat, 31 Oct 2020 20:20:27 +0000 (21:20 +0100)]
[Syntax] DeclaratorList is a List

I think this was just an oversight.

Differential Revision: https://reviews.llvm.org/D90541

3 years agoAdd textual header PPCTypes.def to module Clang_Basic after D81508
Fangrui Song [Tue, 3 Nov 2020 02:23:26 +0000 (18:23 -0800)]
Add textual header PPCTypes.def to module Clang_Basic after D81508

3 years ago[LICM] Add assert of AST/MSSA exclusiveness.
Alina Sbirlea [Tue, 3 Nov 2020 01:50:53 +0000 (17:50 -0800)]
[LICM] Add assert of AST/MSSA exclusiveness.

The API `canSinkOrHoistInst` may be called by LoopSink. Add assert to
avoid having two analyses passed in.

3 years ago[sanitizer] Make destructors protected
Vitaly Buka [Tue, 3 Nov 2020 00:42:20 +0000 (16:42 -0800)]
[sanitizer] Make destructors protected

3 years agoRemove unused parameter
Akira Hatanaka [Tue, 3 Nov 2020 01:40:06 +0000 (17:40 -0800)]
Remove unused parameter

3 years ago[mlir][Affine] Remove single iteration affine.for ops in AffineLoopNormalize
Diego Caballero [Tue, 27 Oct 2020 20:21:00 +0000 (13:21 -0700)]
[mlir][Affine] Remove single iteration affine.for ops in AffineLoopNormalize

This patch renames AffineParallelNormalize to AffineLoopNormalize to make it
more generic and be able to hold more loop normalization transformations in
the future for affine.for and affine.parallel ops. Eventually, it could also be
extended to support scf.for and scf.parallel. As a starting point for affine.for,
the patch also adds support for removing single iteration affine.for ops to the
the pass.

Differential Revision: https://reviews.llvm.org/D90267

3 years agoAdd parallelTransformReduce and parallelForEachError
Reid Kleckner [Mon, 2 Nov 2020 18:15:34 +0000 (10:15 -0800)]
Add parallelTransformReduce and parallelForEachError

parallelTransformReduce is modelled on the C++17 pstl API of
std::transform_reduce, except our wrappers do not use execution policy
parameters.

parallelForEachError allows loops that contain potentially failing
operations to propagate errors out of the loop. This was one of the
major challenges I encountered while parallelizing PDB type merging in
LLD. Parallelizing a loop with parallelForEachError is not behavior
preserving: the loop will no longer stop on the first error, it will
continue working and report all errors it encounters in a list.

I plan to use this to propagate errors out of LLD's
coff::TpiSource::remapTpiWithGHashes, which currently stores errors an
error in the TpiSource object.

Differential Revision: https://reviews.llvm.org/D90639

3 years ago[PowerPC] Parse and ignore .machine ppc64
Fangrui Song [Tue, 3 Nov 2020 00:49:56 +0000 (16:49 -0800)]
[PowerPC] Parse and ignore .machine ppc64

In the wild, kexec-tools purgatory/arch/ppc64/v2wrap.S and hvcall.S
use this directive.

3 years ago[mlir][Linalg] Add more utility functions to LinalgDependenceGraph.
MaheshRavishankar [Tue, 3 Nov 2020 00:04:14 +0000 (16:04 -0800)]
[mlir][Linalg] Add more utility functions to LinalgDependenceGraph.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D90582

3 years ago[darwin] add support for __isPlatformVersionAtLeast check for if (@available)
Alex Lorenz [Thu, 29 Oct 2020 05:48:59 +0000 (22:48 -0700)]
[darwin] add support for __isPlatformVersionAtLeast check for if (@available)

The __isPlatformVersionAtLeast routine is an implementation of `if (@available)` check
that uses the _availability_version_check API on Darwin that's supported on
macOS 10.15, iOS 13, tvOS 13 and watchOS 6.

Differential Revision: https://reviews.llvm.org/D90367

3 years ago[libc++] Fix invalid parsing of ints in a <random> test
Louis Dionne [Tue, 3 Nov 2020 00:18:37 +0000 (19:18 -0500)]
[libc++] Fix invalid parsing of ints in a <random> test

The strings were concatenated together without adding spaces between
numbers, which lead to numbers that wouldn't fit in an unsigned int.

Thanks to Casey Carter for the find.

3 years ago[scudo][standalone] Code tidying (NFC)
Kostya Kortchinsky [Mon, 2 Nov 2020 22:27:11 +0000 (14:27 -0800)]
[scudo][standalone] Code tidying (NFC)

- we have clutter-reducing helpers for relaxed atomics that were barely
  used, use them everywhere we can
- clang-format everything with a recent version

Differential Revision: https://reviews.llvm.org/D90649

3 years ago[cc1as] Close MCAsmParser before MCStreamer
Fangrui Song [Mon, 2 Nov 2020 23:56:56 +0000 (15:56 -0800)]
[cc1as] Close MCAsmParser before MCStreamer

This is a pre-existing problem exposed by D90511 (asan failurs with
clang/test/Driver/relax.s and a few other tests).

3 years ago[NFC] Use [MC]Register in Live-ness tracking
Gaurav Jain [Fri, 23 Oct 2020 05:15:56 +0000 (22:15 -0700)]
[NFC] Use [MC]Register in Live-ness tracking

Differential Revision: https://reviews.llvm.org/D90611

3 years ago[pstl] Replace direct use of assert() with _PSTL_ASSERT
Thomas Rodgers [Mon, 2 Nov 2020 23:11:41 +0000 (18:11 -0500)]
[pstl] Replace direct use of assert() with _PSTL_ASSERT

Standard libraries may (libstdc++ in particular) forbid direct use of
assert()/<cassert> in library code.

Differential Revision: https://reviews.llvm.org/D60249

3 years agoReland - [Clang] Add the ability to map DLL storage class to visibility
Ben Dunbobbin [Mon, 2 Nov 2020 23:24:04 +0000 (23:24 +0000)]
Reland - [Clang] Add the ability to map DLL storage class to visibility

415f7ee883 had LIT test failures on any build where the clang executable
was not called "clang". I have adjusted the LIT CHECKs to remove the
binary name to fix this.

Original commit message:

For PlayStation we offer source code compatibility with
Microsoft's dllimport/export annotations; however, our file
format is based on ELF.

To support this we translate from DLL storage class to ELF
visibility at the end of codegen in Clang.

Other toolchains have used similar strategies (e.g. see the
documentation for this ARM toolchain:

https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)

This patch adds the ability to perform this translation. Options
are provided to support customizing the mapping behaviour.

Differential Revision: https://reviews.llvm.org/D89970

3 years ago[MLIR] Remove unnecessary CHECK's from tests for which we do not run FileCheck.
Rahul Joshi [Mon, 2 Nov 2020 23:02:38 +0000 (15:02 -0800)]
[MLIR] Remove unnecessary CHECK's from tests for which we do not run FileCheck.

Differential Revision: https://reviews.llvm.org/D90651

3 years agoRevert "[CUDA] Allow local static variables with target attributes."
Artem Belevich [Mon, 2 Nov 2020 23:08:26 +0000 (15:08 -0800)]
Revert "[CUDA] Allow local static variables with target attributes."

This reverts commit f38a9e51178add132d2c8ae160787fb2175a48a4
Which triggered assertions.

3 years ago[MachO] Also recongize __swift_ast as a debug info section
Jonas Devlieghere [Mon, 2 Nov 2020 22:41:18 +0000 (14:41 -0800)]
[MachO] Also recongize __swift_ast as a debug info section

Address post-commit review from Adrian.

3 years ago[mlir] Optimize Op definitions and registration to optimize for code size
River Riddle [Mon, 2 Nov 2020 22:21:02 +0000 (14:21 -0800)]
[mlir] Optimize Op definitions and registration to optimize for code size

This revision refactors the base Op/AbstractOperation classes to reduce the amount of generated code size when defining a new operation. The current scheme involves taking the address of functions defined directly on Op and Trait classes. This is problematic because even when these functions are empty/unused we still result in these functions being defined in the main executable. In this revision, we switch to using SFINAE and template type filtering to remove remove functions that are not needed/used. For example, if an operation does not define a custom `print` method we shouldn't define a templated `printAssembly` method for it. The same applies to parsing/folding/verification/etc. This dropped MLIR code size for a large downstream library by ~10%(~1 mb in an opt build).

Differential Revision: https://reviews.llvm.org/D90196

3 years ago[CUDA] Allow local static variables with target attributes.
Artem Belevich [Fri, 25 Sep 2020 23:25:27 +0000 (16:25 -0700)]
[CUDA] Allow local static variables with target attributes.

While CUDA documentation claims that such variables are not allowed[1], NVCC has
been accepting them since CUDA-10.0[2] and some headers in CUDA-11 rely on this
working.

1. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#static-variables-function
2. https://godbolt.org/z/zsodzc

Differential Revision: https://reviews.llvm.org/D88345

3 years ago[AsmPrinter] Split up .gcc_except_table
Fangrui Song [Mon, 2 Nov 2020 22:36:25 +0000 (14:36 -0800)]
[AsmPrinter] Split up .gcc_except_table

MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table:

* if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat`
* otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits`

This ensures that (a) non-prevailing copies are discarded and (b)
.gcc_except_table associated to discarded text sections can be discarded by a
.gcc_except_table-aware linker (GNU ld, but not gold or LLD)

This patches matches the GCC behavior. If -fno-unique-section-names is
specified, we don't append the suffix. If -ffunction-sections is additionally specified,
use `.section ...,unique`.

Note, if clang driver communicates that the linker is LLD and we know it
is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table
costs, at least in the -fno-unique-section-names case. We cannot use it on GNU
ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER &
non-SHF_LINK_ORDER components in an output section
https://sourceware.org/bugzilla/show_bug.cgi?id=26256

For RISC-V -mrelax, this patch additionally fixes an assembler-linker
interaction problem: because a section is shrinkable, the length of a call-site
code range is not a constant. Relocations referencing the associated text
section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a
discarded section group member from outside the group is disallowed by the ELF
specification (PR46675):

```
// a.cc
inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; }
int main() { return comdat(); }

// b.cc
inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; }
int foo() { return comdat(); }

clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax
ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section:
```

-fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations.

Reviewed By: jrtc27, rahmanl

Differential Revision: https://reviews.llvm.org/D83655

3 years ago[LoopFusion] Regenerate test checks (NFC)
Nikita Popov [Mon, 2 Nov 2020 21:42:06 +0000 (22:42 +0100)]
[LoopFusion] Regenerate test checks (NFC)

3 years ago[GWP-ASan] Stub out backtrace/signal functions on Fuchsia
Kostya Kortchinsky [Sat, 31 Oct 2020 18:09:29 +0000 (11:09 -0700)]
[GWP-ASan] Stub out backtrace/signal functions on Fuchsia

The initial version of GWP-ASan on Fuchsia doesn't support crash and
signal handlers, so this just adds empty stubs to be able to compile
the project on the platform.

Differential Revision: https://reviews.llvm.org/D90537

3 years ago[MLIR] Work around an ICE in GCC 7.
Benjamin Kramer [Mon, 2 Nov 2020 21:46:19 +0000 (22:46 +0100)]
[MLIR] Work around an ICE in GCC 7.

Looks like we have a blind spot in the testing matrix.

AsyncRegionRewriter.cpp: In member function ‘virtual void {anonymous}::GpuAsyncRegionPass::runOnFunction()’:
AsyncRegionRewriter.cpp:113:16: internal compiler error: in replace_placeholders_r, at cp/tree.c:2804
   if (getFunction()
       ~~~~~~~~~~~~~
           .getRegion()
           ~~~~~~~~~~~~
           .walk(Callback{OpBuilder{&getContext()}})
           ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

3 years ago[MLIR] Introduce std.global_memref and std.get_global_memref operations.
Rahul Joshi [Mon, 2 Nov 2020 19:21:29 +0000 (11:21 -0800)]
[MLIR] Introduce std.global_memref and std.get_global_memref operations.

- Add standard dialect operations to define global variables with memref types and to
  retrieve the memref for to a named global variable
- Extend unit tests to test verification for these operations.

Differential Revision: https://reviews.llvm.org/D90337

3 years ago[flang] Fix actual argument character length and length error reporting
peter klausler [Fri, 30 Oct 2020 20:28:10 +0000 (13:28 -0700)]
[flang] Fix actual argument character length and length error reporting

Ensure that character length is properly calculated for
actual arguments to intrinsics, and that source provenance
information is available when expression analysis calls
folding in cases where the length is invalid.

Differential revision: https://reviews.llvm.org/D90636

3 years ago[NFC][AMDGPU] Restructure the AMDGPU memory model description
Tony [Sat, 31 Oct 2020 04:43:55 +0000 (04:43 +0000)]
[NFC][AMDGPU] Restructure the AMDGPU memory model description

Separate the AMDGPU memory model description into separate sections
for each architecture.

Differential Revision: https://reviews.llvm.org/D90548

3 years ago[IndVars] Regenerate test checks (NFC)
Nikita Popov [Mon, 2 Nov 2020 21:30:51 +0000 (22:30 +0100)]
[IndVars] Regenerate test checks (NFC)