review.tizen.org Git - platform/upstream/llvm.git/log

[llvm-readelf] - Simplify the implementation of getSectionTypeString() helper. NFCI.

It is used for printing section headers in the GNU style
and the implementation can be simplified.

Differential revision: https://reviews.llvm.org/D84330

[Analyzer][StreamChecker] Use BugType::SuppressOnSink at resource leak report.

Summary:
Use the built-in functionality BugType::SuppressOnSink
instead of a manual solution in StreamChecker.

Differential Revision: https://reviews.llvm.org/D83120

[DebugInfo] Attempt to fix regression test failure after 59a76d957a2603ee0

Test case `test/CodeGen/WebAssembly/stackified-debug.ll`
was failing due to malformed DwarfExpression.

This failure has been seen in lot of bots, for instance in:
http://lab.llvm.org:8011/builders/lld-x86_64-ubuntu-fast/builds/18794

: 'RUN: at line 1'
/home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/build/bin/llc
/home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/build/bin/FileCheck /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/test/CodeGen/WebAssembly/stackified-debug.ll
home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/test/CodeGen/WebAssembly/stackified-debug.ll:26:10: error: CHECK: expected string not found in input
CHECK: .int16 4 # Loc expr size
^
<stdin>:34:2: note: scanning from here
.int16 3 # Loc expr size

Differential Revision: https://reviews.llvm.org/D83560

[mlir] Loop bounds inference in linalg.generic op improved to support bounds for convolution

Loop bound inference is right now very limited as it supports only permutation maps and thus
it is impossible to implement convolution with linalg.generic as it requires more advanced
loop bound inference. This commits solves it for the convolution case.

Depends On D83158

Differential Revision: https://reviews.llvm.org/D83191

Re-apply:" Emit DW_OP_implicit_value for Floating point constants"

This patch was reverted in 9d2da6759b4d due to assertion failure seen
in `test/DebugInfo/Sparc/subreg.ll`. Assertion failure was happening
due to malformed/unhandeled DwarfExpression.

Differential Revision: https://reviews.llvm.org/D83560

[Reduce] Rewrite runDeltaPass() workloop: do reduce a single and/or last target

Summary:
If there was a single target to begin with, because a single target
can only occupy a single chunk, we couldn't increase granularity.
and would immediately give up.

Likewise, if we had multiple targets, if by the end we'd end up with
a single target, we wouldn't finish reducing it, it would always
end up being "interesting"

Reviewers: dblaikie, nickdesaulniers, diegotf

Reviewed By: dblaikie

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84318

Temporarily Revert "Reland [lldb] Unify type name matching in FormattersContainer"
as it breaks bots with due to m_valid being an unused class member
except in assert builds.

This reverts commit 074b121642b286afb16adeebda5ec8236f7b8ea9.

[compiler-rt][sanitizers] Fix Solaris madvise declaration

A last-minute silent change in  D84046 <https://reviews.llvm.org/D84046> broke the Solaris buildbots (Solaris/sparcv9 <http://lab.llvm.org:8014/builders/clang-solaris11-sparcv9/builds/6772>, Solaris/amd64 <http://lab.llvm.org:8014/builders/clang-solaris11-amd64/builds/5434>):

  [2/3679] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.sparc.dir/sanitizer_posix_libcdep.cpp.o
  FAILED: projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.sparc.dir/sanitizer_posix_libcdep.cpp.o
  /opt/llvm-buildbot/bin/c++  -DHAVE_RPC_XDR_H=1 -D_DEBUG -D_FILE_OFFSET_BITS=64 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iprojects/compiler-rt/lib/sanitizer_common -I/opt/llvm-buildbot/home/solaris11-sparcv9/clang-solaris11-sparcv9/llvm/compiler-rt/lib/sanitizer_common -Iinclude -I/opt/llvm-buildbot/home/solaris11-sparcv9/clang-solaris11-sparcv9/llvm/llvm/include -I/opt/llvm-buildbot/home/solaris11-sparcv9/clang-solaris11-sparcv9/llvm/llvm/include/llvm/Support/Solaris -I/opt/llvm-buildbot/home/solaris11-sparcv9/clang-solaris11-sparcv9/llvm/compiler-rt/lib/sanitizer_common/.. -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -Wall -std=c++14 -Wno-unused-parameter -O3     -m32 -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fvisibility=hidden -fno-lto -O3 -g -Wno-variadic-macros -Wno-non-virtual-dtor -fno-rtti -Wframe-larger-than=570 -UNDEBUG -std=c++14 -MD -MT projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.sparc.dir/sanitizer_posix_libcdep.cpp.o -MF projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.sparc.dir/sanitizer_posix_libcdep.cpp.o.d -o projects/compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.sparc.dir/sanitizer_posix_libcdep.cpp.o -c /opt/llvm-buildbot/home/solaris11-sparcv9/clang-solaris11-sparcv9/llvm/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp
  /opt/llvm-buildbot/home/solaris11-sparcv9/clang-solaris11-sparcv9/llvm/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:50:16: error: conflicting declaration of C function ‘int madvise(caddr_t, std::size_t, int)’
   extern "C" int madvise(caddr_t, size_t, int);
                  ^~~~~~~
  In file included from /opt/llvm-buildbot/home/solaris11-sparcv9/clang-solaris11-sparcv9/llvm/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:32:0:
  /usr/include/sys/mman.h:232:12: note: previous declaration ‘int madvise(void*, std::size_t, int)’
   extern int madvise(void *, size_t, int);
              ^~~~~~~

This patch undoes that change.

Tested on `amd64-pc-solaris2.11` (Solaris 11.4 and OpenIndiana).

Differential Revision: https://reviews.llvm.org/D84388

[mlir] [VectorOps] Improve scatter/gather CPU performance

Replaced the linearized address with the proper LLVM way of
defining vector of base + indices in SIMD style. This yields
much better code. Some prototype results with microbencmarking
sparse matrix x vector with 50% sparsity (about 2-3x faster):

         LINEARIZED     IMPROVED
GFLOPS  sdot  saxpy     sdot saxpy
16x16    1.6   1.4       4.4  2.1
32x32    1.7   1.6       5.8  5.9
64x64    1.7   1.7       6.4  6.4
128x128  1.7   1.7       5.9  5.9
256x256  1.6   1.6       6.1  6.0
512x512  1.4   1.4       4.9  4.7

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D84368

[Windows] Fix limit on command line size

This reapplies commit d4020ef7c474, reverted in ac0edc55887b because it
broke build of LLDB. This commit contains appropriate changes for LLDB.
The original commit message is below.

Documentation on CreateProcessW states that maximal size of command line
is 32767 characters including ternimation null character. In the
function llvm::sys::commandLineFitsWithinSystemLimits this limit was set
to 32768. As a result if command line was exactly 32768 characters long,
a response file was not created and CreateProcessW was called with
too long command line.

Differential Revision: https://reviews.llvm.org/D83772

Reland D84057 [PGO][PGSO] Remove a temporary flag used for gradual rollout.

The revert was a misfire.

Remove the temporary flag PGSOIRPassOrTestOnly and the guard code which was used
for the staged rollout. This is a cleanup (NFC) as it's now false by default.

Differential Revision: https://reviews.llvm.org/D84057

Revert "[DebugInfo] Emit DW_OP_implicit_value for Floating point constants"

This reverts commit 6b55a95898e98664164caae4aba7c5e24fd1a05e.
Temporal revert due to a failing/assertion in test case in Sparc backend.
`test/DebugInfo/Sparc/subreg.ll`
Seen in lot of bots, for instance in:
`http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/24679`

Revert "[OpenMP] Wait for kernel prior to memory deallocation"

This reverts commit 9b2832c0897c1d39846eee0ad84bf787f05d2d4b.

[OpenMP] Wait for kernel prior to memory deallocation

Summary:
In the function `target`, memory deallocation and `target_data_end` is called
immediately returning from launching kernel. This might cause a race condition
that the corresponding memory is still being used by the kernel and a potential
issue that when the kernel starts to execute, its required data have already
been deallocated, especially when multiple kernels running concurrently. Since
nevertheless, we will block the thread issuing the target offloading at the end
of the target, we just move the synchronization ahead a little bit to make sure
the correctness.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D84381

[DWARFYAML] Refactor range list table to hold more data structure.

This patch refactors the range list table to hold both the range list
table and the location list table.

Reviewed By: jhenderson, labath

Differential Revision: https://reviews.llvm.org/D84239

[DebugInfo] Emit DW_OP_implicit_value for Floating point constants

Summary:
llvm is missing support for DW_OP_implicit_value operation.
DW_OP_implicit_value op is indispensable for cases such as
optimized out long double variables.

For intro refer: DWARFv5 Spec Pg: 40 2.6.1.1.4 Implicit Location Descriptions

Consider the following example:
```
int main() {
        long double ld = 3.14;
        printf("dummy\n");
        ld *= ld;
        return 0;
}
```
when compiled with tunk `clang` as
`clang test.c -g -O1` produces following location description
of variable `ld`:
```
DW_AT_location        (0x00000000:
                     [0x0000000000201691, 0x000000000020169b): DW_OP_constu 0xc8f5c28f5c28f800, DW_OP_stack_value, DW_OP_piece 0x8, DW_OP_constu 0x4000, DW_OP_stack_value, DW_OP_bit_piece 0x10 0x40, DW_OP_stack_value)
                  DW_AT_name    ("ld")
```
Here one may notice that this representation is incorrect(DWARF4
stack could only hold integers(and only up to the size of address)).
Here the variable size itself is `128` bit.
GDB and LLDB confirms this:
```
(gdb) p ld
$1 = <invalid float value>
(lldb) frame variable ld
(long double) ld = <extracting data from value failed>
```

GCC represents/uses DW_OP_implicit_value in these sort of situations.
Based on the discussion with Jakub Jelinek regarding GCC's motivation
for using this, I concluded that DW_OP_implicit_value is most appropriate
in this case.

Link: https://gcc.gnu.org/pipermail/gcc/2020-July/233057.html
GDB seems happy after this patch:(LLDB doesn't have support
for DW_OP_implicit_value)
```
(gdb) p ld
p ld
$1 = 3.14000000000000012434
```

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D83560

[PGO] Don't call calloc(0, sizeof(ValueProfNode *))

A malloc implementation may return a pointer to some allocated space. It is
undefined for libclang_rt.profile- to access the object - which actually happens
in instrumentTargetValueImpl, where ValueCounters[CounterIndex] may access a
ValueProfNode (from another allocated object) and crashes when the code accesses
the object referenced by CurVNode->Next.

[flang][OpenMP] Added support for lowering OpenMP taskyield construct

Summary:
This patch lower `!OMP TASKYIELD` construct from PFT to
OpenMPDialect operations.
Construct is lowered with conformance to OpenMP 5.0 spec.

Patch is carved out of following merged PR:
https://github.com/flang-compiler/f18-llvm-project/pull/297

Reviewed: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D84350

[flang][openacc] Skeleton for OpenACC construct lowering

Summary:
This patch introduce the basic infrastructure to be able to lower
OpenACC constructs to the future OpenACC dialect.

Reviewers: schweitz, kiranchandramohan, DavidTruby, sscalpone, jdoerfert, ichoyjx

Reviewed By: ichoyjx

Subscribers: ichoyjx, SouraVX, mgorny, jfb, sstefan1, llvm-commits

Tags: #llvm, #flang

Differential Revision: https://reviews.llvm.org/D84195

[flang][openmp] Required clauses are allowed

Summary:
This patch fix a problem where clause needed to be in the allowed set even
they were in the required set. A required clause is allowed obvisouly. This allow
to remove the duplicate in OMP.td

Reviewers: kiranchandramohan, DavidTruby, richard.barton.arm, jdoerfert, sscalpone, kiranktp, ichoyjx

Reviewed By: kiranchandramohan

Subscribers: yaxunl, guansong, sstefan1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84353

[OpenMPOpt] Regression test for hiding latency of H2D mem transfers

[flang] Add runtime I/O APIs for COMPLEX formatted input

It turns out that COMPLEX formatted input needs its own runtime APIs
so that null values in list-directed input skip the entire COMPLEX
datum rather than just a real or imaginary part thereof.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D84370

Reapply "Try enabling -Wsuggest-override again, using add_compile_options instead of add_compile_definitions for disabling it in unittests/ directories."

add_compile_options is more sensitive to its location in the file than add_definitions--it only takes effect for sources that are added after it. This updated patch ensures that the add_compile_options is done before adding any source files that depend on it.

Using add_definitions caused the flag to be passed to rc.exe on Windows and thus broke Windows builds.

[X86] Remove the DeprecatedMPX feature flag.

We deprecated mpx feature in 10.0. I left this feature flag
in case someone still had IR files containing the feature
in a target-feature attribute. At the time I think I thought it
would fail the test if the feature couldn't be found. Further
review suggests that at worst it prints a message to
stderr about ignoring the feature.

[Symbolize][PDB] Switch llvm-symbolizer to use PDB_ReaderType::Native.

Since native PDB reading has been implemented for symbolizing,
switch to using the native PDB reader by default, unless
LLVM_ENABLE_DIA_SDK is on.

Bug: https://bugs.llvm.org/show_bug.cgi?id=41795

Differential Revision: https://reviews.llvm.org/D84286

[lldb] Fix LLDB_DEFAULT_TEST_ARCH for standalone builds

LLVM_TARGET_ARCH is not exported by LLVM so we can't use it from
standalone builds. Default to the architecture in LLVM_HOST_TRIPLE when
no LLDB_DEFAULT_TEST_ARCH was specified.

[X86] Rework the "sahf" feature flag to only apply to 64-bit mode.

SAHF/LAHF instructions are always available in 32-bit mode. Early
64-bit capable CPUs made the undefined opcodes in 64-bit mode. This
was changed on later CPUs.

We have a feature flag to control our usage of these instructions.
This feature flag is hooked up to a clang command line option
-msahf/-mno-sahf specifically to give control of the 64-bit mode
behavior.

In the backend X86Subtarget constructor we were explicitly forcing
+sahf into the feature flag string if we were not compiling for
64-bit mode. This was intended to make the predicates always allow
the instructions outside of 64-bit mode. Unfortunately, the way
it was placed into the string allowed -mno-sahf from clang to disable
SAHF instructions in 32-bit mode. This causes an assertion to fire
if you compile a floating point comparison with something like
"-march=pentium -mno-sahf" as our floating point comparison
handling on CPUs that don't support FCOMI/FUCOMI instructions
requires SAHF.

To fix this, this commit restricts the feature flag to only apply to
64-bit mode by ignoring the flag outside 64-bit mode in
X86Subtarget::hasLAHFSAHF(). This way we don't need to mess with
the feature string at all.

[DFSan] Handle fast16labels for all API functions.

Summary:
Support fast16labels in `dfsan_has_label`, and print an error for all
other API functions.

Reviewers: kcc, vitalybuka, pcc

Reviewed By: kcc

Subscribers: jfb, llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D84215

[gn build] Port 13ad00be98e

[ORC] Add a TargetProcessControl-based dynamic library search generator.

TPCDynamicLibrarySearchGenerator uses a TargetProcessControl instance to
load libraries and search for symbol addresses in a target process. It
can be used in place of a DynamicLibrarySearchGenerator to enable
target-process agnostic lookup.

[gn build] Port 27650ec5541

Revert D81682 "[PGO] Extend the value profile buckets for mem op sizes."

This reverts commit 4a539faf74b9b4c25ee3b880e4007564bd5139b0.

There is a __llvm_profile_instrument_range related crash in PGO-instrumented clang:

```
(gdb) bt
llvm::ConstantRange const&, llvm::APInt const&, unsigned int, bool) ()
llvm::ScalarEvolution::getRangeForAffineAR(llvm::SCEV const*, llvm::SCEV
const*, llvm::SCEV const*, unsigned int) ()
```

(The body of __llvm_profile_instrument_range is inlined, so we can only find__llvm_profile_instrument_target in the trace)

```
23│    0x000055555dba0961 <+65>:    nopw   %cs:0x0(%rax,%rax,1)
24│    0x000055555dba096b <+75>:    nopl   0x0(%rax,%rax,1)
25│    0x000055555dba0970 <+80>:    mov    %rsi,%rbx
26│    0x000055555dba0973 <+83>:    mov    0x8(%rsi),%rsi  # %rsi=-1 -> SIGSEGV
27│    0x000055555dba0977 <+87>:    cmp    %r15,(%rbx)
28│    0x000055555dba097a <+90>:    je     0x55555dba0a76 <__llvm_profile_instrument_target+342>
```

[PowerPC][Power10] Fix vins*vlx instructions to have i32 arguments.

Previously, the vins*vlx instructions were incorrectly defined with i64 as the
second argument. This patches fixes this issue by correcting the second argument
of the vins*vlx instructions/intrinsics to be i32.

Differential Revision: https://reviews.llvm.org/D84277

[X86] Remove a couple temporary std::string for CPU names that I don't need to exist.

The input to these functions is a StringRef. We then convert it
to a std::string. Then maybe replace with "generic". I think we
can just overwrite the incoming StringRef with "generic" if needed
and then pass it along without creating any std::string.

[NFC] Simplify `splitLiteralAndReplacement` function

- Eliminate `From` which is 0 most of the times.
- Replace 'find_first_of('{') != 0' with 'front() != '{'
- Simplify the loop body given the it executes only when front() == '}'

Differential Revision: https://reviews.llvm.org/D84178

[LLVM] Update formatv() documentation to clarify no escape for `}`

- Update documentation to clarify that `}` does not need to be doubled up.
- Update `EscapedBrace` test case to test this behavior

Differential Revision: https://reviews.llvm.org/D83888

[libc] Implements strnlen.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D84247

[SVE] Remove calls to VectorType::getNumElements from Analysis

Reviewers: efriedma, fpetrogalli, c-rhodes, asbirlea, RKSimon

Reviewed By: RKSimon

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81504

Revert "Try enabling -Wsuggest-override again, using add_compile_options instead of add_compile_definitions for disabling it in unittests/ directories."

This reverts commit 388c9fb1af48b059d8b65cb2e002e0992d147aa5.

[PGO] Supporting code for always instrumenting entry block

This patch includes the supporting code that enables always
instrumenting the function entry block by default.

This patch will NOT the default behavior.

It adds a variant bit in the profile version, adds new directives in
text profile format, and changes llvm-profdata tool accordingly.

This patch is a split of D83024 (https://reviews.llvm.org/D83024)
Many test changes from D83024 are also included.

Differential Revision: https://reviews.llvm.org/D84261

[clang][test] Fix test for external assemblers

This test depends on using the integrated assembler, so make it
explicit by specifying -fintegrated-as.

[mlir][VectorOps] Expose SuperVectorizer as a utility

This patch refactors a small part of the Super Vectorizer code to
a utility so that it can be used independently from the pass. This
aligns vectorization with other utilities that we already have for loop
transformations, such as fusion, interchange, tiling, etc.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D84289

Revert D84057 "[PGO][PGSO] Remove a temporary flag used for gradual rollout."

This reverts commit e64afefdf88d2607c476f13de05193c0f8991976. It caused
a PGO bootstrapped clang to crash on many source files.

`__llvm_profile_instrument_range` seems to trigger a null pointer dereference.

Call stack:
__llvm_profile_instrument_range
llvm::APInt::udiv(llvm::APInt const&) const
getRangeForAffineARHelper

[MVT] Fix getTypeForEVT for v64f16 and v128f16

Summary: These should have half float as the element type

Reviewers: cameron.mcinally, efriedma, sdesmalen, paulwalker-arm

Reviewed By: paulwalker-arm

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84211

Try enabling -Wsuggest-override again, using add_compile_options instead of add_compile_definitions for disabling it in unittests/ directories.

Using add_compile_definitions caused the flag to be passed to rc.exe on Windows and thus broke Windows builds.

DebugInfo: Use debug_line.dwo for debug_macro.dwo

This is an alternative proposal to D81476 (and D82084) - the details were sufficiently confusing to me it seemed easier to write some code and see how it looks.

Reviewers: SouraVX

Differential Revision: https://reviews.llvm.org/D84278

[lldb] Eliminate unneeded value parameters in Utility (NFC)

Eliminates value parameter for types that are not trivially copyable.

[Polly] Run polly-update-format. NFC.

For PR46800, implement the GCC __builtin_complex builtin.

glibc's implementation of the CMPLX macro uses it (with -fgnuc-version
set to 4.7 or later).

[gn build] Remove something I missed in 1afd889d0

Temporarily revert D83903 "[PGO] Enable the extended value profile buckets for mem op sizes."

`__llvm_profile_instrument_memop` transitively calls calloc, thus calloc
should not be instrumented.

I saw a
`calloc -> __llvm_profile_instrument_memop -> calloc -> __llvm_profile_instrument_memop -> ...`
infinite loop leading to stack overflow
when the malloc implementation (e.g. tcmalloc) is built and instrumented along with the application.

We should figure out the library calls which may be instrumented and disable
their instrumentation before rolling out this change.

Reviewed By: yamauchi

Differential Revision: https://reviews.llvm.org/D84358

lldb fix for b198de67e0bab462217db50814b1434796fa7caf (PCH/modular codegen refactor)

[SCCP] Add additional multi-edge + phi tests (NFC)

[SCCP] Regenerate test checks (NFC)

And adjust the indbrtest4 test to actually test what it's supposed
to. BB1 is supposed to be eliminated here, but isn't, because
BB0 still branches to it. This was lost due to the incomplete CHECK
lines.

[libc++] Make sure we only consider _GNUC_VER_NEW when the compiler is GCC

When the compiler is Clang, _GNUC_VER_NEW is 0, which messes up the logic.

[llvm][NFC] const-ed MachineBlockFrequencyInfo::isIrrLoopHeader

asan_device_setup's wrapper scripts not handling args with spaces correctly

Summary: Came up in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1103108#c21

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D84237

Merge some of the PCH object support with modular codegen

I was trying to pick this up a bit when reviewing D48426 (& perhaps D69778) - in any case, looks like D48426 added a module level flag that might not be needed.

The D48426 implementation worked by setting a module level flag, then code generating contents from the PCH a special case in ASTContext::DeclMustBeEmitted would be used to delay emitting the definition of these functions if they came from a Module with this flag.

This strategy is similar to the one initially implemented for modular codegen that was removed in D29901 in favor of the modular decls list and a bit on each decl to specify whether it's homed to a module.

One major difference between PCH object support and modular code generation, other than the specific list of decls that are homed, is the compilation model: MSVC PCH modules are built into the object file for some other source file (when compiling that source file /Yc is specified to say "this compilation is where the PCH is homed"), whereas modular code generation invokes a separate compilation for the PCH alone. So the current modular code generation test of to decide if a decl should be emitted "is the module where this decl is serialized the current main file" has to be extended (as Lubos did in D69778) to also test the command line flag -building-pch-with-obj.

Otherwise the whole thing is basically streamlined down to the modular code generation path.

This even offers one extra material improvement compared to the existing divergent implementation: Homed functions are not emitted into object files that use the pch. Instead at -O0 they are not emitted into the IR at all, and at -O1 they are emitted using available_externally (existing functionality implemented for modular code generation). The pch-codegen test has been updated to reflect this new behavior.

[If possible: I'd love it if we could not have the extra MSVC-style way of accessing dllexport-pch-homing, and just do it the modular codegen way, but I understand that it might be a limitation of existing build systems. @hans / @thakis: Do either of you know if it'd be practical to move to something more similar to .pcm handling, where the pch itself is passed to the compilation, rather than homed as a side effect of compiling some other source file?]

Reviewers: llunak, hans

Differential Revision: https://reviews.llvm.org/D83652

[ARM] Fix missing MVE_VMUL_qr predicate

This was missed out of 1030e82598da, but hopefully fixes the issues
reported with NEON accidentally generating MVE instructions.

[mlir][linalg] Add vectorization transform for CopyOp

CopyOp get vectorized to vector.transfer_read followed by vector.transfer_write

Differential Revision: https://reviews.llvm.org/D83739

[libc++] Workaround broken support for C++17 in GCC 5

[flang] Fix an assert when RESHAPE() is called on empty strings

Summary:
When a constant array of empty strings goes through contant folding, the result
is something that contains no bytes. If this array is passed to the intrinsic
function `RESHAPE()`, we were not handling things correctly. I fixed this by
checking for an empty destination when calling the function `CopyFrom()` on an
array of strings.

I also added a test with a couple of different examples that trigger the
problem.

Reviewers: klausler, tskeith, DavidTruby

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84352

[CGP] Add Pass Dependencies

Add pass dependecies:
  - TargetTransformInfoWrapperPass
  - TargetPassConfig
  - LoopInfoWrapperPass
  - TargetLibraryInfoWrapperPass

To fix inconsistencies when passes are added to the pipeline.

Reviewers: efriedma, kmclaughlin, paquette

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84346

[libc++] Add static_assert to make sure rate limiter doesn't use locks

We want to be sure that atomic<size_t> is always lock-free, or the code
will be much slower than expected (and could even conceivably fail if
the lock implementation somehow calls back into libc++abi).

[libc++] Build the dylib with C++17 to allow aligned new/delete

This allows simplifying the implementation of barriers.

This is a re-commit of 1ac403bd145d, which had to be reverted in
64a9c944fc45 because the minimum CMake version wasn't high enough.
Now that we've upgraded, we can do this.

Differential Revision: https://reviews.llvm.org/D75243

[gn build] Port 418121c30a8

[lldb] Use std::make_unique<DynamicRegisterInfo> (NFC)

[SCCP] Add multi-edge switch + phi test case (NFC)

[PowerPC][Power10] Fix the Test LSB by Byte (xvtlsbb) Builtins Implementation

The implementation of the xvtlsbb builtins/intrinsics were not correct as the
intrinsics previously used i1 as an argument type. This patch changes the i1
argument type used in these intrinsics to be i32 instead, as having the second
as an i1 can lead to issues in the backend.

Differential Revision: https://reviews.llvm.org/D84291

DwarfCompileUnit.cpp - remove duplicate includes that already exist in DwarfCompileUnit.h. NFC.

Also remove DIE.h include from DwarfCompileUnit.h and replace with forward declarations.

CodeViewDebug.cpp - remove duplicate includes that already exist in CodeViewDebug.h. NFC.

[CMake] Bump CMake minimum version to 3.13.4

This upgrade should be friction-less because we've already been ensuring
that CMake >= 3.13.4 is used.

This is part of the effort discussed on llvm-dev here:

http://lists.llvm.org/pipermail/llvm-dev/2020-April/140578.html

Differential Revision: https://reviews.llvm.org/D78648

Revert "Enable -Wsuggest-override in the LLVM build" and the follow-ups.

After lots of follow-up fixes, there are still problems, such as
-Wno-suggest-override getting passed to the Windows Resource Compiler
because it was added with add_definitions in the CMake file.

Rather than piling on another fix, let's revert so this can be re-landed
when there's a proper fix.

This reverts commit 21c0b4c1e8d6a171899b31d072a47dac27258fc5.
This reverts commit 81d68ad27b29b1e6bc93807c6e42b14e9a77eade.
This reverts commit a361aa5249856e333a373df90947dabf34cd6aab.
This reverts commit fa42b7cf2949802ff0b8a63a2e111a2a68711067.
This reverts commit 955f87f947fda3072a69b0b00ca83c1f6a0566f6.
This reverts commit 8b16e45f66e24e4c10e2cea1b70d2b85a7ce64d5.
This reverts commit 308a127a38d1111f3940420b98ff45fc1c17715f.
This reverts commit 274b6b0c7a8b584662595762eaeff57d61c6807f.
This reverts commit 1c7037a2a5576d0bb083db10ad947a8308e61f65.

[llvm][NFC] Remove definition from build system of LLVM_HAVE_TF_AOT

We can just use the definition from config.h. This means we need to move
a few lines around in CMakeLists.txt - the TF_AOT detection needs to be
before the spot we process the config.h.cmake files.

Differential Revision: https://reviews.llvm.org/D84349

AArch64: Use Register

GlobalISel: Don't use virtual for distinguishing arg handlers

There's no reason to involve the hassle of a virtual method targets
have to override for a simple boolean.

Not sure exactly what's going on with Mips, but it seems to define its
own totally separate handler classes.

[gn build] (manually) port 746b5fad5b

[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)

This implements OpenMP runtime support for the OpenMP TR8 `present`
map type modifier. The previous patch in this series implements Clang
front end support. See that patch summary for behaviors that are not
yet supported.

Reviewed By: grokos, jdoerfert

Differential Revision: https://reviews.llvm.org/D83062

Fix Windows build

AMDGPU: Don't assert on f16 inv2pi immediates pre-gfx8

v_cvt_f32_f16 can still accept this value as a literal constant. This
showed up in GlobalISel since it doesn't have constant folding for
G_FPEXT.

[clangd] Disable -Wsuggest-override for unittests/

[mlir][Vector] Vectorize integer matmuls

The underlying infrastructure supports this already, just add the
pattern matching for linalg.generic.

Differential Revision: https://reviews.llvm.org/D84335

[libcxx] Fix default argument for merge_archives.py -L flag

If we use the default of None, we get a python exception in
find_and_diagnose_missing() instead of printing a sensible error message.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D84342

GlobalISel: Restructure argument lowering loop in handleAssignments

This was structured in a way that implied every split argument is in
memory, or in registers. It is possible to pass an original argument
partially in registers, and partially in memory. Transpose the logic
here to only consider a single piece at a time. Every individual
CCValAssign should be treated independently, and any merge to original
value needs to be handled later.

This is in preparation for merging some preprocessing hacks in the
AMDGPU calling convention lowering into the generic code.

I'm also not sure what the correct behavior for memlocs where the
promoted size is larger than the original value. I've opted to clamp
the memory access size to not exceed the value register to avoid the
explicit trunc/extend/vector widen/vector extract instruction. This
happens for AMDGPU for i8 arguments that end up stack passed, which
are promoted to i16 (I think this is a preexisting DAG bug though, and
they should not really be promoted when in memory).

AMDGPU: Add IntrWillReturn to llvm.amdgcn.atomic.csub

[Sanitizers] Add interceptor for xdrrec_create

For now, xdrrec_create is only intercepted Linux as its signature
is different on Solaris.

The method of intercepting xdrrec_create isn't super ideal but I
couldn't think of a way around it: Using an AddrHashMap combined
with wrapping the userdata field.

We can't just allocate a handle on the heap in xdrrec_create and leave
it at that, since there'd be no way to free it later. This is because it
doesn't seem to be possible to access handle from the XDR struct, which
is the only argument to xdr_destroy.
On the other hand, the callbacks don't have a way to get at the
x_private field of XDR, which is what I chose for the HashMap key. So we
need to wrap the handle parameter of the callbacks. But we can't just
pass x_private as handle (as it hasn't been set yet). We can't put the
wrapper struct into the HashMap and pass its pointer as handle, as the
key we need (x_private again) hasn't been set yet.

So I allocate the wrapper struct on the heap, pass its pointer as
handle, and put it into the HashMap so xdr_destroy can find it later and
destroy it.

Differential Revision: https://reviews.llvm.org/D83358

[profile][test] Add -fuse-ld=bfd to make instrprof-lto-pgogen.c robust

Otherwise if 'ld' is an older system LLD (FreeBSD; or if someone adds 'ld' to
point to an LLD from a different installation) which does not support the
current ModuleSummaryIndex::BitCodeSummaryVersion, the test will fail.

Add lit feature 'binutils_lto'. GNU ld is more common than GNU gold, so
we can just require 'is_binutils_lto_supported' to additionally support GNU ld.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D84133

AMDGPU/GlobalISel: Fix translation of indirect calls

[WebAssembly] Autogenerate checks in simd-offset.ll

Implementing new functionality tested in this file requires adding new
tests for many IR addressing patterns, which can be a large
maintenance burden. This patch makes adding tests easier by switching
to using autogenerated checks. This patch also removes the testing
mode that has simd128 disabled because it would produce very large
checks and is not particularly interesting.

Differential Revision: https://reviews.llvm.org/D84288

Reapply "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"

(This reverts commit a5e0194709c40212694370e0ea789a1ca14548b5, and
corrects author).

Rename the pass to be able to extend it to function properties other than inliner features.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D82044

Only enable -Wsuggest-override if it doesn't suggest adding override to functions that are already final

A previous patch added -Wsuggest-override using a simple add_flag_if_supported(). This causes lots of warnings in LLVM when building with older GCC versions (< 9.2) which suggest adding override to functions that are only marked final. The current flags in both GCC >=9.2 and Clang accept plain final as equivalent to override final.

This patch adds logic to detect versions of -Wsuggest-override that warn on void foo() final and disables them to avoid warning spam in builds using older GCC's. This has the added minor benefit of getting rid of the useless C_SUPPORTS_SUGGEST_OVERRIDE_FLAG CMake cache variable which was set by add_flag_if_supported().

Differential Revision: https://reviews.llvm.org/D84292

[gn build] Port a5e0194709c

[gn build] Port 2a6c871596c

[lldb] Cleanup CommandObject registration (NFC)

- Remove the spurious argument to `CommandObjectScript`.
- Use make_shared instead of bare `new`.
- Move code duplication behind a macro.

Differential revision: https://reviews.llvm.org/D84336

[gn build] Handle X86InstCombineIntrinsic.cpp in 2a6c871596ce

[MSAN] Instrument libatomic load/store calls

These calls are neither intercepted by compiler-rt nor is libatomic.a
naturally instrumented.

This patch uses the existing libcall mechanism to detect a call
to atomic_load or atomic_store, and instruments them much like
the preexisting instrumentation for atomics.

Calls to _load are modified to have at least Acquire ordering, and
calls to _store at least Release ordering. Because this needs to be
converted at runtime, msan injects a LUT (implemented as a vector
with extractelement).

Differential Revision: https://reviews.llvm.org/D83337

Revert "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"

This reverts commit 44a6bda19b40f2dfcbe92fc3d58bb6276c71ef78. I forgot
to correctly attibute it to tarinduj. Fixing and resubmitting.

[gn build] Port 2a6c871596ce & 44a6bda19b40

[ARM] Add predicated add reduction patterns

Given a vecreduce.add(select(p, x, 0)), we can convert that to a
predicated vaddv, as the else value for the select is the identity
value, a zero. That is what this patch does for the vaddv, vaddva,
vaddlv and vaddlva instructions, copying the existing patterns to also
handle predication through a select.

Differential Revision: https://reviews.llvm.org/D84101

[Sema][AArch64] Add semantics for arm_sve_vector_bits attribute

Summary:
This patch implements semantics for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define fixed-length (VLST) versions
of existing sizeless types (VLAT).

Implemented in this patch is the the behaviour described in section 3.7.3.2
and minimal parts of sections 3.7.3.3 and 3.7.3.4, this includes:

    * Defining VLST globals, structs, unions, and local variables
    * Implicit casting between VLAT <=> VLST.
    * Diagnosis of ill-formed conditional expressions of the form:

        C ?  E1 : E2

      where E1 is a VLAT type and E2 is a VLST, or vice-versa. This
      avoids any ambiguity about the nature of the result type (i.e is
      it sized or sizeless).
    * For vectors:
        * sizeof(VLST) == N/8
        * alignof(VLST) == 16
    * For predicates:
        * sizeof(VLST) == N/64
        * alignof(VLST) == 2

VLSTs have the same representation as VLATs in the AST but are wrapped
with a TypeAttribute. Scalable types are currently emitted in the IR for
uses such as globals and structs which don't support these types, this
is addressed in the next patch with codegen, where VLSTs are lowered to
sized arrays for globals, structs / unions and arrays.

Not implemented in this patch is the behaviour guarded by the feature
macros:

    * __ARM_FEATURE_SVE_VECTOR_OPERATORS
    * __ARM_FEATURE_SVE_PREDICATE_OPERATORS

As such, the GNU __attribute__((vector_size)) extension is not available
and operators such as binary '+' are not supported for VLSTs. Support
for this is intended to be addressed by later patches.

[1] https://developer.arm.com/documentation/100987/latest

This is patch 2/4 of a patch series.

Reviewers: sdesmalen, rsandifo-arm, efriedma, cameron.mcinally, ctetreau, rengolin, aaron.ballman

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D83551