platform/upstream/llvm.git
4 years ago[mlir][Linalg] Mostly NFC - Refactor Linalg patterns and transformations.
Nicolas Vasilache [Sat, 2 May 2020 05:03:37 +0000 (01:03 -0400)]
[mlir][Linalg] Mostly NFC - Refactor Linalg patterns and transformations.

Linalg transformations are currently exposed as DRRs.
Unfortunately RewriterGen does not play well with the line of work on named linalg ops which require variadic operands and results.
Additionally, DRR is arguably not the right abstraction to expose compositions of such patterns that don't rely on SSA use-def semantics.

This revision abandons DRRs and exposes manually written C++ patterns.

Refactorings and cleanups are performed to uniformize APIs.
This refactoring will allow replacing the currently manually specified Linalg named ops.

A collateral victim of this refactoring is the `tileAndFuse` DRR, and the one associated test, which will be revived at a later time.

Lastly, the following 2 tests do not add value and are altered:
- a dot_perm tile + interchange test does not test anything new and is removed
- a dot tile + lower to loops does not need 2-D tiling and is trimmed.

4 years ago[ELF] Don't advance sh_offset for an empty section whose PT_LOAD is removed (due...
Fangrui Song [Fri, 1 May 2020 16:50:37 +0000 (09:50 -0700)]
[ELF] Don't advance sh_offset for an empty section whose PT_LOAD is removed (due to p_memsz=0)

removeEmptyPTLoad() removes empty (p_memsz=0) PT_LOAD segments.  In
assignFileOffsets(), setFileOffset() unnecessarily advances file offsets
for containing empty sections.

This is exposed by arm Linux kernel's multi_v5_defconfig
(see https://bugs.llvm.org/show_bug.cgi?id=45632)

```
ld.lld (max-page-size=65536):
  [34] .init.data        PROGBITS        c0c24000 c34000 0128ac 00  WA  0   0 4096
  [35] .text_itcm        PROGBITS        fffe0000 c50000 000000 00  WA  0   0  1
  [36] .data_dtcm        PROGBITS        fffe8000 c58000 000000 00  WA  0   0  1
  [37] .data             PROGBITS        c0c38000 c58000 0647a0 00  WA  0   0 32

arm-linux-gnueabi-ld (max-page-size=65536):
  [23] .init.data        PROGBITS        c0c12000 c22000 0128ac 00  WA  0   0 4096
  [24] .text_itcm        PROGBITS        fffe0000 ca2558 000000 00   W  0   0  1
  [25] .data_dtcm        PROGBITS        fffe8000 ca2558 000000 00   W  0   0  1
  [26] .data             PROGBITS        c0c26000 c36000 0647a0 00  WA  0   0 32
```

This patch clears OutputSection::ptLoad if ptLoad is removed by
removeEmptyPTLoad(). Conceptually this removes "dangling" references.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D79254

4 years ago[libc++] Translate compiler-identification Lit features to the new DSL
Louis Dionne [Fri, 17 Apr 2020 18:00:39 +0000 (14:00 -0400)]
[libc++] Translate compiler-identification Lit features to the new DSL

4 years ago[flang] Fixed a crash
Pete Steinfeld [Sat, 2 May 2020 03:47:59 +0000 (20:47 -0700)]
[flang] Fixed a crash

Summary:
I found a small test case that caused a crash when derived type
definitions have parameters without definitions.

Reviewers: tskeith, klausler, DavidTruby

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79282

4 years agoFix LABEL match for test case for D72841 #pragma float_control
Melanie Blower [Mon, 4 May 2020 14:25:23 +0000 (07:25 -0700)]
Fix LABEL match for test case for D72841 #pragma float_control

4 years ago[InstCombine] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476)
Simon Pilgrim [Mon, 4 May 2020 14:21:52 +0000 (15:21 +0100)]
[InstCombine] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476)

This patch adds support for discarding integer absolutes (abs + nabs variants) from self-multiplications.

ABS Alive2: http://volta.cs.utah.edu:8080/z/rwcc8W
NABS Alive2: http://volta.cs.utah.edu:8080/z/jZXUwQ

This is an InstCombine version of D79304 - I'm not sure yet if we'll need that after this.

Reviewed By: @lebedev.ri and @xbolva00

Differential Revision: https://reviews.llvm.org/D79319

4 years ago[X86][SSE] Move some VZEXT_MOVL combines into combineTargetShuffle. NFC.
Simon Pilgrim [Mon, 4 May 2020 14:03:50 +0000 (15:03 +0100)]
[X86][SSE] Move some VZEXT_MOVL combines into combineTargetShuffle. NFC.

Minor cleanup of combineShuffle by moving some of the low hanging fruit (load + scalar_to_vector folds).

4 years ago[libc++] NFC: Print Lit available features in sorted order
Louis Dionne [Mon, 4 May 2020 14:11:49 +0000 (10:11 -0400)]
[libc++] NFC: Print Lit available features in sorted order

This makes it easier to diff them between bot runs.

4 years ago[MLIR] Add complex numbers to standard dialect
Frederik Gossen [Mon, 4 May 2020 12:19:04 +0000 (12:19 +0000)]
[MLIR] Add complex numbers to standard dialect

Add `CreateComplexOp`, `ReOp`, and `ImOp` to the standard dialect.
This is the first step to support complex numbers.

Differential Revision: https://reviews.llvm.org/D79159

4 years ago[COFF] Avoid allocating temporary vectors during ICF
Reid Kleckner [Sat, 2 May 2020 20:28:56 +0000 (13:28 -0700)]
[COFF] Avoid allocating temporary vectors during ICF

Heap profiling with ETW shows that LLD performs 4,053,721 heap
allocations over its lifetime, and ~800,000 of them come from
assocEquals. These vectors are created just to do a comparison, so fuse
the comparison into the loop and avoid the allocation.

ICF is overall a small portion of the time spent linking, and I did not
measure overall throughput improvements from this change above the noise
threshold. However, these show up in the heap profiler, and the work is
done, so we might as well land it if the code is clear enough.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D79297

4 years ago[SelectionDAGBuilder] Stop setting alignment to one for hidden sret values
Alex Richardson [Fri, 29 Mar 2019 12:13:56 +0000 (12:13 +0000)]
[SelectionDAGBuilder] Stop setting alignment to one for hidden sret values

We allocated a suitably aligned frame index so we know that all the values
have ABI alignment.
For MIPS this avoids using pair of lwl + lwr instructions instead of a
single lw. I found this when compiling CHERI pure capability code where
we can't use the lwl/lwr unaligned loads/stores and and were to falling
back to a byte load + shift + or sequence.

This should save a few instructions for MIPS and possibly other backends
that don't have fast unaligned loads/stores.
It also improves code generation for CodeGen/X86/pr34653.ll and
CodeGen/WebAssembly/offset.ll since they can now use aligned loads.

Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D78999

4 years ago[MIPS] Add a baseline test showing current inefficient hidden sret lowering
Alex Richardson [Fri, 29 Mar 2019 12:06:40 +0000 (12:06 +0000)]
[MIPS] Add a baseline test showing current inefficient hidden sret lowering

SelectionDAGBuilder currently doesn't propagate the known alignment of
the sret parameter. This is inefficient for MIPS and highly inefficient for
our out-of-tree CHERI-extended MIPS since we don't have lwl/lwr so fall back
to byte loads for align == 1.

4 years ago[AMDGPU] Enable carry out ADD/SUB operations divergence driven instruction selection.
alex-t [Thu, 23 Apr 2020 17:55:36 +0000 (20:55 +0300)]
[AMDGPU] Enable carry out ADD/SUB operations divergence driven instruction selection.

Summary: This change enables all kind of carry out ISD opcodes to be selected according to the node divergence.

Reviewers: rampitec, arsenm, vpykhtin

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78091

4 years ago[lldb/DWARF] Remove dead code in DWARFDebugInfoEntry
Pavel Labath [Mon, 4 May 2020 13:18:59 +0000 (15:18 +0200)]
[lldb/DWARF] Remove dead code in DWARFDebugInfoEntry

The dumping code is not used by anyone, and is a source of
inconsistencies with the llvm dwarf parser, as dumping is implemented at
a different level (DWARFDie) there.

4 years ago[ELF] Move SHF_LINK_ORDER till OutputSection addresses are known
Peter Smith [Wed, 22 Apr 2020 19:28:52 +0000 (20:28 +0100)]
[ELF] Move SHF_LINK_ORDER till OutputSection addresses are known

Sections with the SHF_LINK_ORDER flag must be ordered in the same relative
order as the Sections they have a link to. When using a linker script an
arbitrary expression may be used for the virtual address of the
OutputSection. In some cases the virtual address does not monotonically
increase as the OutputSection index increases, so if we base the ordering
of the SHF_LINK_ORDER sections on the index then we can get the order
wrong. We fix this by moving SHF_LINK_ORDER resolution till after we have
created OutputSection virtual addresses.

Differential Revision: https://reviews.llvm.org/D79286

4 years ago[libc++] Define a few Lit features using the new DSL
Louis Dionne [Sun, 3 May 2020 17:26:23 +0000 (13:26 -0400)]
[libc++] Define a few Lit features using the new DSL

This commit migrates some of the Lit features from config.py to the new
DSL. This simplifies config.py and is a first step towards defining all
the features using the DSL instead of the complex logic in config.py.

Differential Revision: https://reviews.llvm.org/D78382

4 years ago[AArch64] Add NVIDIA Carmel support
Raul Tambre [Mon, 4 May 2020 10:45:35 +0000 (11:45 +0100)]
[AArch64] Add NVIDIA Carmel support

Summary:
NVIDIA's Carmel ARM64 cores are used in Tegra194 chips found in Jetson AGX Xavier, DRIVE AGX Xavier and DRIVE AGX Pegasus.

References:
* https://devblogs.nvidia.com/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/#h.huq9xtg75a5e
* NVIDIA Xavier Series System-on-Chip Technical Reference Manual 1.3 (https://developer.nvidia.com/embedded/downloads#?search=Xavier%20Series%20SoC%20Technical%20Reference%20Manual)

Reviewers: sdesmalen, paquette

Reviewed By: sdesmalen

Subscribers: llvm-commits, ianshmean, kristof.beyls, hiraditya, jfb, danielkiss, cfe-commits, t.p.northover

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D77940

4 years agoReapply "Add support for #pragma float_control" with buildbot fixes
Melanie Blower [Fri, 1 May 2020 17:32:06 +0000 (10:32 -0700)]
Reapply "Add support for #pragma float_control" with buildbot fixes
Add support for #pragma float_control

Reviewers: rjmccall, erichkeane, sepavloff

Differential Revision: https://reviews.llvm.org/D72841

This reverts commit fce82c0ed310174fe48e2402ac731b6340098389.

4 years ago[mlir] Removed tight coupling of BufferPlacement pass to Alloc and Dealloc.
Marcel Koester [Tue, 21 Apr 2020 13:12:10 +0000 (15:12 +0200)]
[mlir] Removed tight coupling of BufferPlacement pass to Alloc and Dealloc.

The current BufferPlacement implementation tries to find Alloc and Dealloc
operations in order to move them. However, this is a tight coupling to
standard-dialect ops which has been removed in this CL.

Differential Revision: https://reviews.llvm.org/D78993

4 years ago[SVE][Codegen] Lower legal min & max operations
Kerry McLaughlin [Mon, 4 May 2020 10:18:50 +0000 (11:18 +0100)]
[SVE][Codegen] Lower legal min & max operations

Summary:
This patch adds AArch64ISD nodes for [S|U]MIN_PRED
and [S|U]MAX_PRED, and lowers both SVE intrinsics and
IR operations for min and max to these nodes.

There are two forms of these instructions for SVE: a predicated
form and an immediate (unpredicated) form. The patterns
which existed for the latter have been updated to match a
predicated node with an immediate and map this
to the immediate instruction.

Reviewers: sdesmalen, efriedma, dancgr, rengolin

Reviewed By: efriedma

Subscribers: huihuiz, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79087

4 years ago[SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func
Jay Foad [Thu, 30 Apr 2020 15:53:20 +0000 (16:53 +0100)]
[SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func

optimizePow does not create any new calls to pow, so it should work
regardless of whether the pow library function is available. This allows
it to optimize the llvm.pow intrinsic on targets with no math library.

Based on a patch by Tim Renouf.

Differential Revision: https://reviews.llvm.org/D68231

4 years ago[InstCombine] Add tests showing failure to fold mul(abs(x),abs(x)) -> mul(x,x) (PR39476)
Simon Pilgrim [Mon, 4 May 2020 09:23:35 +0000 (10:23 +0100)]
[InstCombine] Add tests showing failure to fold mul(abs(x),abs(x)) -> mul(x,x) (PR39476)

Includes abs() and nabs() variants

4 years ago[SCCP] Re-use pushToWorkList in pushToWorkListMsg (NFC).
Florian Hahn [Mon, 4 May 2020 09:17:24 +0000 (10:17 +0100)]
[SCCP] Re-use pushToWorkList in pushToWorkListMsg (NFC).

There's no need to duplicate the logic to push to the different
work-lists.

4 years agoFix building with GCC5 after e64f99c51a8e
Hans Wennborg [Mon, 4 May 2020 09:04:48 +0000 (11:04 +0200)]
Fix building with GCC5 after e64f99c51a8e

It was failing with:

  /work/llvm.monorepo/clang-tools-extra/clangd/ClangdServer.cpp: In lambda function:
  /work/llvm.monorepo/clang-tools-extra/clangd/ClangdServer.cpp:374:75:
  error: could not convert ‘(const char*)""’ from ‘const char*’ to ‘llvm::StringLiteral’
                                                  trace::Metric::Distribution);
                                                                             ^

4 years agoPrecommit test updates for D68231.
Jay Foad [Mon, 4 May 2020 08:26:56 +0000 (09:26 +0100)]
Precommit test updates for D68231.

4 years ago[mlir][rocdl] add rocdl.barier op.
Wen-Heng (Jack) Chung [Mon, 4 May 2020 08:32:16 +0000 (10:32 +0200)]
[mlir][rocdl] add rocdl.barier op.

- Add rocdl.barrier op.
- Lower gpu.barier to rocdl.barrier in -convert-gpu-to-rocdl.

Differential Revision: https://reviews.llvm.org/D79126

4 years ago[mlir][vector] add tests for type_cast taking non-zero addrspace
Wen-Heng (Jack) Chung [Fri, 1 May 2020 10:15:57 +0000 (12:15 +0200)]
[mlir][vector] add tests for type_cast taking non-zero addrspace

Add tests for vector.type_cast that takes memrefs on non-zero
addrspaces.

Differential Revision: https://reviews.llvm.org/D79099

4 years ago[VE][NFC] formatting VEISD enum
Simon Moll [Mon, 4 May 2020 07:50:27 +0000 (09:50 +0200)]
[VE][NFC] formatting VEISD enum

4 years ago[llvm-dwarfdump][Stats] Clean up
Djordje Todorovic [Thu, 23 Apr 2020 10:14:13 +0000 (12:14 +0200)]
[llvm-dwarfdump][Stats] Clean up

This addresses:
  -Clean up the source code
  -Refactor the JSON fields
  -Fix the test cases
  -Improve the docs for the stats output

Differential Revision: https://reviews.llvm.org/D77789

4 years ago[X86] Simplify some code in combineTruncatedArithmetic. NFC
Craig Topper [Mon, 4 May 2020 06:53:08 +0000 (23:53 -0700)]
[X86] Simplify some code in combineTruncatedArithmetic. NFC

We haven't promoted AND/OR/XOR to vXi64 types for a while. So
there's no reason to use isOperationLegalOrPromote. So we can
just use isOperationLegal by merging with ADD handling.

4 years ago[X86] Custom legalize v16i64->v16i8 truncate with avx512.
Craig Topper [Mon, 4 May 2020 05:35:32 +0000 (22:35 -0700)]
[X86] Custom legalize v16i64->v16i8 truncate with avx512.

Default legalization will create two v8i64 truncs to v8i32, concat
them to v16i32, and then truncate the rest of the way to v16i8.

Instead we can truncate directly from v8i64 to v8i8 in the lower
half of an xmm. Then concat the two halves to use vpunpcklqdq.
This is the same number of uops, but the dependency chain through
the uops is better since the halves are merged at the end.

I had to had SimplifyDemandedBits support for VTRUNC to prevent
a regression on vector-trunc-math.ll. combineTruncatedArithmetic
no longer gets a chance to shrink vXi64 mul so we were producing
the v8i64 multiply sequence using multiple PMULUDQs. With the
demanded bits fix we are able to prune out the extra ops leaving
just two PMULUDQs, one for each v8i64 half. This is twice the
width of the 2 v8i32 PMULLDs we had before, but PMULUDQ is 1
uop and PMULLD is 2. We also save some truncates. It's probably
worth using PMULUDQ even when PMULLQ is available since the latter
is 3 uops, but that will require a different change.

Differential Revision: https://reviews.llvm.org/D79231

4 years agoMake Polly tests dependencies explicit
serge-sans-paille [Mon, 4 May 2020 06:06:39 +0000 (08:06 +0200)]
Make Polly tests dependencies explicit

Due to libPolly now using the component infrastructure, it no longer carries all
dependencies as it used to do.

Differential Revision: https://reviews.llvm.org/D79295

4 years ago[llvm-objcopy] Avoid invalid Sec.Offset after D79229
Fangrui Song [Mon, 4 May 2020 04:54:28 +0000 (21:54 -0700)]
[llvm-objcopy] Avoid invalid Sec.Offset after D79229

To avoid undefined behavior caught by -fsanitize=undefined on binary-paddr.test

  void SectionWriter::visit(const Section &Sec) {
    if (Sec.Type != SHT_NOBITS)
      // Sec.Contents is empty while Sec.Offset may be out of bound
      llvm::copy(Sec.Contents, Out.getBufferStart() + Sec.Offset);
  }

4 years ago[Attributor][NFC] Replace the nested AAMap with a key pair
Johannes Doerfert [Wed, 22 Apr 2020 02:34:39 +0000 (21:34 -0500)]
[Attributor][NFC] Replace the nested AAMap with a key pair

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 512375 (362871/s)
temporary memory allocations: 98746 (69933/s)
peak heap memory consumption: 22.54MB
peak RSS (including heaptrack overhead): 106.78MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 509833 (338534/s)
temporary memory allocations: 98902 (65671/s)
peak heap memory consumption: 18.71MB
peak RSS (including heaptrack overhead): 103.00MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -2542 (-27042/s)
temporary memory allocations: 156 (1659/s)
peak heap memory consumption: -3.83MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

4 years ago[Attributor] Remember only necessary dependences
Johannes Doerfert [Thu, 30 Apr 2020 17:12:22 +0000 (12:12 -0500)]
[Attributor] Remember only necessary dependences

Before we eagerly put dependences into the QueryMap as soon as we
encountered them (via `Attributor::getAAFor<>` or
`Attributor::recordDependence`). Now we will wait to see if the
dependence is useful, that is if the target is not already in a fixpoint
state at the end of the update. If so, there is no need to record the
dependence at all.

Due to the abstraction via `Attributor::updateAA` we will now also treat
the very first update (during attribute creation) as we do subsequent
updates.

Finally this resolves the problematic usage of QueriedNonFixAA.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 554675 (389245/s)
temporary memory allocations: 101574 (71280/s)
peak heap memory consumption: 28.46MB
peak RSS (including heaptrack overhead): 116.26MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 512465 (345559/s)
temporary memory allocations: 98832 (66643/s)
peak heap memory consumption: 22.54MB
peak RSS (including heaptrack overhead): 106.58MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -42210 (-727758/s)
temporary memory allocations: -2742 (-47275/s)
peak heap memory consumption: -5.92MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

4 years ago[Attributor] Inititialize "value attributes" w/ must-be-executed-context info
Johannes Doerfert [Wed, 22 Apr 2020 22:49:58 +0000 (17:49 -0500)]
[Attributor] Inititialize "value attributes" w/ must-be-executed-context info

Attributes that only depend on the value (=bit pattern) can be
initialized from uses in the must-be-executed-context (MBEC). We did use
`AAComposeTwoGenericDeduction` and `AAFromMustBeExecutedContext` before
to do this for some positions of these attributes but not for all. This
was fairly complicated and also problematic as we did run it in every
`updateImpl` call even though we only use known information. The new
implementation removes `AAComposeTwoGenericDeduction`* and
`AAFromMustBeExecutedContext` in favor of a simple interface
`AddInformation::fromMBEContext(...)` which we call from the
`initialize` methods of the "value attribute" `Impl` classes, e.g.
`AANonNullImpl:initialize`.

There can be two types of test changes:
  1) Artifacts were we miss some information that was known before a
     global fixpoint was reached and therefore available in an update
     but not at the beginning.
  2) Deduction for values we did not derive via the MBEC before or which
     were not found as the `AAFromMustBeExecutedContext::updateImpl` was
     never invoked.

* An improved version of AAComposeTwoGenericDeduction can be found in
  D78718. Once we find a new use case that implementation will be able
  to handle "generic" AAs better.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 468428 (328952/s)
temporary memory allocations: 77480 (54410/s)
peak heap memory consumption: 32.71MB
peak RSS (including heaptrack overhead): 122.46MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 554720 (351310/s)
temporary memory allocations: 101650 (64376/s)
peak heap memory consumption: 28.46MB
peak RSS (including heaptrack overhead): 116.75MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: 86292 (556722/s)
temporary memory allocations: 24170 (155935/s)
peak heap memory consumption: -4.25MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D78719

4 years ago[Attributor][NFC] Use reference instead of pointer
Johannes Doerfert [Mon, 4 May 2020 00:42:21 +0000 (19:42 -0500)]
[Attributor][NFC] Use reference instead of pointer

4 years ago[Attributor][NFC] Proactively ask for `nocapure` on call site arguments
Johannes Doerfert [Mon, 4 May 2020 00:08:30 +0000 (19:08 -0500)]
[Attributor][NFC] Proactively ask for `nocapure` on call site arguments

This minimizes test noise later on and is in line with other attributes
we derive proactively.

4 years ago[clangd] Fix yet-another gratuitous llvm::Error crash
Sam McCall [Sun, 3 May 2020 20:13:49 +0000 (22:13 +0200)]
[clangd] Fix yet-another gratuitous llvm::Error crash

4 years ago[OpenMP] Fix an issue of wrong return type of DeviceRTLTy::getNumOfDevices
Shilei Tian [Sun, 3 May 2020 19:58:46 +0000 (15:58 -0400)]
[OpenMP] Fix an issue of wrong return type of DeviceRTLTy::getNumOfDevices

Summary: There is a typo in DeviceRTLTy::getNumOfDevices that the type of its return value is bool. It will lead to a problem of wrong device number returned from omp_get_num_devices.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D79255

4 years ago[clangd] Reland LSP latency test
Kadir Cetinkaya [Sun, 3 May 2020 19:06:51 +0000 (21:06 +0200)]
[clangd] Reland LSP latency test

4 years ago[Attributor] Bitcast constant to the returned value type if it has different type
Sergey Dmitriev [Sun, 3 May 2020 18:28:53 +0000 (11:28 -0700)]
[Attributor] Bitcast constant to the returned value type if it has different type

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79277

4 years agoRevert "[InstSimplify] Remove known bits constant folding"
Nikita Popov [Sun, 3 May 2020 18:45:10 +0000 (20:45 +0200)]
Revert "[InstSimplify] Remove known bits constant folding"

This reverts commit 08556afc54e7ddfa7cc2fdd69c615ad417722517.

This breaks some AMDGPU tests.

4 years ago[InstSimplify] Remove known bits constant folding
Nikita Popov [Fri, 20 Mar 2020 10:57:20 +0000 (11:57 +0100)]
[InstSimplify] Remove known bits constant folding

If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294

4 years ago[libc++][test] Use a non-narrowing conversion in assign_pair.pass.cpp
Casey Carter [Sun, 3 May 2020 17:59:10 +0000 (10:59 -0700)]
[libc++][test] Use a non-narrowing conversion in assign_pair.pass.cpp

...to avoid warnings, e.g., from MSVC.

4 years ago[ICP] Handling must tail calls in indirect call promotion
Hongtao Yu [Fri, 1 May 2020 19:44:46 +0000 (12:44 -0700)]
[ICP] Handling must tail calls in indirect call promotion

Per the IR convention, a musttail call must precede a ret with an optional bitcast. This was violated by the indirect call promotion optimization which could result an IR like:

    ; <label>:2192:
      br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483

    ; <label>:2199:                                   ; preds = %2192
      musttail call fastcc void @foo(i8* %2195), !dbg !226012
      br label %2202, !dbg !226012

    ; <label>:2201:                                   ; preds = %2192
      musttail call fastcc void %2197(i8* %2195), !dbg !226012
      br label %2202, !dbg !226012

    ; <label>:2202:                                   ; preds = %605, %2201, %2199
      ret void, !dbg !229485

This is being fixed in this change where the return statement goes together with the promoted indirect call. The code generated is like:

    ; <label>:2192:
      br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483

    ; <label>:2199:                                   ; preds = %2192
      musttail call fastcc void @foo(i8* %2195), !dbg !226012
      ret void, !dbg !229485

    ; <label>:2201:                                   ; preds = %2192
      musttail call fastcc void %2197(i8* %2195), !dbg !226012
      ret void, !dbg !229485

Differential Revision: https://reviews.llvm.org/D79258

4 years ago[llvm][NFC] Inliner: factor cost and reporting out of inlining process
Mircea Trofin [Fri, 1 May 2020 20:27:43 +0000 (13:27 -0700)]
[llvm][NFC] Inliner: factor cost and reporting out of inlining process

Summary:
This factors cost and reporting out of the inlining workflow, thus
making it easier to reuse when driving inlining from the upcoming
InliningAdvisor.

Depends on: D79215

Reviewers: davidxl, echristo

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79275

4 years ago[VPlan] Remove unused & undefined print method (NFC).
Florian Hahn [Sun, 26 Apr 2020 16:52:49 +0000 (17:52 +0100)]
[VPlan] Remove unused & undefined print method (NFC).

4 years ago[Attributor][NFC] Encode IRPositions in the bits of a single pointer
Johannes Doerfert [Fri, 17 Apr 2020 23:03:04 +0000 (18:03 -0500)]
[Attributor][NFC] Encode IRPositions in the bits of a single pointer

This reduces memory consumption for IRPositions by eliminating the
vtable pointer and the `KindOrArgNo` integer. Since each abstract
attribute has an associated IRPosition, the 12-16 bytes we save add up
quickly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 469545 (260135/s)
temporary memory allocations: 77137 (42735/s)
peak heap memory consumption: 30.50MB
peak RSS (including heaptrack overhead): 119.50MB
total memory leaked: 269.07KB
```

After:
```
calls to allocation functions: 468999 (274108/s)
temporary memory allocations: 77002 (45004/s)
peak heap memory consumption: 28.83MB
peak RSS (including heaptrack overhead): 118.05MB
total memory leaked: 269.07KB
```

Difference:
```
calls to allocation functions: -546 (5808/s)
temporary memory allocations: -135 (1436/s)
peak heap memory consumption: -1.67MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

CTMark 15 runs

Metric: compile_time

Program                                        lhs    rhs    diff
 test-suite...:: CTMark/sqlite3/sqlite3.test    25.07  24.09 -3.9%
 test-suite...Mark/mafft/pairlocalalign.test    14.58  14.14 -3.0%
 test-suite...-typeset/consumer-typeset.test    21.78  21.58 -0.9%
 test-suite :: CTMark/SPASS/SPASS.test          21.95  22.03  0.4%
 test-suite :: CTMark/lencod/lencod.test        25.43  25.50  0.3%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    23.88  23.83 -0.2%
 test-suite...TMark/7zip/7zip-benchmark.test    60.24  60.11 -0.2%
 test-suite :: CTMark/kimwitu++/kc.test         15.69  15.69 -0.0%
 test-suite...:: CTMark/ClamAV/clamscan.test    25.43  25.42 -0.0%
 test-suite :: CTMark/Bullet/bullet.test        37.63  37.62 -0.0%
 Geomean difference                                          -0.8%

---

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78722

4 years ago[Attributor][NFC] Let AbstractAttribute be an IRPosition
Johannes Doerfert [Thu, 23 Apr 2020 03:03:44 +0000 (22:03 -0500)]
[Attributor][NFC] Let AbstractAttribute be an IRPosition

Since every AbstractAttribute so far, and for the foreseeable future,
corresponds to a single IRPosition we can simplify the class structure.
We already did this for IRAttribute but there is no reason to stop
there.

4 years agoRevert "Optimize path::remove_dots"
Nico Weber [Sun, 3 May 2020 16:46:46 +0000 (12:46 -0400)]
Revert "Optimize path::remove_dots"

This reverts commit 53913a65b408ade2956061b4c0aaed6bba907403.
Breaks VFSFromYAMLTest.DirectoryIterationSameDirMultipleEntries
in SupportTests on non-Windows.

4 years ago[X86] Add tests showing failure to fold mul(abs(x),abs(x)) -> mul(x,x) (PR39476)
Simon Pilgrim [Sun, 3 May 2020 16:38:57 +0000 (17:38 +0100)]
[X86] Add tests showing failure to fold mul(abs(x),abs(x)) -> mul(x,x) (PR39476)

4 years ago[llvm][NFC] Inliner.cpp shouldInline post-commit feedback
Mircea Trofin [Sun, 3 May 2020 16:28:28 +0000 (09:28 -0700)]
[llvm][NFC] Inliner.cpp shouldInline post-commit feedback

Discussion is in https://reviews.llvm.org/D79215

4 years ago[clangd] Change include to be relative to current directory
Kadir Cetinkaya [Sun, 3 May 2020 16:09:40 +0000 (18:09 +0200)]
[clangd] Change include to be relative to current directory

4 years agoOptimize path::remove_dots
Reid Kleckner [Sun, 3 May 2020 13:26:12 +0000 (06:26 -0700)]
Optimize path::remove_dots

LLD calls this on every source file string in every object file when
writing PDBs, so it is somewhat hot.

Avoid rewriting paths that do not contain path traversal components
(./..). Use find_first_not_of(separators) directly instead of using the
path iterators. The path component iterators appear to be slow, and
directly searching for slashes makes it easier to find double separators
that need to be canonicalized.

I discovered that the VFS relies on remote_dots to not canonicalize
early slashes (/foo or C:/foo) on Windows, so I had to leave that
behavior behind with unit tests for it. This is undesirable, but I claim
that my change is NFC.

4 years ago[COFF] Paritally inline Symbol::getName, NFC
Reid Kleckner [Sun, 3 May 2020 02:53:49 +0000 (19:53 -0700)]
[COFF] Paritally inline Symbol::getName, NFC

4 years ago[InstCombine] use select-of-constants with set/clear bit mask patterns
Sanjay Patel [Sun, 3 May 2020 13:43:49 +0000 (09:43 -0400)]
[InstCombine] use select-of-constants with set/clear bit mask patterns

Cond ? (X & ~C) : (X | C) --> (X & ~C) | (Cond ? 0 : C)
Cond ? (X | C) : (X & ~C) --> (X & ~C) | (Cond ? C : 0)

The select-of-constants form results in better codegen.
There's an existing test diff that shows a transform that
results in an extra IR instruction, but that's an existing
problem.

This is motivated by code seen in LLVM itself - see PR37581:
https://bugs.llvm.org/show_bug.cgi?id=37581

define i8 @src(i8 %x, i8 %C, i1 %b)  {
  %notC = xor i8 %C, -1
  %and = and i8 %x, %notC
  %or = or i8 %x, %C
  %cond = select i1 %b, i8 %or, i8 %and
  ret i8 %cond
}

define i8 @tgt(i8 %x, i8 %C, i1 %b)  {
  %notC = xor i8 %C, -1
  %and = and i8 %x, %notC
  %mul = select i1 %b, i8 %C, i8 0
  %or = or i8 %mul, %and
  ret i8 %or
}

http://volta.cs.utah.edu:8080/z/Vt2WVm

Differential Revision: https://reviews.llvm.org/D78880

4 years ago[clangd] Drop duplicate header
Kadir Cetinkaya [Sun, 3 May 2020 13:20:20 +0000 (15:20 +0200)]
[clangd] Drop duplicate header

4 years ago[Support] Don't initialize buffer allocated by zlib::uncompress
Benjamin Kramer [Sun, 3 May 2020 12:52:19 +0000 (14:52 +0200)]
[Support] Don't initialize buffer allocated by zlib::uncompress

This is a somewhat annoying API, but not without precedend in this low
level API.

4 years ago[X86] Use splitVector helper in truncateVectorWithPACK/splitVectorStore/combineHorizo...
Simon Pilgrim [Sun, 3 May 2020 12:23:46 +0000 (13:23 +0100)]
[X86] Use splitVector helper in truncateVectorWithPACK/splitVectorStore/combineHorizontalMinMaxResult/combineReductionToHorizontal. NFC.

All these locations were performing the same type splitting/extractSubVector calls as the spltVector helper.

4 years ago[gn build] Port e64f99c51a8
LLVM GN Syncbot [Sun, 3 May 2020 12:08:26 +0000 (12:08 +0000)]
[gn build] Port e64f99c51a8

4 years ago[gn build] (manually) port ad97ccf6b26a more, for include added in e64f99c51a8
Nico Weber [Sun, 3 May 2020 12:07:52 +0000 (08:07 -0400)]
[gn build] (manually) port ad97ccf6b26a more, for include added in e64f99c51a8

4 years ago[X86] Don't limit splitVector helper to simple types.
Simon Pilgrim [Sun, 3 May 2020 11:27:14 +0000 (12:27 +0100)]
[X86] Don't limit splitVector helper to simple types.

It can handle EVT just as well (and so can the extractSubVector calls).

4 years ago[Debuginfo][NFC] Avoid double calling of DWARFDie::find(DW_AT_name).
Alexey Lapshin [Thu, 30 Apr 2020 11:05:17 +0000 (14:05 +0300)]
[Debuginfo][NFC] Avoid double calling of DWARFDie::find(DW_AT_name).

Summary:
Current implementation of DWARFDie::getName(DINameKind Kind) could
lead to double call to DWARFDie::find(DW_AT_name) in following
scenario:

getName(LinkageName);
getName(ShortName);

getName(LinkageName) calls find(DW_AT_name) if linkage name is not
found. Then, it is called again in getName(ShortName). This patch
alows to request LinkageName and ShortName separately
to avoid extra call to find(DW_AT_name).

It helps D74169 to parse clang debuginfo faster(~1%).

Reviewers: clayborg, dblaikie

Differential Revision: https://reviews.llvm.org/D79173

4 years ago[InstCombine] Duplicate some InstSimplify tests (NFC)
Nikita Popov [Sun, 3 May 2020 10:42:00 +0000 (12:42 +0200)]
[InstCombine] Duplicate some InstSimplify tests (NFC)

Duplicate some tests in preparation for D79294.

4 years ago[X86][SSE] splitAndLowerShuffle - use splitVector helper. NFC.
Simon Pilgrim [Sun, 3 May 2020 09:48:28 +0000 (10:48 +0100)]
[X86][SSE] splitAndLowerShuffle - use splitVector helper. NFC.

The splitVector helper uses extractSubVector which splits build vectors like we do here, so avoid reimplementing it.

splitVector could easily be extended to peek through bitcasts as well but I'd prefer to keep this commit NFC.

4 years ago[X86] detectAVGPattern - use matchUnaryPredicate helper. NFC.
Simon Pilgrim [Sun, 3 May 2020 08:54:54 +0000 (09:54 +0100)]
[X86] detectAVGPattern - use matchUnaryPredicate helper. NFC.

Use the ISD::matchUnaryPredicate helper to check for inrange constants.

4 years ago[ValueTracking] Convert test to unit test (NFC)
Nikita Popov [Sun, 3 May 2020 10:21:50 +0000 (12:21 +0200)]
[ValueTracking] Convert test to unit test (NFC)

Test this directly, rather than going through InstSimplify.

4 years ago[clangd] Fix name hiding in TestTracer and disable racy test for now
Kadir Cetinkaya [Sun, 3 May 2020 09:44:52 +0000 (11:44 +0200)]
[clangd] Fix name hiding in TestTracer and disable racy test for now

4 years ago[clangd] Metric tracking through Tracer
Kadir Cetinkaya [Thu, 16 Apr 2020 21:12:09 +0000 (23:12 +0200)]
[clangd] Metric tracking through Tracer

Summary: Introduces an endpoint to Tracer for tracking metrics on
internal events.

Reviewers: sammccall

Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78429

4 years agoTest Commit: add two head comments in WinEHPrepare.cpp
Ten Tzen [Sun, 3 May 2020 08:15:59 +0000 (01:15 -0700)]
Test Commit: add two head comments in WinEHPrepare.cpp

This is a Test commit.

4 years agoRe-land "[PDB] Avoid calling discoverTypeIndices for a known record kind"
Reid Kleckner [Sun, 3 May 2020 01:33:27 +0000 (18:33 -0700)]
Re-land "[PDB] Avoid calling discoverTypeIndices for a known record kind"

Fixed bad usage of slice API causing assertion failures.

Reverts 810c8e9b495c191f49b162cee3fb8829185a2691
Reinstates bd7ea8641e7667b109534ae06b33be7bc9b59821

4 years ago[PDB] Bypass generic deserialization code for publics sorting
Reid Kleckner [Sun, 3 May 2020 01:06:41 +0000 (18:06 -0700)]
[PDB] Bypass generic deserialization code for publics sorting

The number of public symbols is very large, and each deserialization
does a few heap allocations. The public symbols are serialized by the
linker, so we can assume they have the expected layout and use it
directly.

Saves O(#publics) temporary heap allocations and shrinks some data
structures.

4 years agoRevert "[PDB] Avoid calling discoverTypeIndices for a known record kind"
Nico Weber [Sun, 3 May 2020 01:06:06 +0000 (21:06 -0400)]
Revert "[PDB] Avoid calling discoverTypeIndices for a known record kind"

This reverts commit bd7ea8641e7667b109534ae06b33be7bc9b59821.
Breaks check-lld everywhere.

4 years ago[X86] Fix a few issues in the evex-to-vex-compress.mir test.
Craig Topper [Sat, 2 May 2020 23:33:48 +0000 (16:33 -0700)]
[X86] Fix a few issues in the evex-to-vex-compress.mir test.

Don't use $noreg for instructions that take register inputs.
Only allow $noreg for parts of memory operands.

Don't use index register with $rip base.

Use RETQ instead of the RET pseudo. This pass is after the
ExpandPseudo pass that converts RET to RETQ.

4 years ago[PDB] Remove a couple asserts that are no longer valid now that C13Builders does...
Craig Topper [Sun, 3 May 2020 00:28:58 +0000 (17:28 -0700)]
[PDB] Remove a couple asserts that are no longer valid now that C13Builders does not use unique_ptr.

These asserts used to check that unique_ptr was not null.

This fixes failures from 7af4bb16417deeb1d01e7dbbbb2272f1f46753c6

4 years ago[PDB] Remove unique_ptr wrapper around C13 line table subsections
Reid Kleckner [Sat, 2 May 2020 23:31:43 +0000 (16:31 -0700)]
[PDB] Remove unique_ptr wrapper around C13 line table subsections

This accounts for a large portion of the memory allocations in LLD.
This DebugSubsectionRecordBuilder object can be stored directly in
C13Builders, it mostly wraps other subsections.

Remove the container kind field from the object. It is always the same
for all elements in the vector, and we can pass it in during writing.

4 years ago[PDB] Avoid calling discoverTypeIndices for a known record kind
Reid Kleckner [Sat, 2 May 2020 22:48:31 +0000 (15:48 -0700)]
[PDB] Avoid calling discoverTypeIndices for a known record kind

This particular overload allocates memory, and we do this for every
S_[GL]PROC32_ID record. Instead, hardcode the offset of the typeindex
that we are looking for in the LF_[MEM]FUNC_ID record. We already
assumed that looking up the item index already found a record of this
kind.

4 years ago[docs][FileCheck] Fix invalid example
Thomas Preud'homme [Fri, 1 May 2020 18:25:08 +0000 (19:25 +0100)]
[docs][FileCheck] Fix invalid example

Summary:
FileCheck documentation contains an example of a numeric variable
defined and used on the same line. This is not currently supported by
FileCheck so this commit fixes the example to use CHECK-SAME for the
variable use.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D79253

4 years ago[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad
LemonBoy [Sat, 2 May 2020 21:09:18 +0000 (14:09 -0700)]
[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad

The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces.

The technique employed by `ExpandLoad`  is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines.

Differential Revision: https://reviews.llvm.org/D79096

4 years ago[COFF] Use a global option table to avoid reconstructing it
Reid Kleckner [Sat, 2 May 2020 21:53:59 +0000 (14:53 -0700)]
[COFF] Use a global option table to avoid reconstructing it

Otherwise an ArgumentParser is constructed for every directive section,
and that involves copying the entire table of options into a vector.
There is no need for this, just have one option table.

4 years ago[test] Fix lld's ELF/linkerscript/thunk-gen-mips.s
Thomas Preud'homme [Fri, 1 May 2020 22:22:07 +0000 (23:22 +0100)]
[test] Fix lld's ELF/linkerscript/thunk-gen-mips.s

Summary:
Lld test ELF/linkerscript/thunk-gen-mips.s was accidentally disabled due
to the use of wrong FileCheck directives. As a result the test seems to
have bitrotted as it fails to pass if fixing the directive. To ease
updates to the test in case of change of the __start address the checks
have been changed to use numeric variables to express all the addresses
based on the __start address.

Reviewed By: atanasyan

Differential Revision: https://reviews.llvm.org/D79270

4 years ago[libclang]: visit C++17 if init statements
Milian Wolff [Sat, 2 May 2020 20:18:09 +0000 (22:18 +0200)]
[libclang]: visit C++17 if init statements

This makes the previously unaccessible AST nodes for C++17 "if with
init statements" accessible to consumers of libclang.

Differential Revision: https://reviews.llvm.org/D78214

4 years ago[libclang]: visit BindingDecl in DecompositionDecl
Milian Wolff [Sat, 2 May 2020 20:17:59 +0000 (22:17 +0200)]
[libclang]: visit BindingDecl in DecompositionDecl

This makes the BindingDecl accessible to consumers of libclang
as CXCursor_UnexposedDecl where previously these AST nodes were
not visited at all from the libclang API.

Differential Revision: https://reviews.llvm.org/D78213

4 years ago[mlir] Add a new context flag for disabling/enabling multi-threading
River Riddle [Sat, 2 May 2020 19:28:57 +0000 (12:28 -0700)]
[mlir] Add a new context flag for disabling/enabling multi-threading

This is useful for several reasons:
* In some situations the user can guarantee that thread-safety isn't necessary and don't want to pay the cost of synchronization, e.g., when parsing a very large module.

* For things like logging threading is not desirable as the output is not guaranteed to be in stable order.

This flag also subsumes the pass manager flag for multi-threading.

Differential Revision: https://reviews.llvm.org/D79266

4 years agoRevert rG8e05ac0a510c - "[DAGCombine] visitTRUNCATE - remove GetDemandedBits call"
Simon Pilgrim [Sat, 2 May 2020 19:08:33 +0000 (20:08 +0100)]
Revert rG8e05ac0a510c - "[DAGCombine] visitTRUNCATE - remove GetDemandedBits call"

Causing buildbot failures

4 years ago[DAGCombine] visitTRUNCATE - remove GetDemandedBits call
Simon Pilgrim [Sat, 2 May 2020 18:51:58 +0000 (19:51 +0100)]
[DAGCombine] visitTRUNCATE - remove GetDemandedBits call

rL368553 added SimplifyMultipleUseDemandedBits handling for ISD::TRUNCATE to SimplifyDemandedBits so we don't need to duplicate this (and it gets rid of another GetDemandedBits call which is slowly being replaced with SimplifyMultipleUseDemandedBits anyhow).

4 years ago[sema] NFC Unable to build Sema library with MSVC Debug target due to missing /bigobj
mydeveloperday [Sat, 2 May 2020 18:33:18 +0000 (19:33 +0100)]
[sema] NFC Unable to build Sema library with MSVC Debug target due to missing /bigobj

Summary:
Unable to build sema library on MSVC with Debug target

```
C:\clang\llvm-project\clang\lib\Sema\SemaOpenMP.cpp : fatal error C1128: number of sections exceeded object file format limit: compile with /bigobj
```

Reviewed By: aaron.ballman

Subscribers: mgorny, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79292

4 years ago[MBP] tuple->pair. NFC.
Benjamin Kramer [Sat, 2 May 2020 18:18:09 +0000 (20:18 +0200)]
[MBP] tuple->pair. NFC.

std::pair has a trivial copy ctor, std::tuple doesn't.

4 years ago[gn build] Port 8f766e382b77e more and fix 2 llvm-config test failures.
Nico Weber [Sat, 2 May 2020 18:10:18 +0000 (14:10 -0400)]
[gn build] Port 8f766e382b77e more and fix 2 llvm-config test failures.

The failures only happened in fully clean builds.

Also put all current dependencies of LibraryDependencies.inc in the
build graph, so that this type of thing will cause a failure in
incremental builds next time as well.

4 years ago[COFF] Add and use a zero-copy tokenizer for .drectve
Reid Kleckner [Fri, 1 May 2020 14:34:12 +0000 (07:34 -0700)]
[COFF] Add and use a zero-copy tokenizer for .drectve

This generalizes the main Windows command line tokenizer to be able to
produce StringRef substrings as well as freshly copied C strings. The
implementation is still shared with the normal tokenizer, which is
important, because we have unit tests for that.

.drective sections can be very long. They can potentially list up to
every symbol in the object file by name. It is worth avoiding these
string copies.

This saves a lot of memory when linking chrome.dll with PGO
instrumentation:

             BEFORE      AFTER      % IMP
peak memory: 6657.76MB   4983.54MB  -25%
real:        4m30.875s   2m26.250s  -46%

The time improvement may not be real, my machine was noisy while running
this, but that the peak memory usage improvement should be real.

This change may also help apps that heavily use dllexport annotations,
because those also use linker directives in object files. Apps that do
not use many directives are unlikely to be affected.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D79262

4 years ago[SmallVector] Weaken the predicate for the memcpy optimization
Benjamin Kramer [Sat, 2 May 2020 17:22:15 +0000 (19:22 +0200)]
[SmallVector] Weaken the predicate for the memcpy optimization

We don't require the type to be trivially assignable. While the standard
says that only is_trivially_copyable types may be memcpy'd, this seems
overly strict. We never assign the type, so there's no way for the type
to observe that the copy/move construction got elided. This is important
for std::pair<POD, POD>, which is not trivially assignable and probably
never will be because changing that would break ABI.

As a side-effect this no longer allows types with deleted copy/move
constructors in SmallVector. That's an unintended side-effect of
is_trivially_copyable anyways.

Shrinks Release+Asserts clang by 20k.

4 years agoDon't stash types that aren't copyable or moveable into a SmallVector
Benjamin Kramer [Sun, 26 Apr 2020 20:44:31 +0000 (22:44 +0200)]
Don't stash types that aren't copyable or moveable into a SmallVector

This seems to be working by accident.

4 years agoUse realloc for NestedNameSpecifierLocBuilder
Benjamin Kramer [Sat, 2 May 2020 15:04:52 +0000 (17:04 +0200)]
Use realloc for NestedNameSpecifierLocBuilder

These allocations are so tiny that the buffer can be grown in-place most
of the time.

4 years ago[clang-format] NFC - clang-format the FormatTests
mydeveloperday [Sat, 2 May 2020 14:42:20 +0000 (15:42 +0100)]
[clang-format] NFC - clang-format the FormatTests

Summary:
Ensure the clang-format unit tests are themselves clang-formatted

Having areas of the llvm code which are clang-format clean, give us more areas to run new clang-format binaries on ensuring we haven't broken anything.

It seems to me we SHOULD have this clang-formatted at a minimum, otherwise how can we expect others to use clang-format if we "don't eat our own dogfood", also if the tests are dependent on the formatting of the code then that would also be bad!

Reviewed By: sammccall

Subscribers: cfe-commits

Tags: #clang, #clang-format

Differential Revision: https://reviews.llvm.org/D79204

4 years ago[Allocator] Make Deallocate() pass alignment and make it use (de)allocate_buffer
Benjamin Kramer [Sat, 2 May 2020 13:59:10 +0000 (15:59 +0200)]
[Allocator] Make Deallocate() pass alignment and make it use (de)allocate_buffer

This lets it use sized deallocation and make more efficient alignment
decisions. Also adjust BumpPtrAllocator to always allocate at
alignof(std::max_align_t).

4 years ago[RISCV] Implement convertSelectOfConstantsToMath
Sam Elliott [Sat, 2 May 2020 14:05:12 +0000 (15:05 +0100)]
[RISCV] Implement convertSelectOfConstantsToMath

Summary:
The current lowering of `select` on RISC-V uses a branch instruction to load a
register with one or other value. This is inefficient, especially in the case of
small constants that can be computed easily.

By implementing the TargetLowering::convertSelectOfConstantsToMath hook, some of
the simpler cases are covered that let us avoid introducing a branch in these
cases.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D79260

4 years ago[RISCV][NFC] Tests for (select (const), (const))
Sam Elliott [Sat, 2 May 2020 14:05:02 +0000 (15:05 +0100)]
[RISCV][NFC] Tests for (select (const), (const))

Summary:
This just adds some simple cases for testing select of constants. There will be
a follow-up patch that improves code generation in some of these cases.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D79259

4 years ago[RISCV] Add patterns for checking isnan
Sam Elliott [Sat, 2 May 2020 14:00:38 +0000 (15:00 +0100)]
[RISCV] Add patterns for checking isnan

Summary:
This patch addresses some weird assembly sequences we were seeing during
comparing floats. In particular, comparing a float to itself tells you whether
it is NaN or not, which we were doing correctly, but with an extra unneeded
`and` instruction.

This patch specialises the existing patterns to remove the `and` instructions
when both their operands are the same.

Reviewed By: luismarques, asb

Differential Revision: https://reviews.llvm.org/D78908

4 years ago[RISCV][NFC] Add tests for checking isnan patterns
Sam Elliott [Sat, 2 May 2020 13:56:35 +0000 (14:56 +0100)]
[RISCV][NFC] Add tests for checking isnan patterns

Summary:
I worked on adding some SelectionDag patterns to address code generated by these
examples, which came out of some differential testing against GCC. The pattern
additions will be in a follow-up patch.

Reviewers: luismarques, asb

Reviewed By: luismarques, asb

Differential Revision: https://reviews.llvm.org/D78907