platform/upstream/llvm.git
20 months ago[IPO] Remove some legacy passes
Arthur Eubanks [Mon, 6 Feb 2023 21:25:59 +0000 (13:25 -0800)]
[IPO] Remove some legacy passes

These are part of the optimization pipeline, of which the legacy pass manager version is deprecated.

20 months ago[HIP] Update test hip-header.hip
Yaxun (Sam) Liu [Mon, 6 Feb 2023 16:37:00 +0000 (11:37 -0500)]
[HIP] Update test hip-header.hip

remove -no-opaque-pointers

Reviewed by: Matt Arsenault

Differential Revision: https://reviews.llvm.org/D143412

20 months ago[MergeFunctions] Remove legacy pass
Arthur Eubanks [Mon, 6 Feb 2023 21:07:07 +0000 (13:07 -0800)]
[MergeFunctions] Remove legacy pass

It's part of the optimization pipeline, which the legacy pass manager version is deprecated.

20 months ago[libc][Obvious] Add __FMA__ flag detection to cpu_features.h
Tue Ly [Mon, 6 Feb 2023 21:05:23 +0000 (16:05 -0500)]
[libc][Obvious] Add __FMA__ flag detection to cpu_features.h

20 months ago[Support] Move ItaniumManglingCanonicalizer and SymbolRemappingReader from Support...
Simon Pilgrim [Mon, 6 Feb 2023 20:55:24 +0000 (20:55 +0000)]
[Support] Move ItaniumManglingCanonicalizer and SymbolRemappingReader from Support to ProfileData

As mentioned on https://discourse.llvm.org/t/issues-in-llvm-tblgen-high-parallelized-build/68037, ItaniumManglingCanonicalizer is often slow to build, resulting in a bottleneck for distributed builds while waiting for LLVMSupport to complete.

SymbolRemappingReader is the only current user of ItaniumManglingCanonicalizer, and this is only used by ProfileData and llvm-cxxmap - so I propose we move both files into the ProfileData library.

Differential Revision: https://reviews.llvm.org/D143318

20 months ago[Driver] Fix -fsanitize-address-stack-use-after-scope after D142606
Fangrui Song [Mon, 6 Feb 2023 20:54:34 +0000 (12:54 -0800)]
[Driver] Fix -fsanitize-address-stack-use-after-scope after D142606

Driver::getToolChain called by Driver::BuildCompilation gets the
`Triple` argument from a temporary. With delayed detection due to
LazyDetector, we would reference a dangling `Triple`.

20 months agoImprove transforms for (icmp uPred X * Z, Y * Z) -> (icmp uPred X, Y)
Noah Goldstein [Mon, 6 Feb 2023 18:06:22 +0000 (12:06 -0600)]
Improve transforms for (icmp uPred X * Z, Y * Z) -> (icmp uPred X, Y)

Several cases where missing.

1. `(icmp eq/ne X*Z, Y*Z) [if Z % 2 != 0] -> (icmp eq/ne X, Y)`
    EQ: https://alive2.llvm.org/ce/z/6_HPZ5
    NE: https://alive2.llvm.org/ce/z/c34qSU

    There was previously an implementation of this that work of `Y`
    was non-constant, but it was missing if `Y*Z` evaluated to a
    constant and/or `nsw`/`nuw` where both false. As well it only
    worked if `Z` was a constant but we can check 1s bit of
    `KnownBits` to cover more cases.

2. `(icmp eq/ne X*Z, Y*Z) [if Z != 0 and nsw(X*Y) and nsw(Y*Z)] -> (icmp eq/ne X, Y)`
    EQ: https://alive2.llvm.org/ce/z/6SdAG6
    NE: https://alive2.llvm.org/ce/z/fjsq_b

    This was previously implemented only to work if `Z` was constant,
    but we can use `isKnownNonZero` to cover more cases.

3. `(icmp uPred X*Y, Y*Z) [if Z != 0 and nuw(X*Y) and nuw(X*Y)] -> (icmp uPred X, Y)`
    EQ:  https://alive2.llvm.org/ce/z/FqWQLX
    NE:  https://alive2.llvm.org/ce/z/2gHrd2
    ULT: https://alive2.llvm.org/ce/z/MUAWgZ
    ULE: https://alive2.llvm.org/ce/z/szQQ2L
    UGT: https://alive2.llvm.org/ce/z/McVUdu
    UGE: https://alive2.llvm.org/ce/z/95uyC8

    This was previously implemented only for `eq/ne` cases. As well
    only if `Z` was constant, but again we can use `isKnownNonZero` to
    cover more cases.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D142786

20 months agoAdd transform for `(mul X, OddC) eq/ne N * C` --> `X eq/ne N`
Noah Goldstein [Mon, 6 Feb 2023 18:06:11 +0000 (12:06 -0600)]
Add transform for `(mul X, OddC) eq/ne N * C` --> `X eq/ne N`

We previously only did this if the `mul` was `nuw`, but it works for
any odd value.

Alive2 Links:
EQ: https://alive2.llvm.org/ce/z/6_HPZ5
NE: https://alive2.llvm.org/ce/z/c34qSU

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D143026

20 months agoAdd tests for folding (icmp UnsignedPred X * Z, Y * Z) -> (icmp UnsignedPred X, Y...
Noah Goldstein [Mon, 6 Feb 2023 18:05:58 +0000 (12:05 -0600)]
Add tests for folding (icmp UnsignedPred X * Z, Y * Z) -> (icmp UnsignedPred X, Y); NFC

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D142785

20 months agoRecommit "Improve and enable folding of conditional branches with tail calls." (2nd...
Noah Goldstein [Mon, 6 Feb 2023 18:05:44 +0000 (12:05 -0600)]
Recommit "Improve and enable folding of conditional branches with tail calls." (2nd Try)

Improve and enable folding of conditional branches with tail calls.

1. Make it so that conditional tail calls can be emitted even when
   there are multiple predecessors.

2. Don't guard the transformation behind -Os. The rationale for
   guarding it was static-prediction can be affected by whether the
   branch is forward of backward. This is no longer true for almost any
   X86 cpus (anything newer than `SnB`) so is no longer a meaningful
   concern.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D140931

20 months agoOnly match BMI (BLSR, BLSI, BLSMSK) if the add/sub op is single use
Noah Goldstein [Mon, 6 Feb 2023 18:05:10 +0000 (12:05 -0600)]
Only match BMI (BLSR, BLSI, BLSMSK) if the add/sub op is single use

If the add/sub is not single use, it will need to be materialized
later, in which case using the BMI instruction is a de-optimization in
terms of code-size and throughput.

i.e:
```
// Good
leal -1(%rdi), %eax
andl %eax, %eax
xorl %eax, %esi
...
```
```
// Unecessary BMI (lower throughput, larger code size)
leal -1(%rdi), %eax
blsr %edi, %eax
xorl %eax, %esi
...
```

Note, this may cause more `mov` instructions to be emitted sometimes
because BMI instructions only have 1 src and write-only to dst.  A
better approach may be to only avoid BMI for (and/xor X, (add/sub
0/-1, X)) if this is the last use of X but NOT the last use of
(add/sub 0/-1, X).

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D141180

20 months agoSearch through associative operators for BMI patterns (BLSI, BLSR, BLSMSK)
Noah Goldstein [Mon, 6 Feb 2023 18:04:34 +0000 (12:04 -0600)]
Search through associative operators for BMI patterns (BLSI, BLSR, BLSMSK)

(a & (-b)) & b is often lowered as:
    %sub  = sub i32     0, %b
    %and0 = and i32  %sub, %a
    %and1 = and i32 %and0, %b

Which won't get detected by the BLSI pattern as b & -b are never in
the same SDNode.

This patch will do a small search through associative operators and try
and place BMI patterns in the same node so they will hit the pattern.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D141179

20 months agoMatch (xor TSize - 1, ctlz) to `bsr` instead of `lzcnt` + `xor`
Noah Goldstein [Mon, 6 Feb 2023 18:04:09 +0000 (12:04 -0600)]
Match (xor TSize - 1, ctlz) to `bsr` instead of `lzcnt` + `xor`

Was previously de-optimizating if -march supported lzcnt as there is
no reason to add the extra instruction.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D141464

20 months ago[flang] Fix creation of the bound array for pointer remapping
Valentin Clement [Mon, 6 Feb 2023 20:06:44 +0000 (21:06 +0100)]
[flang] Fix creation of the bound array for pointer remapping

The runtime function expects a 2 x newRank array and the code
was passing a newRank x 2 array. This patch updates the
creation of the array to fit the runtime expectation.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D143405

20 months ago[Clang] Fix __ptr32 arguments passed to builtins
Ariel Burton [Mon, 6 Feb 2023 19:51:41 +0000 (19:51 +0000)]
[Clang] Fix __ptr32 arguments passed to builtins

Currently when clang deals with a call to a builtin function that
is supplied with an argument that has an explicit address space
it rewrites the signature of the callee to make the types of
the formal parameters match those of the actual arguments.
This functionality was added to support OpenCL, and was
introduced with commit b919c7d.

However, this does not work properly for "size" related address
spaces such as those used for __ptr32. This affects platforms
like Microsoft and z/OS.

This change preserves the OpenCL functionality, but will use
the formal parameter types when an address space is size-related.

Reviewed By: akhuang

Differential Revision: https://reviews.llvm.org/D142048

20 months ago[Clang] Add llvm-mt and llvm-rc to Clang bootstrap dependency
Haowei Wu [Mon, 30 Jan 2023 23:43:09 +0000 (15:43 -0800)]
[Clang] Add llvm-mt and llvm-rc to Clang bootstrap dependency

This patch adds llvm-mt and llvm-rc to the Clang bootstrap
dependency when building the Clang under Windows.

Differential Revision: https://reviews.llvm.org/D143025

20 months agoRevert "[OpenMP][libomp] Remove false positive for memory sanitizer"
Ron Lieberman [Mon, 6 Feb 2023 19:16:37 +0000 (13:16 -0600)]
Revert "[OpenMP][libomp] Remove false positive for memory sanitizer"

breaks amdgpu buildbot

This reverts commit 402981ee25fe135d63226a7de17dbb14c437c71b.

20 months ago[Fuchsia] Simplified the stage2 build setup
Haowei Wu [Fri, 3 Feb 2023 18:46:04 +0000 (10:46 -0800)]
[Fuchsia] Simplified the stage2 build setup

This patch simplified the BOOTSTRAP_ flags, allowing them to be
pass through from regular flags.

Differential Revision: https://reviews.llvm.org/D143288

20 months ago[LinkerWrapper] Output a temp file with the wrapper bitcode
Joseph Huber [Mon, 6 Feb 2023 18:33:25 +0000 (12:33 -0600)]
[LinkerWrapper] Output a temp file with the wrapper bitcode

Summary:
The wrapper bitcode currently only gets a temp file for the compiled
object. This makes it more difficult to see what was actually generated.

20 months agoRevert "[Lint] Use new PM instead of legacy PM in lintFunction and lintModule"
Bjorn Pettersson [Mon, 6 Feb 2023 18:29:06 +0000 (19:29 +0100)]
Revert "[Lint] Use new PM instead of legacy PM in lintFunction and lintModule"

This reverts commit 525ed98be483188db6dc3bb69cecd0123148ceca.

Some buildbots are failing when linking bugpoint.
Reverting to investigate that further.

20 months ago[Lint] Use new PM instead of legacy PM in lintFunction and lintModule
Bjorn Pettersson [Mon, 6 Feb 2023 09:55:26 +0000 (10:55 +0100)]
[Lint] Use new PM instead of legacy PM in lintFunction and lintModule

There are some helpers in the Lint analysis pass that will setup
a pass manager and then run the Lint pass on a given Function/Module.

Those have been using the LegacyPassManager, but as a small step
towards removing the deprecated legacy pass manager this patch is
changing those helpers into using the new pass manager instead.

No idea if anyone is really is using those helpers. Maybe an
alternative had been to just remove them. There is at least no unit
tests or similar that verifies that they work, so I validated this
patch by using a hacked opt binary that called those functions
before running the normal pipeline.

Differential Revision: https://reviews.llvm.org/D143388

20 months ago[TailDuplicator] Fix old bugs in TailDuplicator::duplicateInstruction
Bjorn Pettersson [Wed, 21 Dec 2022 21:10:52 +0000 (22:10 +0100)]
[TailDuplicator] Fix old bugs in TailDuplicator::duplicateInstruction

This patch is updating TailDuplicator::duplicateInstruction to fix
some old bugs that has been found with an out-of-tree target. There
are three different things being addressed:

1) In one situation two subregister indices are combined using the
   composeSubRegIndices helper. But the order in which those indices
   are combined has been incorrect. For this problem I managed to
   create some kind of reproducer using AArch64 (see the test case
   touched in this patch).

2) Another fault was found in the else branch for the above situation.
   Here we do not compose the two subregisters, instead we insert a
   COPY to replace the PHI, and then the subreg index in the using
   MO remains. Thus, the virtual register created for the COPY should
   always match with the size of the original register. Therefore the
   optimization that "constrain" (or rather relax) the register
   class by looking at the instruction desc must be limited to the
   situation when there is no subregister access. Otherwise we create
   a vreg with the wrong class.

3) Last problem addressed in this patch is that when a new register
   class is picked by looking at the instruction desc, then it
   isn't guaranteed that the isAllocatable property is set for that
   class. So one need to use the getAllocatableClass helper to find
   a subclass that is allocatable before using createVirualRegister,
   or alternatively (as in this patch) just use the OrigRC instead
   of relaxing the register class for the COPY destination.

Haven't been able to find any in-tree reproducers for problem 2 and 3.
The tricky part is to find a target that has register hierarchies that
match with the problem to trigger those code paths (and with subreg
accesses involved).

Differential Revision: https://reviews.llvm.org/D140496

20 months ago[TailDuplicator] Pre-commit test case for a subreg composition bug
Bjorn Pettersson [Wed, 21 Dec 2022 20:23:57 +0000 (21:23 +0100)]
[TailDuplicator] Pre-commit test case for a subreg composition bug

Differential Revision: https://reviews.llvm.org/D140495

20 months ago[Coverage] Map regions from system headers
Gulfem Savrun Yeniceri [Fri, 27 Jan 2023 18:02:26 +0000 (18:02 +0000)]
[Coverage] Map regions from system headers

Originally, the following commit removed mapping coverage regions for system headers:
https://github.com/llvm/llvm-project/commit/93205af066341a53733046894bd75c72c99566db

It might be viable and useful to collect coverage from system headers in some systems.
This patch adds --system-headers-coverage option (disabled by default) to enable
collecting coverage from system headers.

Differential Revision: https://reviews.llvm.org/D143304

20 months agoRecommit "[ConstraintElim] Enable pass by default."
Florian Hahn [Mon, 6 Feb 2023 18:09:42 +0000 (18:09 +0000)]
Recommit "[ConstraintElim] Enable pass by default."

This reverts commit 695ce48c63ec582a46bfbda9b066f4d3bcde143f.

The compile-time regression causing the revert has been fixed. Recommit
the original patch.

Original commit message:

   The pass should help to close a functional gap when it comes to
    reasoning about related conditions in a relatively general way.

    It addresses multiple existing issues (linked below) and the need for a
    more powerful reasoning system was also discussed recently in
    https://discourse.llvm.org/t/rfc-alternative-approach-of-dealing-with-implications-from-comparisons-through-pos-analysis/65601/7

    On AArch64, the new pass performs ~2000 simplifications on
    MultiSource,SPEC2006,SPEC2017 with -O3.

    Compile-time impact:

    NewPM-O3: +0.20%
    NewPM-ReleaseThinLTO: +0.32%
    NewPM-ReleaseLTO-g: +0.28%

    https://llvm-compile-time-tracker.com/compare.php?from=f01a3a893c147c1594b9a3fbd817456b209dabbf&to=577688758ef64fb044215ec3e497ea901bb2db28&stat=instructions:u

    Fixes #49344.
    Fixes #47888.
    Fixes #48253.
    Fixes #49229.
    Fixes #58074.

    Reviewed By: asbirlea

    Differential Revision: https://reviews.llvm.org/D135915

20 months ago[DebugInfo] Add missing 'break' in switch (NFC)
Benjamin Maxwell [Mon, 6 Feb 2023 17:38:35 +0000 (17:38 +0000)]
[DebugInfo] Add missing 'break' in switch (NFC)

20 months agoDon't re-export top-level modules
Vassil Vassilev [Mon, 6 Feb 2023 17:33:54 +0000 (17:33 +0000)]
Don't re-export top-level modules

In https://reviews.llvm.org/D119036 we fixed some of the infrastructure by
removing the textual keyword.

The underlying issue of PR50592 was that clang can re-export only submodules but
under some conditions we needed to re-export the standalone module std_config
via std. This patch provides a better fix to the symptom D119036 fixed.

Differential revision: https://reviews.llvm.org/D142805

20 months ago[DAG] Remove non-canonical AVG case.
David Green [Mon, 6 Feb 2023 17:24:25 +0000 (17:24 +0000)]
[DAG] Remove non-canonical AVG case.

This removes a condition in the detection of AVG nodes, where we needn't be
checking the LHS of an add node as any const will be canonicalized to the RHS.

20 months ago[DAG][AArch64][ARM] Recognize avg (hadd) from wrapping flags
David Green [Mon, 6 Feb 2023 17:24:01 +0000 (17:24 +0000)]
[DAG][AArch64][ARM] Recognize avg (hadd) from wrapping flags

This slightly extends the creation of hadd nodes to allow them to be generated
with the original type size if wrapping flags allow.
https://alive2.llvm.org/ce/z/bPjakD
https://alive2.llvm.org/ce/z/fa_gzb

Differential Revision: https://reviews.llvm.org/D143371

20 months ago[DebugInfo] Handle fixed-width DW_FORM_addrx variants in DWARFFormValue::getAsSection...
Benjamin Maxwell [Wed, 1 Feb 2023 13:35:31 +0000 (13:35 +0000)]
[DebugInfo] Handle fixed-width DW_FORM_addrx variants in DWARFFormValue::getAsSectionedAddress()

Previously this would incorrectly return the raw offset into the .debug_addr section for the
DW_FORM_addrx1/2/3/4 forms rather than the actual address.

Note that this was handled correctly in the dump() function so this issue only occurs for users
of this API and not in tools such as llvm-dwarfdump. The dump() method has now been updated
to use this method to increase coverage.

This also now adds a few unit tests for indexed addresses to DWARFDebugInfoTest.

Differential Revision: https://reviews.llvm.org/D143073

20 months ago[libc] Add `LIBC_GPU_TEST_ARCHITECTURE` option to set architecture
Joseph Huber [Mon, 6 Feb 2023 15:14:20 +0000 (09:14 -0600)]
[libc] Add `LIBC_GPU_TEST_ARCHITECTURE` option to set architecture

Currently, the plan is to support testing on a single GPU architecture.
We query the supported architectures from the user's system. However,
there are times when the user would want to override this. This patch
adds the `LIBC_GPU_TEST_ARCHITECTURE` option, which allows users to
specify which GPU architecture to build for.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D143400

20 months ago[ConstraintElim] Update existing constraint system in place (NFC).
Florian Hahn [Mon, 6 Feb 2023 16:43:42 +0000 (16:43 +0000)]
[ConstraintElim] Update existing constraint system in place (NFC).

This patch breaks up the solving step into 2 phases:

1. Collect all rows where the variable to eliminate is != 0 and remove
   it from the original system.
2. Process all collect rows to build new set of constraints, add them to
   the original system.

This is much more efficient for excessive cases, as this avoids a large
number of moves to the new system. This reduces the time spent in
ConstraintElimination for the test case shared in D135915 from ~3s to
0.6s.

20 months ago[LLDB][NFC] Fix a incorrect use of shared_ptr in RenderScriptRuntime.cpp
Shivam Gupta [Mon, 6 Feb 2023 15:43:53 +0000 (21:13 +0530)]
[LLDB][NFC] Fix a incorrect use of shared_ptr in RenderScriptRuntime.cpp

Incorrect use of shared_ptr.

found by PVS-Studio https://pvs-studio.com/en/blog/posts/cpp/1003/, N8 & N9.

Differential Revision: https://reviews.llvm.org/D142309

20 months ago[AMDGPU][NFC] Clean up the VOP profile definition for v_swap_b32.
Ivan Kosarev [Mon, 6 Feb 2023 13:07:24 +0000 (13:07 +0000)]
[AMDGPU][NFC] Clean up the VOP profile definition for v_swap_b32.

v_swap_b32 is a VOP1-only instruction, meaning it neither encodes src1
nor has 64-bit encodings.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D143289

20 months ago[Coroutines] Pass size parameter for deallocation function when qualified
Chuanqi Xu [Mon, 6 Feb 2023 16:20:50 +0000 (00:20 +0800)]
[Coroutines] Pass size parameter for deallocation function when qualified

Close https://github.com/llvm/llvm-project/issues/60545.

Previously, we would only pass the size parameter to the deallocation
function if the type is completely the same. But it is good enough to
make them unqualified the smae.

20 months ago[bazel][libc] Fix dependencies for float functions
Dmitry Chernenkov [Mon, 6 Feb 2023 15:53:56 +0000 (15:53 +0000)]
[bazel][libc] Fix dependencies for float functions

20 months ago[ConstraintElim] Move some array accesses to variables (NFC).
Florian Hahn [Sun, 5 Feb 2023 22:05:53 +0000 (22:05 +0000)]
[ConstraintElim] Move some array accesses to  variables (NFC).

Move some accesses that are use multiple times to variables. This also
will make updating them easier in the future.

20 months ago[AMDGPU] Fix some LABEL check lines
Jay Foad [Mon, 6 Feb 2023 15:43:38 +0000 (15:43 +0000)]
[AMDGPU] Fix some LABEL check lines

20 months ago[AMDGPU] Fix DOS line endings in some tests
Jay Foad [Mon, 6 Feb 2023 15:42:39 +0000 (15:42 +0000)]
[AMDGPU] Fix DOS line endings in some tests

20 months agoReapply 6fa2abf90886f18472c87bc9bffbcdf4f73c465e
serge-sans-paille [Thu, 26 Jan 2023 07:41:14 +0000 (08:41 +0100)]
Reapply 6fa2abf90886f18472c87bc9bffbcdf4f73c465e

Lazyly initialize uncommon toolchain detector

Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Reapplied since 77910ac374656319ff114ef251fda358d4aa166a landed and
fixes the test ordering issue.

Differential Revision: https://reviews.llvm.org/D142606

20 months ago[NFC] add new function is64Bit for SymbolicFile class
zhijian [Mon, 6 Feb 2023 15:43:29 +0000 (10:43 -0500)]
[NFC] add new function is64Bit for SymbolicFile class

Summary:

since the class 'SymbolicFile ' do not have a is64Bit() API , when we need to check whether a SymbolicFile object is 64bit or not. we need to write a function to do it, it maybe cause duplication code.

Reviewers: James Henderson, Fangrui Song
Differential Revision: https://reviews.llvm.org/D143097

20 months ago[X86] combineConcatVectorOps - add X86ISD::VPERMV handling
Simon Pilgrim [Mon, 6 Feb 2023 15:20:02 +0000 (15:20 +0000)]
[X86] combineConcatVectorOps - add X86ISD::VPERMV handling

20 months ago[X86] combineConcatVectorOps - merge 256-bit logic ops on AVX2+
Simon Pilgrim [Mon, 6 Feb 2023 14:28:51 +0000 (14:28 +0000)]
[X86] combineConcatVectorOps - merge 256-bit logic ops on AVX2+

AVX1 doesn't benefit as nearly all integer ops will stay as 128-bit ops.

This only exposes a couple of minor changes but will be a lot more useful in an upcoming shuffle combining patch.

20 months ago[extract_symbols.py] Better handling of templates
John Brawn [Mon, 30 Jan 2023 14:34:14 +0000 (14:34 +0000)]
[extract_symbols.py] Better handling of templates

Since commit 846b676 SmallVectorBase<uint32_t> has been explicitly
instantiated, which means that clang.exe must export it for a plugin
to be able to link against it, but the constructor is not exported as
currently no template constructors or destructors are exported.

We can't just export all constructors and destructors, as that puts us
over the symbol limit on Windows, so instead rewrite how we decide
which templates need to be exported to be more precise. Currently we
assume that templates instantiated many times have no explicit
instantiations, but this isn't necessarily true and results also in
exporting implicit template instantiations that we don't need
to. Instead check for references to template members, as this
indicates that the template must be explicitly instantiated (as if it
weren't the template would just be implicitly instantiated on use).

Doing this reduces the number of symbols exported from clang from
66011 to 53993 (in the build configuration that I've been testing). It
also lets us get rid of the special-case handling of Type::getAs, as
its explicit instantiations are now being detected as such.

Differential Revision: https://reviews.llvm.org/D142989

20 months ago[X86] Change precision control to FP80 during u64->fp32 conversion on Windows.
Craig Topper [Mon, 6 Feb 2023 15:29:31 +0000 (07:29 -0800)]
[X86] Change precision control to FP80 during u64->fp32 conversion on Windows.

This is an alternative to D141074 to fix the problem by adjusting
the precision control dynamically.

Reviewed By: icedrocket

Differential Revision: https://reviews.llvm.org/D142178

20 months ago[OpenMP][libomp] Remove false positive for memory sanitizer
Jonathan Peyton [Mon, 6 Feb 2023 15:26:44 +0000 (09:26 -0600)]
[OpenMP][libomp] Remove false positive for memory sanitizer

The memory sanitizer intercepts the memcpy() call but not the direct
assignment of last byte to 0. This leads the sanitizer to believe the
last byte of a string based on the kmp_str_buf_t type is uninitialized.
Hence, the eventual strlen() inside __kmp_env_dump() leads to an
use-of-uninitialized-value warning.

Using strncat() instead gives the sanitizer the information it needs.

Differential Revision: https://reviews.llvm.org/D143401

Fixes #60501

20 months ago[libc++][CI] Uses LLVM 17 in Docker.
Mark de Wever [Tue, 31 Jan 2023 19:50:08 +0000 (20:50 +0100)]
[libc++][CI] Uses LLVM 17 in Docker.

Updates the LLVM versions used in the Dockerfile. It also removes
obsolete symlinks. This doesn't update the Buildkite jobs, they need to
use the new Docker image before they can be updated.

Reviewed By: ldionne, #libc, philnik

Differential Revision: https://reviews.llvm.org/D143007

20 months ago[RISCV] Remove DecoderMethod from C_NOP_HINT. NFC
Craig Topper [Mon, 6 Feb 2023 15:14:37 +0000 (07:14 -0800)]
[RISCV] Remove DecoderMethod from C_NOP_HINT. NFC

This doesn't appear to be needed.

Differential Revision: https://reviews.llvm.org/D143367

20 months ago[RISCV] Make 'c.addi x0, imm' an alias for 'c.nop imm'.
Craig Topper [Mon, 6 Feb 2023 15:11:10 +0000 (07:11 -0800)]
[RISCV] Make 'c.addi x0, imm' an alias for 'c.nop imm'.

Instead of making it an AsmParserOnly instruction, make it an alias.
This makes printing consistent with disassembly.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D143362

20 months ago[mlir][bufferization] Fix bufferization of repetitive regions
Matthias Springer [Mon, 6 Feb 2023 15:17:45 +0000 (16:17 +0100)]
[mlir][bufferization] Fix bufferization of repetitive regions

The previous strategy was too complex and faulty. Op dominance cannot be used to rule out RaW conflicts due to op ordering if the reading op and the conflicting writing op are in a sub repetitive region of the closest enclosing repetitive region of the definition of the read value.

Differential Revision: https://reviews.llvm.org/D143087

20 months ago[HIP] Support ASAN with malloc/free
Yaxun (Sam) Liu [Mon, 9 Jan 2023 21:50:15 +0000 (16:50 -0500)]
[HIP] Support ASAN with malloc/free

Device side malloc/free needs special
implementation for ASAN.

Reviewed by: Artem Belevich, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D143111

20 months ago[mlir][bufferization] Reads from tensors with undefined data are not a conflict
Matthias Springer [Mon, 6 Feb 2023 15:10:23 +0000 (16:10 +0100)]
[mlir][bufferization] Reads from tensors with undefined data are not a conflict

Reading from tensor.empty or bufferization.alloc_tensor (without copy) cannot cause a conflict because these ops do not specify the contents of their result tensors.

Differential Revision: https://reviews.llvm.org/D143183

20 months ago[clang] Reorder output of rocm-detect.hip test
serge-sans-paille [Mon, 6 Feb 2023 15:02:21 +0000 (16:02 +0100)]
[clang] Reorder output of rocm-detect.hip test

Since 6fa2abf90886f18472c87bc9bffbcdf4f73c465e the rocm driver is lazily
loaded, which impacts the output of the rocm-detect.hip test.

20 months agoRevert "Lazyly initialize uncommon toolchain detector"
Jonas Hahnfeld [Mon, 6 Feb 2023 14:39:33 +0000 (15:39 +0100)]
Revert "Lazyly initialize uncommon toolchain detector"

clang/test/Driver/rocm-detect.hip is failing for a number of
configurations, for example:

clang-x86_64-debian-fast
https://lab.llvm.org/buildbot/#/builders/109/builds/57270

clang-debian-cpp20
https://lab.llvm.org/buildbot/#/builders/249/builds/310

clang-with-lto-ubuntu
https://lab.llvm.org/buildbot/#/builders/124/builds/6693

This reverts commit 6fa2abf90886f18472c87bc9bffbcdf4f73c465e.

20 months ago[flang][hlfir] deref pointers before lowering assignment to hlfir.assign
Jean Perier [Mon, 6 Feb 2023 14:14:08 +0000 (15:14 +0100)]
[flang][hlfir] deref pointers before lowering assignment to hlfir.assign

There is little point not to dereference pointers LHS and RHS before
before emitting an hlfir.assign when lowering an assignment.
This pushes complexity and descriptor read side effects that are better
expressed in a load before the assignment.

Differential Revision: https://reviews.llvm.org/D143372

20 months agoFix broken link to CxxCodeBrowser in External Clang Examples
Pratik Sharma [Mon, 6 Feb 2023 14:11:26 +0000 (09:11 -0500)]
Fix broken link to CxxCodeBrowser in External Clang Examples

Replaced the dead link with the correct link in ExternalClangExamples.rst

Differential Revision: https://reviews.llvm.org/D143343
Fixes https://github.com/llvm/llvm-project/issues/60142

20 months ago[AArch64] Don't create ST2 for 64bit store that requires an EXT
David Green [Mon, 6 Feb 2023 14:05:26 +0000 (14:05 +0000)]
[AArch64] Don't create ST2 for 64bit store that requires an EXT

A 64bit st2 which does not start at element 0 will involved adding extra ext
elements, making the st2 unprofitable. This prevents that case which can lead
to a few less instructions.

Differential Revision: https://reviews.llvm.org/D142966

20 months ago[clangd] Remove the direct use of StdSymbolMapping.inc usage.
Haojian Wu [Fri, 3 Feb 2023 16:08:45 +0000 (17:08 +0100)]
[clangd] Remove the direct use of StdSymbolMapping.inc usage.

Replace them with the library APIs.

Differential Revision: https://reviews.llvm.org/D143274

20 months agoUpdate status of WG21 DR1042
Aaron Ballman [Mon, 6 Feb 2023 13:34:31 +0000 (08:34 -0500)]
Update status of WG21 DR1042

We've supported attributes on alias declarations at least as far back
as Clang 3.5 from my testing. This also updates the RUN lines to test
the newer language modes as well.

20 months ago[mlir] more side effect verification in transform dialect
Alex Zinenko [Mon, 23 Jan 2023 14:46:46 +0000 (14:46 +0000)]
[mlir] more side effect verification in transform dialect

Add a verifier checking that if a transform operation consumes a handle
(which is associated with a payload operation being erased or
recreated), it also indicates modification of the payload IR. This
hasn't been consistent in the past because of the "no-aliasing"
assumption where we couldn't have had more than one handle to an
operation, requiring some handle-manipulation operations, such as
`transform.merge_handles` to consume their operands. That assumption has
been liften and it is no longer necessary for these operations to
consume handles and thus make the life harder for the clients.

Additionally, remove TransformEffects.td that uses the ODS mechanism for
indicating side effects that works only for operands and results. It
was being used incorrectly to also indicate effects on the payload IR,
not assocaited with any IR value, and lacked the consume/produce
semantics available via helpers in C++.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D142361

20 months ago[mlir][NFC] Use fully qualified C++ namespaces in .td files.
Vladislav Vinogradov [Mon, 6 Feb 2023 11:31:20 +0000 (14:31 +0300)]
[mlir][NFC] Use fully qualified C++ namespaces in .td files.

Add missing llvm:: and mlir:: namespaces qualifiers to some auto-generated code.

Reviewed By: ftynse, springerm

Differential Revision: https://reviews.llvm.org/D143381

20 months agoRemove no longer needed includes of LegacyPassManager.h
Bjorn Pettersson [Mon, 6 Feb 2023 10:29:17 +0000 (11:29 +0100)]
Remove no longer needed includes of LegacyPassManager.h

Most of the removed includes should probably have been removed already
when we removed TargetMachine::adjustPassManager.

20 months ago[CodeGen] Remove some not needed includes in BackendUtil.cpp
Bjorn Pettersson [Mon, 6 Feb 2023 10:00:37 +0000 (11:00 +0100)]
[CodeGen] Remove some not needed includes in BackendUtil.cpp

Getting rid of some include dependencies that seem to be outdated.

20 months agoAMDGPU/MC: Fix indentation and remove unused macro after D142636
Petar Avramovic [Mon, 6 Feb 2023 12:18:41 +0000 (13:18 +0100)]
AMDGPU/MC: Fix indentation and remove unused macro after D142636

20 months ago[clang-format] PackConstructorInitializers support PCIS_OnlyNextLine
Backl1ght [Mon, 6 Feb 2023 10:47:11 +0000 (18:47 +0800)]
[clang-format] PackConstructorInitializers support PCIS_OnlyNextLine

fixes https://github.com/llvm/llvm-project/issues/60241

Differential Revision: https://reviews.llvm.org/D143091

20 months ago[clang][NFC] Fix a documentation typo
Timm Bäder [Mon, 6 Feb 2023 11:36:25 +0000 (12:36 +0100)]
[clang][NFC] Fix a documentation typo

20 months ago[LoongArch] Add baseline tests for optimizations that merge offsets into instructions
gonglingqin [Thu, 2 Feb 2023 02:10:35 +0000 (10:10 +0800)]
[LoongArch] Add baseline tests for optimizations that merge offsets into instructions

20 months ago[LV] Also check interleaving only in select-min-index.ll
Florian Hahn [Mon, 6 Feb 2023 11:30:14 +0000 (11:30 +0000)]
[LV] Also check interleaving only in select-min-index.ll

The new combination exposed a crash in earlier versions of
D132063.

20 months ago[mlir][llvm] Add missing license header (NFC)
Christian Ulmann [Mon, 6 Feb 2023 11:14:13 +0000 (12:14 +0100)]
[mlir][llvm] Add missing license header (NFC)

This commit adds a missing license header that was forgotten in
https://reviews.llvm.org/D143064.

20 months ago[mlir][MemRef] Add required address space cast when lowering alloc to LLVM
Markus Böck [Sun, 5 Feb 2023 13:58:06 +0000 (14:58 +0100)]
[mlir][MemRef] Add required address space cast when lowering alloc to LLVM

alloc uses either `malloc` or a plugable allocation function for allocating the required memory. Both of these functions always return a `llvm.ptr<i8>`, aka a pointer in the default address space. When allocating for a memref in a different memory space however, no address space cast is created, leading to invalid LLVM IR being generated.

This is currently not caught by the verifier since the pointer to the memory is always bitcast which currently lacks a verifier disallowing address space casts. Translating to actual LLVM IR would cause the verifier to go off, since bitcast cannot translate from one address space to another: https://godbolt.org/z/3a1z97rc9

This patch fixes that issue by generating an address space cast if the address space of the allocation function does not match the address space of the resulting memref.

Not sure whether this is actually a real life problem. I found this issue while converting the pass to using opaque pointers which gets rid of all the bitcasts and hence caused type errors without the address space cast.

Differential Revision: https://reviews.llvm.org/D143341

20 months agoLazyly initialize uncommon toolchain detector
serge-sans-paille [Thu, 26 Jan 2023 07:41:14 +0000 (08:41 +0100)]
Lazyly initialize uncommon toolchain detector

Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Differential Revision: https://reviews.llvm.org/D142606

20 months ago[ARM][AArch64] Regenerate hadd tests. NFC
David Green [Mon, 6 Feb 2023 10:54:18 +0000 (10:54 +0000)]
[ARM][AArch64] Regenerate hadd tests. NFC

This just runs the existing tests through opt -O1, which helps canonicalizing
the code and adds additional flags which can be useful for matching.

20 months ago[flang][NFC] Move IntrinsicCall to Optimizer/Builder/ 6/6
Tom Eccles [Wed, 1 Feb 2023 15:41:29 +0000 (15:41 +0000)]
[flang][NFC] Move IntrinsicCall to Optimizer/Builder/ 6/6

This will allow IntrinsicCall to be used in passes to implement hlfir
transformational intrinsic operations.

Differential Revision: https://reviews.llvm.org/D143084

20 months ago[flang][NFC] Move intrinsic name mangling to IntrinsicCall 5/6
Tom Eccles [Wed, 1 Feb 2023 15:29:01 +0000 (15:29 +0000)]
[flang][NFC] Move intrinsic name mangling to IntrinsicCall 5/6

This removes another dependency of IntrinsicCall upon flang/lib/Lower:
making it possible to move IntrinsicCall into flang/lib/Optimizer.

Differential Revision: https://reviews.llvm.org/D143083

20 months ago[flang][NFC] remove duplicate fir::toInt definition 4/6
Tom Eccles [Wed, 1 Feb 2023 15:22:14 +0000 (15:22 +0000)]
[flang][NFC] remove duplicate fir::toInt definition 4/6

Differential Revision: https://reviews.llvm.org/D143082

20 months ago[flang][NFC] Move runtime helpers used by intrinsics to lib/Optimizer 3/6
Tom Eccles [Wed, 1 Feb 2023 15:14:11 +0000 (15:14 +0000)]
[flang][NFC] Move runtime helpers used by intrinsics to lib/Optimizer 3/6

This will allow IntrinsicCall to be moved into lib/Optimizer later.

Differential Revision: https://reviews.llvm.org/D143081

20 months ago[flang][NFC] remove spurious dependency from IntrinsicCall 2/6
Tom Eccles [Wed, 1 Feb 2023 13:55:36 +0000 (13:55 +0000)]
[flang][NFC] remove spurious dependency from IntrinsicCall 2/6

Differential Revision: https://reviews.llvm.org/D143080

20 months ago[flang][NFC] remove stmtCtx genIntrinsicCall 1/6
Tom Eccles [Wed, 1 Feb 2023 11:54:36 +0000 (11:54 +0000)]
[flang][NFC] remove stmtCtx genIntrinsicCall 1/6

This removes IntrinsicCall's dependency upon StatementContext, which
will make it easier to move IntrinsicCall into flang/lib/Optimizer, for
use in passes.

Differential Revision: https://reviews.llvm.org/D143079

20 months ago[TLI] SimplifyMultipleUseDemandedBits - remove insert_subvector(undef, x, 0) fold
Simon Pilgrim [Mon, 6 Feb 2023 09:55:03 +0000 (09:55 +0000)]
[TLI] SimplifyMultipleUseDemandedBits - remove insert_subvector(undef, x, 0) fold

SimplifyMultipleUseDemandedBits shouldn't be creating general nodes on the fly, it should mainly just peek through them (although we do currently allow creation of new bitcasts and constant folding).

This is mostly a win - by avoiding new nodes we avoid a lot of hasOneUse limitations inside x86 shuffle combining - the main regressions I've noticed are where we've ended up with multiple insert_subvector(undef, x, 0) nodes, widening x to different vector widths - that should hopefully be improved when we remove the last of the vector widening from combineX86ShufflesRecursively for Issue #45319

20 months ago[libc] Fix pthread argument for scudo integration tests when using GCC
David Spickett [Fri, 3 Feb 2023 11:06:34 +0000 (11:06 +0000)]
[libc] Fix pthread argument for scudo integration tests when using GCC

This adds "-pthreads" which appears to be a clang only
alias for "-pthread" (all the drivers check for both).

Use "-pthread" instead to be compatible with gcc.

Otherwise you get:
FAILED: bin/libc-gwp-asan-uaf-should-crash
: && /usr/bin/g++-11 <...> -pthreads <...> projects/libc/test/integration/scudo/liblibc_for_scudo_integration_test.a && :
g++-11: error: unrecognized command-line option ‘-pthreads’; did you mean ‘-pthread’?

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D143258

20 months ago[mlir][tensor][bufferize] tensor.empty does not define the result tensor contents
Matthias Springer [Mon, 6 Feb 2023 09:19:22 +0000 (10:19 +0100)]
[mlir][tensor][bufferize] tensor.empty does not define the result tensor contents

This is encoded in the `BufferizableOpInterface` via `resultBufferizesToMemoryWrite = false`.

Differential Revision: https://reviews.llvm.org/D143181

20 months ago[Instcombine] precommit tests for icmp with intrinsic look through trunc; NFC
chenglin.bi [Mon, 6 Feb 2023 09:23:02 +0000 (17:23 +0800)]
[Instcombine] precommit tests for icmp with intrinsic look through trunc; NFC

20 months ago[mlir][llvm] Drop opaque ptr test in LLVM IR import.
Tobias Gysi [Mon, 6 Feb 2023 09:13:02 +0000 (10:13 +0100)]
[mlir][llvm] Drop opaque ptr test in LLVM IR import.

After switching all LLVM IR import tests to opaque pointers
the specialized opaque pointer test file is redundant.

Reviewed By: Dinistro

Differential Revision: https://reviews.llvm.org/D143370

20 months ago[Modules] Recreate file manager for ftime-trace when compiling a module
Chuanqi Xu [Mon, 6 Feb 2023 09:11:22 +0000 (17:11 +0800)]
[Modules] Recreate file manager for ftime-trace when compiling a module

Close https://github.com/llvm/llvm-project/issues/60544.

The root cause for the issue is that when we compile a module unit, the
file manager (and proprocessor and source manager) are owned by AST
instead of the compilaton instance. So the file manager may be invalid
when we want to create a time-report file for -ftime-trace when we are
compiling a module unit.

This patch tries to recreate the file manager for -ftime-trace if we
find the file manager is not valid.

20 months ago[InstCombine] precommit tests for icmp with bool range; NFC
chenglin.bi [Mon, 6 Feb 2023 09:16:25 +0000 (17:16 +0800)]
[InstCombine] precommit tests for icmp with bool range; NFC

20 months ago[NFC][OpenMP][libomptarget] Fix format in PluginInterface header
Kevin Sala [Mon, 6 Feb 2023 09:12:55 +0000 (10:12 +0100)]
[NFC][OpenMP][libomptarget] Fix format in PluginInterface header

20 months ago[RISCV][NFC] Update debug message for XTHeadVdot
Philipp Tomsich [Sat, 4 Feb 2023 16:59:32 +0000 (17:59 +0100)]
[RISCV][NFC] Update debug message for XTHeadVdot

As we prepare the tree to add more vendor-defined extensions that are
originating with T-Head, the debug message announcing the XTheadVdot
decoder namespace should refer to XTHeadVdot instead of all T-Head
custom extensions.

20 months ago[mlir][llvm] Fix bug in constant import from LLVM IR.
Tobias Gysi [Mon, 6 Feb 2023 09:01:27 +0000 (10:01 +0100)]
[mlir][llvm] Fix bug in constant import from LLVM IR.

The revision addresses a bug during constant expression traversal
when importing LLVM IR. A constant expression may have cyclic
dependencies, for example, when a constant is initialized with its
address. This revision extends the constant expression traversal
to detect cyclic dependencies and adds a test to verify this
case is handled properly.

Reviewed By: Dinistro

Differential Revision: https://reviews.llvm.org/D143152

20 months ago[OpenMP][libomptarget] Notify the plugins regarding new mapping/unmappings
Kevin Sala [Wed, 25 Jan 2023 00:04:07 +0000 (01:04 +0100)]
[OpenMP][libomptarget] Notify the plugins regarding new mapping/unmappings

The NextGen plugins use the information regarding new mapping/unmappings to
lock/unlock the corresponding host buffer and speed up the host-device memory
transfers involving those buffers. The locking/unlocking is disabled by default
and can be enabled by the LIBOMPTARGET_LOCK_MAPPED_HOST_BUFFERS envar. The
envar accepts boolean values (on/off) and a special option:
  - off:       Do not lock mapped host buffers (default).
  - on:        Lock mapped host buffers automatically, but do not report lock
               failures if the plugin fails to lock them.
  - mandatory: Lock mapped host buffers automatically and treat locking failures
               in the plugins as fatal errors. This option may be useful for
               debugging purposes.

Differential Revision: https://reviews.llvm.org/D142514

20 months ago[NFC] Inline variable
Guillaume Chatelet [Mon, 6 Feb 2023 09:03:55 +0000 (09:03 +0000)]
[NFC] Inline variable

20 months ago[clangd] Semantic highlighting for constrained-parameter
Nathan Ridge [Mon, 30 Jan 2023 08:09:00 +0000 (03:09 -0500)]
[clangd] Semantic highlighting for constrained-parameter

Differential Revision: https://reviews.llvm.org/D142871

20 months ago[Release] Increase test-release.sh verbosity
Rainer Orth [Mon, 6 Feb 2023 08:30:36 +0000 (09:30 +0100)]
[Release] Increase test-release.sh verbosity

`test-release.sh` is too silent in some cases:

- Only the build proper is run verbosely, but `check-all` is not.
- `lit` is run without `-v`, so in case of failures one cannot see what's
actually wrong.

This patch fixes both issues, running all `${MAKE}` invocations with
`$Verbose` (except for `${MAKE} install` where it would only add noise),
and running `lit` with `-v`.

Tested on `x86_64-pc-linux-gnu` and `arm64-apple-darwin21.6`.

Differential Revision: https://reviews.llvm.org/D143249

20 months ago[flang][hlfir] Lower asInquired intrinsic arguments
Jean Perier [Mon, 6 Feb 2023 07:53:57 +0000 (08:53 +0100)]
[flang][hlfir] Lower asInquired intrinsic arguments

Differential Revision: https://reviews.llvm.org/D143272

20 months ago[flang][hlfir] Turn fir.char<1> results into hlfir.expr<fir.char<1>>
Jean Perier [Mon, 6 Feb 2023 07:51:56 +0000 (08:51 +0100)]
[flang][hlfir] Turn fir.char<1> results into hlfir.expr<fir.char<1>>

This gets rid of a special case with CHAR() intrinsic and BIND(C) results.
I tested this has no impact on the LLVM assembly when LLVM opt -01 or
more is run.
See comment in the patch for more details.

Differential Revision: https://reviews.llvm.org/D143270

20 months ago[RISCV] Use uint32_t intead of uint64_t for instruction fields in RISCVDisassembler...
Craig Topper [Mon, 6 Feb 2023 07:12:22 +0000 (23:12 -0800)]
[RISCV] Use uint32_t intead of uint64_t for instruction fields in RISCVDisassembler.cpp. NFC

The tablegen generated code is templated based on the type of Insn
passed to decodeInstruction which is currently uint32_t. All of the
fields extracted will this type.

20 months ago[RISCV] Simplify some code in RISCVDisassembler. NFC
Craig Topper [Mon, 6 Feb 2023 06:42:57 +0000 (22:42 -0800)]
[RISCV] Simplify some code in RISCVDisassembler. NFC

Create X0 register directly instead of passing 0 to DecodeGPRRegisterClass.

20 months agoAMDGPU: Mark control flow intrinsics non-duplicable
Ruiling Song [Thu, 2 Feb 2023 05:59:59 +0000 (13:59 +0800)]
AMDGPU: Mark control flow intrinsics non-duplicable

This is used to help get simplified CFG for divergent regions as well as
get better code generation in some cases.

For example, with below IR:
```
define amdgpu_kernel void @test() {
bb:
  br label %bb1

bb1:
  %tmp = phi i32 [ 0, %bb ], [ %tmp5, %bb4 ]
  %tid = call i32 @llvm.amdgcn.workitem.id.x()
  %cnd = icmp eq i32 %tid, 0
  br i1 %cnd, label %bb4, label %bb2

bb2:
  %tmp3 = add nsw i32 %tmp, 1
  br label %bb4

bb4:
  %tmp5 = phi i32 [ %tmp3, %bb2 ], [ %tmp, %bb1 ]
  store volatile i32 %tmp5, ptr addrspace(1) undef
  br label %bb1
}
```

We got below assembly before the change:
```
  v_mov_b32_e32 v1, 0
  v_cmp_eq_u32_e32 vcc, 0, v0
  s_branch .LBB0_2
.LBB0_1:                                ; %bb4
                                        ;   in Loop: Header=BB0_2 Depth=1
  s_mov_b32 s2, -1
  s_mov_b32 s3, 0xf000
  buffer_store_dword v1, off, s[0:3], 0
  s_waitcnt vmcnt(0)
.LBB0_2:                                ; %bb
                                        ; =>This Inner Loop Header: Depth=1
  s_and_saveexec_b64 s[0:1], vcc
  s_xor_b64 s[0:1], exec, s[0:1]
                                        ; kill: def $sgpr0_sgpr1 killed $sgpr0_sgpr1 killed $exec
  s_cbranch_execnz .LBB0_1
; %bb.3:                                ; %bb2
                                        ;   in Loop: Header=BB0_2 Depth=1
  s_or_b64 exec, exec, s[0:1]
  s_waitcnt expcnt(0)
  v_add_i32_e64 v1, s[0:1], 1, v1
  s_branch .LBB0_1
```

After the change:
```
  s_mov_b32 s0, 0
  v_cmp_eq_u32_e32 vcc, 0, v0
  s_mov_b32 s2, -1
  s_mov_b32 s3, 0xf000
  v_mov_b32_e32 v0, s0
  s_branch .LBB0_2
.LBB0_1:                                ; %bb4
                                        ;   in Loop: Header=BB0_2 Depth=1
  buffer_store_dword v0, off, s[0:3], 0
  s_waitcnt vmcnt(0)
.LBB0_2:                                ; %bb1
                                        ; =>This Inner Loop Header: Depth=1
  s_and_saveexec_b64 s[0:1], vcc
  s_cbranch_execnz .LBB0_1
; %bb.3:                                ; %bb2
                                        ;   in Loop: Header=BB0_2 Depth=1
  s_or_b64 exec, exec, s[0:1]
  s_waitcnt expcnt(0)
  v_add_i32_e64 v0, s[0:1], 1, v0
  s_branch .LBB0_1
```

We are using one less VGPR, one less s_xor_, and better LICM with one
additional branch after the change. Please note the experiment
was done with reverting the workaround D139780, as it will stop the
tail-duplication completely for this case.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D118250

20 months ago[mlir] Use mlir::TypedValue to avoid compiler bug in MSVC.
Adrian Kuegel [Mon, 6 Feb 2023 07:02:28 +0000 (08:02 +0100)]
[mlir] Use mlir::TypedValue to avoid compiler bug in MSVC.

20 months agoRevert "[lldb] Fix warning about unhandled enum value `WasmExternRef` (NFC)."
Kazu Hirata [Mon, 6 Feb 2023 06:45:46 +0000 (22:45 -0800)]
Revert "[lldb] Fix warning about unhandled enum value `WasmExternRef` (NFC)."

This reverts commit b27e4f72213e78cacf0ce5bfd127261ec0b9309b.

bccf5999d38f14552f449618c1d72d18613f4285 necessitates this revert.

20 months agoRevert "[clang][WebAssembly] Initial support for reference type externref in clang"
Vitaly Buka [Mon, 6 Feb 2023 05:26:19 +0000 (21:26 -0800)]
Revert "[clang][WebAssembly] Initial support for reference type externref in clang"

Very likely breaks stage 3 of msan build bot.
Good: 764c88a50ac76a2df2d051a0eb5badc6867aabb6 https://lab.llvm.org/buildbot/#/builders/74/builds/17058
Looks unrelated: 48b5a06dfcab12cf093a1a3df42cb5b684e2be4c
Bad: 48b5a06dfcab12cf093a1a3df42cb5b684e2be4c https://lab.llvm.org/buildbot/#/builders/74/builds/17059

This reverts commit eb66833d19573df97034a81279eda31b8d19815b.