platform/upstream/llvm.git
21 months ago[mlir][sparse] Replace the folding of nop convert with a codegen rule.
bixia1 [Wed, 19 Oct 2022 00:22:13 +0000 (17:22 -0700)]
[mlir][sparse] Replace the folding of nop convert with a codegen rule.

This is to allow the use of a nop convert to express that the sparse tensor
allocated through bufferization::AllocTensorOp will be expanded to sparse
tensor storage by sparse tensor codegen.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D136214

21 months ago[OMPIRBuilder] Support depend clause for task
Prabhdeep Singh Soni [Fri, 7 Oct 2022 20:55:13 +0000 (16:55 -0400)]
[OMPIRBuilder] Support depend clause for task

This patch adds support for the `depend` clause for the `task`
construct.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135695

21 months ago[DX] Fix missing preserved analysis
Chris Bieneman [Wed, 19 Oct 2022 17:09:43 +0000 (12:09 -0500)]
[DX] Fix missing preserved analysis

The ShaderFlagsAnalysisWrapper needs to be marked to preserve all
analyssis.

Fixes #58474 (https://github.com/llvm/llvm-project/issues/58474)

21 months ago[AArch64] Fix minor issue introduced in D135950.
Sander de Smalen [Wed, 19 Oct 2022 16:53:18 +0000 (16:53 +0000)]
[AArch64] Fix minor issue introduced in D135950.

The Key for the SubtargetMap had the StreamingSVEModeDisabled in the
wrong place. This change is non-functional, since the string (key) is
still unique.

21 months ago[DirectX] Disabling currently failing test
Chris Bieneman [Wed, 19 Oct 2022 16:50:08 +0000 (11:50 -0500)]
[DirectX] Disabling currently failing test

The pretty-printer isn't working because the resource analysis isn't
properly preservered.

21 months ago[AArch64] SME2 Single-multi vector ternary int/FP 2 and 4 registers
Caroline Concatto [Mon, 3 Oct 2022 13:11:01 +0000 (14:11 +0100)]
[AArch64] SME2 Single-multi vector ternary int/FP 2 and 4 registers

This patch adds the assembly/disassembly for the following instructions:

For INT:
    ADD(array results, multiple and single vector): Add replicated single
        vector to multi-vector with ZA array vector results.
    SUB(array results, multiple and single vector): Subtract replicated single
        vector from multi-vector with ZA array vector results.
For FP:
    FMLA (multiple and single vector): Multi-vector floating-point fused
          multiply-add by vector.
    FMLS (multiple and single vector): Multi-vector floating-point
          multiply-subtract long by vector.
The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

The Matriz Operand has 2 new sizes 32(.s) and 64(.d) bits
(MatrixOp32 and MatrixOp64)

Depends on: D135448

Depends on:  D135952

Differential Revision: https://reviews.llvm.org/D135455

21 months ago[AArch64][SME] Disable (SLP|Loop)Vectorizer when function may be executed in streamin...
Sander de Smalen [Wed, 19 Oct 2022 14:14:00 +0000 (14:14 +0000)]
[AArch64][SME] Disable (SLP|Loop)Vectorizer when function may be executed in streaming mode.

When the SME attributes tell that a function is or may be executed in Streaming
SVE mode, we currently need to be conservative and disable _any_ vectorization
(fixed or scalable) because the code-generator does not yet support generating
streaming-compatible code.

Scalable auto-vec will be gradually enabled in the future when we have
confidence that the loop-vectorizer won't use any SVE or NEON instructions
that are illegal in Streaming SVE mode.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D135950

21 months ago[MLIR][Tensor] Remove assert in PadOp builder
Lorenzo Chelini [Wed, 19 Oct 2022 15:31:22 +0000 (17:31 +0200)]
[MLIR][Tensor] Remove assert in PadOp builder

The assert is misplaced as the result type is allowed to be null. A few
lines below the result type is inferred if it is passed a nullptr.
Besides, this behavior is described in the documentation of the builder.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D136262

21 months agoMove HLSL builtins into hlsl namespace
Chris Bieneman [Wed, 19 Oct 2022 15:18:19 +0000 (10:18 -0500)]
Move HLSL builtins into hlsl namespace

Should have done this from the start. Since all the injected AST types
are in the hlsl namespace we should also put the header-defined types
and functions in there too.

This updates the basic_types test to run once with the namespaced types
and once without, and adds using declarations or namespaces calls in
other tests.

Reviewed By: python3kgae

Differential Revision: https://reviews.llvm.org/D135973

21 months ago[X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics
Phoebe Wang [Wed, 19 Oct 2022 08:26:54 +0000 (16:26 +0800)]
[X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

This is an alternative of D120395 and D120411.

Previously we use `__bfloat16` as a typedef of `unsigned short`. The
name may give user an impression it is a brand new type to represent
BF16. So that they may use it in arithmetic operations and we don't have
a good way to block it.

To solve the problem, we introduced `__bf16` to X86 psABI and landed the
support in Clang by D130964. Now we can solve the problem by switching
intrinsics to the new type.

Reviewed By: LuoYuanke, RKSimon

Differential Revision: https://reviews.llvm.org/D132329

21 months ago[AMDGPU] New helper function SIInsertWaitcnts::getVmemWaitEventType
Jay Foad [Wed, 19 Oct 2022 12:39:20 +0000 (13:39 +0100)]
[AMDGPU] New helper function SIInsertWaitcnts::getVmemWaitEventType

This just commons up and simplifies some logic that was repeated in
SIInsertWaitcnts::updateEventWaitcntAfter. NFCI.

Differential Revision: https://reviews.llvm.org/D136253

21 months ago[SLP][NFC]Add a test for possible reordering gap in SLP, NFC.
Alexey Bataev [Wed, 19 Oct 2022 15:21:09 +0000 (08:21 -0700)]
[SLP][NFC]Add a test for possible reordering gap in SLP, NFC.

21 months agoAvoid exporting 80-bit fp functions for architectures other than Intel.
Malhar Jajoo [Wed, 19 Oct 2022 14:55:15 +0000 (15:55 +0100)]
Avoid exporting 80-bit fp functions for architectures other than Intel.

This patch is a partial fix for [[ https://github.com/llvm/llvm-project/issues/56349 | issue ]], due to functions affected by D117473.

Implementation details:
The patch essentially creates a new macro if the architecture is either
intel32 or intel64, since the generate-def.pl cannot process boolean algebra
on macros.

Reviewed By: jlpeyton

Differential Revision: https://reviews.llvm.org/D135795

21 months ago[analyzer] Make directly bounded LazyCompoundVal as lazily copied
Tomasz Kamiński [Wed, 19 Oct 2022 09:38:21 +0000 (11:38 +0200)]
[analyzer] Make directly bounded LazyCompoundVal as lazily copied

Previously, `LazyCompoundVal` bindings to subregions referred by
`LazyCopoundVals`, were not marked as //lazily copied//.

This change returns `LazyCompoundVals` from `getInterestingValues()`,
so their regions can be marked as //lazily copied// in `RemoveDeadBindingsWorker::VisitBinding()`.

Depends on D134947

Authored by: Tomasz Kamiński <tomasz.kamiński@sonarsource.com>

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D135136

21 months ago[analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal
Tomasz Kamiński [Wed, 19 Oct 2022 09:38:21 +0000 (11:38 +0200)]
[analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal

To illustrate our current understanding, let's start with the following program:
https://godbolt.org/z/33f6vheh1
```lang=c++
void clang_analyzer_printState();

struct C {
   int x;
   int y;
   int more_padding;
};

struct D {
   C c;
   int z;
};

C foo(D d, int new_x, int new_y) {
   d.c.x = new_x;       // B1
   assert(d.c.x < 13);  // C1

   C c = d.c;           // L

   assert(d.c.y < 10);  // C2
   assert(d.z < 5);     // C3

   d.c.y = new_y;       // B2

   assert(d.c.y < 10);  // C4

   return c;  // R
}
```
In the code, we create a few bindings to subregions of root region `d` (`B1`, `B2`), a constrain on the values  (`C1`, `C2`, ….), and create a `lazyCompoundVal` for the part of the region `d` at point `L`, which is returned at point `R`.

Now, the question is which of these should remain live as long the return value of the `foo` call is live. In perfect a word we should preserve:

  # only the bindings of the subregions of `d.c`, which were created before the copy at `L`. In our example, this includes `B1`, and not `B2`.  In other words, `new_x` should be live but `new_y` shouldn’t.

  # constraints on the values of `d.c`, that are reachable through `c`. This can be created both before the point of making the copy (`L`) or after. In our case, that would be `C1` and `C2`. But not `C3` (`d.z` value is not reachable through `c`) and `C4` (the original value of`d.c.y` was overridden at `B2` after the creation of `c`).

The current code in the `RegionStore` covers the use case (1), by using the `getInterestingValues()` to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case `d.c.x` in `C1` that has value `new_x`), but it fails to preserve the constraints that require creating a new symbol for location (`d.c.y` in `C2`).

We introduce the concept of //lazily copied// locations (regions) to the `SymbolReaper`, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by `lazyCompoundVal`. A //readable// location (region) is a location that //live// or //lazily copied// . And symbols that refer to values in regions are alive if the region is //readable//.

For simplicity, we follow the current approach to live regions and mark the base region as //lazily copied//, and consider any subregions as //readable//. This makes some symbols falsy live (`d.z` in our example) and keeps the corresponding constraints alive.

The rename `Regions` to `LiveRegions` inside  `RegionStore` is NFC change, that was done to make it clear, what is difference between regions stored in this two sets.

Regression Test: https://reviews.llvm.org/D134941
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D134947

21 months ago[Libomptarget][NFC] clang-format the libomptarget OpenMP tests
Joseph Huber [Wed, 19 Oct 2022 13:26:35 +0000 (08:26 -0500)]
[Libomptarget][NFC] clang-format the libomptarget OpenMP tests

Summary:
Recent changes to clang-format improved the handling of OpenMP pragmas.
Clean up the existing libomptarget tests.

21 months ago[AMDGPU] V_LDEXP_F16 encoding fix and doc update.
Joe Nash [Tue, 18 Oct 2022 18:59:19 +0000 (14:59 -0400)]
[AMDGPU] V_LDEXP_F16 encoding fix and doc update.

The amdgcn.ldexp.* intrinsics take an i32 value as src1.
The V_LDEXP_F16 instruction considers src1 an f16 operand, and therefore
src1 is implicitly truncated to 16 bits when lowering to that instruction from the
intrinsic. This is unlikely to result in an error in practice
because values that large are not useful.

The operand class of src1 in the True16 version of the instruction has
been corrected to encode correctly on GFX11.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D136195

21 months ago[Verifier] Allow undef/poison token argument to llvm.experimental.gc.result
dbakunevich [Wed, 19 Oct 2022 13:40:54 +0000 (20:40 +0700)]
[Verifier] Allow undef/poison token argument to llvm.experimental.gc.result

As part of the optimization in the unreachable code, we remove
tokens, thereby replacing them with undef/poison in intrinsics.
But the verifier falls on the assertion, within of what it sees
token poison in unreachable code, which in turn is incorrect.

bug: 57871, https://github.com/llvm/llvm-project/issues/57871
Differential Revision: https://reviews.llvm.org/D134427

21 months ago[flang] Fix missing generated includes in out of tree build
David Spickett [Wed, 19 Oct 2022 12:22:24 +0000 (12:22 +0000)]
[flang] Fix missing generated includes in out of tree build

875fd9df76ded4a88a3a44b690f290ea98f91705 added a new dialect
with some generated files.

When flang is built out of tree (build llvm/clang/mlir first, then
build flang pointing at the first build) those files were not created
at all.

I don't 100% understand why not but juding by the comment at the top
of the file, add_mlir_interface probably expects to run in an MLIR
directory, as add_mlir_dialect does.

So in the same way, I've just inlined enough of that function to
fix the out of tree build.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D136250

21 months ago[gn build] Port 8cadac41e9f6
LLVM GN Syncbot [Wed, 19 Oct 2022 12:45:01 +0000 (12:45 +0000)]
[gn build] Port 8cadac41e9f6

21 months ago[clang][dataflow] Add equivalence relation `Value` type.
Yitzhak Mandelbaum [Fri, 14 Oct 2022 12:10:52 +0000 (12:10 +0000)]
[clang][dataflow] Add equivalence relation `Value` type.

Defines an equivalence relation on the `Value` type to standardize several
places in the code where we replicate the ~same equivalence comparison.

Differential Revision: https://reviews.llvm.org/D135964

21 months ago[VPlan] Add VPValue::isDefinedOutsideVectorRegions helper (NFC).
Florian Hahn [Wed, 19 Oct 2022 12:20:30 +0000 (13:20 +0100)]
[VPlan] Add VPValue::isDefinedOutsideVectorRegions helper (NFC).

@Ayal suggested a better named helper than using `!getDef()` to check if
a value is invariant across all parts.

The property we are using here is that the VPValue is defined outside
any vector loop region. There's a TODO left to handle recipes defined in
pre-header blocks.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133666

21 months ago[clangd] consider ~^foo() to target the destructor, not the type
Sam McCall [Tue, 18 Oct 2022 23:42:31 +0000 (01:42 +0200)]
[clangd] consider ~^foo() to target the destructor, not the type

This behavior was once deliberate, but i've yet to find someone who likes it.
The reference behavior is unchanged: the `foo` within ~foo is still considered
a reference to the type. This means rename etc still works.

fixes https://github.com/clangd/clangd/issues/179

Differential Revision: https://reviews.llvm.org/D136212

21 months agoRevert rG42230efccf8fe1185be5fa6c23dce0a8183d6ec9 "[DAG] Fold (sra (or (shl x, c1...
Simon Pilgrim [Wed, 19 Oct 2022 11:07:29 +0000 (12:07 +0100)]
Revert rG42230efccf8fe1185be5fa6c23dce0a8183d6ec9 "[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1"

@foad was right - this isn't actually going to help with D136042 as much as hoped, we need a better AMDGPU-specific solution as other targets are likely to make use of it

21 months ago[Attr][Doc] Fix pragma unroll documentation.
Alexey Bader [Tue, 18 Oct 2022 11:42:21 +0000 (04:42 -0700)]
[Attr][Doc] Fix pragma unroll documentation.

There is a contradiction in the #pragma unroll behavior documentation.
It says that specifying `#pragma unroll` without a parameter directs the
loop unroller to attempt to partially unroll the loop if the trip count
is not known at compile time. At the same time later it states that
`#pragma unroll` has identical semantics to `#pragma clang loop unroll(full)`,
which doesn't attempt to unroll partially if the trip count is not known
at compile time.

pragma clang loop unroll(enable):
If unroll(enable) is specified the unroller will attempt to fully unroll
the loop if the trip count is known at compile time. If the fully
unrolled code size is greater than an internal limit the loop will be
partially unrolled up to this limit. If the trip count is not known at
compile time the loop will be partially unrolled with a heuristically
chosen unroll factor.

pragma clang loop unroll(full):
If unroll(full) is specified the unroller will attempt to fully unroll
the loop if the trip count is known at compile time identically to
unroll(enable). However, with unroll(full) the loop will not be unrolled
if the loop count is not known at compile time.

Differential Revision: https://reviews.llvm.org/D136160

21 months ago[mlir] Add TransposeOp to Linalg structured ops.
Oleg Shyshkov [Wed, 19 Oct 2022 09:42:25 +0000 (11:42 +0200)]
[mlir] Add TransposeOp to Linalg structured ops.

RFC: https://discourse.llvm.org/t/rfc-primitive-ops-add-mapop-reductionop-transposeop-broadcastop-to-linalg/64184

Differential Revision: https://reviews.llvm.org/D135854

21 months ago[SCEV] Replace assert with returning CouldNotComp in computeMaxBECountForLT.
Florian Hahn [Wed, 19 Oct 2022 10:24:10 +0000 (11:24 +0100)]
[SCEV] Replace assert with returning CouldNotComp in computeMaxBECountForLT.

This patch removes the bail out for signed predicates and non-positive
strides in howManyLessThans and updates computeMaxBECountForLT to return
SCEVCouldNotCompute for signed predicates with negative strides.

AFAICT bail-out was only added because computeMaxBECountForLT may not
handle negative signed strides correctly. Instead of not calling
computeMaxBECountForLT at all because we bail out earlier, we can
instead return SCEVCouldNotCompute in computeMaxBECountForLT.

The max backedge taken count will be computed as the max value of the
symbolic backedge taken count.

This improves precision in cases where we can compute symbolic backedge
taken counts and also fixes a crash.

Fixes #57818.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D135667

21 months ago[AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.
bipmis [Wed, 19 Oct 2022 10:22:58 +0000 (11:22 +0100)]
[AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.

This patch extends the load merge/widen in AggressiveInstCombine() to handle reverse load patterns.

Differential Revision: https://reviews.llvm.org/D135137

21 months agoKeep configuration file search directories in ExpansionContext. NFC
Serge Pavlov [Wed, 19 Oct 2022 10:19:04 +0000 (17:19 +0700)]
Keep configuration file search directories in ExpansionContext. NFC

Class ExpansionContext encapsulates options for search and expansion of
response files, including configuration files. With this change the
directories which are searched for configuration files are also stored
in ExpansionContext.

Differential Revision: https://reviews.llvm.org/D135439

21 months ago[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2...
Simon Pilgrim [Wed, 19 Oct 2022 10:18:39 +0000 (11:18 +0100)]
[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1

Helps with some of the AMDGPU regressions identified in D136042 where we were losing signed BFE patterns after sinking shifts behind logic ops.

Differential Revision: https://reviews.llvm.org/D136081

21 months ago[AMDGPU] Assume getDefIgnoringCopies will succeed. NFC.
Jay Foad [Wed, 19 Oct 2022 09:32:08 +0000 (10:32 +0100)]
[AMDGPU] Assume getDefIgnoringCopies will succeed. NFC.

getDefIgnoringCopies and getSrcRegIgnoringCopies should not fail on
valid MIR, so don't bother to check for failure.

Differential Revision: https://reviews.llvm.org/D136238

21 months ago[mlir][llvm] Ordered traversal in LLVM IR import.
Tobias Gysi [Wed, 19 Oct 2022 09:48:45 +0000 (12:48 +0300)]
[mlir][llvm] Ordered traversal in LLVM IR import.

The revision performs a topological sort of the blocks to
ensure the operations are processed in dominance order.
After the change, we do not need to introduce dummy
instructions if an operand has not yet been processed.
Additionally, the revision also moves and simplifies the
control-flow related tests to a separate test file.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D136230

21 months ago[AArch64] Replace sme-i64 by sme-i16i64 and sme-f64 by sme-f64f64
Caroline Concatto [Wed, 19 Oct 2022 09:43:37 +0000 (10:43 +0100)]
[AArch64] Replace sme-i64 by sme-i16i64  and sme-f64 by sme-f64f64

The names in developer.arm for these SME features are:
  HaveSMEI16I64 and HaveSMEF64F64
so the new flag names are consistent with the documentation page

Reviewed By: sdesmalen, c-rhodes

Differential Revision: https://reviews.llvm.org/D135974

21 months ago[AMDGPU] Add test case for a VOPD s_delay_alu insertion bug
Jay Foad [Wed, 19 Oct 2022 09:52:12 +0000 (10:52 +0100)]
[AMDGPU] Add test case for a VOPD s_delay_alu insertion bug

21 months ago[AMDGPU][Backend] Fix user-after-free in AMDGPUReleaseVGPRs::isLastVGPRUseVMEMStore
Juan Manuel MARTINEZ CAAMAÑO [Wed, 19 Oct 2022 07:40:22 +0000 (02:40 -0500)]
[AMDGPU][Backend] Fix user-after-free in AMDGPUReleaseVGPRs::isLastVGPRUseVMEMStore

Reviewed By: jpages, arsenm

Differential Revision: https://reviews.llvm.org/D134641

21 months ago[libc++] Remove std::function in C++03
Nikolas Klauser [Wed, 19 Oct 2022 09:07:34 +0000 (11:07 +0200)]
[libc++] Remove std::function in C++03

We've said that we'll remove `std::function` from C++03 in LLVM 16, so we might as well do it now before we forget.

Reviewed By: ldionne, #libc, Mordante

Spies: jloser, Mordante, libcxx-commits

Differential Revision: https://reviews.llvm.org/D135868

21 months ago[flang] Add fir.declare operation
Jean Perier [Wed, 19 Oct 2022 09:06:27 +0000 (11:06 +0200)]
[flang] Add fir.declare operation

Add fir.declare operation whose purpose was described in https://reviews.llvm.org/D134285.
It uses the FortranVariableInterfaceOp for most of its logic (including the verifier).
The rational is that all these aspects/logic will be shared by hlfir.designate and
hlfir.associate.

Its codegen and lowering will be added in later patches.

Differential Revision: https://reviews.llvm.org/D136181

21 months ago[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)
Nikita Popov [Wed, 19 Oct 2022 09:03:54 +0000 (11:03 +0200)]
[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)

Follow up on D135962, renaming the method name to match the new
type name.

21 months ago[AA] Rename uses of FunctionModRefBehavior (NFC)
Nikita Popov [Wed, 19 Oct 2022 08:42:09 +0000 (10:42 +0200)]
[AA] Rename uses of FunctionModRefBehavior (NFC)

Followup to D135962 to rename remaining uses of
FunctionModRefBehavior to MemoryEffects. Does not touch API names
yet, but also updates variables names FMRB/MRB to ME, to match the
new type name.

21 months ago[AA] Rename FunctionModRefBehavior to MemoryEffects (NFC)
Nikita Popov [Fri, 14 Oct 2022 14:57:07 +0000 (16:57 +0200)]
[AA] Rename FunctionModRefBehavior to MemoryEffects (NFC)

As part of https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579,
the FunctionModRefBehavior class sees a good bit of additional use,
and I've found the name to be something of a mouthful. This patch
renames it to MemoryEffects, which has a couple of advantages over
the old name:
 * It is more concise.
 * It decouples it from modelling only functions.
 * It matches the terminology of the aforementioned RFC.
 * The meaning should be more obvious to people not familiar with
   our particular AA lingo.

This patch just updates the class definition. Other uses of the
name will be updated separately.

Differential Revision: https://reviews.llvm.org/D135962

21 months ago[RISCV] Enable the LocalStackSlotAllocation pass support
luxufan [Wed, 19 Oct 2022 06:34:05 +0000 (14:34 +0800)]
[RISCV] Enable the LocalStackSlotAllocation pass support

For RISC-V, load/store(exclude vector load/store) instructions only
has a 12 bit immediate operand. If the offset is out-of-range, it
must make use of a temp register to make up this offset. If between
these offsets, they have a small(IsInt<12>) relative offset,
LocalStackSlotAllocation pass can find a value as frame base register's
value, and replace the origin offset with this register's value plus
the relative offset.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98101

21 months ago[flang][NFC] Fix printed name from proc_nopass_p2
Valentin Clement [Wed, 19 Oct 2022 07:51:33 +0000 (09:51 +0200)]
[flang][NFC] Fix printed name from proc_nopass_p2

21 months ago[lldb][trace] Fix some minor bugs in the call tree
Walter Erquinigo [Wed, 19 Oct 2022 07:18:01 +0000 (00:18 -0700)]
[lldb][trace] Fix some minor bugs in the call tree

- We weren't truncating the output files
- We weren't considering the case in which we couldn't disassembly an
instruction.

21 months ago[flang] Add fir.dispatch code generation
Valentin Clement [Wed, 19 Oct 2022 07:41:23 +0000 (09:41 +0200)]
[flang] Add fir.dispatch code generation

fir.dispatch code generation uses the binding table stored in the
type descriptor. There is no runtime call involved. The binding table
is always build from the parent type so the index of a specific binding
is the same in the parent derived-type or in the extended type.

Follow-up patches will deal cases not present here such as allocatable
polymorphic entities or pointers.

Reviewed By: jeanPerier, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D136189

21 months ago[include-cleaner] Fix link errors when -DBUILD_SHARED_LIBS=ON
Kai Luo [Wed, 19 Oct 2022 07:25:44 +0000 (07:25 +0000)]
[include-cleaner] Fix link errors when -DBUILD_SHARED_LIBS=ON

Fixed ppc buildbot https://lab.llvm.org/buildbot/#/builders/121/builds/24273 which is using `-DBUILD_SHARED_LIBS=ON`.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D136229

21 months ago[flang] Introduce FortranVariableOpInterface for ops creating variable
Jean Perier [Wed, 19 Oct 2022 06:55:02 +0000 (08:55 +0200)]
[flang] Introduce FortranVariableOpInterface for ops creating variable

HLFIR will rely on certain operations to create SSA memory values
that correspond to a Fortran variable. They will hold bounds and type
parameters information as well as metadata (like Fortran attributes).

This patch adds an interface that for such operations so that Fortran
variable can be stored, manipulated, and queried regardless of what
created them. This is so far intended for fir.declare, hlfir.designate
and hlfir.associate operations.
It is added to FIR and not HLFIR because fir.declare needs it and it
does not itself needs any HLFIR concepts.

Unit tests for the interface methods will be added alongside
fir.declare in the next patch.

Differential Revision: https://reviews.llvm.org/D136151

21 months ago[mlir][spirv] Consider target when converting one-element vector
Lei Zhang [Wed, 19 Oct 2022 05:49:08 +0000 (05:49 +0000)]
[mlir][spirv] Consider target when converting one-element vector

Vectors with just one element will be converted into scalars.
However, we cannot just return the element types and assume it
is supported in the target environment; we need to conver the
element type again factoring in those considerations.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D136226

21 months ago[X86] Add WRMSRNS instructions.
Freddy Ye [Wed, 19 Oct 2022 03:21:46 +0000 (11:21 +0800)]
[X86] Add WRMSRNS instructions.

For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135935

21 months ago[RISCV] Add an early out to lowerVECTOR_SHUFFLEAsVSlidedown. NFC
Craig Topper [Wed, 19 Oct 2022 04:11:42 +0000 (21:11 -0700)]
[RISCV] Add an early out to lowerVECTOR_SHUFFLEAsVSlidedown. NFC

If Mask[0] is 0, then we're never going to match a slidedown. If
we get through the for loop, then it's an identity mask which should
have already been optimized out. Otherwise it's some non-contiguous
mask that will fail out of the lop. Might as well not bother entering
the loop.

21 months ago[BOLT][NFC] Refactor EFMM initialization
Maksim Panchenko [Thu, 22 Sep 2022 20:08:05 +0000 (13:08 -0700)]
[BOLT][NFC] Refactor EFMM initialization

Move EFMM initialization code to emitAndLink(), where EFMM is used.

Reviewed By: yavtuk

Differential Revision: https://reviews.llvm.org/D136205

21 months ago[X86] Add MSRLIST instructions.
Freddy Ye [Wed, 19 Oct 2022 01:49:35 +0000 (09:49 +0800)]
[X86] Add MSRLIST instructions.

For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: skan, RKSimon

Differential Revision: https://reviews.llvm.org/D135934

21 months ago[MC][COFF] Add COFF section flag "Info"
chenglin.bi [Wed, 19 Oct 2022 02:32:32 +0000 (10:32 +0800)]
[MC][COFF] Add COFF section flag "Info"

For now, we have not parse section flag `Info` in asm file. When we emit a section with info flag to asm, then compile asm to obj we will lose the Info flag for the section.
The motivation of this change is ARM64EC's hybmp$x section. If we lose the Info flag MSVC link will report a warning:
`warning LNK4078: multiple '.hybmp' sections found with different attributes`

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D136125

21 months agoReland "[LoongArch] Fix codegen of atomicrmw nand"
Weining Lu [Tue, 18 Oct 2022 23:58:35 +0000 (07:58 +0800)]
Reland "[LoongArch] Fix codegen of atomicrmw nand"

Fix invalid RISCV-like MI being emitted for performing the `not`
operation: the LoongArch `xori` zero-extends the immediate, hence is
not equivalent to RISCV `xori`. The LoongArch `not` is a `nor` with
zero.

Patch by lrzlin (Lin Runze).

Differential Revision: https://reviews.llvm.org/D136021

21 months ago[PowerPC] handle more than two predecessors loop header in ctrloop pass
Chen Zheng [Thu, 13 Oct 2022 01:49:02 +0000 (01:49 +0000)]
[PowerPC] handle more than two predecessors loop header in ctrloop pass

After ISEL, the "valid" loop header which has two predecessors
(one is preheader and the other one is latch) may be transformed
to have more than two predecessors by some optimizations, like tail
duplicator, if the old header's successor(will be changed to new
header) is a sub loop.

The predecessors of the new loop header are preheader, loop latch
and the loop latch(es) of the sub loop(old header's successor).

Before the patch, ctrloop pass assumes two predecessors for candidate
loop header. This patch fixes this case.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D135846

21 months ago[Clang] constraints partial ordering should work with deduction guide
Yuanfang Chen [Wed, 19 Oct 2022 00:19:58 +0000 (17:19 -0700)]
[Clang] constraints partial ordering should work with deduction guide

D128750 incorrectly skips constraints partial ordering for deduction guide.
This patch reverts that part.

Fixes https://github.com/llvm/llvm-project/issues/58456.

21 months ago[lld][WebAssembly] Don't allow `--global-base` to be specified in -share/-pie or...
Sam Clegg [Mon, 17 Oct 2022 23:26:54 +0000 (16:26 -0700)]
[lld][WebAssembly] Don't allow `--global-base` to be specified in -share/-pie or --relocatable modes

Add some checks around this combination of flags

Also, honor `--global-base` when specified in `--stack-first` mode
rather than ignoring it.  But error out if the specified base preseeds
the end of the stack.

Differential Revision: https://reviews.llvm.org/D136117

21 months agoRevert "[LoongArch] Fix codegen of atomicrmw nand"
Weining Lu [Tue, 18 Oct 2022 12:59:59 +0000 (20:59 +0800)]
Revert "[LoongArch] Fix codegen of atomicrmw nand"

This reverts commit 9572406bbcb497f8c23c28daa762b55ee3219f41.

The author name is wrong.

21 months ago[clang][deps] Remove unintentional `move`
Jan Svoboda [Tue, 18 Oct 2022 03:04:33 +0000 (20:04 -0700)]
[clang][deps] Remove unintentional `move`

This is a fix related to D135414. The original intention was to keep `BaseFS` as a member of the worker and conditionally overlay it with local in-memory FS. The `move` of ref-counted `BaseFS` was not intended, and it's a bug.

Disabling parallelism in the "by-module-name" test reliably reproduces this, and the test itself doesn't *need* parallelism. (I think `-j 4` was cargo culted from another test.) Reusing that test to check for correct behavior...

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D136124

21 months ago[JITLink] Add convenience methods for creating block readers / writers.
Lang Hames [Tue, 18 Oct 2022 17:53:25 +0000 (10:53 -0700)]
[JITLink] Add convenience methods for creating block readers / writers.

This saves clients some boilerplate compared to setting up the readers and
writers manually.

To obtain a BinaryStreamWriter / BinaryStreamReader for a given block, B,
clients can now write:

auto Reader = G.getBlockContentReader(B);

and

auto Writer = G.getBlockContentWriter(B);

The latter will trigger a copy to mutable memory allocated on the graph's
allocator if the block is currently marked as backed by read-only memory.

This commit also introduces a new createMutableContentBlock overload that
creates a block with a given size and zero-filled content (by default --
passing false for the ZeroInitialize bypasses initialization entirely).
This overload is intended to be used with getBlockContentWriter above when
creating new content for the graph.

21 months ago[sanitizer] Let internal symbolizer use toupper and tolower
Florian Mayer [Tue, 18 Oct 2022 23:19:11 +0000 (16:19 -0700)]
[sanitizer] Let internal symbolizer use toupper and tolower

21 months ago[HLSL] Add SV_DispatchThreadID
Xiang Li [Tue, 18 Oct 2022 20:09:01 +0000 (13:09 -0700)]
[HLSL] Add SV_DispatchThreadID

Support SV_DispatchThreadID attribute.
Translate it into dx.thread.id in clang codeGen.

Reviewed By: beanz, aaron.ballman

Differential Revision: https://reviews.llvm.org/D133983

21 months ago[mlir][sparse] Removing the DimLvlType and DimLevelFormat types
wren romano [Tue, 18 Oct 2022 02:11:20 +0000 (19:11 -0700)]
[mlir][sparse] Removing the DimLvlType and DimLevelFormat types

This removes another massive source of redundancy, and instead has the Merger.{h,cpp} reuse the SparseTensorEnums library.

Depends On D136005

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D136123

21 months ago[mlir][MemRef] Move the forwarding patterns for `extract_strided_metadata`
Quentin Colombet [Wed, 12 Oct 2022 00:53:52 +0000 (00:53 +0000)]
[mlir][MemRef] Move the forwarding patterns for `extract_strided_metadata`

The `SimplifyExtractStridedMetadata` pass features a pattern that forward
statically known information (offset, sizes, strides) to their respective
users.

This patch moves this pattern from this pass to the
`extract_strided_metadata` folding patterns.

Differential Revision: https://reviews.llvm.org/D135797

21 months ago[include-cleaner] Add line numbers to HTML output
Sam McCall [Tue, 18 Oct 2022 17:12:47 +0000 (19:12 +0200)]
[include-cleaner] Add line numbers to HTML output

21 months ago[mlir][sparse] Moving Enums.h into Dialect/SparseTensor/IR
wren romano [Tue, 18 Oct 2022 01:33:40 +0000 (18:33 -0700)]
[mlir][sparse] Moving Enums.h into Dialect/SparseTensor/IR

Move the SparseTensorEnums library out of the ExecutionEngine directory and into Dialect/SparseTensor/IR.

Depends On D136002

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D136005

21 months ago[libc] Add implementation of sigaltstack for linux.
Siva Chandra Reddy [Thu, 13 Oct 2022 22:18:52 +0000 (22:18 +0000)]
[libc] Add implementation of sigaltstack for linux.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D135949

21 months ago[CMake] Fix LIBUNWIND_ENABLE_CET build after D110005
jinge90 [Tue, 18 Oct 2022 22:00:09 +0000 (15:00 -0700)]
[CMake] Fix LIBUNWIND_ENABLE_CET build after D110005

D110005 renamed LIBUNWIND_SUPPORTS_* to CXX_SUPPORTS_*.

Reviewed By: MaskRay, #libunwind, mstorsjo

Differential Revision: https://reviews.llvm.org/D136131

21 months ago[clang-format] Do not parse certain characters in pragma directives
Joseph Huber [Fri, 14 Oct 2022 20:49:26 +0000 (15:49 -0500)]
[clang-format] Do not parse certain characters in pragma directives

Currently, we parse lines inside of a compiler `#pragma` the same way we
parse any other line. This is fine for some cases, like separating
expressions and adding proper spacing, but in others it causes some poor
results from miscategorizing some tokens.

For example, the OpenMP offloading uses certain clauses that contain
special characters like `map(tofrom : A[0:N])`. This will be formatted
poorly as it will be split between lines on the first colon.
Additionally the subscript notation will lead to poor spacing. This can
be seen in the OpenMP tests as the automatic clang formatting with
inevitably ruin the formatting.

For example, the following contrived example will be formatted poorly.
```
#pragma omp target teams distribute collapse(2) map(to: A[0 : M * K])  \
    map(to: B[0:K * N]) map(tofrom:C[0:M*N]) firstprivate(Alpha) \
    firstprivate(Beta) firstprivate(X) firstprivate(D) firstprivate(Y) \
    firstprivate(E) firstprivate(Z) firstprivate(F)
```
This results in this when formatted, which is far from ideal.
```
#pragma omp target teams distribute collapse(2) map(to                         \
                                                    : A [0:M * K])             \
    map(to                                                                     \
        : B [0:K * N]) map(tofrom                                              \
                           : C [0:M * N]) firstprivate(Alpha)                  \
        firstprivate(Beta) firstprivate(X) firstprivate(D) firstprivate(Y)     \
            firstprivate(E) firstprivate(Z) firstprivate(F)
```

This patch seeks to improve this by adding extra logic where the parsing goes
awry. This is primarily caused by the colon being parsed as an inline-asm
directive and the brackes an objective-C expressions. Also the line gets
indented every single time the line is dropped.

This doesn't implement true parsing handling for OpenMP statements.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D136100

21 months ago[OpenMP] Make kernels have protected visibility
Joseph Huber [Tue, 18 Oct 2022 20:03:28 +0000 (15:03 -0500)]
[OpenMP] Make kernels have protected visibility

This patch changes the kernels generated by OpenMP to have protected
visibility. This is unlikely to change anything functionally. However,
protected visibility better matches the behaviour of these GPU kernels.
We do not expect any pending shared library load to preempt these
kernels so we can specify a more restrictive visibility.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D136198

21 months ago[lld-macho] Folded symbols should have size zero in linker map
Jez Ng [Tue, 18 Oct 2022 21:21:43 +0000 (17:21 -0400)]
[lld-macho] Folded symbols should have size zero in linker map

This matches ld64's behavior.

I also extended the icf-stabs.s test to demonstrate that even though
folded symbols have size zero, we cannot use the size-zero property in
lieu of `wasIdenticalCodeFolded`, because size zero symbols should still
get STABS entries.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D136001

21 months ago[lld-macho] Don't fold subsections with symbols at nonzero offsets
Jez Ng [Tue, 18 Oct 2022 21:21:39 +0000 (17:21 -0400)]
[lld-macho] Don't fold subsections with symbols at nonzero offsets

Symbols occur at non-zero offsets in a subsection if they are
`.alt_entry` symbols, or if `.subsections_via_symbols` is omitted.

It doesn't seem like ld64 supports folding those subsections either.
Moreover, supporting this it makes `foldIdentical` a lot more
complicated to implement. The existing implementation has some
questionable behavior around STABS omission -- if a section with an
non-zero offset symbol was folded into one without, we would omit the
STABS entry for the non-zero offset symbol.

I will be following up with a diff that makes `foldIdentical` zero out
the symbol sizes for folded symbols. Again, this is much easier to
implement if we don't have to worry about non-zero offsets.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D136000

21 months ago[mlir][sparse] Factoring out SparseTensorEnums library
wren romano [Tue, 18 Oct 2022 01:13:05 +0000 (18:13 -0700)]
[mlir][sparse] Factoring out SparseTensorEnums library

This differential splits the SparseTensorEnums library out from the SparseTensorRuntime library. The actual moving of files will be handled in the next differential.

Depends On D135996

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D136002

21 months ago[mlir][sparse] refine insertion code
Aart Bik [Tue, 18 Oct 2022 17:35:00 +0000 (10:35 -0700)]
[mlir][sparse] refine insertion code

builds SSA cycle for compress insertion loop
adds casting on index mismatch during push_back

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D136186

21 months ago[opt] Don't initialize legacy instrumentation passes
Arthur Eubanks [Sun, 2 Oct 2022 21:07:51 +0000 (14:07 -0700)]
[opt] Don't initialize legacy instrumentation passes

So that we require `opt -passes=` syntax for instrumentation passes.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D135042

21 months ago[libc][Obvious] Skip some termios tests when there no is /dev/tty.
Siva Chandra Reddy [Tue, 18 Oct 2022 20:58:08 +0000 (20:58 +0000)]
[libc][Obvious] Skip some termios tests when there no is /dev/tty.

21 months ago[lldb][trace] Add a basic function call dump [3] - Add a JSON dumper
Walter Erquinigo [Sun, 16 Oct 2022 01:52:22 +0000 (18:52 -0700)]
[lldb][trace] Add a basic function call dump [3] - Add a JSON dumper

The JSON dumper is very minimalistic. It pretty much only shows the
delimiting instruction IDs of every segment, so that further queries to
the SBCursor can be used to make sense of the data. It's main purpose is
to be serialized somewhat cheaply.

I also renamed untracedSegment to untracedPrefixSegment, in case in the
future we add an untracedSuffixSegment. In any case, this new name is
more explicit, which I like.

Differential Revision: https://reviews.llvm.org/D136034

21 months ago[lldb][trace] Add a basic function call dump [2] - Implement the reconstruction algorithm
Walter Erquinigo [Mon, 10 Oct 2022 19:57:13 +0000 (12:57 -0700)]
[lldb][trace] Add a basic function call dump [2] - Implement the reconstruction algorithm

This diff implements the reconstruction algorithm for the call tree and
add tests.

See TraceDumper.h for documentation and explanations.

One important detail is that the tree objects are in TraceDumper, even
though Trace.h is a better home. I'm leaving that as future work.

Another detail is that this code is as slow as dumping the entire
symolicated trace, which is not that bad tbh. The reason is that we use
symbols throughout the algorithm and we are not being careful about
memory and speed. This is also another area for future improvement.

Lastly, I made sure that incomplete traces work, i.e. you start tracing
very deep in the stack or failures randomly appear in the trace.

Differential Revision: https://reviews.llvm.org/D135917

21 months ago[lldb][trace] Add a basic function call dumpdump [1] - Add the command scaffolding
Walter Erquinigo [Sat, 8 Oct 2022 21:06:44 +0000 (14:06 -0700)]
[lldb][trace] Add a basic function call dumpdump [1] - Add the command scaffolding

The command is thread trace dump function-calls and as minimum will
require printing to a file in json and non-json format

I added a test

Differential Revision: https://reviews.llvm.org/D135521

21 months ago[libc] Add termios.h and the implementation of functions declared in it.
Siva Chandra Reddy [Mon, 17 Oct 2022 16:27:45 +0000 (16:27 +0000)]
[libc] Add termios.h and the implementation of functions declared in it.

Reviewed By: lntue, michaelrj

Differential Revision: https://reviews.llvm.org/D136143

21 months ago[BOLT] Fix instruction encoding validation
Maksim Panchenko [Mon, 17 Oct 2022 23:15:59 +0000 (16:15 -0700)]
[BOLT] Fix instruction encoding validation

Always use non-symbolizing disassembler for instruction encoding
validation as symbols will be treated as undefined/zeros be the encoder
and causing byte sequence mismatches.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D136118

21 months agoFix incorrect check for running out of source locations.
Paul Pluzhnikov [Tue, 18 Oct 2022 20:47:55 +0000 (20:47 +0000)]
Fix incorrect check for running out of source locations.

When CurrentLoadedOffset is less than TotalSize, current code will
trigger unsigned overflow and will not return an "allocation failed"
indicator.

Google ref: b/248613299

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D135192

21 months ago[mlir][sparse] Use the runtime DimLevelType instead of a separate tablegen enum
wren romano [Fri, 14 Oct 2022 23:40:28 +0000 (16:40 -0700)]
[mlir][sparse] Use the runtime DimLevelType instead of a separate tablegen enum

This differential replaces all uses of SparseTensorEncodingAttr::DimLevelType with DimLevelType.  The next differential will break out a separate library for the DimLevelType enum, so that the Dialect code doesn't need to depend on the rest of the runtime

Depends On D135995

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135996

21 months ago[clang] Move variable declaration closer to use
Nico Weber [Tue, 18 Oct 2022 20:39:55 +0000 (16:39 -0400)]
[clang] Move variable declaration closer to use

...and add some whitespace to delimit the three logical steps done in this
function.

No behavior change.

21 months ago[SystemZ][z/OS][libcxx]: fix the mask in stage2_float_loop function
Nancy Wang [Tue, 18 Oct 2022 19:53:03 +0000 (15:53 -0400)]
[SystemZ][z/OS][libcxx]: fix the mask in stage2_float_loop function

This patch is to fix issue related to __stage2_float_loop function, float point value comparison is not working on EBCDIC mode because the mask is hard-coded and assumes character is ASCII, fix is to use toupper function when do the comparison.

Differential Revision: https://reviews.llvm.org/D118930

21 months ago[mlir][MemRef] Fix the simplification of extract_strided_metadata(subview)
Quentin Colombet [Mon, 17 Oct 2022 19:40:19 +0000 (19:40 +0000)]
[mlir][MemRef] Fix the simplification of extract_strided_metadata(subview)

Prior to this patch we were wrongly applying the sub-strides to the
computation of the final offset of the subview.

Put differently, we were computing the offset as:
```
offset = baseOffset + sum(subOffset#i * baseStrides#i * subSizes#i)
```
Whereas we should be doing:
```
offset = baseOffset + sum(subOffset#i * baseStrides#i)
```
I.e., drop the subSizes#i term from the sum.

Differential Revision: https://reviews.llvm.org/D136107

21 months ago[gn build] Port 594fa1474f0c
LLVM GN Syncbot [Tue, 18 Oct 2022 19:15:31 +0000 (19:15 +0000)]
[gn build] Port 594fa1474f0c

21 months ago[mlir][sparse] rename the values of the runtime DimLevelType
wren romano [Fri, 14 Oct 2022 23:36:14 +0000 (16:36 -0700)]
[mlir][sparse] rename the values of the runtime DimLevelType

This change is to make way for reusing the DimLevelType enum in lieu of the SparseTensorEncodingAttr::DimLevelType enum, but broken out to make it quick and easy to review

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135995

21 months ago[C++20] Implement P2113R0: Changes to the Partial Ordering of Constrained Functions
Yuanfang Chen [Tue, 18 Oct 2022 18:51:02 +0000 (11:51 -0700)]
[C++20] Implement P2113R0: Changes to the Partial Ordering of Constrained Functions

This implementation matches GCC behavior in that [[ https://eel.is/c++draft/temp.func.order#6.2.1 | temp.func.order p6.2.1 ]] is not implemented [1]. I reached out to the GCC author to confirm that some changes elsewhere to overload resolution are probably needed, but no solution has been developed sufficiently [3].

Most of the wordings are implemented straightforwardly. However,
for [[ https://eel.is/c++draft/temp.func.order#6.2.2 | temp.func.order p6.2.2 ]] "... or if the function parameters that positionally correspond between the two templates are not of the same type", the "same type" is not very clear ([2] is a bug related to this). Here is a quick example
```
template <C T, C U>        int f(T, U);
template <typename T, C U> int f(U, T);

int x = f(0, 0);
```
Is the `U` and `T` from different `f`s the "same type"? The answer is NO even though both `U` and `T` are deduced to be `int` in this case. The reason is that `U` and `T` are dependent types, according to [[ https://eel.is/c++draft/temp.over.link#3 |  temp.over.link p3 ]], they can not be the "same type".

To check if two function parameters are the "same type":
* For //function template//: compare the function parameter canonical types and return type between two function templates.
* For //class template/partial specialization//: by [[ https://eel.is/c++draft/temp.spec.partial.order#1.2 | temp.spec.partial.order p1.2 ]], compare the injected template arguments between two templates using hashing(TemplateArgument::Profile) is enough.

[1] https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=57b4daf8dc4ed7b669cc70638866ddb00f5b7746
[2] https://github.com/llvm/llvm-project/issues/49308
[3] https://lists.isocpp.org/core/2020/06/index.php#msg9392

Fixes https://github.com/llvm/llvm-project/issues/54039
Fixes https://github.com/llvm/llvm-project/issues/49308 (PR49964)

Reviewed By: royjacobson, #clang-language-wg, mizvekov

Differential Revision: https://reviews.llvm.org/D128750

21 months ago[libc++][chrono] Fixes build.
Mark de Wever [Tue, 18 Oct 2022 18:57:54 +0000 (20:57 +0200)]
[libc++][chrono] Fixes build.

Changes in D134742 were not properly propagated to D136037 before
landing.

21 months ago[SLP]Generalize cost model.
Alexey Bataev [Thu, 18 Nov 2021 23:59:30 +0000 (15:59 -0800)]
[SLP]Generalize cost model.

Generalized the cost model estimation. Improved cost model estimation
for repeated scalars (no need to count their cost anymore), improved
  cost model for extractelement instructions.

cpu2017
   511.povray_r             0.57
   520.omnetpp_r           -0.98
   521.wrf_r               -0.01
   525.x264_r               3.59 <+
   526.blender_r           -0.12
   531.deepsjeng_r         -0.07
   538.imagick_r           -1.42
Geometric mean:  0.21

Differential Revision: https://reviews.llvm.org/D115757

21 months ago[Clang] update cxx_dr_status.html by running make_cxx_dr_status
Yuanfang Chen [Tue, 18 Oct 2022 18:24:38 +0000 (11:24 -0700)]
[Clang] update cxx_dr_status.html by running make_cxx_dr_status

For https://github.com/llvm/llvm-project/issues/58382

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D136133

21 months ago[AArch64][Windows] Add MC support for save_any_reg.
Eli Friedman [Tue, 18 Oct 2022 18:44:01 +0000 (11:44 -0700)]
[AArch64][Windows] Add MC support for save_any_reg.

Representing this as 12 separate operations is a bit ugly, but
trying to represent the different modes using a bitfield seemed worse.

Differential Revision: https://reviews.llvm.org/D135417

21 months ago[libc++][chrono] Implements formatter weekday.
Mark de Wever [Sun, 20 Mar 2022 12:40:02 +0000 (13:40 +0100)]
[libc++][chrono] Implements formatter weekday.

Partially implements:
- P1361 Integration of chrono with text formatting
- P2372 Fixing locale handling in chrono formatters

Depends on D134742

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D136037

21 months ago[libc++][chrono] Implements formatter duration.
Mark de Wever [Sun, 20 Mar 2022 12:40:02 +0000 (13:40 +0100)]
[libc++][chrono] Implements formatter duration.

Partially implements:
- P1361 Integration of chrono with text formatting
- P2372 Fixing locale handling in chrono formatters
- LWG3270 Parsing and formatting %j with durations

Completes:
- P1650R0 std::chrono::days with 'd' suffix
- LWG3262 Formatting of negative durations is not specified
- LWG3314 Is stream insertion behavior locale dependent when Period::type is micro?

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D134742

21 months ago[libc++][ranges] implement `std::ranges::drop_while_view`
Hui Xie [Fri, 7 Oct 2022 17:07:54 +0000 (18:07 +0100)]
[libc++][ranges] implement `std::ranges::drop_while_view`

Differential Revision: https://reviews.llvm.org/D135460

21 months agoRevert "[SLP]Generalize cost model."
Alexey Bataev [Tue, 18 Oct 2022 18:23:43 +0000 (11:23 -0700)]
Revert "[SLP]Generalize cost model."

This reverts commit f12fb91188b836e1bddb36bacbbdb8e4ab70b9b6 and
f5c747bfbe36b8f53e6fe2d85ffcaecba6d7153c to fix detected non-initialized
var use.

21 months agoRevert "Recommit "[LoopFlatten] Enable it by default""
Sjoerd Meijer [Tue, 18 Oct 2022 17:54:04 +0000 (23:24 +0530)]
Revert "Recommit "[LoopFlatten] Enable it by default""

This reverts commit 5b9597f59a445523bd59b5251ab1c2865e74919f.

A miscompilation was reported:
https://github.com/llvm/llvm-project/issues/58441

Reverting this while I look at that.

21 months agoRevert "[lldb-tests] Remove dubious standard library flag"
Felipe de Azevedo Piovezan [Tue, 18 Oct 2022 17:59:29 +0000 (13:59 -0400)]
Revert "[lldb-tests] Remove dubious standard library flag"

This reverts commit f477412685fe6bac49d3d080ba91896c28e62116.

21 months ago[flang] Add getTypeDescriptorBindingTableName function
Valentin Clement [Tue, 18 Oct 2022 17:52:20 +0000 (19:52 +0200)]
[flang] Add getTypeDescriptorBindingTableName function

Type descriptor and its binding table are defined as fir.global in FIR.
Their names are derived from the derived-type name. This patch adds a new
function `getTypeDescriptorBindingTableName` in the NameUniquer and
refactor the `GetTypeDescriptorName` function to reuse the same code.
This will be used in the fir.dispatch code generation.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D136167

21 months ago[lldb-tests] Remove dubious standard library flag
Felipe de Azevedo Piovezan [Tue, 18 Oct 2022 13:22:20 +0000 (09:22 -0400)]
[lldb-tests] Remove dubious standard library flag

The test currently sets `USE_LIBSTDCPP = 0`, which is curious given the
behavior of `and` and `or` in Makefiles (the contents of the variables
are not important). In particular, this causes the tests to not use the
standard libraries appropriately.

To capture the actual intent of the test, we're changing this to
`USE_LIBCXX=1`.

Differential Revision: https://reviews.llvm.org/D136171