platform/upstream/llvm.git
19 months ago[RISCV][TTI] Account for constant materialization cost when costing arithmetic operations
Philip Reames [Wed, 30 Nov 2022 15:12:53 +0000 (07:12 -0800)]
[RISCV][TTI] Account for constant materialization cost when costing arithmetic operations

At the IR level, we generally assume that constants are free to materialize. However, for RISCV due to some quirks of the ISA, materializing arbitrary constants can be rather expensive. We frequently fallback to constant pool loads.

We've been slowly moving in the direction of modeling the cost of the remat as part of the instruction cost. This has the effect of disincentivizing vectorization - mostly SLP - when we'd have to materialize an expensive constant.

We need better modeling of which constants are expensive and not, but the moment let's be consistent with how we model arithmetic and memory instructions. The difference between the two is that arithmetic can sometimes fold a splat operation which stores can not.

Differential Revision: https://reviews.llvm.org/D138941

19 months ago[Sanitizers] Fix test that never ran anywhere
Paul Robinson [Wed, 30 Nov 2022 15:19:39 +0000 (07:19 -0800)]
[Sanitizers] Fix test that never ran anywhere

Incorrect REQUIRES clause. Also fixed the incorrect 'opt' line
and removed a redundant -mtriple option.

19 months ago[include-cleaner] don't clang-format tests. NFC
Sam McCall [Wed, 30 Nov 2022 15:12:23 +0000 (16:12 +0100)]
[include-cleaner] don't clang-format tests. NFC

19 months agoRevert "[clang][Interp] Use placement new to construct opcode args into vector"
Timm Bäder [Wed, 30 Nov 2022 15:07:52 +0000 (16:07 +0100)]
Revert "[clang][Interp] Use placement new to construct opcode args into vector"

This reverts commit aaf73ae266db44fce107a0b73fcb33527bfb52eb.

This breaks sanitized builds because the constructor is called with an
unaligned address.

19 months ago[clang][Interp][NFC] Avoid unnecessary work in compileFunc()
Timm Bäder [Sun, 30 Oct 2022 09:13:18 +0000 (10:13 +0100)]
[clang][Interp][NFC] Avoid unnecessary work in compileFunc()

We don't need to create the paramter descriptors etc. if we've already
done that in the past.

19 months agoRevert "Implement CWG2631"
Corentin Jabot [Wed, 30 Nov 2022 15:02:14 +0000 (16:02 +0100)]
Revert "Implement CWG2631"

This reverts commit 26fa17ed2914bd80c066d36b325fd3104e45554c.
This reverts commit 4403c4f9e77e673a2771edfc7ab0ebb234e97485.

There is still an ODR issue causing linker errors, investigating.

19 months ago[flang] Allow non polymorphic pointer assignment with polymorphic rhs
Valentin Clement [Wed, 30 Nov 2022 14:53:01 +0000 (15:53 +0100)]
[flang] Allow non polymorphic pointer assignment with polymorphic rhs

Remove the TODO and allow pointer assignment with non
polymorphic entity on the lhs. The assignment follow the same scheme
as derived-type pointer assignment to parent component.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D138998

19 months ago[gn build] Port 77b220524541
LLVM GN Syncbot [Wed, 30 Nov 2022 14:41:02 +0000 (14:41 +0000)]
[gn build] Port 77b220524541

19 months ago[lldb][DataFormatter] Add std::ranges::ref_view formatter
Michael Buch [Tue, 25 Oct 2022 09:57:44 +0000 (10:57 +0100)]
[lldb][DataFormatter] Add std::ranges::ref_view formatter

This patch adds a formatter for `std::ranges::ref_view<T>`.
It simply holds a `T*`, so all this formatter does is dereference
this pointer and format it as `T` would be.

**Testing**

* Added API tests

Differential Revision: https://reviews.llvm.org/D138558

19 months agoRemove 'modindex' from the Clang docs
Aaron Ballman [Wed, 30 Nov 2022 14:36:50 +0000 (09:36 -0500)]
Remove 'modindex' from the Clang docs

This was added in the initial commit to use Sphinx ~12 years ago, but
is a dead link in our docs. Removing it and the python bits that appear
to be unused.

19 months agoInstCombine: Add baseline tests for folding or of is.fpclass
Matt Arsenault [Thu, 17 Nov 2022 05:59:50 +0000 (21:59 -0800)]
InstCombine: Add baseline tests for folding or of is.fpclass

19 months agoInstCombine: Add baseline tests for negated is_fpclass
Matt Arsenault [Thu, 17 Nov 2022 05:24:11 +0000 (21:24 -0800)]
InstCombine: Add baseline tests for negated is_fpclass

19 months agoConstantFolding: Guard use of getFunction
David Stuttard [Wed, 23 Nov 2022 15:53:36 +0000 (15:53 +0000)]
ConstantFolding: Guard use of getFunction

Add additional guards for a use of getFunction on an Instruction
In some cases constanfFoldCanonicalize can be called with a cloned instruction
that doesn't have a parent (or associated function), causing a seg fault.

Differential Revision: https://reviews.llvm.org/D138642

19 months ago[clang] Do not merge traps in functions annotated optnone
Henrik G. Olsson [Wed, 30 Nov 2022 13:59:49 +0000 (14:59 +0100)]
[clang] Do not merge traps in functions annotated optnone

This aligns the behaviour with that of disabling optimisations for the
translation unit entirely. Not merging the traps allows us to keep
separate debug information for each, improving the debugging experience
when finding the cause for a ubsan trap.

Differential Revision: https://reviews.llvm.org/D137714

19 months agoInstCombine: Add baseline checks for is_fpclass
Matt Arsenault [Thu, 10 Nov 2022 23:27:18 +0000 (15:27 -0800)]
InstCombine: Add baseline checks for is_fpclass

19 months ago[gn build] Port d3c851d3fc8b
LLVM GN Syncbot [Wed, 30 Nov 2022 13:51:23 +0000 (13:51 +0000)]
[gn build] Port d3c851d3fc8b

19 months agoUse-after-return sanitizer binary metadata
Dmitry Vyukov [Mon, 17 Oct 2022 13:13:56 +0000 (15:13 +0200)]
Use-after-return sanitizer binary metadata

Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078

19 months ago[flang][NFC] add genType(FunctionRef<T>) entry points in lowering
Jean Perier [Wed, 30 Nov 2022 13:44:10 +0000 (14:44 +0100)]
[flang][NFC] add genType(FunctionRef<T>) entry points in lowering

This will help lowering to HLFIR to not use the AsGenericExpr/AsExpr
patterns that copies sub-expresssions into evaluate::SomeExpr so that
they can be passed to helpers. Sub-expressions like FunctionRef can
be heavy (hundreds of arguments, constant array expression arguments...).

Differential Revision: https://reviews.llvm.org/D138997

19 months ago[AArch64] Assembly support for VMSA
Tomas Matheson [Thu, 24 Nov 2022 15:25:14 +0000 (15:25 +0000)]
[AArch64] Assembly support for VMSA

Virtual Memory System Architecture (VMSA)

This is part of the 2022 A-Profile Architecture extensions and adds support for
the following:

 - Translation Hardening Extension (FEAT_THE)
 - 128-bit Page Table Descriptors (FEAT_D128)
 - 56-bit Virtual Address (FEAT_LVA3)
 - Support for 128-bit System Registers (FEAT_SYSREG128)
 - System Instructions that can take 128-bit inputs (FEAT_SYSINSTR128)
 - 128-bit Atomic Instructions (FEAT_LSE128)
 - Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE)
 - Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE)
 - Memory Attribute Index Enhancement (FEAT_AIE)

New instructions added:
 - FEAT_SYSREG128 adds MRRS and MSRR.
 - FEAT_SYSINSTR128 adds the SYSP instruction and TLBIP aliases.
 - FEAT_LSE128 adds LDCLRP, LDSET, and SWPP instructions.
 - FEAT_THE adds the set of RCW* instructions.

Specs for individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/

Contributors:
  Keith Walker
  Lucas Prates
  Sam Elliott
  Son Tuan Vu
  Tomas Matheson

Differential Revision: https://reviews.llvm.org/D138920

19 months ago[clang] Speedup LineOffsetMapping::get
serge-sans-paille [Mon, 21 Nov 2022 15:01:32 +0000 (16:01 +0100)]
[clang] Speedup LineOffsetMapping::get

LineOffsetMapping::get is a critical function that consistently appears
in the top 5 more computation intensive functions when running the
preprocessor.

This change brings consistent speedup of ~.5% on, preprocessing time,
see

https://llvm-compile-time-tracker.com/compare.php?from=0745b0c0354a0c8e1fefb68a3876d15db6c2e27a&to=460f3f04dac025e6952d78fce104a88151508a29&stat=instructions:u

for detailed statistics.

Differential Revision: https://reviews.llvm.org/D138474

19 months ago[mlir][Vector] Add a Broadcast::createBroadcastOp helper
Nicolas Vasilache [Wed, 30 Nov 2022 12:20:18 +0000 (04:20 -0800)]
[mlir][Vector] Add a Broadcast::createBroadcastOp helper

This helper handles non trivial cases of broadcast + optional transpose creation
that should not leak to the outside world.

Differential Revision: https://reviews.llvm.org/D139003

19 months ago[AMDGPU] Remove todo about vector types
Sebastian Neubauer [Wed, 30 Nov 2022 12:18:32 +0000 (13:18 +0100)]
[AMDGPU] Remove todo about vector types

D138205 added all the new vector types, so the todo is fixed now.

Differential Revision: https://reviews.llvm.org/D139002

19 months ago[AArch64] Don't treat SVE scalable extends as free widening instructions
David Green [Wed, 30 Nov 2022 13:09:48 +0000 (13:09 +0000)]
[AArch64] Don't treat SVE scalable extends as free widening instructions

The logic in isWideningInstruction handles instructions like uaddw and
smull, where 'add(x, zext(y))' or 'mul(sext(x), sext(y))' can be
converted to single instructions, making the extends free. This doesn't
apply the same to SVE instructions though.
https://godbolt.org/z/695d3nhGd

(There are instructions like SMULLT/B, but they require top/bottom lane
interleaving. That is similar to MVE instructions, which required a
special pass to perform the lane interleaving).

This patch just bails out of the call to isWideningInstruction if the
vector is scalable, getting a more accurate cost.

Differential Revision: https://reviews.llvm.org/D138591

19 months ago [RISCV] Add cost model for fixed broadcast shuffle
ShihPo Hung [Wed, 30 Nov 2022 12:58:52 +0000 (04:58 -0800)]
 [RISCV] Add cost model for fixed broadcast shuffle

This patch adds basic broadcast shuffle costs in order to enable SLP vectorization.
And adds `getLMULCost` to consider reciprocal throughput for different LMUL.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D137276

19 months agoFix Clang sphinx build
Aaron Ballman [Wed, 30 Nov 2022 12:52:59 +0000 (07:52 -0500)]
Fix Clang sphinx build

This addresses the issue found by:
https://lab.llvm.org/buildbot/#/builders/92/builds/36449

19 months ago[flang] Add hlfir.associate and hlfir.end_associate definitions
Jean Perier [Wed, 30 Nov 2022 12:41:44 +0000 (13:41 +0100)]
[flang] Add hlfir.associate and hlfir.end_associate definitions

These operations allow creating an HLFIR variable from a HLFIR value and
destroying it at the end of the variable lifetime.
This will both be used to implement procedure reference arguments association
when the actual is an expression, and to implement the Fortran associate
construct when the associated entity is an expression.

See https://github.com/llvm/llvm-project/blob/main/flang/docs/HighLevelFIR.md
for more details.

Differential Revision: https://reviews.llvm.org/D138996

19 months ago[X86] Add missing PFM port mappings for Core2/Nehalem
Simon Pilgrim [Tue, 29 Nov 2022 18:17:44 +0000 (18:17 +0000)]
[X86] Add missing PFM port mappings for Core2/Nehalem

This was an old patch from when I was trying to improve pre-AVX scheduler support as part of D103695, we were missing port mappings entirely for these targets - although tbh they don't map well to the SandyBridge model that they currently use.

19 months ago[clang-repl] Add basic documentation about clang-repl
Sara Bellei [Wed, 30 Nov 2022 10:25:35 +0000 (10:25 +0000)]
[clang-repl] Add basic documentation about clang-repl

Differential revision: https://reviews.llvm.org/D138698

19 months ago[CodeGen][X86] Crash fixes for "patchable-function" pass
Sylvain Audi [Fri, 4 Nov 2022 20:34:23 +0000 (16:34 -0400)]
[CodeGen][X86] Crash fixes for "patchable-function" pass

This patch fixes crashes related with how PatchableFunction selects the instruction to make patchable:
- Ensure PatchableFunction skips all instructions that don't generate actual machine instructions.
- Handle the case where the first MachineBasicBlock is empty
- Removed support for 16 bit x86 architectures.

Note: another issue remains related with PatchableFunction, in the lowering part.
See https://github.com/llvm/llvm-project/issues/59039

Differential Revision: https://reviews.llvm.org/D137642

19 months ago[AMDGPU] Use aperture registers instead of S_GETREG
Pierre van Houtryve [Mon, 28 Nov 2022 13:58:23 +0000 (13:58 +0000)]
[AMDGPU] Use aperture registers instead of S_GETREG

Fixes a longstanding TODO in the codebase where we were using S_GETREG + shift to do something that could simply be done with an inline constant (register).

Patch based on D31874 by @kzhuravl
Depends on D137767

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137542

19 months ago[AMDGPU][MC][GFX11] Disable non-VGPR src operands for VOP3_DPP variants of fmac instr...
Dmitry Preobrazhensky [Wed, 30 Nov 2022 11:45:52 +0000 (14:45 +0300)]
[AMDGPU][MC][GFX11] Disable non-VGPR src operands for VOP3_DPP variants of fmac instructions

Differential Revision: https://reviews.llvm.org/D138710

19 months agoX86: relax EFLAGS liveness check when generating stack probes.
Tim Northover [Tue, 22 Nov 2022 11:27:05 +0000 (11:27 +0000)]
X86: relax EFLAGS liveness check when generating stack probes.

The probes are all inserted at the iterator passed into the functions, so
that's where any EFLAGS clobbering will happen and where we need it to be dead.

Fixes: https://github.com/llvm/llvm-project/issues/59121

19 months ago[AMDGPU] Fix location of line break in VOPC instruction table
Jay Foad [Wed, 30 Nov 2022 11:39:48 +0000 (11:39 +0000)]
[AMDGPU] Fix location of line break in VOPC instruction table

19 months agoAMDGPU: Fixup tests
Nicolai Hähnle [Wed, 30 Nov 2022 11:29:37 +0000 (12:29 +0100)]
AMDGPU: Fixup tests

19 months ago[FLANG] Fix MSVC + clang-cl build
Muhammad Omair Javaid [Wed, 30 Nov 2022 11:28:13 +0000 (16:28 +0500)]
[FLANG] Fix MSVC + clang-cl build

Flang build on windows with MSVC environment and clang-cl compiler
requires clang_rt.builtin.${target} library. This patch allows us to
locate and include this link library. This is mostly needed for flang
runtime and associated unittests as it requires the uint128 division
builtin function __udivti3.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D138023

19 months ago[MSAN] add interceptor for stpncpy
Gabor Buella [Sun, 20 Nov 2022 21:15:24 +0000 (22:15 +0100)]
[MSAN] add interceptor for stpncpy

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D138386

19 months agoRevert "[OpenMP] [OMPD] Enable OMPD Tests"
Vignesh Balasubramanian [Wed, 30 Nov 2022 11:07:01 +0000 (16:37 +0530)]
Revert "[OpenMP] [OMPD] Enable OMPD Tests"

This reverts commit 451c017a32695ee62c0ae6de6401d89cd9bd9555.

19 months ago[flang] fix unused variables
Tom Eccles [Wed, 30 Nov 2022 10:33:22 +0000 (10:33 +0000)]
[flang] fix unused variables

19 months agoAMDGPU: Remove ImagePSV and move images to addrspace 7
Nicolai Hähnle [Tue, 29 Nov 2022 21:36:15 +0000 (22:36 +0100)]
AMDGPU: Remove ImagePSV and move images to addrspace 7

Following up on the removal of BufferPSV in commit 43b86bf992 ("AMDGPU:
Remove BufferPseudoSourceValue")

It is unclear what exactly the right address space for images should be.
They seem morally closest to buffers, so that's what I went with. In
practical terms, address space 7 is better than address space 0 because
it can't alias with LDS.

Differential Revision: https://reviews.llvm.org/D138949

19 months ago[Clang] Remove conflict markers from ReleaseNotes
Corentin Jabot [Wed, 30 Nov 2022 10:27:16 +0000 (11:27 +0100)]
[Clang] Remove conflict markers from ReleaseNotes

19 months agoImplement CWG2631
Corentin Jabot [Sun, 23 Oct 2022 15:32:58 +0000 (17:32 +0200)]
Implement CWG2631

Implement https://cplusplus.github.io/CWG/issues/2631.html.

Immediate calls in default arguments and defaults members
are not evaluated.

Instead, we evaluate them when constructing a
`CXXDefaultArgExpr`/`BuildCXXDefaultInitExpr`.

The immediate calls are executed by doing a
transform on the initializing expression.

Note that lambdas are not considering subexpressions so
we do not need to transform them.

As a result of this patch, unused default member
initializers are not considered odr-used, and
errors about members binding to local variables
in an outer scope only surface at the point
where a constructor is defined.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D136554

19 months ago[OpenMP] [OMPD] Enable OMPD Tests
Vignesh Balasubramanian [Wed, 30 Nov 2022 08:48:29 +0000 (14:18 +0530)]
[OpenMP] [OMPD] Enable OMPD Tests

It was disabled due to different failures it different llvm bots.

Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D138411

19 months ago[LoongArch] Add codegen support for atomicrmw min/max operation on LA64
gonglingqin [Wed, 30 Nov 2022 09:04:25 +0000 (17:04 +0800)]
[LoongArch] Add codegen support for atomicrmw min/max operation on LA64

This patch is required by OpenMP. After applying this patch, OpenMP regression
test passed. To reduce review difficulty caused by too large patches,
atomicrmw min/max operations on LA32 will be added later.

Differential Revision: https://reviews.llvm.org/D138177

19 months ago[clang][Interp] Explicitly handle RVO Pointer
Timm Bäder [Fri, 4 Nov 2022 06:43:34 +0000 (07:43 +0100)]
[clang][Interp] Explicitly handle RVO Pointer

The calling convention is:

[RVO pointer]
[instance pointer]
[... args ...]

We handle the instance pointer ourselves, BUT for the RVO pointer, we
just assumed in visitReturnStmt() that it is on top of the stack. Which
isn't true if there are other args present (and a this pointer, maybe).

Fix this by recording the RVO pointer explicitly when creating an
InterpFrame, just like we do with the instance/This pointer.

There is already a "RVOAndParams()" test in test/AST/Inter/records.cpp,
that was supposed to test this, however, it didn't trigger any
problematic behavior because the parameter and the return value have the
same type.

Differential Revision: https://reviews.llvm.org/D137392

19 months agotsan: fix epoll_pwait2 interceptor
Dmitry Vyukov [Tue, 29 Nov 2022 15:51:48 +0000 (16:51 +0100)]
tsan: fix epoll_pwait2 interceptor

epoll_pwait2 is new and may not be present in libc and/or kernel.
Since we effectively add it to libc (as will be probed by the program
using dlsym or a weak function pointer) we need to handle the case
when it's not present in the actual libc.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D138929

19 months ago[clang][Interp] Use placement new to construct opcode args into vector
Timm Bäder [Wed, 23 Nov 2022 09:48:45 +0000 (10:48 +0100)]
[clang][Interp] Use placement new to construct opcode args into vector

This way we're invoking the copy constructor, which might be necessary
if the argument is not trivially constructible.

Differential Revision: https://reviews.llvm.org/D138554

19 months ago[clang][Interp] Handle undefined functions better
Timm Bäder [Fri, 28 Oct 2022 08:31:47 +0000 (10:31 +0200)]
[clang][Interp] Handle undefined functions better

Differential Revision: https://reviews.llvm.org/D136936

19 months ago[clang][Interp] Array initialization via CXXConstructExpr
Timm Bäder [Sun, 30 Oct 2022 06:08:45 +0000 (07:08 +0100)]
[clang][Interp] Array initialization via CXXConstructExpr

Differential Revision: https://reviews.llvm.org/D136920

19 months ago[flang] Handle polymorphic value when creating temporary
Valentin Clement [Wed, 30 Nov 2022 08:58:27 +0000 (09:58 +0100)]
[flang] Handle polymorphic value when creating temporary

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D138921

19 months agoRevert "Use-after-return sanitizer binary metadata"
Dmitry Vyukov [Wed, 30 Nov 2022 08:35:30 +0000 (09:35 +0100)]
Revert "Use-after-return sanitizer binary metadata"

This reverts commit e6aea4a5db09c845276ece92737a6aac97794100.

Broke tests:
https://lab.llvm.org/buildbot/#/builders/16/builds/38992

19 months ago[mlir][bufferize] Improve error message when returning allocs
Matthias Springer [Wed, 30 Nov 2022 08:11:50 +0000 (09:11 +0100)]
[mlir][bufferize] Improve error message when returning allocs

The previous error message was confusing. Also improve code documentation and some minor code cleanups.

Differential Revision: https://reviews.llvm.org/D138902

19 months ago[clang][Interp] Fix discarding non-primitive function call return values
Timm Bäder [Fri, 21 Oct 2022 15:32:39 +0000 (17:32 +0200)]
[clang][Interp] Fix discarding non-primitive function call return values

Differential Revision: https://reviews.llvm.org/D136457

19 months ago[RISCV] Remove lmuls argument in Sched class
wangpc [Wed, 30 Nov 2022 08:01:12 +0000 (16:01 +0800)]
[RISCV] Remove lmuls argument in Sched class

The original intention is adding a list of SchedWrites (which is a
default argument of ReadAdvance) to LMULReadAdvance, but it may not
be practical that there are two default arguments in one class. So
we add variants that are intended for widening and narrowing
instructions with postfix "W" and remove lmuls argument.

Reviewed By: michaelmaitland

Differential Revision: https://reviews.llvm.org/D138640

19 months agoUse-after-return sanitizer binary metadata
Dmitry Vyukov [Mon, 17 Oct 2022 13:13:56 +0000 (15:13 +0200)]
Use-after-return sanitizer binary metadata

Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078

19 months ago[AMDGPU] Remove AMDGPUISelDAGToDAG::isKnownNeverNaN
Thomas Symalla [Tue, 29 Nov 2022 19:15:40 +0000 (20:15 +0100)]
[AMDGPU] Remove AMDGPUISelDAGToDAG::isKnownNeverNaN

This function removes the mentioned function, as it only does two
checks which are already implemented as part of
SelectionDAG::isKnownNeverNaN - which is called there.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138938

19 months ago[llvm-profdata] Use flattening sample profile in profile supplementation
Rong Xu [Wed, 30 Nov 2022 06:23:47 +0000 (22:23 -0800)]
[llvm-profdata] Use flattening sample profile in profile supplementation

We need to flatten the SampleFDO profile in profile supplementation
because the InstrFDO profile does not have inlined callsite counters.
Without flattening profile, FDO optimizations are not stable:
we will not supplement the second generation profile when the modified
functions are all inlined.

This patch fixes this issue: we will flatten the profile for functions
that appears in FDO profile.

Note that we only need to find the hot/warm functions in SampleFDO
profile, so we will not perform a full flatten. We will use
a DFS traversal to compute the accumulated entry count and max bodycount.
This is much cheaper than full flattening.

Differential Revision: https://reviews.llvm.org/D138893

19 months ago[CMake] Support injecting extra dependencies for perf-training
Petr Hosek [Wed, 30 Nov 2022 03:02:58 +0000 (03:02 +0000)]
[CMake] Support injecting extra dependencies for perf-training

It may be necessary to build additional targets before running
perf-training, the typical use case would be builtins and runtimes.

This change allows users to specify those dependencies as:

  set(CLANG_PERF_TRAINING_DEPS builtins runtimes CACHE STRING "")

Differential Revision: https://reviews.llvm.org/D138974

19 months ago[X86] combine-and.ll - add test coverage for scalar broadcast
Evgenii Kudriashov [Wed, 30 Nov 2022 05:22:33 +0000 (13:22 +0800)]
[X86] combine-and.ll - add test coverage for scalar broadcast

Reviewed By: RKSimon, pengfei

Differential Revision: https://reviews.llvm.org/D138734

19 months agoFix obvious typo
Gabriel F. T. Gomes [Wed, 30 Nov 2022 05:21:33 +0000 (13:21 +0800)]
Fix obvious typo

The -fcf-protection option takes an optional argument, which allows the
requesting of control-flow protection on returns, or on indirect jumps
and calls, or both. One of the comments in the code applies to returns,
yet it mentions branches (for indirect calls and jumps). This patch
fixes this obvious typo.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D117836

19 months ago[perf-training] Support additional test suffixes
Petr Hosek [Wed, 30 Nov 2022 03:10:07 +0000 (03:10 +0000)]
[perf-training] Support additional test suffixes

.cc and .test are commonly used and supported by other lit test suites.

Differential Revision: https://reviews.llvm.org/D138975

19 months ago[IR][NFC] Adds Instruction::insertAt() for inserting at a specific point in the instr...
Vasileios Porpodas [Mon, 28 Nov 2022 22:38:38 +0000 (14:38 -0800)]
[IR][NFC] Adds Instruction::insertAt() for inserting at a specific point in the instr list.

Currently the only way to do this is to work with the instruction list directly.
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.

Differential Revision: https://reviews.llvm.org/D138875

19 months ago[mlir][linalg] Changing the positions of introduced parallel loop in SplitReduction...
Murali Vijayaraghavan [Wed, 30 Nov 2022 03:50:58 +0000 (03:50 +0000)]
[mlir][linalg] Changing the positions of introduced parallel loop in SplitReduction to be consistent with IREE's downstream passes

IREE's passes depend on the behavior of SplitReduction's introduced
parallel loop being the same as the introduced dimension in the
intermediate tensor (the order of loops was changed in
https://reviews.llvm.org/D137478).

Differential Revision: https://reviews.llvm.org/D138972

19 months ago[X86] include cmpccxaddintrin.h from immintrin.h to x86gprintrin.h
Freddy Ye [Wed, 30 Nov 2022 02:40:01 +0000 (10:40 +0800)]
[X86] include cmpccxaddintrin.h from immintrin.h to x86gprintrin.h

Reviewed By: LuoYuanke, pengfei

Differential Revision: https://reviews.llvm.org/D138900

19 months ago[scudo] Do not consider releasing unallocated pages
Petr Hosek [Wed, 30 Nov 2022 02:34:46 +0000 (02:34 +0000)]
[scudo] Do not consider releasing unallocated pages

We already know that there are no free blocks above Region->AllocatedUser.
This results in a smaller RegionPageMap and faster releaseFreeMemoryToOS.

Patch By: fabio-d
Differential Revision: https://reviews.llvm.org/D138794

19 months ago[clang-doc] Fix warnings about lock_guard
Petr Hosek [Wed, 30 Nov 2022 02:30:40 +0000 (02:30 +0000)]
[clang-doc] Fix warnings about lock_guard

Fixes a warning about a potentially unsupported template argument
deduction by explicitly specifying the template type in std::lock_guard.

Patch By: brettw
Differential Revision: https://reviews.llvm.org/D138961

19 months ago[InstSimplify] Fold (X || Y) ? X : Y --> X
chenglin.bi [Wed, 30 Nov 2022 02:10:45 +0000 (10:10 +0800)]
[InstSimplify] Fold (X || Y) ? X : Y --> X

(X || Y) ? X : Y --> X
https://alive2.llvm.org/ce/z/oRQJee

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D138815

19 months ago[RISCV] Preserve chain output when selecting splat as x0 strided load.
Craig Topper [Tue, 29 Nov 2022 23:18:35 +0000 (15:18 -0800)]
[RISCV] Preserve chain output when selecting splat as x0 strided load.

We need the vlse node to have a chain output and it should replace
the chain output of the original load.

19 months ago[NFC] Removed call to getInstList() from range loops on BBs.
Vasileios Porpodas [Wed, 23 Nov 2022 21:30:03 +0000 (13:30 -0800)]
[NFC] Removed call to getInstList() from range loops on BBs.

Differential Revision: https://reviews.llvm.org/D138605

19 months ago[ADT] Add `zip_equal` for iteratees of equal lengths
Jakub Kuderski [Wed, 30 Nov 2022 00:56:23 +0000 (19:56 -0500)]
[ADT] Add `zip_equal` for iteratees of equal lengths

Add a new version of `zip` that assumes that all iteratees have equal
lengths. The difference compared to `zip_first` is that `zip_equal`
checks this assumption in builds with assertions enabled.

This will allow us to clearly express the intent when working with
equally-sized ranges without having to write this assertion manually.

This is similar to Python's `zip(..., equal=True)` [1] or
`more_itertools.zip_equal` [2].

I saw this first suggested by @benvanik.

[1] https://peps.python.org/pep-0618/
[2] https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.zip_equal

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D138865

19 months ago[ADT] Clarify `zip` behavior with iteratees of different lengths
Jakub Kuderski [Wed, 30 Nov 2022 00:50:15 +0000 (19:50 -0500)]
[ADT] Clarify `zip` behavior with iteratees of different lengths

Update the documentation comment and add a new test case.

Add an assertion in `zip_first` checking the iteratee length precondition.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D138858

19 months ago[RISCV][CodeGen] Account for LMUL for Vector Fixed-Point Arithmetic Instructions
Michael Maitland [Thu, 3 Nov 2022 16:37:00 +0000 (09:37 -0700)]
[RISCV][CodeGen] Account for LMUL for Vector Fixed-Point Arithmetic Instructions

It is likley that subtargets act differently for vector fixed-point arithmetic instructions based on the LMUL.
This patch creates seperate SchedRead, SchedWrite, WriteRes, ReadAdvance for each relevant LMUL.

Differential Revision: https://reviews.llvm.org/D137342

19 months ago[FuzzMutate] SinkInstructionStrategy
Peter Rong [Tue, 29 Nov 2022 21:41:02 +0000 (13:41 -0800)]
[FuzzMutate] SinkInstructionStrategy

Randomlly select an instruction and try to use it in the future by replacing it with another instruction's operand.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138948

19 months agoAMDGPU: Convert some cast tests to opaque pointers
Matt Arsenault [Tue, 29 Nov 2022 23:41:50 +0000 (18:41 -0500)]
AMDGPU: Convert some cast tests to opaque pointers

19 months ago[libc++][math.h] move #undefs to the top and guard explicitly against MSVCRT instead
Nikolas Klauser [Fri, 4 Nov 2022 19:19:20 +0000 (20:19 +0100)]
[libc++][math.h] move #undefs to the top and guard explicitly against MSVCRT instead

Reviewed By: ldionne, #libc

Spies: #libc_vendors, EricWF, libcxx-commits

Differential Revision: https://reviews.llvm.org/D137502

19 months ago[libc++][math.h] Use builtins for all the functions
Nikolas Klauser [Thu, 20 Oct 2022 23:41:22 +0000 (01:41 +0200)]
[libc++][math.h] Use builtins for all the functions

This allows compiling libc++, even when the C library doesn't support floating point math.

Reviewed By: ldionne, #libc

Spies: daltenty, xingxue, libcxx-commits, michaelplatings

Differential Revision: https://reviews.llvm.org/D136393

19 months ago[libc++] Install llvm-dev in the docker image
Nikolas Klauser [Fri, 25 Nov 2022 17:25:05 +0000 (18:25 +0100)]
[libc++] Install llvm-dev in the docker image

This is required to compile custom clang-tidy checks

Reviewed By: Mordante, #libc

Spies: libcxx-commits, arichardson

Differential Revision: https://reviews.llvm.org/D138728

19 months ago[mlir][tensor] Enhance the verifier of pack and unpack op.
Hanhan Wang [Tue, 29 Nov 2022 19:38:15 +0000 (11:38 -0800)]
[mlir][tensor] Enhance the verifier of pack and unpack op.

The outer_dims_perm must be a permutation or empty.

Reviewed By: chelini

Differential Revision: https://reviews.llvm.org/D138936

19 months agoAMDGPU: Convert some bit operation tests to opaque pointers
Matt Arsenault [Tue, 29 Nov 2022 23:26:06 +0000 (18:26 -0500)]
AMDGPU: Convert some bit operation tests to opaque pointers

19 months agoAMDGPU: Fix creating illegal f16 fp_class
Matt Arsenault [Tue, 29 Nov 2022 01:18:10 +0000 (20:18 -0500)]
AMDGPU: Fix creating illegal f16 fp_class

We were missing legality checks. The device library build was broken
for targets without f16 support. Technically the first pattern isn't
tested by this patch; it only triggers with the isBeforeLegalize check
in performAndCombine removed. I'm not sure how to trick this into
appearing post-legalization.

19 months agoAMDGPU: Bulk update some call tests to use opaque pointers
Matt Arsenault [Tue, 29 Nov 2022 23:10:38 +0000 (18:10 -0500)]
AMDGPU: Bulk update some call tests to use opaque pointers

19 months agoAMDGPU: Bulk update some generic intrinsic tests to opaque pointers
Matt Arsenault [Fri, 25 Nov 2022 03:53:12 +0000 (22:53 -0500)]
AMDGPU: Bulk update some generic intrinsic tests to opaque pointers

Done purely with the script.

19 months agoAMDGPU: Convert amdgpu-alias-analysis.ll to opaque pointers
Matt Arsenault [Tue, 29 Nov 2022 22:54:34 +0000 (17:54 -0500)]
AMDGPU: Convert amdgpu-alias-analysis.ll to opaque pointers

This one was slightly tricky. The AA debug printing usually, but not
always, uses the old pointer syntax. Also, we need to stop folding out
0 index GEPs in a few of these cases.

19 months agoAMDGPU: Convert some fp op tests to opaque issues
Matt Arsenault [Tue, 29 Nov 2022 22:49:58 +0000 (17:49 -0500)]
AMDGPU: Convert some fp op tests to opaque issues

fmax_legacy.ll had one test that produced "ptraddrspace(1)", since
somehow "i1addrspace(1)*" used to parse.

19 months ago[Hexagon] Fix unused variable warning in Release builds. NFC
Benjamin Kramer [Tue, 29 Nov 2022 23:00:43 +0000 (00:00 +0100)]
[Hexagon] Fix unused variable warning in Release builds. NFC

19 months ago[clang-doc] Move file layout to the generators.
Brett Wilson [Tue, 29 Nov 2022 20:35:58 +0000 (12:35 -0800)]
[clang-doc] Move file layout to the generators.

Previously file naming and directory layout was handled on a per Info
object basis by ClangDocMain and the generators blindly wrote to the
files given. This means all generators must use the same file layout and
caused problems where multiple objects mapped to the same file. The
object collision problem happens most easily with template
specializations because the template parameters are not part of the
"name".

This patch moves the responsibility for output file organization to the
generators. Currently HTML and MD use the same structure as before. But
they now collect all objects that map to a given file and combine them,
avoiding the corruption problems.

Converts the YAML generator to naming files based on USR in one
directory. This is easier for downstream tools to manage and avoids the
naming problems with template specializations. Since this change
requires backward-incompatible output changes to referenced files anyway
(since each one is now an array), this is a good time to introduce this
change.

Differential Revision: https://reviews.llvm.org/D138073

19 months ago[libc++][ranges][NFC] Revamp the Ranges status page
varconst [Tue, 29 Nov 2022 22:07:44 +0000 (14:07 -0800)]
[libc++][ranges][NFC] Revamp the Ranges status page

Focus on the not-yet-implemented features: remove most details about the
already-implemented C++20 stuff, list out the major C++23 additions.

Differential Revision: https://reviews.llvm.org/D136657

19 months ago[clang][darwin] Use consistent version define stringifying logic for different Darwin...
Alex Lorenz [Tue, 29 Nov 2022 22:13:39 +0000 (14:13 -0800)]
[clang][darwin] Use consistent version define stringifying logic for different Darwin OSes

19 months ago[Hexagon] Further improve code generation for shuffles
Krzysztof Parzyszek [Thu, 24 Nov 2022 15:05:10 +0000 (07:05 -0800)]
[Hexagon] Further improve code generation for shuffles

* Concatenate partial shuffles into longer ones whenever possible:
In selection DAG, shuffle's operands and return type must all agree. This
is not the case in LLVM IR, and non-conforming IR-level shuffles will be
rewritten to match DAG's requirements. This can also make a shuffle that
can be matched to a single HVX instruction become shuffles that require
more complex handling. Example: anything that takes two single vectors
and returns a pair (e.g. V6_vshuffvdd).
This is avoided by concatenating such shuffles into ones that take a vector
pair, and an undef pair, and produce a vector pair.

* Recognize perfect shuffles when masks contain `undef` values.

* Use funnel shifts for contracting shuffles.

* Recognize rotations as a separate step.

These changes go into a single commit, because each one on their own
introduced some regressions.

19 months ago[AIX][LTO] Properly respect LDR_CNTRL and set MAXDATA32 to 0xA0000000@DSA.
Wael Yehia [Tue, 29 Nov 2022 20:43:04 +0000 (20:43 +0000)]
[AIX][LTO] Properly respect LDR_CNTRL and set MAXDATA32 to 0xA0000000@DSA.

Reviewed By: rzurob

Differential Revision: https://reviews.llvm.org/D138944

19 months agoupdate_test_checks: fix typos
Nicolai Hähnle [Tue, 29 Nov 2022 22:08:47 +0000 (23:08 +0100)]
update_test_checks: fix typos

Found by our downstream CI.

19 months ago[libc++] Add a missing include to `swap_allocator.h`.
Konstantin Varlamov [Tue, 29 Nov 2022 21:54:20 +0000 (13:54 -0800)]
[libc++] Add a missing include to `swap_allocator.h`.

Also add tests for the file.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D135635

19 months ago[InstCombine] Revert D125845
William Huang [Tue, 29 Nov 2022 21:49:44 +0000 (21:49 +0000)]
[InstCombine] Revert D125845

Reverting D125845 `[InstCombine] Canonicalize GEP of GEP by swapping constant-indexed GEP to the back` because multiple users reported performance regression

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D138950

19 months ago[clang] [test] Fix recently pushed mingw tests in some environments
Martin Storsjö [Tue, 29 Nov 2022 21:56:50 +0000 (23:56 +0200)]
[clang] [test] Fix recently pushed mingw tests in some environments

Account for backslashes in paths in mingw.cpp.

Testing clang with the <triple>-clang form seems to require the
x86 target to be enabled, when the triple is an x86 triple. Just
skip that aspect of the test, since the "clang --target=<triple>"
form should give enough test coverage here.

19 months agoAMDGPU: Remove unused variables. NFC
Benjamin Kramer [Tue, 29 Nov 2022 21:57:00 +0000 (22:57 +0100)]
AMDGPU: Remove unused variables. NFC

19 months ago[PowerPC] Fix vperm codegen
Maryam Moghadas [Fri, 25 Nov 2022 21:58:00 +0000 (15:58 -0600)]
[PowerPC] Fix vperm codegen

Commit rG934d5fa2b8672695c335deed0e19d0e777c98403 changed the vperm codegen
for cases that vperm is not replaced by xxperm, this patch is to revert that.

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D138736

19 months agoRevert "enable code-object-version=5"
Ron Lieberman [Tue, 29 Nov 2022 21:21:09 +0000 (15:21 -0600)]
Revert "enable code-object-version=5"

very sorry wrong repo.

This reverts commit d882ba7aeac4b496dccd1b10cb58bd691786b691.

19 months agoRevert "Add mean_anyway to hpc config"
Ron Lieberman [Tue, 29 Nov 2022 21:19:44 +0000 (15:19 -0600)]
Revert "Add mean_anyway to hpc config"

my bad, wrong repo ,so sorry.

This reverts commit 0b9350f3da7daf1d740bbbfab79d01613fcd29f4.

19 months ago[clang][driver][darwin] Enforce consistent major version limit for any Darwin OS
Alex Lorenz [Tue, 29 Nov 2022 21:12:22 +0000 (13:12 -0800)]
[clang][driver][darwin] Enforce consistent major version limit for any Darwin OS

Limit can also be bumped up to 999 to allow OS versions over 100

19 months ago[openmp] [test] Use stdint.h instead of manual code defining kmp_int*. NFC.
Martin Storsjö [Mon, 28 Nov 2022 15:08:05 +0000 (17:08 +0200)]
[openmp] [test] Use stdint.h instead of manual code defining kmp_int*. NFC.

Differential Revision: https://reviews.llvm.org/D138818

19 months agoReapply [openmp] [test] XFAIL many-microtask-args.c on ARM
Martin Storsjö [Fri, 25 Nov 2022 14:26:50 +0000 (16:26 +0200)]
Reapply [openmp] [test] XFAIL many-microtask-args.c on ARM

On ARM, a C fallback version of __kmp_invoke_microtask is used,
which only handles up to a fixed number of arguments - while
many-microtask-args.c tests that the function can handle an
arbitrarily large number of arguments (the testcase produces 17
arguments).

On the CMake level, we can't add ${LIBOMP_ARCH} directly to
OPENMP_TEST_COMPILER_FEATURES in OpenMPTesting.cmake, since
that file is parsed before LIBOMP_ARCH is set. Instead
convert the feature list into a proper CMake list, and append
${LIBOMP_ARCH} into it before serializing it to an Python array.

Reapply: Make sure OPENMP_TEST_COMPILER_FEATURES is defined
properly in all other test subdirectories other than
runtime/test too.

Differential Revision: https://reviews.llvm.org/D138738