platform/upstream/llvm.git
13 months agoAdd release notes for MLIR
Mehdi Amini [Tue, 25 Jul 2023 18:14:16 +0000 (11:14 -0700)]
Add release notes for MLIR

Differential Revision: https://reviews.llvm.org/D156253

13 months ago[mlir] Fix assembly format parser generator after 9ea6b30ac20f8223fb6aeae853e5c736918...
Oleg Shyshkov [Tue, 25 Jul 2023 15:39:03 +0000 (17:39 +0200)]
[mlir] Fix assembly format parser generator after 9ea6b30ac20f8223fb6aeae853e5c73691850a8d.

13 months ago[mlir][spirv] Do not introduce vector<1xT> in UnifyAliasedResource
Jakub Kuderski [Tue, 25 Jul 2023 15:30:17 +0000 (11:30 -0400)]
[mlir][spirv] Do not introduce vector<1xT> in UnifyAliasedResource

1-element vectors are not valid in SPIR-V and fail `Bitcast` op verification.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D156207

13 months ago[CodeGen] Disable FP LD1RX instructions generation for Neoverse-V1
Igor Kirillov [Fri, 4 Aug 2023 12:05:27 +0000 (12:05 +0000)]
[CodeGen] Disable FP LD1RX instructions generation for Neoverse-V1

These instructions show worse performance on Neoverse-V1 compared
to pair of LDR(LDP)/MOV instructions.
This patch adds `no-sve-fp-ld1r` sub-target feature, which is enabled
only on Neoverse-V1.

Fixes https://github.com/llvm/llvm-project/issues/64498

Differential Revision: https://reviews.llvm.org/D157279

(cherry picked from commit 60e2a849b0a537f96ca12fb032c4a0e32e07b4ae)

13 months agoReland "Try to implement lambdas with inalloca parameters by forwarding without use...
Amy Huang [Fri, 23 Jun 2023 18:41:44 +0000 (11:41 -0700)]
Reland "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."t

This reverts commit 8ed7aa59f489715d39d32e72a787b8e75cfda151.

Differential Revision: https://reviews.llvm.org/D154007

(cherry picked from commit 27dab4d305acb6e0935e014c061c5317016ae2b3)

13 months ago[llvm-exegesis] Don't try to use SYS_rseq if it's not defined.
Guillaume Chatelet [Mon, 7 Aug 2023 07:31:53 +0000 (07:31 +0000)]
[llvm-exegesis] Don't try to use SYS_rseq if it's not defined.

When compiling against recent glibc (>= 2.35) but old kernel headers (< 4.18), `SYS_rseq` is not defined and thus llvm-exegesis fails to build. So also check that `SYS_rseq` is defined before trying to use it.

Fixes https://github.com/llvm/llvm-project/issues/64456

Reviewed By: MaskRay, gchatelet

Differential Revision: https://reviews.llvm.org/D157189

(cherry picked from commit f70e83af7a708a22fdde8c644ac5810223090cd4)

13 months ago[CodeGen] Fix incorrect pattern FMLA_* pseudo instructions
Igor Kirillov [Thu, 3 Aug 2023 15:57:12 +0000 (15:57 +0000)]
[CodeGen] Fix incorrect pattern FMLA_* pseudo instructions

* Remove the incorrect patterns from AArch64fmla_p/AArch64fmls_p
* Add correct patterns to AArch64fmla_m1/AArch64fmls_m1
* Refactor fma_patfrags for the sake of PatFrags

Fixes https://github.com/llvm/llvm-project/issues/64419

Differential Revision: https://reviews.llvm.org/D157095

(cherry picked from commit 84d444f90900d1b9d6c08be61f8d62090df28042)

13 months ago[CodeGen] Precommit tests for D157095
Igor Kirillov [Tue, 8 Aug 2023 11:30:32 +0000 (11:30 +0000)]
[CodeGen] Precommit tests for D157095

(cherry picked from commit 7542477d5d6e10848ac9ba5dd5421afc7e4947d2)

13 months ago[CodeGen] Pre-commit tests showing incorrect pattern FMLA_* pseudo instructions
Igor Kirillov [Fri, 4 Aug 2023 13:45:03 +0000 (13:45 +0000)]
[CodeGen] Pre-commit tests showing incorrect pattern FMLA_* pseudo instructions

Differential Revision: https://reviews.llvm.org/D157094

(cherry picked from commit b560d5c7e380c1c412b892a3e22f8ee15a522381)

13 months ago[lldb][AArch64] Save/restore TLS registers around expressions
David Spickett [Fri, 28 Jul 2023 08:03:40 +0000 (08:03 +0000)]
[lldb][AArch64] Save/restore TLS registers around expressions

This was cherry-picked from 6239227172cdc92f3bb72131333f50f83a6439cf and has been
modified to remove references to the tpidr2 register that is not supported on
the 17 branch.

Previously lldb was storing them but not restoring them. Meaning that this function:
```
void expr(uint64_t value) {
  __asm__ volatile("msr tpidr_el0, %0" ::"r"(value));
}
```
When run from lldb:
```
(lldb) expression expr()
```
Would leave tpidr as `value` instead of the original value of the register.

A check for this scenario has been added to TestAArch64LinuxTLSRegister.py,

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D156512

13 months ago[lldb][AArch64] Add reading of TLS tpidr register from core files
David Spickett [Mon, 24 Jul 2023 12:38:27 +0000 (12:38 +0000)]
[lldb][AArch64] Add reading of TLS tpidr register from core files

7e229217f4215b519b886e7881bae4da3742a7d2 did live processes, this does
core files. Pretty simple, there is an NT_ARM_TLS note that contains
at least tpidr, and on systems with the Scalable Matrix Extension (SME), tpidr2
as well.

tpidr2 will be handled in future patches for SME support.

This NT_ARM_TLS note has always been present but it seems convenient to
handle it as "optional" inside of LLDB. We'll probably want the flexibility
when supporting tpidr2.

Normally the C library would set tpidr but all our test sources build
without it. So I've updated the neon test program to write to tpidr
and regenerated the corefile.

I've removed the LLDB_PTRACE_NT_ARM_TLS that was unused, we get
what we need from llvm's defs instead.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D156118

13 months agoRevert "[Clang] Fix -Wconstant-logical-operand when LHS is a constant"
Shivam Gupta [Tue, 8 Aug 2023 01:50:36 +0000 (07:20 +0530)]
Revert "[Clang] Fix -Wconstant-logical-operand when LHS is a constant"

This reverts commit dfdfd306cfaf54fbc43e2d5eb36489dac3eb9976.

An issue is reported for wrong warning, this has to be reconsidered.

Differential Revision: https://reviews.llvm.org/D157352

13 months ago[libc++][libunwind] Fixes to allow GCC 13 to compile libunwind/libc++abi/libc++
Nikolas Klauser [Thu, 3 Aug 2023 23:37:22 +0000 (16:37 -0700)]
[libc++][libunwind] Fixes to allow GCC 13 to compile libunwind/libc++abi/libc++

These are changes to allow GCC 13 to successfully compile the runtimes stack.

Reviewed By: ldionne, #libc, #libunwind, MaskRay

Spies: MaskRay, zibi, SeanP, power-llvm-team, mstorsjo, arichardson, libcxx-commits

Differential Revision: https://reviews.llvm.org/D151387

(cherry picked from commit 3537338d1ab9b6da4b58499877953deb81c59e5e)

13 months ago[TailCallElim] Remove the readonly attribute of byval.
DianQK [Tue, 8 Aug 2023 20:50:30 +0000 (04:50 +0800)]
[TailCallElim] Remove the readonly attribute of byval.

When eliminating a tail call, we modify the values of the arguments.
Therefore, if the byval parameter has a readonly attribute, we have to remove it. It is safe because,
from the perspective of a caller, the byval parameter is always treated as "readonly," even if the readonly attribute is removed.

Fixes #64289.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D156793

(cherry picked from commit c3f227ead65c606409ff8cc3333a6c751f156a9c)

13 months ago[TailCallElim] Regenerate test checks with --function-signature (NFC)
DianQK [Tue, 8 Aug 2023 20:50:04 +0000 (04:50 +0800)]
[TailCallElim] Regenerate test checks with --function-signature (NFC)

For checking the readonly attribute.

Pre-commit test for D156793.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D156789

(cherry picked from commit b77e5563f6bc4b5a81d427bf0f42ebea8ca376f0)

13 months ago[LoongArch] Support -march=native and -mtune=
Weining Lu [Wed, 9 Aug 2023 01:58:34 +0000 (09:58 +0800)]
[LoongArch] Support -march=native and -mtune=

As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.

D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.

A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.

`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.

Two preprocessor macros are defined based on user-provided `-march=`
and `-mtune=` options and the defaults.
- __loongarch_arch
- __loongarch_tune
Note that, to work with `-fno-integrated-cc1` we leverage cc1 options
`-target-cpu` and `-tune-cpu` to pass driver options `-march=` and
`-mtune=` respectively because cc1 needs these information to define
macros in `LoongArchTargetInfo::getTargetDefines`.

[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc

Reviewed By: xen0n, wangleiat, steven_wu, MaskRay

Differential Revision: https://reviews.llvm.org/D155824

(cherry picked from commit f62c9252fc0f1fa0a0f02033659db052c2202a4c)

13 months agoRevert "Reland "[LoongArch] Support -march=native and -mtune=""
Steven Wu [Mon, 31 Jul 2023 22:49:08 +0000 (15:49 -0700)]
Revert "Reland "[LoongArch] Support -march=native and -mtune=""

This reverts commit c56514f21b2cf08eaa7ac3a57ba4ce403a9c8956. This
commit adds global state that is shared between clang driver and clang
cc1, which is not correct when clang is used with `-fno-integrated-cc1`
option (no integrated cc1). The -march and -mtune option needs to be
properly passed through cc1 command-line and stored in TargetInfo.

(cherry picked from commit 42c9354a928d4d9459504527085fccc91b46aed3)

13 months ago[libc++] Deflake the Clang Modules CI job
Louis Dionne [Mon, 7 Aug 2023 18:38:40 +0000 (14:38 -0400)]
[libc++] Deflake the Clang Modules CI job

This re-introduces the workaround that had been introduced in d7ca140c0122
and then removed in 0c0628c92c0d, since it seems like it is needed after all.

Differential Revision: https://reviews.llvm.org/D157319

(cherry picked from commit d2a61db072e90ca15a8e5bc053aab878af5cb92a)

13 months agoworkflows/release-tasks: Add missing sudo
Tom Stellard [Tue, 1 Aug 2023 20:40:49 +0000 (13:40 -0700)]
workflows/release-tasks: Add missing sudo

(cherry picked from commit ffecb43c4812707be07a9810f21b7b407480f868)

13 months ago[clang][hexagon] Handle library path arguments earlier
Brian Cain [Tue, 1 Aug 2023 02:42:45 +0000 (19:42 -0700)]
[clang][hexagon] Handle library path arguments earlier

The removal of the early return in 96832a6bf7e0e7f1e8d634d38c44a1b32d512923
was an error: it would include the 'standalone' library that's not used
by linux.

Instead we reproduce the library path handling in the linux/musl block.

Differential Revision: https://reviews.llvm.org/D156771

(cherry picked from commit 5bc4b34a3aa9c6ea10663a252ac46d20862b38d5)

13 months ago[PPC32] Parse bl __tls_get_addr(x@tlsgd)@plt+32768
Fangrui Song [Tue, 8 Aug 2023 02:45:28 +0000 (19:45 -0700)]
[PPC32] Parse bl __tls_get_addr(x@tlsgd)@plt+32768

PPC32 -fpic/-fPIC generates `bl __tls_get_addr(x@tlsgd)@PLT` or
`bl __tls_get_addr(x@tlsgd)@PLT+32768`.
`powerpc-linux-gnu-gcc -fPIC` generates `bl __tls_get_addr+32668(x@tlsgd)@plt`.

These expressions can be parsed by GNU assembler but not by the integrated
assembler. Add the support.

Differential Revision: https://reviews.llvm.org/D153206

(cherry picked from commit 6e07e90890d61b1be19d3f5fbf00ea7430068325)

13 months ago[CodeGen] Improve speed of ComplexDeinterleaving pass
Igor Kirillov [Wed, 2 Aug 2023 16:26:52 +0000 (16:26 +0000)]
[CodeGen] Improve speed of ComplexDeinterleaving pass

Cache all results of running `identifyNode`, even those that do not identify
potential complex operations. This patch prevents ComplexDeinterleaving pass
from repeatedly trying to identify Nodes for the same pair of instructions.

Fixes https://github.com/llvm/llvm-project/issues/64379

Differential Revision: https://reviews.llvm.org/D156916

(cherry picked from commit 46b2ad0224d3c9a9cc299211213e2cf677f5a78c)

13 months agoBump version to 17.0.0-rc2
Tobias Hieta [Tue, 8 Aug 2023 11:16:44 +0000 (13:16 +0200)]
Bump version to 17.0.0-rc2

13 months ago[libc++][modules] Fixes exporting named declarations.
Mark de Wever [Fri, 28 Jul 2023 15:44:06 +0000 (17:44 +0200)]
[libc++][modules] Fixes exporting named declarations.

@ChuanqiXu noticed std::atomic was not properly exported in the std module.
Investigation showed other named declarations were not exported either. This
fixes the issue.

Depends on D156550

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D156592

(cherry picked from commit e57f6f709ed2ebef7852bebb47baaf202962b4ee)

13 months ago[clang/cxx-interop] Teach clang to ignore availability errors that come from CF_OPTIONS
zoecarver [Mon, 7 Aug 2023 00:21:51 +0000 (20:21 -0400)]
[clang/cxx-interop] Teach clang to ignore availability errors that come from CF_OPTIONS

This cherry-picks https://github.com/apple/llvm-project/pull/6431
since without it, macOS 14 SDK headers don't compile when targeting
catalyst.

Fixes #64438.

(cherry picked from commit bb58748e52ebd48a46de20525ef2c594db044a11)

13 months ago[Clang][LoongArch] Fix ABI handling of empty structs in C++ to match GCC behaviour
Weining Lu [Mon, 7 Aug 2023 09:49:18 +0000 (17:49 +0800)]
[Clang][LoongArch] Fix ABI handling of empty structs in C++ to match GCC behaviour

GCC doesn't ignore non-zero-length array of empty structures in C++
while clang does. What this patch did is to match GCC's behaviour
although this rule is not documented in psABI.

Similar to D142327 for RISCV.

Reviewed By: xry111, xen0n

Differential Revision: https://reviews.llvm.org/D156116

13 months agoRevert "[clang][X86] Add __cpuidex function to cpuid.h"
Aiden Grossman [Fri, 4 Aug 2023 16:20:50 +0000 (09:20 -0700)]
Revert "[clang][X86] Add __cpuidex function to cpuid.h"

This reverts commit 2df77ac20a1ed996706b164b0c4ed5ad140f635f.

This has been causing some issues with some windows builds as
_MSC_EXTENSIONS isn't defined when only -fms-extensions is set, but the
builtin that conflicts with __cpuidex is. This was also causing problems
as it exposed some latent issues with how auxiliary triples are handled
in clang.

Differential Revision: https://reviews.llvm.org/D157115

(cherry picked from commit f3baf63d9a1ba91974f4df6abb8f2abd9a0df5b5)

13 months agoAMDGPU: Remove note about spill handling changes
Matt Arsenault [Mon, 7 Aug 2023 21:35:16 +0000 (17:35 -0400)]
AMDGPU: Remove note about spill handling changes

This was reverted from the release by
8ff26437cfd37a3611d3b6066e5aa2cf933887e0

13 months agoMIPS: clear_cache, use _flush_cache instead of cacheflush
YunQiang Su [Mon, 7 Aug 2023 19:25:34 +0000 (15:25 -0400)]
MIPS: clear_cache, use _flush_cache instead of cacheflush

The cacheflush is only defined with __USE_MISC, which depends on _DEFAULT_SOURCE,
_GNU_SOURCE or _BSD_SOURCE, or _SVID_SOURCE.

If CC is called with -std=c11, these macros won't be defined, Let's use
_flush_cache, which is defined always.

Reviewed By: brad, jrtc27

Differential Revision: https://reviews.llvm.org/D156072

(cherry picked from commit 0f99bc2d685c572c3b38fd0e1ca56be12d7e2f6a)

13 months ago[libc++][print] Mark some more `<print>` tests as requiring a file system.
Konstantin Varlamov [Fri, 4 Aug 2023 20:53:43 +0000 (13:53 -0700)]
[libc++][print] Mark some more `<print>` tests as requiring a file system.

A follow-up to https://reviews.llvm.org/D156585.

(cherry picked from commit e6b2e1b88248c1f5b34fde4e51aaeb99f849b580)

13 months ago[clang][RISCV] Fix bug in ABI handling of empty structs with hard FP calling conventi...
Alex Bradbury [Mon, 7 Aug 2023 09:42:45 +0000 (10:42 +0100)]
[clang][RISCV] Fix bug in ABI handling of empty structs with hard FP calling conventions in C++

As reported in <https://github.com/llvm/llvm-project/issues/58929>,
Clang's handling of empty structs in the case of small structs that may
be eligible to be passed using the hard FP calling convention doesn't
match g++. In general, C++ record fields are never empty unless
[[no_unique_address]] is used, but the RISC-V FP ABI overrides this.

After this patch, fields of structs that contain empty records will be
ignored, even in C++, when considering eligibility for the FP calling
convention ('flattening'). It isn't explicitly noted in the RISC-V
psABI, but arrays of empty records will disqualify a struct for
consideration of using the FP calling convention in g++. This patch
matches that behaviour. The psABI issue
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/358> seeks
to clarify this.

This patch was previously committed but reverted after a bug was found.
This recommit adds additional logic to prevent that bug (adding an extra
check for when a candidate from detectFPCCEligibleStructHelper may not
be valid).

Differential Revision: https://reviews.llvm.org/D142327

13 months agoRevert "Reapply: [MemCpyOpt] implement single BB stack-move optimization which unify...
Vitaly Buka [Wed, 2 Aug 2023 17:14:49 +0000 (10:14 -0700)]
Revert "Reapply: [MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas"""

Breaks Asan and LTO.

This reverts commit ea72b5137eb72391ad192dbb01084c21b9fe8b71.

(cherry picked from commit 00653889883f2d818536efcb21c6c8b739f0888b)

13 months agoRevert "[CodeGen]Allow targets to use target specific COPY instructions for live...
Vitaly Buka [Wed, 26 Jul 2023 22:41:17 +0000 (15:41 -0700)]
Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"

And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.

No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D156381

(cherry picked from commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd)

13 months ago[sanitizer_common] Remove hacks for __builtin_return_address abuse on SPARC
Rainer Orth [Thu, 3 Aug 2023 14:06:59 +0000 (16:06 +0200)]
[sanitizer_common] Remove hacks for __builtin_return_address abuse on SPARC

As detailed in Issue #57624, the introduction of
`__builtin_extract_return_address` to `GET_CALLER_PC` in
4248f32b9ebe87c7af8ee53911efd47c2652f488
<https://reviews.llvm.org/rG4248f32b9ebe87c7af8ee53911efd47c2652f488> broke
`TestCases/Misc/missing_return.cpp` on Solaris/SPARC.  Unlike most other
targets, the builtin isn't a no-op on SPARC and thus has always been
necessary. Its lack had previously been worked around by calls to
`GetNextInstructionPc` in `sanitizer_stacktrace_sparc.cpp`
(`BufferedStackTrace::UnwindFast`) and `sanitizer_unwind_linux_libcdep.cpp`
(`BufferedStackTrace::UnwindSlow`).  However, those calls are superfluous
now and actually harmful.

This patch removes those hacks, fixing the failure.

Tested on `sparcv9-sun-solaris2.11` and on `sparc-sun-solaris2.11` in the
GCC tree.  On the latter, several more testcase failures had been caused by
this issue since ASan actually works with `gcc` on SPARC, unlike `clang`.

Differential Revision: https://reviews.llvm.org/D156504

(cherry picked from commit 679c076ae446af81eba81ce9b94203a273d4b88a)

13 months agocmake: add missing dependencies on ClangDriverOptions tablegen
Jon Roelofs [Fri, 4 Aug 2023 17:42:45 +0000 (10:42 -0700)]
cmake: add missing dependencies on ClangDriverOptions tablegen

This is a follow-up to 2fb1c1082c01

(cherry picked from commit 3d756c32cdf005d0f4c05f561fec4a37b64b7ddd)

13 months ago[SymbolSize] Improve the performance of SymbolSize computation
Steven Wu [Sun, 30 Jul 2023 19:14:34 +0000 (12:14 -0700)]
[SymbolSize] Improve the performance of SymbolSize computation

The current algorithm to compute the symbol size is quadratic if there
are lots of symbols sharing the same addresses. This happens in a debug
build when lots of debug symbols get emitted in the symtab.

This patch improves the performance like `llvm-symbolizer` that relies
on the symbol size computation. Symbolizing a release+assert clang with
DebugInfo sees significant improvements from 3:40min to less than 1s.

Reviewed By: pete, mehdi_amini, arsenm, MaskRay

Differential Revision: https://reviews.llvm.org/D156603

(cherry picked from commit f5974e80653db977913bceffca7e900e818ef872)

13 months agoRetain all jump table range checks when using BTI.
Simon Tatham [Mon, 31 Jul 2023 08:09:09 +0000 (09:09 +0100)]
Retain all jump table range checks when using BTI.

This modifies the switch-statement generation in SelectionDAGBuilder,
specifically the part that generates case clusters of type CC_JumpTable.

A table-based branch of any kind is at risk of being a JOP gadget, if
it doesn't range-check the offset into the table. For some types of
table branch, such as Arm TBB/TBH, the impact of this is limited
because the value loaded from the table is a relative offset of
limited size; for others, such as a MOV PC,Rn computed branch into a
table of further branch instructions, the gadget is fully general.

When compiling for branch-target enforcement via Arm's BTI system,
many of these table branch idioms use branch instructions of types
that do not require a BTI instruction at the branch destination. This
avoids the need to put a BTI at the start of each case handler,
reducing the number of available gadgets //with// BTIs (i.e. ones
which could be used by a JOP attack in spite of the BTI system). But
without a range check, the use of a non-BTI-requiring branch also
opens up a larger range of followup gadgets for an attacker's use.

A defence against this is to avoid optimising away the range check on
the table offset, even if the compiler believes that no out-of-range
value should be able to reach the table branch. (Rationale: that may
be true for values generated legitimately by the program, but not
those generated maliciously by attackers who have already corrupted
the control flow.)

The effect of keeping the range check and branching to an unreachable
block is that no actual code is generated at that block, so it will
typically point at the end of the function. That may still cause some
kind of unpredictable code execution (such as executing data as code,
or falling through to the next function in the code section), but even
if so, there will only be //one// possible invalid branch target,
rather than giving an attacker the choice of many possibilities.

This defence is enabled only when branch target enforcement is in use.
Without branch target enforcement, the range check is easily bypassed
anyway, by branching in to a location just after it. But with
enforcement, the attacker will have to enter the jump table dispatcher
at the initial BTI and then go through the range check. (Or, if they
don't, it's because they //already// have a general BTI-bypassing
gadget.)

Reviewed By: MaskRay, chill

Differential Revision: https://reviews.llvm.org/D155485

(cherry picked from commit 60b98363c7ed0a549be4d51ee07c32dc2bf47d2f)

13 months ago[libc++][print] Make `<print>` tests require file system support.
Konstantin Varlamov [Fri, 4 Aug 2023 07:23:41 +0000 (00:23 -0700)]
[libc++][print] Make `<print>` tests require file system support.

`print` functions require `FILE` and `stdout` to be available and cause
compilation errors on platforms that don't support the file system.

Differential Revision: https://reviews.llvm.org/D156585

(cherry picked from commit 1cf970db4e5499f6b38d9c6644935a78d758802c)

13 months ago[libc++][mdspan] Fix layout_left::stride(r)
Christian Trott [Fri, 4 Aug 2023 03:35:23 +0000 (21:35 -0600)]
[libc++][mdspan] Fix layout_left::stride(r)

It was using the stride calculation of layout_right.

Reviewed By: philnik

Differential Revision: https://reviews.llvm.org/D157065

(cherry picked from commit 0f4d7d81c9d08512a3871596fa2a14b737233c80)

13 months ago[RISCV] Use max pushed register to get pushed register number.
Yeting Kuo [Thu, 3 Aug 2023 06:35:09 +0000 (14:35 +0800)]
[RISCV] Use max pushed register to get pushed register number.

Previously we used the number of registers needed saved and pushable as the
number of pushed registers. We also use pushed register number to caculate
the stack size. It is not correct because Zcmp pushes registers from $ra to the
max register needed saved and there is no gurantee that the needed saved
registers are a sequenced list from $ra.

There is an example about that. PushPopRegs should be 6 (ra,s0 - s4)= instead of 1.
```
; llc -mtriple=riscv32 -mattr=+zcmp
define void @foo() {
entry:
; Old:    .cfi_def_cfa_offset 16
; New:    .cfi_def_cfa_offset 32
  tail call void asm sideeffect "li s4, 0", "~{s4}"()
  ret void
}
```

Reviewed By: Jim, kito-cheng

Differential Revision: https://reviews.llvm.org/D156407

(cherry picked from commit f68c6879ad0e08e6509b89f60ed436d3be409f9c)

13 months agoCommit to a primary definition for a class when we load its first
Richard Smith [Tue, 25 Jul 2023 00:34:08 +0000 (17:34 -0700)]
Commit to a primary definition for a class when we load its first
member.

Previously, we wouldn't do this if the first member loaded is within a
definition that's added to a class via an update record, which happens
when template instantiation adds a class definition to a declaration
that was imported from an AST file.

This would lead to classes having member functions whose getParent
returned a class declaration that wasn't the primary definition, which
in turn caused the vtable builder to build broken vtables.

I don't yet have a reduced testcase for the wrong-code bug here, because
the setup required to get us into the broken state is very subtle, but
have confirmed that this fixes it.

(cherry picked from commit 61c7a9140becb19c5b1bc644e54452c6f782f5d5)

13 months agoRemove stale info and fix superscript numbering
Aaron Ballman [Tue, 1 Aug 2023 11:17:27 +0000 (07:17 -0400)]
Remove stale info and fix superscript numbering

This amends 1e06b82bded69fe627d6cd62ecff236fca15f39b

(cherry picked from commit 80e80fa79bf66a74caf959bc420823e2b544dee9)

13 months ago[docs] Bump minimum GCC version to 7.4
Fangrui Song [Mon, 31 Jul 2023 20:10:08 +0000 (13:10 -0700)]
[docs] Bump minimum GCC version to 7.4

GCC 7.3 cannot build 16.x releases.
```
In file included from /tmp/llvm-16/llvm/lib/Transforms/IPO/AttributorAttributes.cpp:14:0:
/tmp/llvm-16/llvm/include/llvm/Transforms/IPO/Attributor.h:1137:32: error: duplicate initialization of ‘llvm::AnalysisGetter::HasLegacyWrapper<Analysis, std::void_t<typename Analysis::Lega
cyWrapper> >’
 constexpr bool AnalysisGetter::HasLegacyWrapper<
                                ^~~~~~~~~~~~~~~~~
       Analysis, std::void_t<typename Analysis::LegacyWrapper>> = true;
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/llvm-16/llvm/include/llvm/Transforms/IPO/Attributor.h:1137:32: error: got 1 template parameters for ‘constexpr const bool llvm::AnalysisGetter::HasLegacyWrapper< <template-parameter-1
-1>, <template-parameter-1-2> >’
/tmp/llvm-16/llvm/include/llvm/Transforms/IPO/Attributor.h:1137:32: error:   but 2 required
```

The 17.x and main branches have more failures, e.g.

```
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp: `error: cannot decompose class type ‘std::pair<llvm::Value*, const llvm::SCEV*>’: ...`
```

We probably should just give up 7.1 and say that GCC<=7.3 is unsupported.
There is evidence that GCC 7.4 works.
I have verified that GCC 7.5 is able to build `check-{llvm,clang,clang-tools,lldb,lld,polly,mlir,bolt}`,
but not flang due to at least `flang/Common/enum-class.h` and a `<charconv`> in a unittest.

Link: https://discourse.llvm.org/t/require-gcc-7-5-as-gcc-7-3-cannot-build-llvm/72310
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D156286

(cherry picked from commit 1e06b82bded69fe627d6cd62ecff236fca15f39b)

13 months ago[TableGen] Improve error report of unspecified arguments
wangpc [Thu, 3 Aug 2023 09:20:10 +0000 (17:20 +0800)]
[TableGen] Improve error report of unspecified arguments

Wrong error message is fixed and a note of argument is printed.

Tests are added in `llvm/test/TableGen/template-args.td`.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D156966

(cherry picked from commit eb6987027e0504adcdc319f080a9ea48aab2a72a)

13 months ago[X86] Workaround possible CPUID bug in Sandy Bridge.
Craig Topper [Thu, 3 Aug 2023 15:12:00 +0000 (08:12 -0700)]
[X86] Workaround possible CPUID bug in Sandy Bridge.

Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
supported via EAX.

Intel documentation says invalid subleaves return 0. We had been
relying on that behavior instead of checking the max sublef number.

It appears that some Sandy Bridge CPUs return at least the subleaf 0
EDX value for subleaf 1. Best guess is that this is a bug in a
microcode patch since all of the bits we're seeing set in EDX were
introduced after Sandy Bridge was originally released.

This is causing avxvnniint16 to be incorrectly enabled with -march=native
on these CPUs.

Reviewed By: pengfei, anna

Differential Revision: https://reviews.llvm.org/D156963

(cherry picked from commit 2a5e3f4c6c2cdd2aab55fbfdb703ca8163351ea9)

13 months agoMultilib & mfloat-abi release notes
Michael Platings [Tue, 1 Aug 2023 13:21:01 +0000 (14:21 +0100)]
Multilib & mfloat-abi release notes

13 months ago[PowerPC][MC] Recognize tlbilx and its mnemonics
Qiu Chaofan [Wed, 2 Aug 2023 03:10:46 +0000 (11:10 +0800)]
[PowerPC][MC] Recognize tlbilx and its mnemonics

This fixes issue 64080. tlbilx exists in ISA 2.07 Book III-E. Since
contents of Book III-E were eliminated after ISA 3.0, tlbilx does not
exist in ISA 3.0 and ISA 3.1.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D156204

(cherry picked from commit 53648ac1d0c953ae6d008864dd2eddb437a92468)

13 months ago[libc++] Fix `std::out_of_range` thrown from `basic_stringbuf::str() &&`
Piotr Fusik [Tue, 1 Aug 2023 18:17:46 +0000 (20:17 +0200)]
[libc++] Fix `std::out_of_range` thrown from `basic_stringbuf::str() &&`

Reviewed By: #libc, Mordante, philnik

Differential Revision: https://reviews.llvm.org/D156783

(cherry picked from commit f418cb1a9367d85c7c9b1aa93dc3fa60c8ef9849)

13 months ago[RISCV] Fix the CFI offset for callee-saved registers stored by Zcmp push.
Jim Lin [Wed, 2 Aug 2023 02:49:18 +0000 (10:49 +0800)]
[RISCV] Fix the CFI offset for callee-saved registers stored by Zcmp push.

Issue mentioned: https://github.com/riscv/riscv-code-size-reduction/issues/182

The order of callee-saved registers stored by Zcmp push in memory is reversed.

Pseudo code for cm.push in https://github.com/riscv/riscv-code-size-reduction/releases/download/v1.0.4-1/Zc.1.0.4-1.pdf

```
if (XLEN==32) bytes=4; else bytes=8;

addr=sp-bytes;
for(i in 27,26,25,24,23,22,21,20,19,18,9,8,1)  {
  //if register i is in xreg_list
  if (xreg_list[i]) {
    switch(bytes) {
      4:  asm("sw x[i], 0(addr)");
      8:  asm("sd x[i], 0(addr)");
    }
    addr-=bytes;
  }
}
```

The placement order for push is s11, s10, ..., ra.

CFI offset should be calculed as reversed order for correct stack unwinding.

Reviewed By: fakepaper56, kito-cheng

Differential Revision: https://reviews.llvm.org/D156437

13 months agoRevert "[AArch64] Merge LDRSWpre-LD[U]RSW pair into LDPSWpre"
Alexander Kornienko [Wed, 26 Jul 2023 13:34:10 +0000 (15:34 +0200)]
Revert "[AArch64] Merge LDRSWpre-LD[U]RSW pair into LDPSWpre"

This reverts commit b0093e13fcfdd4eea5bbd7ae57d3d1b82f4135c3 due to a miscompile
under MSan. See https://reviews.llvm.org/D152407#4533478 for more details.

Reviewed By: asmok-g

Differential Revision: https://reviews.llvm.org/D156328

(cherry picked from commit 0def4e6b0f638b97a73bd4674365961d8fabda28)

13 months agoclang driver throws error for -mabi=elfv2 or elfv2
Kishan Parmar [Sat, 29 Jul 2023 10:39:54 +0000 (16:09 +0530)]
clang driver throws error for -mabi=elfv2 or elfv2

After clang release/16.x there is a regression that -mabi=elfv1
or -mabi=elfv2 are being unused and throws warning. But clang-trunk
throws error for -mabi=elfv2 or elfv1. Intent of this patch to accept
elfv1 or elfv2 for -mabi.

Reviewed By : nemanjai
Differential Revision: https://reviews.llvm.org/D156351

(cherry picked from commit 065da3574b4fe9d4ee6283de2c82b8ce1c08af08)

13 months ago[clang] allow const structs/unions/arrays to be constant expressions for C
Nick Desaulniers [Wed, 2 Aug 2023 22:23:47 +0000 (15:23 -0700)]
[clang] allow const structs/unions/arrays to be constant expressions for C

For code like:
struct foo { ... };
struct bar { struct foo foo; };
const struct foo my_foo = { ... };
struct bar my_bar = { .foo = my_foo };

Eli Friedman points out the relevant part of the C standard seems to
have some flexibility in what is considered a constant expression:

6.6 paragraph 10:
An implementation may accept other forms of constant expressions.

GCC 8 added support for these, so clang not supporting them has been a
constant thorn in the side of source code portability within the Linux
kernel.

Fixes: https://github.com/llvm/llvm-project/issues/44502

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76096

(cherry picked from commit 610ec954e1f81c0e8fcadedcd25afe643f5a094e)

13 months ago[RISCV] Use correct LMUL!=1 types for __attribute__((riscv_rvv_vector_bits(N)))
wangpc [Tue, 1 Aug 2023 17:20:11 +0000 (01:20 +0800)]
[RISCV] Use correct LMUL!=1 types for __attribute__((riscv_rvv_vector_bits(N)))

We used to convert them to M1 types in arguments and return
value, which causes failures in CodeGen since it is not legal
to insert subvectors with LMUL>1 to M1 vectors.

Fixes 64266

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D156779

(cherry picked from commit edb5056300bbb327a4b07b4f64ccc8678345721a)

13 months ago[AArch64] Add some basic handling for bf16 constants.
David Green [Mon, 31 Jul 2023 20:31:56 +0000 (21:31 +0100)]
[AArch64] Add some basic handling for bf16 constants.

This adds some basic handling for bf16 constants, attempting to treat them a
lot like fp16 constants where it can. Zero immediates get lowered to FMOVH0,
others either get lowered to FMOVWHr(MOVi32imm) or use FMOVHi if they can.
Without fp16 they get expanded. This may not always be optimal, but fixes a gap
in our lowering. See llvm/test/CodeGen/AArch64/f16-imm.ll for the equivalent
fp16 test.

Differential Revision: https://reviews.llvm.org/D156649

(cherry picked from commit 778fa4edaf207bd2fef3635ceb8782e325ded76a)

13 months ago[libcxx] Add release notes for Windows wide stdio stream handling
Martin Storsjö [Sun, 30 Jul 2023 19:37:38 +0000 (22:37 +0300)]
[libcxx] Add release notes for Windows wide stdio stream handling

This adds notes for the change from https://reviews.llvm.org/D146398 /
fcbbd9649ac165aaf7fc7d60b8fef3b23755179a.

Differential Revision: https://reviews.llvm.org/D156627

(cherry picked from commit 9abc6d9105ca625ee2a03c0ec96a77d9575ca34f)

13 months ago[docs] Add release notes for the LLVM 17 RVV intrinsics support
eopXD [Mon, 31 Jul 2023 08:16:06 +0000 (01:16 -0700)]
[docs] Add release notes for the LLVM 17 RVV intrinsics support

13 months ago[XCOFF] Do not put MergeableCStrings in their own section
Wael Yehia [Wed, 26 Jul 2023 20:48:13 +0000 (20:48 +0000)]
[XCOFF] Do not put MergeableCStrings in their own section

The current implementation generates a csect with a
".rodata.str.x.y" prefix for a MergeableCString variable definition.
However, a reference to such variable does not get the prefix in its
name because there's not enough information in the containing IR.
In particular, without seeing the initializer and absent of some other
indicators, we cannot tell that the referenced variable is a null-
terminated string.

When the AIX codegen in llvm was being developed, the prefixing was copied
from ELF without having the linker take advantage of the info.
Currently, the AIX linker does not have the capability to merge
MergeableCString variables. If such feature would ever get implemented,
the contract between the linker and compiler would have to be reconsidered.

Here's the before and after of this change:
```
@a = global i64 320255973571806, align 8
@strA = unnamed_addr constant [7 x i8] c"hello\0A\00", align 1  ;; Mergeable1ByteCString
@strB = unnamed_addr constant [8 x i8] c"Blahah\0A\00", align 1 ;; Mergeable1ByteCString
@strC = unnamed_addr constant [2 x i16] [i16 1, i16 0], align 2 ;; Mergeable2ByteCString
@strD = unnamed_addr constant [2 x i16] [i16 1, i16 1], align 2 ;; !isMergeableCString
@strE = external unnamed_addr constant [2 x i16], align 2

-fdata-sections:
  .text  extern        .rodata.str1.1strA        .text  extern        strA
    0    SD       RO                               0    SD       RO
  .text  extern        .rodata.str1.1strB        .text  extern        strB
    0    SD       RO                               0    SD       RO
  .text  extern        .rodata.str2.2strC  ===>  .text  extern        strC
    0    SD       RO                               0    SD       RO
  .text  extern        strD                      .text  extern        strD
    0    SD       RO                               0    SD       RO
  .data  extern        a                         .data  extern        a
    0    SD       RW                               0    SD       RW
  undef  extern        strE                      undef  extern        strE
    0    ER       UA                               0    ER       UA

-fno-data-sections:
  .text  unamex        .rodata.str1.1            .text  unamex        .rodata
    0    SD       RO                               0    SD       RO
  .text  extern        strA                      .text  extern        strA
    0    LD       RO                               0    LD       RO
  .text  extern        strB                      .text  extern        strB
    0    LD       RO                               0    LD       RO
  .text  unamex        .rodata.str2.2      ===>  .text  extern        strC
    0    SD       RO                               0    LD       RO
  .text  extern        strC                      .text  extern        strD
    0    LD       RO                               0    LD       RO
  .text  unamex        .rodata                   .data  unamex        .data
    0    SD       RO                               0    SD       RW
  .text  extern        strD                      .data  extern        a
    0    LD       RO                               0    LD       RW
  .data  unamex        .data                     undef  extern        strE
    0    SD       RW                               0    ER       UA
  .data  extern        a
    0    LD       RW
  undef  extern        strE
    0    ER       UA
```

Reviewed by: David Tenty, Fangrui Song

Differential Revision: https://reviews.llvm.org/D156202

(cherry picked from commit 9d4e8c09f493280acc7637d904bdc84abc11fdc3)

13 months ago[NFC] Fix version number in release tree
Tobias Hieta [Mon, 31 Jul 2023 09:22:55 +0000 (11:22 +0200)]
[NFC] Fix version number in release tree

13 months ago[docs] Add release notes for a Windows specific change in LLD
Martin Storsjö [Sat, 29 Jul 2023 21:40:03 +0000 (00:40 +0300)]
[docs] Add release notes for a Windows specific change in LLD

13 months ago[libc++][Modules] Fix a few module related warnings
Ian Anderson [Fri, 28 Jul 2023 06:36:50 +0000 (23:36 -0700)]
[libc++][Modules] Fix a few module related warnings

I'm getting a few -Wundefined-inline warnings, and a -Wnon-modular-include-in-module too. Fix all of those.

Reviewed By: Mordante, #libc

Differential Revision: https://reviews.llvm.org/D156508

(cherry picked from commit 165841b681c146ae1e013a0aa4d69ef7c7c20fe2)

13 months ago[ThinLTO] Use module hash instead of module ID for cache key
Nikita Popov [Fri, 28 Jul 2023 12:21:00 +0000 (14:21 +0200)]
[ThinLTO] Use module hash instead of module ID for cache key

This is a followup to D151165. Instead of using the module ID, use
the module hash for sorting the import list. The module hash is what
will actually be included in the hash.

This has the advantage of being independent of the module order,
which is something that Rust relies on.

A caveat here is that the test doesn't quite work for linkonce_odr
functions, because the function may be imported from two different
modules, and the first one on the llvm-lto2 command line gets picked
(rather than, say, the prevailing copy). This doesn't really matter
for Rust's purposes (because it does not use linkonce_odr linkage),
but may still be worth addressing. For now I'm using a variant of
the test using internal instead of linkonce_odr functions.

Differential Revision: https://reviews.llvm.org/D156525

(cherry picked from commit 279c2971951c2ea58a2bd1e6687ce61451f9d329)

13 months ago[Clang][RISCV] Remove RVV intrinsics `vread_csr`,`vwrite_csr`
eopXD [Wed, 26 Jul 2023 12:16:23 +0000 (05:16 -0700)]
[Clang][RISCV] Remove RVV intrinsics `vread_csr`,`vwrite_csr`

As proposed in riscv-non-isa/rvv-intrinsic-doc#249, removing the interface.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D156321

13 months ago[OpenMP] Do not always emit unused extern variables
Joseph Huber [Wed, 26 Jul 2023 21:02:08 +0000 (16:02 -0500)]
[OpenMP] Do not always emit unused extern variables

Currently, the precense of the OpenMP target declare metadata requires
that we always codegen a global declaration. This is undesirable in the
case that we could defer or omit this declaration as is common with
unused extern variables. This is important as it allows us, in the
runtime, to rely on static linking semantics to omit unused symbols so
they are not included when the user links it in.

This patch changes the check for always emitting these variables.
Because of this we also need to extend this logic to the generation of
the offloading entries. This has the result of derring the offload entry
generation to the canonical definitoin. So we are effectively assuming
whoever owns the storage for this variable will perform that operation.
This makes an exception for `link` attributes as those require their own
special handling.

Let me know if this is sound in the implementation, I do not have the
largest view of the standards here.

Fixes: https://github.com/llvm/llvm-project/issues/64133

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D156368

(cherry picked from commit 141c4e7a9403fed46d84c7f0429295bd28c89368)

13 months ago[libunwind] Fix build with -Wunused-function
Shoaib Meenai [Thu, 27 Jul 2023 23:55:26 +0000 (16:55 -0700)]
[libunwind] Fix build with -Wunused-function

https://reviews.llvm.org/D144252 removed -Wno-unused-function from the
libunwind build, but we have an unused function when you're building for
armv7 without assertions. Mark that function as possibly unused to avoid
the warning, and mark the parameter as a const pointer while I'm here to
make it clear that nothing is modified by a debugging function.

Reviewed By: #libunwind, philnik

Differential Revision: https://reviews.llvm.org/D156496

(cherry picked from commit 3da76c2116179fdb3fff8feb4551209e4218746e)

13 months agoAdd release node for exact dynamic_cast optimization.
Richard Smith [Thu, 27 Jul 2023 19:50:00 +0000 (12:50 -0700)]
Add release node for exact dynamic_cast optimization.

13 months agoAdd release note for assumes now recognizing class-like FP tests
Matt Arsenault [Tue, 25 Jul 2023 12:42:51 +0000 (08:42 -0400)]
Add release note for assumes now recognizing class-like FP tests

13 months agoAMDGPU: Add some release notes
Matt Arsenault [Tue, 25 Jul 2023 12:01:47 +0000 (08:01 -0400)]
AMDGPU: Add some release notes

13 months ago[hexagon] restore library path arguments
Brian Cain [Wed, 26 Jul 2023 13:24:30 +0000 (06:24 -0700)]
[hexagon] restore library path arguments

Before applying this fix, clang would not include the specified library
path arguments:

    $ ./bin/clang --target=hexagon-unknown-linux-musl  -o tprog tprog.o -L/tmp -###
    ...
    clang: warning: argument unused during compilation: '-L/tmp' [-Wunused-command-line-argument]
     "/local/mnt/workspace/install/clang-latest/bin/ld.lld" "-z" "relro" "-o" "tprog" "-dynamic-linker=/lib/ld-musl-hexagon.so.1" "/usr/lib/crt1.o" "-L/usr/lib" "tprog.o" "-lclang_rt.builtins-hexagon" "-lc"

Differential Revision: https://reviews.llvm.org/D156330

(cherry picked from commit 96832a6bf7e0e7f1e8d634d38c44a1b32d512923)

13 months ago[libc++][Modules] Recreate the top level `std` clang module
Ian Anderson [Mon, 24 Jul 2023 22:35:00 +0000 (15:35 -0700)]
[libc++][Modules] Recreate the top level `std` clang module

lldb needs the `std` clang module to make all of libc++ available in the debugger. Make a new header to include the rest of the public headers and use to build a `std` module that just re-exports the rest of libc++.

Reviewed By: Mordante, JDevlieghere, #libc

Differential Revision: https://reviews.llvm.org/D156177

(cherry picked from commit a800485a2deda0807cb9dc212b7d42ac916055fd)

13 months ago[CMake] Use `LLVM_ENABLE_ASSERTIONS` to enable the hardened mode in libc++.
Konstantin Varlamov [Thu, 27 Jul 2023 06:09:15 +0000 (23:09 -0700)]
[CMake] Use `LLVM_ENABLE_ASSERTIONS` to enable the hardened mode in libc++.

Use the new libc++ hardened mode instead of the deprecated safe mode.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D156377

(cherry picked from commit 194e2ba1250c97926ed83b1ade1fbcbb49112a05)

13 months ago[Clang][RISCV] Bump rvv intrinsics version to v0.12
eopXD [Thu, 27 Jul 2023 05:38:53 +0000 (22:38 -0700)]
[Clang][RISCV] Bump rvv intrinsics version to v0.12

The LLVM now supports v0.12 of the RVV intrinsics. Users can use the macro
riscv_v_intrinsic to distinguish what kind of intrinsics is supported in
the compiler.

Please refer to tag descriptions under

https://github.com/riscv-non-isa/rvv-intrinsic-doc/tags

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D156394

(cherry picked from commit 20e87e2f794173deebd1cf8c86684452bb0c989b)

13 months ago[Driver] Link shared asan runtime lib with -z now on Solaris/x86
Rainer Orth [Thu, 27 Jul 2023 09:32:48 +0000 (11:32 +0200)]
[Driver] Link shared asan runtime lib with -z now on Solaris/x86

As detailed in Issue #64126, several asan tests `FAIL` due to a cycle in
`AsanInitInternal`.  This can by avoided by disabling lazy binding with `ld
-z now`.

Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D156325

(cherry picked from commit 6b5149aa442efc10afa00e8864e58a24a9cf5c9f)

13 months ago[AMDGPU] Fix PromoteAlloca Subvector Stores for Single Elements
pvanhout [Wed, 26 Jul 2023 10:26:13 +0000 (12:26 +0200)]
[AMDGPU] Fix PromoteAlloca Subvector Stores for Single Elements

The previous condition was incorrect in some cases, like storing <2 x i32>
into a double. If IndexVal was >0, we ended up never storing anything.

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D156308

(cherry picked from commit a8aabba5872aeaa57fbc71fdfde025d70d11deb0)

13 months ago[AMDGPU] Precommit tests for D156308
pvanhout [Wed, 26 Jul 2023 10:28:18 +0000 (12:28 +0200)]
[AMDGPU] Precommit tests for D156308

Also includes another testcase that's unrelated, it's just a sanity check.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D156309

(cherry picked from commit 6a767fbc36a37a8731a313b47208069b708dccf5)

13 months agoAMDGPU: Always custom lower extract_subvector
Matt Arsenault [Thu, 27 Jul 2023 12:10:57 +0000 (08:10 -0400)]
AMDGPU: Always custom lower extract_subvector

The patterns were ripped out in
a4a3ac10cb1a40ccebed4e81cd7e94f1eb71602d so this always needs to be
custom lowered. I absolutely hate how difficult it is to write tests
for these, I have no doubt there are more of these hidden.

Fixes #64142

(cherry picked from commit 95e5a461f52f9046bc7a06d70812b2bec509a432)

13 months agoFor #64088: mark vtable as used if we might emit a reference to it.
Richard Smith [Tue, 25 Jul 2023 21:41:10 +0000 (14:41 -0700)]
For #64088: mark vtable as used if we might emit a reference to it.

(cherry picked from commit b6847edfc235829b37dd6d734ef5bbfa0a58b6fc)

13 months ago[lld][ELF][test] Fix excessive output file size in loongarch-add-sub.s
WANG Xuerui [Wed, 26 Jul 2023 14:16:49 +0000 (22:16 +0800)]
[lld][ELF][test] Fix excessive output file size in loongarch-add-sub.s

Initially the .rodata section came before .text, hence sharing its
segment with the program header sitting at a small offset, pushing the
output file size to ~72GiB (the file was sparse though, so not much is
really written). This breaks on 32-bit platforms and is irrelevant to
the feature being tested, so re-order the two sections so .text gets
processed first, and both sections get their own segment.

This addresses the issue found by the clang-armv8-lld-2stage builder:
    https://lab.llvm.org/buildbot/#/builders/178/builds/5340

Reviewed By: SixWeining, xry111

Differential Revision: https://reviews.llvm.org/D156293

(cherry picked from commit ffe2b6f75de55b665520669059c3d95240482d54)

13 months ago[clangd] Revert the symbol collector behavior to old pre-include-cleaner-library...
Viktoriia Bakalova [Thu, 27 Jul 2023 08:43:54 +0000 (08:43 +0000)]
[clangd] Revert the symbol collector behavior to old pre-include-cleaner-library behavior due to a regression.

Differential Revision: https://reviews.llvm.org/D156403

(cherry picked from commit 3c6a7b0045afe9a230346e476bf07f88c145fdb5)

13 months ago[AArch64] Correct the regtype of indexed fmlal
David Green [Thu, 27 Jul 2023 07:27:03 +0000 (08:27 +0100)]
[AArch64] Correct the regtype of indexed fmlal

The indexed fmlal should use a low numbered register for the index operand,
which this fixes by making it V128_lo.

Fixes 64104

Differential Revision: https://reviews.llvm.org/D156296

(cherry picked from commit 509cb334699a2360f2d87f184bc0f56f742c6fc3)

13 months ago[AArch64] Add test showing incorrect register usage of FMLAL. NFC
David Green [Thu, 27 Jul 2023 06:39:10 +0000 (07:39 +0100)]
[AArch64] Add test showing incorrect register usage of FMLAL. NFC

See D156296

(cherry picked from commit e012c5cfac8542eb8164bab9891ea9b355e73517)

13 months ago[Support] Remove llvm::is_trivially_{copy/move}_constructible
Fangrui Song [Wed, 26 Jul 2023 00:21:16 +0000 (17:21 -0700)]
[Support] Remove llvm::is_trivially_{copy/move}_constructible

This restores D132311, which was reverted in
29c841ce93e087fa4e0c5f3abae94edd460bc24a (Sep 2022) due to certain files
not buildable with GCC 7.3.0. The previous attempt was reverted by
6cd9608fb37ca2418fb44b57ec955bb5efe10689 (Dec 2020).

This time, GCC 7.3.0 has existing build errors for a long time due to
structured bindings for many files, e.g.

```
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9098:13: error: cannot decompose class type ‘std::pair<llvm::Value*, const llvm::SCEV*>’: both it and it
s base class ‘std::pair<llvm::Value*, const llvm::SCEV*>’ have non-static data members
   for (auto [_, Stride] : Legal->getLAI()->getSymbolicStrides()) {
             ^~~~~~~~~~~
```

... and also some `error: duplicate initialization of` instances due to llvm/Transforms/IPO/Attributor.h.

---

GCC 7.5.0 has a bug that, without this change, certain `SmallVector` with a `std::pair` element type like `SmallVector<std::pair<Instruction * const, Info>, 0> X;` lead to spurious

```
/tmp/opt/gcc-7.5.0/include/c++/7.5.0/type_traits:878:48: error: constructor required before non-static data member for ‘...’ has been parsed
```

Switching to std::is_trivially_{copy/move}_constructible fixes the error.

(cherry picked from commit 6a684dbc4433a33e5f94fb15c9e378a2408021e0)

13 months agoHIP: Fix broken version check for deprecated macro
Matt Arsenault [Tue, 25 Jul 2023 12:20:16 +0000 (08:20 -0400)]
HIP: Fix broken version check for deprecated macro

Remove test hack that was accidentally pushed.

(cherry picked from commit 73105a54725ec11165dd8c90ca3b7a0b1b9cd6e3)

13 months agoRevert "[FuncSpec] Add Phi nodes to the InstCostVisitor."
Alexandros Lamprineas [Wed, 26 Jul 2023 18:09:35 +0000 (19:09 +0100)]
Revert "[FuncSpec] Add Phi nodes to the InstCostVisitor."

This reverts commit 03f1d09fe484f6c924434bc9c888e022b3514455
because of a crash reported on https://reviews.llvm.org/D154852

13 months ago[libc++][mdspan] Fix uglification, categorize asserts and move tests
Christian Trott [Tue, 25 Jul 2023 18:25:17 +0000 (12:25 -0600)]
[libc++][mdspan] Fix uglification, categorize asserts and move tests

Fixes uglification in mdspan deduction guides, which CI
did not test for until recently. The CI modification
and mdspan testing overlapped, so mdspan landed with green
CI, and the CI modification landed too.

Make most assertions in mdspan and its helper classes
trigger during a hardened build in order to catch
out of bounds access errors.

Also moves all mdspan assertions tests from libcxx/test/std
to libcxx/test/libcxx.

Differential Revision: https://reviews.llvm.org/156181

13 months ago[libc++][mdspan] Implement std::mdspan class
Christian Trott [Tue, 25 Jul 2023 04:35:15 +0000 (22:35 -0600)]
[libc++][mdspan] Implement std::mdspan class

This implements P0009 std::mdspan ((https://wg21.link/p0009)),
a multidimensional span with customization points for
layouts and data access.

Co-authored-by: Damien L-G <dalg24@gmail.com>
Differential Revision: https://reviews.llvm.org/154367

13 months ago[lldb] Treat ARM64X images as ARM64.
Jacek Caban [Tue, 25 Jul 2023 22:09:34 +0000 (00:09 +0200)]
[lldb] Treat ARM64X images as ARM64.

With D149091, ARM64X binaries are no longer reported as ARM64. This broke
lldb tests as Windows 11 system DLLs are mostly ARM64X binaries and lldb
doesn't know how to handle them. Ideally lldb would understand a bit more
about ARM64X and handle them as AMD64 in x64 processes, but this is
enough to preserve previous behavior and fix tests.

Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D156268

(cherry picked from commit 48feef277a24b1b9c0ff33267a91e70d9584012e)

13 months ago[XCOFF] Enable available_externally linkage for functions.
esmeyi [Wed, 26 Jul 2023 02:47:11 +0000 (22:47 -0400)]
[XCOFF] Enable available_externally linkage for functions.

Summary: D80642 added support for emitting AvailableExternally Linkage on AIX. However, an assertion of "Trying to get csect representation of this symbol but none was set." occurred when a function is declared as available_externally. This is due to we missing to generate a csect for the function. This patch fixes it.

Reviewed By: hubert.reinterpretcast, shchenz

Differential Revision: https://reviews.llvm.org/D156213

Signed-off-by: Esme Yi <esme.yi@ibm.com>
(cherry picked from commit e83b8a5e711a663c44e80965da5c747e08dea497)

13 months ago[OpenMP] [OMPT] [7/8] Invoke tool-supplied callbacks before and after target launch...
Michael Halkenhaeuser [Tue, 25 Jul 2023 12:14:59 +0000 (08:14 -0400)]
[OpenMP] [OMPT] [7/8] Invoke tool-supplied callbacks before and after target launch and data transfer operations

Implemented RAII objects, initialized at target entry points, that
invoke tool-supplied callbacks. Updated status of target callbacks as
implemented.

Depends on D127365

Patch from John Mellor-Crummey <johnmc@rice.edu>
With contributions from:
Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>
Jan-Patrick Lehr <janpatrick.lehr@amd.com>

Reviewed By: jdoerfert, dhruvachak, jplehr

Differential Revision: https://reviews.llvm.org/D127367

(cherry picked from commit 1dec417ac4a533e40f637cd1a7f0628803d9e634)

13 months agoReland "[LoongArch] Support -march=native and -mtune="
Weining Lu [Wed, 26 Jul 2023 01:56:49 +0000 (09:56 +0800)]
Reland "[LoongArch] Support -march=native and -mtune="

As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.

D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.

A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.

`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.

And these two preprocessor macros are defined:
- __loongarch_arch
- __loongarch_tune

[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc

Reviewed By: xen0n, wangleiat

Differential Revision: https://reviews.llvm.org/D155824

13 months ago[Clang] use unsigned integer constants in unit-test | fixes build error on ppc64le...
Kai Stierand [Tue, 25 Jul 2023 11:47:46 +0000 (13:47 +0200)]
[Clang] use unsigned integer constants in unit-test | fixes build error on ppc64le-lld-multistage-test

Fixes:
    /home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:1526:11: warning: comparison of integer expressions of different signedness: ‘const unsigned int’ and ‘const int’ [-Wsign-compare]
    /home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:1526:11: warning: comparison of integer expressions of different signedness: ‘const long unsigned int’ and ‘const int’ [-Wsign-compare]

Reviewed By: cor3ntin

Differential Revision: https://reviews.llvm.org/D156224

13 months agoRevert "[OpenMP] Add the `ompx_attribute` clause for target directives"
Aaron Ballman [Tue, 25 Jul 2023 11:55:28 +0000 (07:55 -0400)]
Revert "[OpenMP] Add the `ompx_attribute` clause for target directives"

This reverts commit ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2.

The changes broke several bots:
https://lab.llvm.org/buildbot/#/builders/176/builds/3408
https://lab.llvm.org/buildbot/#/builders/198/builds/4028
https://lab.llvm.org/buildbot/#/builders/197/builds/8491
https://lab.llvm.org/buildbot/#/builders/197/builds/8491

13 months agoHIP: Directly call nearbyint builtins
Matt Arsenault [Tue, 22 Nov 2022 04:37:15 +0000 (23:37 -0500)]
HIP: Directly call nearbyint builtins

13 months agoAMDGPU: Remove trailing whitespace from documentation
Matt Arsenault [Tue, 25 Jul 2023 11:50:08 +0000 (07:50 -0400)]
AMDGPU: Remove trailing whitespace from documentation

13 months agoAMDGPU: Correctly expand f64 sqrt intrinsic
Matt Arsenault [Sun, 20 Nov 2022 16:40:25 +0000 (08:40 -0800)]
AMDGPU: Correctly expand f64 sqrt intrinsic

rocm-device-libs and llpc were avoiding using f64 sqrt
intrinsics in favor of their own expansions. Port the
expansion into the backend. Both of these users should be
updated to call the intrinsic instead.

The library and llpc expansions are slightly different.
llpc uses an ldexp to do the scale; the library uses a multiply.

Use ldexp to do the scale instead of the multiply.
I believe v_ldexp_f64 and v_mul_f64 are always the same number of
cycles, but it's cheaper to materialize the 32-bit integer constant
than the 64-bit double constant.

The libraries have another fast version of sqrt which will
be handled separately.

I am tempted to do this in an IR expansion instead. In the IR
we could take advantage of computeKnownFPClass to avoid
the 0-or-inf argument check.

13 months agoAMDGPU: Add more sqrt f64 lowering tests
Matt Arsenault [Tue, 20 Jun 2023 10:19:08 +0000 (06:19 -0400)]
AMDGPU: Add more sqrt f64 lowering tests

Almost all permutations of the flags are potentially relevant.

13 months agoHIP: Directly call rint builtins
Matt Arsenault [Sun, 20 Nov 2022 16:44:50 +0000 (08:44 -0800)]
HIP: Directly call rint builtins

13 months ago[Sema] Fix handling of functions that hide classes
John Brawn [Wed, 28 Jun 2023 09:31:38 +0000 (10:31 +0100)]
[Sema] Fix handling of functions that hide classes

When a function is declared in the same scope as a class with the same
name then the function hides that class. Currently this is done by a
single check after the main loop in LookupResult::resolveKind, but
this can give the wrong result when we have a using declaration in
multiple namespace scopes in two different ways:

 * When the using declaration is hidden in one namespace but not the
   other we can end up considering only the hidden one when deciding
   if the result is ambiguous, causing an incorrect "not ambiguous"
   result.

 * When two classes with the same name in different namespace scopes
   are both hidden by using declarations this can result in
   incorrectly deciding the result is ambiguous. There's currently a
   comment saying this is expected, but I don't think that's correct.

Solve this by checking each Decl to see if it's hidden by some other
Decl in the same scope. This means we have to delay removing anything
from Decls until after the main loop, in case a Decl is hidden by
another that is removed due to being non-unique.

Differential Revision: https://reviews.llvm.org/D154503

13 months agoAttributor: Fix typo
Matt Arsenault [Mon, 24 Jul 2023 13:34:52 +0000 (09:34 -0400)]
Attributor: Fix typo

13 months ago[FuncSpec][NFC] Leave a comment for future improvements.
Alexandros Lamprineas [Tue, 25 Jul 2023 10:09:52 +0000 (11:09 +0100)]
[FuncSpec][NFC] Leave a comment for future improvements.

Adds a TODO for checking inlinining opportunities while traversing
the users of the specialization arguments. This was brought up in
the review of D154852.

13 months ago[RISCV] Remove zvk uimm constraints
4vtomat [Wed, 19 Jul 2023 02:10:18 +0000 (19:10 -0700)]
[RISCV] Remove zvk uimm constraints

Since the spec doesn't describe these behaviors as invalid,
the llvm-mc should just make them take care by hardware.

Differential Revision: https://reviews.llvm.org/D155669