Chuanqi Xu [Thu, 8 Aug 2024 05:29:59 +0000 (13:29 +0800)]
[C++20] [Modules] Don't diagnose duplicated implicit decl in multiple named modules (#102423)
Close https://github.com/llvm/llvm-project/issues/102360
Close https://github.com/llvm/llvm-project/issues/102349
http://eel.is/c++draft/basic.def.odr#15.3 makes it clear that the
duplicated deinition are not allowed to be attached to named modules.
But we need to filter the implicit declarations as user can do nothing
about it and the diagnostic message is annoying.
(cherry picked from commit
e72d956b99e920b0fe2a7946eb3a51b9e889c73c)
PaulXiCao [Mon, 5 Aug 2024 20:08:47 +0000 (20:08 +0000)]
[libc++][math] Fix undue overflowing of `std::hypot(x,y,z)` (#100820)
This is in relation to mr #93350. It was merged to main, but reverted
because of failing sanitizer builds on PowerPC.
The fix includes replacing the hard-coded threshold constants (e.g.
`__overflow_threshold`) for different floating-point sizes by a general
computation using `std::ldexp`. Thus, it should now work for all architectures.
This has the drawback of not being `constexpr` anymore as `std::ldexp`
is not implemented as `constexpr` (even though the standard mandates it
for C++23).
Closes #92782
(cherry picked from commit
72825fde03aab3ce9eba2635b872144d1fb6b6b2)
Mitch Phillips [Wed, 24 Jul 2024 10:58:24 +0000 (12:58 +0200)]
Revert "[libc++][math] Fix undue overflowing of `std::hypot(x,y,z)` (#93350)"
This reverts commit
9628777479a970db5d0c2d0b456dac6633864760.
More details in https://github.com/llvm/llvm-project/pull/93350, but
this broke the PowerPC sanitizer bots.
(cherry picked from commit
1031335f2ee1879737576fde3a3425ce0046e773)
Ahmed Bougacha [Fri, 9 Aug 2024 19:32:01 +0000 (12:32 -0700)]
[clang] Implement -fptrauth-auth-traps. (#102417)
This provides -fptrauth-auth-traps, which at the frontend level only
controls the addition of the "ptrauth-auth-traps" function attribute.
The attribute in turn controls various aspects of backend codegen, by
providing the guarantee that every "auth" operation generated will trap
on failure.
This can either be delegated to the hardware (if AArch64 FPAC is known
to be available), in which case this attribute doesn't change codegen.
Otherwise, if FPAC isn't available, this asks the backend to emit
additional instructions to check and trap on auth failure.
(cherry picked from commit
d179acd0484bac30c5ebbbed4d29a4734d92ac93)
Pavel Labath [Thu, 8 Aug 2024 08:53:15 +0000 (10:53 +0200)]
[lldb] Fix crash when adding members to an "incomplete" type (#102116)
This fixes a regression caused by delayed type definition searching
(#96755 and friends): If we end up adding a member (e.g. a typedef) to a
type that we've already attempted to complete (and failed), the
resulting AST would end up inconsistent (we would start to "forcibly"
complete it, but never finish it), and importing it into an expression
AST would crash.
This patch fixes this by detecting the situation and finishing the
definition as well.
(cherry picked from commit
57cd1000c9c93fd0e64352cfbc9fbbe5b8a8fcef)
Owen Pan [Sat, 10 Aug 2024 20:31:35 +0000 (13:31 -0700)]
[clang-format] Fix a serious bug in `git clang-format -f` (#102629)
With the --force (or -f) option, git-clang-format wipes out input files
excluded by a .clang-format-ignore file if they have unstaged changes.
This patch adds a hidden clang-format option --list-ignored that lists
such excluded files for git-clang-format to filter out.
Fixes #102459.
(cherry picked from commit
986bc3d0719af653fecb77e8cfc59f39bec148fd)
Rainer Orth [Sat, 10 Aug 2024 20:54:07 +0000 (22:54 +0200)]
[llvm-exegesis][unittests] Also disable SubprocessMemoryTest on SPARC (#102755)
Three `llvm-exegesis` tests
```
LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/DefinitionFillsCompletely
LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/MultipleDefinitions
LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/OneDefinition
```
`FAIL` on Linux/sparc64 like
```
llvm/unittests/tools/llvm-exegesis/X86/SubprocessMemoryTest.cpp:68: Failure
Expected equality of these values:
SharedMemoryMapping[I]
Which is: '\0'
ExpectedValue[I]
Which is: '\xAA' (170)
```
It seems like this test only works on little-endian hosts: three
sub-tests are already disabled on powerpc and s390x (both big-endian),
and the fourth is additionally guarded against big-endian hosts (making
the other guards unnecessary).
However, since it's not been analyzed if this is really an endianess
issue, this patch disables the whole test on powerpc and s390x as before
adding sparc to the mix.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
(cherry picked from commit
a417083e27b155dc92b7f7271c0093aee0d7231c)
Xing Xue [Thu, 8 Aug 2024 13:15:51 +0000 (09:15 -0400)]
[NFC][libc++][test][AIX] UnXFAIL LIT test transform.pass.cpp (#102338)
Remove `XFAIL: LIBCXX-AIX-FIXME` from lit test `transform.pass.cpp` now
that AIX system call `wcsxfrm`/`wcsxfrm_l` is fixed in AIX 7.2.5.8 and
7.3.2.2 and buildbot machines have been upgraded.
Backported from commit
cb5912a71061c6558bd4293596dcacc1ce0ca2f6
Daniel Kiss [Fri, 9 Aug 2024 15:51:38 +0000 (17:51 +0200)]
[Arm][AArch64][Clang] Respect function's branch protection attributes. (#101978)
Default attributes assigned to all functions according to the command
line parameters. Some functions might have their own attributes and we
need to set or remove attributes accordingly.
Tests are updated to test this scenarios too.
(cherry picked from commit
9e9fa00dcb9522db3f78d921eda6a18b9ee568bb)
Tom Stellard [Wed, 7 Aug 2024 21:19:22 +0000 (14:19 -0700)]
workflows: Fix permissions for release-sources job (#100750)
For reusable workflows, the called workflow cannot upgrade it's
permissions, and since the default permission is none, we need to
explicitly declare 'contents: read' when calling the release-sources
workflow.
Fixes the error:
The workflow is requesting 'contents: read', but is only allowed
'contents: none'.
(cherry picked from commit
82c2259aeb87f5cb418decfb6a1961287055e5d2)
Sharadh Rajaraman [Tue, 6 Aug 2024 15:05:55 +0000 (16:05 +0100)]
[clang][driver][clang-cl] Support `--precompile` and `-fmodule-*` options in Clang-CL (#98761)
This PR is the first step in improving the situation for `clang-cl`
detailed in [this LLVM Discourse
thread](https://discourse.llvm.org/t/clang-cl-exe-support-for-c-modules/72257/28).
There has been some work done in #89772. I believe this is somewhat
orthogonal.
This is a work-in-progress; the functionality has only been tested with
the [basic 'Hello World'
example](https://clang.llvm.org/docs/StandardCPlusPlusModules.html#quick-start),
and proper test cases need to be written. I'd like some thoughts on
this, thanks!
Partially resolves #64118.
(cherry picked from commit
bd576fe34285c4dcd04837bf07a89a9c00e3cd5e)
Tom Stellard [Mon, 5 Aug 2024 21:40:46 +0000 (14:40 -0700)]
workflows/release-binaries-all: Pass secrets on to release-binaries workflow (#101866)
A called workflow does not have access to secrets by default, so we need
to explicitly pass any secret that we want to use.
(cherry picked from commit
1fb1a5d8e2c5a0cbaeb39ead68352e5e55752a6d)
Sirraide [Mon, 5 Aug 2024 12:02:15 +0000 (14:02 +0200)]
[Clang] Define __cpp_pack_indexing (#101956)
Following the discussion on #101448 this defines
`__cpp_pack_indexing`. Since pack indexing is currently
supported in all language modes, the feature test macro
is also defined in all language modes.
(cherry picked from commit
c65afad9c58474a784633314e945c874ed06584a)
Alex Langford [Fri, 9 Aug 2024 19:50:42 +0000 (12:50 -0700)]
[lldb] Move definition of SBSaveCoreOptions dtor out of header (#102539)
This class is technically not usable in its current state. When you use
it in a simple C++ project, your compiler will complain about an
incomplete definition of SaveCoreOptions. Normally this isn't a problem,
other classes in the SBAPI do this. The difference is that
SBSaveCoreOptions has a default destructor in the header, so the
compiler will attempt to generate the code for the destructor with an
incomplete definition of the impl type.
All methods for every class, including constructors and destructors,
must have a separate implementation not in a header.
(cherry picked from commit
101cf540e698529d3dd899d00111bcb654a3c12b)
Ahmed Bougacha [Fri, 9 Aug 2024 18:49:50 +0000 (11:49 -0700)]
[clang] Wire -fptrauth-returns to "ptrauth-returns" fn attribute. (#102416)
We already ended up with -fptrauth-returns, the feature macro, the lang
opt, and the actual backend lowering.
The only part left is threading it all through PointerAuthOptions, to
drive the addition of the "ptrauth-returns" attribute to generated
functions.
While there, do minor cleanup on ptrauth-function-attributes.c.
This also adds ptrauth_key_return_address to ptrauth.h.
(cherry picked from commit
2eb6e30fe83ccce3cf01e596e73fa6385facd44b)
David Green [Fri, 9 Aug 2024 13:25:07 +0000 (14:25 +0100)]
[AArch64] Add invalid 1 x vscale costs for reductions and reduction-operations. (#102105)
The code-generator is currently not able to handle scalable vectors of
<vscale x 1 x eltty>. The usual "fix" for this until it is supported is
to mark the costs of loads/stores with an invalid cost, preventing the
vectorizer from vectorizing at those factors. But on rare occasions
loops do not contain load/stores, only reductions.
So whilst this is still unsupported return an invalid cost to avoid
selecting vscale x 1 VFs. The cost of a reduction is not currently used
by the vectorizer so this adds the cost to the add/mul/and/or/xor or
min/max that should feed the reduction. It includes reduction costs
too, for completeness. This change will be removed when code-generation
for these types is sufficiently reliable.
Fixes #99760
(cherry picked from commit
0b745a10843fc85e579bbf459f78b3f43e7ab309)
Fangrui Song [Wed, 7 Aug 2024 19:23:28 +0000 (12:23 -0700)]
Revert "demangle function names in trace files (#87626)"
This reverts commit
0fa20c55b58deb94090985a5c5ffda4d5ceb3cd1.
Storing raw symbol names is generally preferred in profile files.
Demangling might lose information. Language frontends might use
demangling schemes not supported by LLVMDemangle
(https://github.com/llvm/llvm-project/issues/45901#issuecomment-
2008686663).
In addition, calling `demangle` for each function has a significant
performance overhead (#102222).
I believe that even if we decide to provide a producer-side demangling,
it would not be on by default.
Pull Request: https://github.com/llvm/llvm-project/pull/102274
(cherry picked from commit
72b73e23b6c36537db730ebea00f92798108a6e5)
Fangrui Song [Thu, 8 Aug 2024 19:02:44 +0000 (12:02 -0700)]
[ELF] scanRelocations: support .crel.eh_frame
Follow-up to #98115. For EhInputSection, RelocationScanner::scan calls
sortRels, which doesn't support the CREL iterator. We should set
supportsCrel to false to ensure that the initial_location fields in
.eh_frame FDEs are relocated.
(cherry picked from commit
a821fee312d15941174827a70cb534c2f2fe1177)
Fangrui Song [Thu, 8 Aug 2024 07:57:43 +0000 (00:57 -0700)]
[ELF] .llvm.call-graph-profile: support CREL
https://reviews.llvm.org/
D105217 added RELA support. This patch adds
CREL support.
(cherry picked from commit
0766a59be3256e83a454a089f01215d6c7f94a48)
David Tenty [Thu, 8 Aug 2024 15:16:18 +0000 (11:16 -0400)]
[NFC][llvm][support] rename INFINITY in regcomp (#101758)
since C23 this macro is defined by float.h, which clang implements in
it's float.h since #96659 landed.
However, regcomp.c in LLVMSupport happened to define it's own macro with
that name, leading to problems when bootstrapping. This change renames
the offending macro.
(cherry picked from commit
899f648866affd011baae627752ba15baabc2ef9)
Antonio Frighetto [Thu, 25 Jul 2024 07:18:20 +0000 (09:18 +0200)]
[TBAA] Do not rewrite TBAA if exists, always null out `!tbaa.struct`
Retrieve `!tbaa` metadata via `!tbaa.struct` in `adjustForAccess`
unless it already exists, as struct-path aware `MDNodes` emitted
via `new-struct-path-tbaa` may be leveraged. As `!tbaa.struct`
carries memcpy padding semantics among struct fields and `!tbaa`
is already meant to aid to alias semantics, it should be possible
to zero out `!tbaa.struct` once the memcpy has been simplified.
`SROA/tbaa-struct.ll` test has gone out of scope, as `!tbaa` has
already replaced `!tbaa.struct` in SROA.
Fixes: https://github.com/llvm/llvm-project/issues/95661.
Muhammad Omair Javaid [Thu, 25 Jul 2024 07:21:16 +0000 (12:21 +0500)]
Revert "[LLVM] Silence compiler-rt warning in runtimes build (#99525)"
This patch broke LLVM Flang build on Windows. PR #100202
This reverts commit
f6f88f4b99638821af803d1911ab6a7dac04880b.
(cherry picked from commit
73d862e478738675f5d919c6a196429acd7b5f50)
Mirko Brkušanin [Thu, 25 Jul 2024 16:19:26 +0000 (18:19 +0200)]
[AMDGPU] Fix folding clamp into pseudo scalar instructions (#100568)
Clamp is canonically a v_max* instruction with a VGPR dst. Folding clamp
into a pseudo scalar instruction can cause issues due to a change in
regbank. We fix this with a copy.
(cherry picked from commit
817cd726454f01e990cd84e5e1d339b120b5ebaa)
Mariya Podchishchaeva [Thu, 8 Aug 2024 06:51:56 +0000 (08:51 +0200)]
[clang] Fix crash when #embed used in a compound literal (#102304)
Fixes https://github.com/llvm/llvm-project/issues/102248
(cherry picked from commit
3606d69d0b57dc1d23a4362e376e7ad27f650c27)
Owen Pan [Thu, 8 Aug 2024 04:05:42 +0000 (21:05 -0700)]
[clang-format] Fix a bug in annotating CastRParen (#102261)
Fixes #102102.
(cherry picked from commit
8c7a038f9029c675f2a52ff5e85f7b6005ec7b3e)
Chen Zheng [Tue, 6 Aug 2024 03:07:45 +0000 (11:07 +0800)]
[AIX]export function descriptor symbols related to template functions. (#101920)
This fixes regressions caused by
https://github.com/llvm/llvm-project/pull/97526
After that patch, all undefined references to DS symbol are removed.
This makes DS symbols(for template functions) have no reference in some
cases. So extract_symbols.py does not export these DS symbols for these
cases.
On AIX, exporting the function descriptor depends on references to the
function descriptor itself and the function entry symbol.
Without this fix, on AIX, we get:
```
rtld: 0712-001 Symbol _ZN4llvm15SmallVectorBaseIjE13mallocForGrowEPvmmRm was referenced
from module llvm-project/build/unittests/Passes/Plugins/TestPlugin.so(), but a runtime definition
of the symbol was not found.
```
(cherry picked from commit
396343f17b1182ff8ed698beac3f9b93b1d9dabd)
Kazu Hirata [Wed, 7 Aug 2024 20:10:31 +0000 (13:10 -0700)]
[Driver] Fix a warning
This patch fixes:
clang/lib/Driver/ToolChains/Darwin.cpp:2937:3: error: default label
in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
(cherry picked from commit
0f1361baf650641a59aaa1710d7a0b7b02f2e56d)
Ian Anderson [Wed, 7 Aug 2024 17:14:58 +0000 (10:14 -0700)]
[clang][modules] Enable built-in modules for the upcoming Apple releases (#102239)
The upcoming Apple SDK releases will support the clang built-in headers
being in the clang built-in modules: stop passing
-fbuiltin-headers-in-system-modules for those SDK versions.
(cherry picked from commit
961639962251de7428c3fe93fa17cfa6ab3c561a)
Lucas Duarte Prates [Wed, 7 Aug 2024 14:15:25 +0000 (15:15 +0100)]
[AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139)
This change updates the AArch64DeadRegisterDefinition pass to ensure it
does not replace the destination register of a SWP instruction with the
zero register when its value is unused. This is necessary to ensure that
the ordering of such instructions in relation to DMB.LD barries adheres
to the definitions of the AArch64 Memory Model.
The memory model states the following (ARMARM version DDI 0487K.a
§B2.3.7):
```
Barrier-ordered-before
An effect E1 is Barrier-ordered-before an effect E2 if one of the following applies:
[...]
* All of the following apply:
- E1 is a Memory Read effect.
- E1 is generated by an instruction whose destination register is not WZR or XZR.
- E1 appears in program order before E3.
- E3 is either a DMB LD effect or a DSB LD effect.
- E3 appears in program order before E2.
```
Prior to this change, by replacing the destination register of such SWP
instruction with WZR/XZR, the ordering relation described above was
incorrectly removed from the generated code.
The new behaviour is ensured in this patch by adding the relevant
`SWP[L](B|H|W|X)` instructions to list in the `atomicReadDroppedOnZero`
predicate, which already covered the `LD<Op>` instructions that are
subject to the same effect.
Fixes #68428.
(cherry picked from commit
beb37e2e22b549b361be7269a52a3715649e956a)
sinan [Wed, 7 Aug 2024 10:02:42 +0000 (18:02 +0800)]
[BOLT] Skip PLT search for zero-value weak reference symbols (#69136)
Take a common weak reference pattern for example
```
__attribute__((weak)) void undef_weak_fun();
if (&undef_weak_fun)
undef_weak_fun();
```
In this case, an undefined weak symbol `undef_weak_fun` has an address
of zero, and Bolt incorrectly changes the relocation for the
corresponding symbol to symbol@PLT, leading to incorrect runtime
behavior.
(cherry picked from commit
6c8933e1a095028d648a5a26aecee0f569304dd0)
Oliver Stannard [Wed, 7 Aug 2024 09:20:26 +0000 (10:20 +0100)]
[lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985)
Previously, we selected the Thumb2 PLT sequences if any input object is
marked as not supporting the ARM ISA, which then causes assertion
failures when calls from ARM code in other objects are seen. I think the
intention here was to only use Thumb PLTs when the target does not have
the ARM ISA available, signalled by no objects being marked as having it
available. To do that we need to track which ISAs we have seen as we
parse the build attributes, and defer the decision about PLTs until all
input objects have been parsed.
This bug was triggered by real code in picolibc, which have some
versions of string.h functions built with Thumb2-only build attributes,
so that they are compatible with v7-A, v7-R and v7-M.
Fixes #99008.
(cherry picked from commit
a1c6467bd90905d52cf8f6162b60907f8e98a704)
sinan [Wed, 7 Aug 2024 07:57:25 +0000 (15:57 +0800)]
[BOLT] Support map other function entry address (#101466)
Allow BOLT to map the old address to a new binary address if the old
address is the entry of the function.
(cherry picked from commit
734c0488b6e69300adaf568f880f40b113ae02ca)
Dimitry Andric [Tue, 23 Jul 2024 17:02:36 +0000 (19:02 +0200)]
[CalcSpillWeights] Avoid x87 excess precision influencing weight result
Fixes #99396
The result of `VirtRegAuxInfo::weightCalcHelper` can be influenced by
x87 excess precision, which can result in slightly different register
choices when the compiler is hosted on x86_64 or i386. This leads to
different object file output when cross-compiling to i386, or native.
Similar to
7af3432e22b0, we need to add a `volatile` qualifier to the
local `Weight` variable to force it onto the stack, and avoid the excess
precision. Define `stack_float_t` in `MathExtras.h` for this purpose,
and use it.
(cherry picked from commit
c80c09f3e380a0a2b00b36bebf72f43271a564c1)
Florian Hahn [Fri, 26 Jul 2024 12:10:16 +0000 (13:10 +0100)]
[LAA] Refine stride checks for SCEVs during dependence analysis. (#99577)
Update getDependenceDistanceStrideAndSize to reason about different
combinations of strides directly and explicitly.
Update getPtrStride to return 0 for invariant pointers.
Then proceed by checking the strides.
If either source or sink are not strided by a constant (i.e. not a
non-wrapping AddRec) or invariant, the accesses may overlap
with earlier or later iterations and we cannot generate runtime
checks to disambiguate them.
Otherwise they are either loop invariant or strided. In that case, we
can generate a runtime check to disambiguate them.
If both are strided by constants, we proceed as previously.
This is an alternative to
https://github.com/llvm/llvm-project/pull/99239 and also replaces
additional checks if the underlying object is loop-invariant.
Fixes https://github.com/llvm/llvm-project/issues/87189.
PR: https://github.com/llvm/llvm-project/pull/99577
Sam James [Tue, 6 Aug 2024 08:58:36 +0000 (09:58 +0100)]
[LLDB] Add `<cstdint>` to AddressableBits (#102110)
(cherry picked from commit
bb59f04e7e75dcbe39f1bf952304a157f0035314)
Rainer Orth [Tue, 6 Aug 2024 07:08:41 +0000 (09:08 +0200)]
[BinaryFormat] Disable MachOTest.UnalignedLC on SPARC (#100086)
As discussed in Issue #86793, the `MachOTest.UnalignedLC` test dies with
`SIGBUS` on SPARC, a strict-alignment target. It simply cannot work
there. Besides, the test invokes undefined behaviour on big-endian
targets, so this patch disables it on all of those.
Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.
(cherry picked from commit
3a226dbe27ac7c7d935bc0968e84e31798a01207)
Tom Stellard [Mon, 5 Aug 2024 21:38:56 +0000 (14:38 -0700)]
workflows/release-binaries: Give attestation artifacts a unique name (#102041)
We need a different attestation for each supported architecture, so
there artifacts all need to have a different name.
The upload step is run on a Linux runner, so no matter which
architecture's binary is being uploaded the runner.os and runner.arch
variables would always be 'Linux' and 'X64' and so we can't use them for
naming the artifact.
(cherry picked from commit
3c8dadda3aa20b89fb5ad29ae31380d9594c3430)
Tom Stellard [Mon, 5 Aug 2024 20:30:04 +0000 (13:30 -0700)]
workflows/release-tasks: Add missing permissions for release binaries (#102023)
Now that the release binaries create artifact attestations, we need to
ensure that we call the workflow with the correct permissions.
(cherry picked from commit
dc349a3f47882cdac7112c763d2964b59e77356a)
Fangrui Song [Mon, 5 Aug 2024 18:52:52 +0000 (11:52 -0700)]
[Driver] Temporarily probe aarch64-linux-gnu GCC installation
As the comment explains, `*Triples[]` lists are discouraged and not
comprehensive anyway (e.g.
aarch64-unknown-linux-gnu/aarch64-unknown-linux-musl/aarch64-amazon-linux
do not work).
Boost incorrectly specifies --target=arm64-pc-linux ("arm64" should not
be used for Linux) and expects to probe "aarch64-linux-gnu". Add this
temporary workaround for the 19.x releases.
cor3ntin [Mon, 5 Aug 2024 12:22:07 +0000 (14:22 +0200)]
[Clang] SFINAE on mismatching pack length during constraint satisfaction checking (#101879)
If a fold expanded constraint would expand packs of different size, it
is not a valid pack expansion and it is not satisfied. This should not
produce an error.
Fixes #99430
(cherry picked from commit
da380b26e4748ade5a8dba85b7df5e1c4eded8bc)
Kiran Chandramohan [Mon, 5 Aug 2024 11:43:37 +0000 (12:43 +0100)]
[Driver] Restrict Ofast deprecation help message to Clang (#101682)
The discussion about this in Flang
(https://discourse.llvm.org/t/rfc-deprecate-ofast-in-flang/80243) has
not concluded hence restricting the deprecation only to Clang.
(cherry picked from commit
e60ee1f2d70bdb0ac87b09ae685d669d8543b7bd)
Paul Walker [Mon, 5 Aug 2024 10:25:44 +0000 (11:25 +0100)]
[LLVM][TTI][SME] Allow optional auto-vectorisation for streaming functions. (#101679)
The command line option enable-scalable-autovec-in-streaming-mode is
used to enable scalable vectors but the same check is missing from
enableScalableVectorization, which is blocking auto-vectorisation.
(cherry picked from commit
7775a4882d7105fde7f7a81f3c72567d39afce45)
Kerry McLaughlin [Fri, 2 Aug 2024 17:00:59 +0000 (18:00 +0100)]
[AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522)
The implementation of __arm_sc_memset in compiler-rt contains
a Neon dup instruction which is not valid in streaming mode. This
patch rewrites the function, using an SVE mov instruction if available.
(cherry picked from commit
d6649f2d4871c4535ae0519920e36100748890c4)
Sander de Smalen [Fri, 2 Aug 2024 14:56:52 +0000 (15:56 +0100)]
[AArch64] Avoid NEON dot product in streaming[-compatible] functions (#101677)
The NEON dot product is not valid in streaming mode.
A follow-up patch will improve codegen for these operations.
(cherry picked from commit
12937b1bfb23cca4731fa274f3358f7286cc6784)
Sander de Smalen [Fri, 2 Aug 2024 09:29:08 +0000 (10:29 +0100)]
[AArch64] Avoid inlining if ZT0 needs preserving. (#101343)
Inlining may result in different behaviour when the callee clobbers ZT0,
because normally the call-site will have code to preserve ZT0. When
inlining the function this code to preserve ZT0 will no longer be
emitted, and so the resulting behaviour of the program is changed.
(cherry picked from commit
fb470db7b3a8ce6853e8bf17d235617a2fa79434)
Mark de Wever [Sat, 3 Aug 2024 09:19:00 +0000 (11:19 +0200)]
[libc++][bit] Improves rotate functions. (#98032)
Investigating #96612 shows our implementation was different from the
Standard and could cause UB. Testing the codegen showed quite a bit of
assembly generated for these functions. The functions have been written
differently which allows Clang to optimize the code to use simple CPU
rotate instructions.
Fixes: https://github.com/llvm/llvm-project/issues/96612
Sam James [Fri, 2 Aug 2024 22:07:21 +0000 (23:07 +0100)]
[ADT] Add `<cstdint>` to SmallVector (#101761)
SmallVector uses `uint32_t`, `uint64_t` without including `<cstdint>`
which fails to build w/ GCC 15 after a change in libstdc++ [0]
[0] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=
3a817a4a5a6d94da9127af3be9f84a74e3076ee2
(cherry picked from commit
7e44305041d96b064c197216b931ae3917a34ac1)
Rainer Orth [Tue, 30 Jul 2024 07:03:00 +0000 (09:03 +0200)]
[sanitizer_common] Fix internal_*stat on Linux/sparc64 (#101012)
```
SanitizerCommon-Unit :: ./Sanitizer-sparcv9-Test/SanitizerCommon/FileOps
```
`FAIL`s on 64-bit Linux/sparc64:
```
projects/compiler-rt/lib/sanitizer_common/tests/./Sanitizer-sparcv9-Test --gtest_filter=SanitizerCommon.FileOps
--
compiler-rt/lib/sanitizer_common/tests/sanitizer_libc_test.cpp:144: Failure
Expected equality of these values:
len1 + len2
Which is: 10
fsize
Which is:
1721875535
```
The issue is similar to the mips64 case: the Linux/sparc64 `*stat`
syscalls take a `struct kernel_stat64 *` arg. Also the syscalls actually
used differ.
This patch handles this, adopting the mips64 code to avoid too much
duplication.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
(cherry picked from commit
fcd6bd5587cc376cd8f43b60d1c7d61fdfe0f535)
Rainer Orth [Tue, 30 Jul 2024 07:00:20 +0000 (09:00 +0200)]
[sanitizer_common] Adjust signal_send.cpp for Linux/sparc64 (#100538)
```
SanitizerCommon-ubsan-sparc-Linux :: Linux/signal_send.cpp
```
currently `FAIL`s on Linux/sparc64 (32 and 64-bit). Instead of the
expected values for `SIGUSR1` (`10`) and `SIGUSR1` (`12`), that target
uses `30` and `31`.
On Linux/x86_64, the signals get their values from
`x86_64-linux-gnu/bits/signum-generic.h`, to be overridden in
`x86_64-linux-gnu/bits/signum.h`. On Linux/sparc64 OTOH, the definitions
are from `sparc64-linux-gnu/bits/signum-arch.h` and remain that way.
There's no `signum.h` at all.
The patch allows for both values.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
(cherry picked from commit
7cecbdfe4eac3fd7268532426fb6b13e51b8720d)
Rainer Orth [Tue, 30 Jul 2024 06:57:25 +0000 (08:57 +0200)]
[sanitizer_common] Don't use syscall(SYS_clone) on Linux/sparc64 (#100534)
```
SanitizerCommon-Unit :: ./Sanitizer-sparc-Test/SanitizerCommon/StartSubprocessTest
```
and every single test using the `llvm-symbolizer` `FAIL` on
Linux/sparc64 in a very weird way: when using `StartSubprocess`, there's
a call to `internal_fork`, but we never reach `internal_execve`.
`internal_fork` is implemented using `syscall(SYS_clone)`. The calling
convention of that syscall already varies considerably between targets,
but as documented in `clone(2)`, SPARC again is widely different.
Instead of trying to match `glibc` here, this patch just calls `__fork`.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
(cherry picked from commit
1c53b907bd6348138a59da270836fc9b4c161a07)
Rainer Orth [Tue, 30 Jul 2024 06:55:45 +0000 (08:55 +0200)]
[sanitizer_common][test] Fix SanitizerIoctl/KVM_GET_* tests on Linux/… (#100532)
…sparc64
Two ioctl tests `FAIL` on Linux/sparc64 (both 32 and 64-bit):
```
SanitizerCommon-Unit :: ./Sanitizer-sparc-Test/SanitizerIoctl/KVM_GET_LAPIC
SanitizerCommon-Unit :: ./Sanitizer-sparc-Test/SanitizerIoctl/KVM_GET_MP_STATE
```
like
```
compiler-rt/lib/sanitizer_common/tests/./Sanitizer-sparc-Test --gtest_filter=SanitizerIoctl.KVM_GET_LAPIC
--
compiler-rt/lib/sanitizer_common/tests/sanitizer_ioctl_test.cpp:91: Failure
Value of: res
Actual: false
Expected: true
compiler-rt/lib/sanitizer_common/tests/sanitizer_ioctl_test.cpp:92: Failure
Expected equality of these values:
ioctl_desc::WRITE
Which is: 2
desc.type
Which is: 1
```
The problem is that Linux/sparc64, like Linux/mips, uses a different
layout for the `ioctl` `request` arg than most other Linux targets as
can be seen in `sanitizer_platform_limits_posix.h` (`IOC_*`). Therefore,
this patch makes the tests use the correct one.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
(cherry picked from commit
9eefe065bb2752b0db9ed553d2406e9a15ce349e)
Tobias Hieta [Mon, 5 Aug 2024 08:40:33 +0000 (10:40 +0200)]
Bump version to 19.1.0-rc2
Fangrui Song [Sun, 4 Aug 2024 20:48:22 +0000 (13:48 -0700)]
ReleaseNotes: lld/ELF: mention CREL
Amara Emerson [Wed, 31 Jul 2024 23:51:45 +0000 (16:51 -0700)]
Forward declare OSSpinLockLock on MacOS since it's not shipped on the system. (#101392)
Fixes build errors on some SDKs.
rdar://
132607572
(cherry picked from commit
3a4c7cc56c07b2db9010c2228fc7cb2a43dd9b2d)
Martin Storsjö [Sun, 4 Aug 2024 20:20:45 +0000 (23:20 +0300)]
[ARM] [Windows] Use IMAGE_SYM_CLASS_STATIC for private functions (#101828)
For functions with private linkage, pick
IMAGE_SYM_CLASS_STATIC rather than IMAGE_SYM_CLASS_EXTERNAL;
GlobalValue::isInternalLinkage() only checks for
InternalLinkage, while GlobalValue::isLocalLinkage() checks for both
InternalLinkage and PrivateLinkage.
This matches what the AArch64 target does, since commit
3406934e4db4bf95c230db072608ed062c13ad5b.
This activates a preexisting fix for the AArch64 target from
1e7f592a890aad860605cf5220530b3744e107ba, for the ARM target as well.
When a relocation points at a symbol, one usually can convey an offset
to the symbol by encoding it as an immediate in the instruction.
However, for the ARM and AArch64 branch instructions, the immediate
stored in the instruction is ignored by MS link.exe (and lld-link
matches this aspect). (It would be simple to extend lld-link to support
it - but such object files would be incompatible with MS link.exe.)
This was worked around by
1e7f592a890aad860605cf5220530b3744e107ba by
emitting symbols into the object file symbol table, for temporary
symbols that otherwise would have been omitted, if they have the class
IMAGE_SYM_CLASS_STATIC, in order to avoid needing an offset in the
relocated instruction.
This change gives the symbols generated from functions with the IR level
"private" linkage the right class, to activate that workaround.
This fixes https://github.com/llvm/llvm-project/issues/100101, fixing
code generation for coroutines for Windows on ARM. After the change in
f78688134026686288a8d310b493d9327753a022, coroutines generate a function
with private linkage, and calls to this function were previously broken
for this target.
(cherry picked from commit
8dd065d5bc81b0c8ab57f365bb169a5d92928f25)
Matt Arsenault [Sun, 4 Aug 2024 12:36:00 +0000 (16:36 +0400)]
InferAddressSpaces: Fix mishandling stores of pointers to themselves (#101877)
(cherry picked from commit
3c483b887e5a32a0ddc0a52a467b31f74aad25bb)
DianQK [Sun, 4 Aug 2024 08:45:10 +0000 (16:45 +0800)]
[Metadata] Try to merge the first and last ranges. (#101860)
Fixes #101859.
If we have at least 2 ranges, we have to try to merge the last and first
ones to handle the wrap range.
(cherry picked from commit
4377656f2419a8eb18c01e86929b689dcf22b5d6)
Tom Stellard [Sat, 3 Aug 2024 04:52:03 +0000 (21:52 -0700)]
workflows: Re-implement the get-llvm-version action as a composite action (#101569)
The old version in the llvm/actions repo stopped working after the
version variables were moved out of llvm/CMakeLists.txt. Composite
actions are more simple and don't require javascript, which is why I
reimplemented it as a composite action.
This will fix the failing abi checks on the release branch.
(cherry picked from commit
14837aff058f9a2d32b8277debe619d8eb1995a1)
Tom Stellard [Sat, 3 Aug 2024 16:11:51 +0000 (09:11 -0700)]
workflows/release-binaries: Fix problem with python installation on macos-14 (#101774)
python3 wasn't able to see modules installed by pip, so we need to use
the setup-python action to ensure that the default pip and python3 both
use the same prefix.
See https://github.com/actions/runner-images/issues/10385
(cherry picked from commit
59476c99983d3813b412c9b0c0464365644c23a8)
Tom Stellard [Wed, 31 Jul 2024 01:54:20 +0000 (18:54 -0700)]
workflows/release-binaries: Fetch composite actions outside of default workspace (#100845)
Otherwise, the checkout step will override them.
(cherry picked from commit
41003ff3fe344dee5c963d462a4bc6d528811d86)
Tom Stellard [Fri, 26 Jul 2024 21:51:47 +0000 (14:51 -0700)]
workflow/release-binaries: Fix typo
Introduced in
b0860b20878d2c84fc3ce56ea608c5186872faa2.
(cherry picked from commit
d41f565318e2a414acfd7eec1cfb2fbf515391ba)
Tom Stellard [Fri, 26 Jul 2024 21:46:32 +0000 (14:46 -0700)]
workflows/release-binaries: Always pull composite actions from main branch (#100805)
If we pull from the release tag, then if there is a bug in one of the
actions on the release tag, then we can never do a build for that tag.
Pulling from main will allows us to fix bugs in the actions we use to
build the releases.
(cherry picked from commit
b0860b20878d2c84fc3ce56ea608c5186872faa2)
Tom Stellard [Fri, 26 Jul 2024 19:36:40 +0000 (12:36 -0700)]
workflows: Remove left over debugging step from release-binaries job
(cherry picked from commit
18dee70168bcd7259daade4c86462ba859e7bed5)
Tom Stellard [Fri, 26 Jul 2024 18:26:34 +0000 (11:26 -0700)]
Build release binaries for multiple targets (#98431)
This adds release binary builds for the 4 platforms currently supported
by the free GitHub Action runners:
* Linux x86_64
* Windows x86_64
* Mac x86_64
* Mac AArch64
The test stages for these are known to fail, but the creating and
upoading of the release binaries should pass.
(cherry picked from commit
247251aee0d4314385a3fea86e31484d3d792ffb)
Nathan James [Thu, 25 Jul 2024 15:25:37 +0000 (16:25 +0100)]
[clang-tidy] Fix crash in modernize-use-ranges (#100427)
Crash seems to be caused by the check function not handling inline
namespaces correctly for some instances. Changed how the Replacer is got
from the MatchResult now which should alleviate any potential issues
Fixes #100406
(cherry picked from commit
0762db6533eda3453158c7b9b0631542c47093a8)
Rainer Orth [Sat, 3 Aug 2024 20:19:44 +0000 (22:19 +0200)]
[sanitizer_common] Fix UnwindFast on SPARC (#101634)
```
UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp
```
`FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the
test isn't Linux-specific at all). With
`UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a
duplicate innermost frame:
```
compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value
#0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
#1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
#2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38
```
which isn't seen with `fast_unwind_on_fatal=0`.
This turns out to be another fallout from fixing
`__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In
`sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the
`pc` arg is the return address, while `pc1` from the stack frame
(`fr_savpc`) is the address of the `call` insn, leading to a double
entry for the innermost frame in `trace_buffer[]`.
This patch fixes this by moving the adjustment before all uses.
Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11`
(with the `ubsan/TestCases/Misc/Linux` tests enabled).
(cherry picked from commit
3368a3245ce5049b090d7c1081c2d52a6b6fda68)
Jannick Kremer [Sat, 3 Aug 2024 13:56:54 +0000 (14:56 +0100)]
[libclang] Fix symbol version of `getBinaryOpcode` functions (#101820)
#98489 resurrected an [old patch](https://reviews.llvm.org/D10833) that
was adding new libclang functions. That PR got merged with old `LLVM_13`
symbol versions for new functions. This patch fixes this oversight.
(cherry picked from commit
2bae7aeab42062e61d6f9d6458660d4a5646f7af)
Sam James [Sat, 3 Aug 2024 05:36:43 +0000 (06:36 +0100)]
[AMDGPU] Include `<cstdint>` in AMDGPUMCTargetDesc (#101766)
(cherry picked from commit
8f39502b85d34998752193e85f36c408d3c99248)
Qiongsi Wu [Fri, 2 Aug 2024 19:01:15 +0000 (15:01 -0400)]
Revert "[AIX] Turn on `#pragma mc_func` check by default (#101336)"
This reverts commit
b9335176db718bf64c72d48107eb9dff28ed979e.
(cherry picked from commit
dd7a4c3e5ee3300588b7c12631f3305553d8ea6c)
Fangrui Song [Fri, 2 Aug 2024 17:10:15 +0000 (10:10 -0700)]
[asan,test] Disable _FORTIFY_SOURCE test incompatible with glibc 2.40
In terms of bug catching capability, `_FORTIFY_SOURCE` does not perform
as well as some dynamic instrumentation tools. When a sanitizer is used,
generally `_FORTIFY_SOURCE` should be disabled since sanitizer runtime
does not implement most `*_chk` functions. Using `_FORTIFY_SOURCE`
will regress error checking (asan/hwasan/tsan) or cause false positives
(msan).
`*printf_chk` are the most pronounced `_chk` interceptors for
uninstrumented DSOes (https://reviews.llvm.org/D40951).
glibc 2.40 introduced `pass_object_info` style fortified source for some
functions ([1]). `fprintf` will be mangled as
`_ZL7fprintfP8_IO_FILEU17pass_object_size1PKcz`, which has no associated
interceptor, leading to printf-fortify-5.c failure.
Just disable the test. Fix #100877
[1]: https://sourceware.org/pipermail/libc-alpha/2024-February/154531.html
Pull Request: https://github.com/llvm/llvm-project/pull/101566
(cherry picked from commit
bbdccf4c94ff18a0761b03a0e2c8b05805385132)
Pavel Skripkin [Fri, 2 Aug 2024 15:04:57 +0000 (18:04 +0300)]
[analyzer] Fix crash on using `bitcast(<type>, <array>)` as array subscript (#101647)
Current CSA logic does not expect `LazyCompoundValKind` as array index.
This may happen if array is used as subscript to another, in case of
bitcast to integer type.
Catch such cases and return `UnknownVal`, since CSA cannot model
array -> int casts.
Closes #94496
(cherry picked from commit
d96569ecc2807a13dab6495d8cc4e82775b00af1)
Stefan Pintilie [Mon, 29 Jul 2024 15:17:04 +0000 (11:17 -0400)]
[PowerPC] Add phony subregisters to cover the high half of the VSX registers. (#94628)
On PowerPC there are 128 bit VSX registers. These registers are half
overlapped with 64 bit floating point registers (FPR). The 64 bit half
of the VXS register that does not overlap with the FPR does not overlap
with any other register class. The FPR are the only subregisters of the
VSX registers but they do not fully cover the 128 bit super register.
This leads to incorrect lane masks being created.
This patch adds phony registers for the other half of the VSX registers
in order to fully cover them and to make sure that the lane masks are
not the same for the VSX and the floating point register.
(cherry picked from commit
53c37f300dd1b450671f2aee4cc649c380adb5ad)
Mital Ashok [Thu, 1 Aug 2024 14:05:46 +0000 (15:05 +0100)]
[Clang] Fix definition of layout-compatible to ignore empty classes (#92103)
Also changes the behaviour of `__builtin_is_layout_compatible`
None of the historic nor the current definition of layout-compatible
classes mention anything about base classes (other than implicitly
through being standard-layout) and are defined in terms of members, not
direct members.
Sander de Smalen [Mon, 29 Jul 2024 10:23:25 +0000 (11:23 +0100)]
Reland: "[Clang] Demote always_inline error to warning for mismatching SME attrs" (#100991) (#100996)
Test `aarch64-sme-inline-streaming-attrs.c` caused some buildbot
failures, because the test was missing a `REQUIRES: aarch64-registered
target`. This was because we've demoted the error to a warning, which
then resulted in a different error message, because Clang can't actually
CodeGen the IR.
(cherry picked from commit
389679d5f9055bffe8bbd25ae41f084a8d08e0f8)
Rainer Orth [Tue, 30 Jul 2024 07:02:05 +0000 (09:02 +0200)]
[sanitizer_common][test] Fix InternalMmapWithOffset on 32-bit Linux/s… (#101011)
…parc64
```
SanitizerCommon-Unit :: ./Sanitizer-sparc-Test/SanitizerCommon/InternalMmapWithOffset
```
`FAIL`s on 32-bit Linux/sparc64:
```
projects/compiler-rt/lib/sanitizer_common/tests/./Sanitizer-sparc-Test --gtest_filter=SanitizerCommon.InternalMmapWithOffset
--
compiler-rt/lib/sanitizer_common/tests/sanitizer_libc_test.cpp:335: Failure
Expected equality of these values:
'A'
Which is: 'A' (65, 0x41)
p[0]
Which is: '\0'
```
It turns out the `pgoffset` arg to `mmap2` is passed incorrectly in this
case, unlike the 64-bit test. The caller, `MapWritableFileToMemory`,
passes an `u64` arg, while `mmap2` expects an `off_t`. This patch casts
the arg accordingly.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
(cherry picked from commit
1c25f2cd470c2882e422b66d0482f5a120960394)
Rainer Orth [Tue, 30 Jul 2024 06:59:12 +0000 (08:59 +0200)]
[sanitizer_common] Fix signal_line.cpp on SPARC (#100535)
```
SanitizerCommon-ubsan-sparc-Linux :: Linux/signal_line.cpp
```
currently `FAIL`s on Linux/sparc64 (32 and 64-bit) for `n == 2`. Instead
of the expected `SIGSEGV`, the test dies with `SIGBUS`. `strace` reveals
that this is due to a unaligned access:
```
--- SIGBUS {si_signo=SIGBUS, si_code=BUS_ADRALN, si_addr=0x1} ---
```
which is to be expected on a strict-alignment target like SPARC. Fixed
by changing the invalid pointer to be better aligned.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
(cherry picked from commit
94394ca980f8ecbd845155d2170cfd865e4d62dc)
NAKAMURA Takumi [Fri, 26 Jul 2024 01:03:17 +0000 (10:03 +0900)]
Mel Chen [Thu, 25 Jul 2024 07:14:39 +0000 (15:14 +0800)]
[VP] Refactor VectorBuilder to avoid layering violation. NFC (#99276)
This patch refactors the handling of reduction to eliminate layering
violations.
* Introduced `getReductionIntrinsicID` in LoopUtils.h for mapping
recurrence kinds to llvm.vector.reduce.* intrinsic IDs.
* Updated `VectorBuilder::createSimpleTargetReduction` to accept
llvm.vector.reduce.* intrinsic directly.
* New function `VPIntrinsic::getForIntrinsic` for mapping intrinsic ID
to the same functional VP intrinsic ID.
(cherry picked from commit
6d12b3f67df429bffff6e1953d9f55867d7e2469)
NAKAMURA Takumi [Sun, 28 Jul 2024 07:48:23 +0000 (16:48 +0900)]
[Bazel] Use PACKAGE_VERSION for version string.
This enables "-rc" suffix in release branches.
(cherry picked from commit
25efb746d907ce0ffdd9195d191ff0f6944ea3ca)
Sjoerd Meijer [Fri, 2 Aug 2024 12:25:35 +0000 (13:25 +0100)]
Ofast deprecation clarifications (#101005)
Following up on the RFC discussion, this is clarifying that the main
purpose and effect of the -Ofast deprecation is to discourage its usage
and that everything else is more or less open for discussion, e.g. there
is no timeline yet for removal.
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
(cherry picked from commit
48d4d4b641702bf6db03a1bac73b7e13dea28349)
Donát Nagy [Fri, 2 Aug 2024 10:43:06 +0000 (12:43 +0200)]
[analyzer] Restore recognition of mutex methods (#101511)
Before commit
705788c the checker alpha.unix.BlockInCriticalSection
"recognized" the methods `std::mutex::lock` and `std::mutex::unlock`
with an extremely trivial check that accepted any function (or method)
named lock/unlock.
To avoid matching unrelated user-defined function, this was refined to a
check that also requires the presence of "std" and "mutex" as distinct
parts of the qualified name.
However, as #99628 reported, there are standard library implementations
where some methods of `std::mutex` are inherited from an implementation
detail base class and the new code wasn't able to recognize these
methods, which led to emitting false positive reports.
As a workaround, this commit partially restores the old behavior by
omitting the check for the class name.
In the future, it would be good to replace this hack with a solution
which ensures that `CallDescription` understands inherited methods.
(cherry picked from commit
99ae2edc2592e602b0eb5a287f4d003aa3902440)
Luke Lau [Tue, 30 Jul 2024 16:28:52 +0000 (00:28 +0800)]
[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW (#101152)
As noted in
https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we
currently fold in vmerge.vvms and vmv.v.vs into their ops even if the
EEW is different which leads to an incorrect transform.
This checks the op's EEW via its simple value type for now since there
doesn't seem to be any existing information about the EEW size of
instructions. We'll probably need to encode this at some point if we
want to be able to access it at the MachineInstr level in #100367
Qiongsi Wu [Thu, 1 Aug 2024 13:51:07 +0000 (09:51 -0400)]
[AIX] Turn on `#pragma mc_func` check by default (#101336)
https://github.com/llvm/llvm-project/pull/99888 added a check (and
corresponding options) to flag uses of `#pragma mc_func` on AIX.
This PR turns on the check by default.
(cherry picked from commit
b9335176db718bf64c72d48107eb9dff28ed979e)
Nikolas Klauser [Fri, 2 Aug 2024 08:53:33 +0000 (10:53 +0200)]
[Clang] Add a release note deprecating __is_nullptr
Fangrui Song [Thu, 1 Aug 2024 17:22:03 +0000 (10:22 -0700)]
[ELF] Support relocatable files using CREL with explicit addends
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.
---
Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.
(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)
A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.
---
Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).
* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.
SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.
Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600
Pull Request: https://github.com/llvm/llvm-project/pull/98115
(cherry picked from commit
0af07c078798b7c427e2981377781b5cc555a568)
Xing Xue [Thu, 1 Aug 2024 11:25:01 +0000 (07:25 -0400)]
[NFC][libc++][libc++abi][libunwind][test] Fix/unify AIX triples used in LIT tests (#101196)
This patch fixes/unifies AIX target triples used in libc++, libc++abi,
and libunwind LIT tests.
(cherry picked from commit
2d3655037ccfa276cb0949c2ce0cff56985f6637)
Damien L-G [Thu, 1 Aug 2024 14:39:27 +0000 (10:39 -0400)]
[libc++] Increase atomic_ref's required alignment for small types (#99654)
This patch increases the alignment requirement for std::atomic_ref
such that we can guarantee lockfree operations more often. Specifically,
we require types that are 1, 2, 4, 8, or 16 bytes in size to be aligned
to at least their size to be used with std::atomic_ref.
This is the case for most types, however a notable exception is
`long long` on x86, which is 8 bytes in length but has an alignment
of 4.
As a result of this patch, one has to be more careful about the
alignment of objects used with std::atomic_ref. Failure to provide
a properly-aligned object to std::atomic_ref is a precondition
violation and is technically UB. On the flipside, this allows us
to provide an atomic_ref that is actually lockfree more often,
which is an important QOI property.
More information in the discussion at https://github.com/llvm/llvm-project/pull/99570#issuecomment-
2237668661.
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
(cherry picked from commit
59ca618e3b7aec8c32e24d781bae436dc99b2727)
Dimitry Andric [Thu, 1 Aug 2024 07:28:29 +0000 (09:28 +0200)]
[lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm,mips64,powerpc} declarations (#101403)
Similar to #97796, fix the type of the `native_thread` parameter for the
arm, mips64 and powerpc variants of `NativeRegisterContextFreeBSD_*`.
Otherwise, this leads to compile errors similar to:
```
lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.cpp:85:39: error: out-of-line definition of 'NativeRegisterContextFreeBSD_powerpc' does not match any declaration in 'lldb_private::process_freebsd::NativeRegisterContextFreeBSD_powerpc'
85 | NativeRegisterContextFreeBSD_powerpc::NativeRegisterContextFreeBSD_powerpc(
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
(cherry picked from commit
7088a5ed880f29129ec844c66068e8cb61ca98bf)
Tom Stellard [Thu, 1 Aug 2024 18:23:03 +0000 (11:23 -0700)]
workflows: Fix libclc-tests (#101524)
The old out-of-tree build configuration stopped working and in tree
builds are supported now, so we should use the in tree configuration.
The only downside is we can't run the tests any more, but at least we
will be able to test the build again.
(cherry picked from commit
0512ba0a435a9d693cb61f182fc9e3eb7f6dbd6a)
Dimitry Andric [Mon, 29 Jul 2024 18:34:01 +0000 (20:34 +0200)]
[InstrProf] Remove duplicate definition of IntPtrT
In
16e74fd48988a (for #82711) a duplicate definition of `IntPtrT` was
added to `InstrProfiling.h`, leading to warnings:
compiler-rt/lib/profile/InstrProfiling.h:52:15: warning: redefinition of typedef 'IntPtrT' is a C11 feature [-Wtypedef-redefinition]
52 | typedef void *IntPtrT;
| ^
compiler-rt/lib/profile/InstrProfiling.h:34:15: note: previous definition is here
34 | typedef void *IntPtrT;
| ^
Fix the warnings by removing the duplicate typedef.
(cherry picked from commit
2c376fe96c83443c15e6485d043ebe321904546b)
Alexandre Ganea [Tue, 30 Jul 2024 23:06:03 +0000 (19:06 -0400)]
[Support] Silence warnings when retrieving exported functions (#97905)
Since functions exported from DLLs are type-erased, before this patch I
was seeing the new Clang 19 warning `-Wcast-function-type-mismatch`.
This happens when building LLVM on Windows.
Following discussion in
https://github.com/llvm/llvm-project/commit/
593f708118aef792f434185547f74fedeaf51dd4#commitcomment-
143905744
(cherry picked from commit
39e192b379362e9e645427631c35450d55ed517d)
Louis Dionne [Wed, 31 Jul 2024 14:40:14 +0000 (10:40 -0400)]
[libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)"
This reverts commit
55357160d0e151c32f86e1d6683b4bddbb706aa1.
This is only being reverted from the LLVM 19 branch as a
convenience to avoid breaking some IDEs which were not ready
for that change.
Fixes #99464
Piyou Chen [Wed, 31 Jul 2024 07:54:03 +0000 (00:54 -0700)]
Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85790)"
This reverts commit
a41a4ac78294c728fb70a51623c602ea7f3e308a.
Fangrui Song [Tue, 30 Jul 2024 21:52:29 +0000 (14:52 -0700)]
Revert "[MC] Compute fragment offsets eagerly"
This reverts commit
1a47f3f3db66589c11f8ddacfeaecc03fb80c510.
Fix #100283
This commit is actually a trigger of other preexisting problems:
* Size change of fill fragments does not influence the fixed-point iteration.
* The `invalid number of bytes` error is reported too early. Since
`.zero A-B` might have temporary negative values in the first few
iterations.
However, the problems appeared at least "benign" (did not affect the
Linux kernel builds) before this commit.
(cherry picked from commit
4eb5450f630849ee0518487de38d857fbe5b1aee)
Nikita Popov [Tue, 30 Jul 2024 07:25:03 +0000 (09:25 +0200)]
[Sanitizers] Avoid overload ambiguity for interceptors (#100986)
Since glibc 2.40 some functions like openat make use of overloads when
built with `-D_FORTIFY_SOURCE=2`, see:
https://github.com/bminor/glibc/blob/master/io/bits/fcntl2.h
This means that doing something like `(uintptr_t) openat` or `(void *)
openat` is now ambiguous, breaking the compiler-rt build on new glibc
versions.
Fix this by explicitly casting the symbol to the expected function type
before casting it to an intptr. The expected type is obtained as
`decltype(REAL(func))` so we don't have to repeat the signature from
INTERCEPTOR in the INTERCEPT_FUNTION macro.
Fixes https://github.com/llvm/llvm-project/issues/100754.
(cherry picked from commit
155b7a12820ec45095988b6aa6e057afaf2bc892)
Alexandros Lamprineas [Tue, 23 Jul 2024 18:24:41 +0000 (19:24 +0100)]
[clang][FMV][AArch64] Improve streaming mode compatibility.
* Allow arm-streaming if all the functions versions adhere to it.
* Allow arm-streaming-compatible if all the functions versions adhere to it.
* Allow arm-locally-streaming regardless of the other functions versions.
When the caller needs to toggle the streaming mode all the function versions
of the callee must adhere to the same mode, otherwise the call will yield a
runtime error.
Imagine the versions of the callee live in separate TUs. The version that
is visible to the caller will determine the calling convention used when
generating code for the callsite. Therefore we cannot support mixing
streaming with non-streaming function versions. Imagine TU1 has a streaming
caller and calls foo._sme which is streaming-compatible. The codegen for
the callsite will not switch off the streaming mode. Then in TU2 we have
a version which is non-streaming and could potentially be called in
streaming mode. Similarly if the caller is non-streaming and the called
version is streaming-compatible the codegen for the callsite will not
switch on the streaming mode, but other versions may be streaming.
Hubert Tong [Tue, 30 Jul 2024 21:56:55 +0000 (17:56 -0400)]
ReleaseNotes.rst: Fix typo "my" for "may"
Replace typo for "may" with "can".
Jacek Caban [Tue, 30 Jul 2024 12:22:50 +0000 (14:22 +0200)]
[CodeGen][ARM64EC] Use alias symbol for exporting hybrid_patchable functions. (#100872)
Exporting $hp_target symbol doesn't make sense, use the unmangled alias instead.
This is not compatible with MSVC, but it makes using dllexport together with
hybrid_patchable attribute possible.
(cherry picked from commit
41c0f89f5532ec110b927c3a67ceac83448c5d98)
Xing Xue [Tue, 30 Jul 2024 10:28:59 +0000 (06:28 -0400)]
[libunwind][AIX] Fix the wrong traceback from signal handler (#101069)
Patch [llvm#92291](https://github.com/llvm/llvm-project/pull/92291)
causes wrong traceback from a signal handler for AIX because the AIX
unwinder uses the traceback table at the end of each function instead of
FDE/CIE for unwinding. This patch adds a condition to exclude traceback
table based unwinding from the code added by the patch.
(cherry picked from commit
d90fa612604b49dfc81c3f42c106fab7401322ec)
Stefan Pintilie [Wed, 24 Jul 2024 01:59:27 +0000 (21:59 -0400)]
[RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. (#96839)
The issue with the handling of the SUBREG_TO_REG is that we don't join
the subranges correctly when we join live ranges across the
SUBREG_TO_REG. For example when joining across this:
```
32B %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
```
we want to join these live ranges:
```
%0 [16r,32r:0) 0@16r weight:0.
000000e+00
%2 [32r,112r:0) 0@32r weight:0.
000000e+00
```
Before the fix the range for the resulting merged `%2` is:
```
%2 [16r,112r:0) 0@16r weight:0.
000000e+00
```
After the fix it is now this:
```
%2 [16r,112r:0) 0@16r L000000000000000F [16r,112r:0) 0@16r weight:0.
000000e+00
```
Two tests are added to this fix. The X86 test fails without the patch.
The PowerPC test passes with and without the patch but is added as a way
track future possible failures when register classes are changed in a
future patch.
(cherry picked from commit
26fa399012da00fbf806f50ad72a3b5f0ee63eab)