platform/upstream/llvm.git
11 months ago[Inline] Add test for #67054 (NFC)
Nikita Popov [Mon, 4 Mar 2024 10:32:07 +0000 (11:32 +0100)]
[Inline] Add test for #67054 (NFC)

(cherry picked from commit cad6ad2759a782c48193f83886488dacc9f330e3)

11 months ago[ValueTracking] Treat phi as underlying obj when not decomposing further (#84339)
Florian Hahn [Tue, 12 Mar 2024 08:55:03 +0000 (08:55 +0000)]
[ValueTracking] Treat phi as underlying obj when not decomposing further (#84339)

At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.

Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.

Fixes https://github.com/llvm/llvm-project/issues/82665.

PR: https://github.com/llvm/llvm-project/pull/84339
(cherry picked from commit b274b23665dec30f3ae4fb83ccca8b77e6d3ada3)

11 months ago[LAA] Add test case for #82665.
Florian Hahn [Thu, 7 Mar 2024 13:53:02 +0000 (13:53 +0000)]
[LAA] Add test case for #82665.

Test case for https://github.com/llvm/llvm-project/issues/82665.

(cherry picked from commit 4cfd4a7896b5fd50274ec8573c259d7ad41741de)

11 months ago[Release] Install compiler-rt builtins during Phase 1 on AIX (#81485)
azhan92 [Fri, 16 Feb 2024 02:27:45 +0000 (21:27 -0500)]
[Release] Install compiler-rt builtins during Phase 1 on AIX (#81485)

The current test-release.sh script does not install the necessary
compiler-rt builtin's during Phase 1 on AIX, resulting on a
non-functional Phase 1 clang. Futhermore, the installation is also
necessary for Phase 2 on AIX.

Co-authored-by: Alison Zhang <alisonzhang@ibm.com>
(cherry picked from commit 3af5c98200e0b1268f755c3f289be4f73aac4214)

11 months ago[ArgPromotion] Remove incorrect TranspBlocks set for loads. (#84835)
Florian Hahn [Tue, 12 Mar 2024 09:47:42 +0000 (09:47 +0000)]
[ArgPromotion] Remove incorrect TranspBlocks set for loads. (#84835)

The TranspBlocks set was used to cache aliasing decision for all
processed loads in the parent loop. This is incorrect, because each load
can access a different location, which means one load not being modified
in a block doesn't translate to another load not being modified in the
same block.

All loads access the same underlying object, so we could perhaps use a
location without size for all loads and retain the cache, but that would
mean we loose precision.

For now, just drop the cache.

Fixes https://github.com/llvm/llvm-project/issues/84807

PR: https://github.com/llvm/llvm-project/pull/84835
(cherry picked from commit bba4a1daff6ee09941f1369a4e56b4af95efdc5c)

11 months ago[ArgPromotion] Add test case for #84807.
Florian Hahn [Mon, 11 Mar 2024 21:06:03 +0000 (21:06 +0000)]
[ArgPromotion] Add test case for #84807.

Test case for https://github.com/llvm/llvm-project/issues/84807,
showing a mis-compile in ArgPromotion.

(cherry picked from commit 31ffdb56b4df9b772d763dccabbfde542545d695)

11 months ago[LLD] [COFF] Set the right alignment for DelayDirectoryChunk (#84697)
Martin Storsjö [Mon, 11 Mar 2024 22:03:26 +0000 (00:03 +0200)]
[LLD] [COFF] Set the right alignment for DelayDirectoryChunk (#84697)

This makes a difference when linking executables with delay loaded
libraries for arm32; the delay loader implementation can load data from
the registry with instructions that assume alignment.

This issue does not show up when linking in MinGW mode, because a
PseudoRelocTableChunk gets injected, which also sets alignment, even if
the chunk itself is empty.

(cherry picked from commit c93c76b562784926b22a69d3f82a5032dcb4a274)

11 months ago[X86] combineAndShuffleNot - ensure the type is legal before create X86ISD::ANDNP...
Simon Pilgrim [Sun, 10 Mar 2024 16:23:51 +0000 (16:23 +0000)]
[X86] combineAndShuffleNot - ensure the type is legal before create X86ISD::ANDNP target nodes

Fixes #84660

(cherry picked from commit 862c7e0218f27b55a5b75ae59a4f73cd4610448d)

11 months ago[DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157)
Florian Hahn [Wed, 6 Mar 2024 20:08:00 +0000 (20:08 +0000)]
[DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157)

Not removing the malloc from earliest escape info leaves stale entries
in the cache.

Fixes https://github.com/llvm/llvm-project/issues/84051.

PR: https://github.com/llvm/llvm-project/pull/84157
(cherry picked from commit eb8f379567e8d014194faefe02ce92813e237afc)

11 months ago[InstCombine] Fix miscompilation in PR83947 (#83993)
Yingwei Zheng [Tue, 5 Mar 2024 14:34:04 +0000 (22:34 +0800)]
[InstCombine] Fix miscompilation in PR83947 (#83993)

https://github.com/llvm/llvm-project/blob/762f762504967efbe159db5c737154b989afc9bb/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L394-L407

Comment from @topperc:
> This transforms assumes the mask is a non-zero splat. We only know its
a splat and not provably all 0s. The mask is a constexpr that includes
the address of the global variable. We can't resolve the constant
expression to an exact value.

Fixes #83947.

11 months ago[Clang][LoongArch] Fix wrong return value type of __iocsrrd_h (#84100)
wanglei [Wed, 6 Mar 2024 02:03:28 +0000 (10:03 +0800)]
[Clang][LoongArch] Fix wrong return value type of __iocsrrd_h (#84100)

relate:
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645016.html
(cherry picked from commit 2f479b811274fede36535e34ecb545ac22e399c3)

11 months ago[Clang][LoongArch] Precommit test for fix wrong return value type of __iocsrrd_h...
wanglei [Tue, 5 Mar 2024 11:44:28 +0000 (19:44 +0800)]
[Clang][LoongArch] Precommit test for fix wrong return value type of __iocsrrd_h. NFC

(cherry picked from commit aeda1a6e800e0dd6c91c0332b4db95094ad5b301)

11 months ago[LoongArch] Make sure that the LoongArchISD::BSTRINS node uses the correct `MSB`...
wanglei [Mon, 11 Mar 2024 00:59:17 +0000 (08:59 +0800)]
[LoongArch] Make sure that the LoongArchISD::BSTRINS node uses the correct `MSB` value (#84454)

The `MSB` must not be greater than `GRLen`. Without this patch, newly
added test cases will crash with LoongArch32, resulting in a 'cannot
select' error.

(cherry picked from commit edd4c6c6dca4c556de22b2ab73d5bfc02d28e59b)

11 months ago[analyzer] Fix crash on dereference invalid return value of getAdjustedParameterIndex...
Exile [Wed, 6 Mar 2024 16:01:30 +0000 (00:01 +0800)]
[analyzer] Fix crash on dereference invalid return value of getAdjustedParameterIndex() (#83585)

Fixes #78810
Thanks for Snape3058 's comment

---------

Co-authored-by: miaozhiyuan <miaozhiyuan@feysh.com>
(cherry picked from commit d4687fe7d1639ea5d16190c89a54de1f2c6e2a9a)

11 months ago[libc++] Enable availability based on the compiler instead of __has_extension (#84065)
Louis Dionne [Thu, 7 Mar 2024 20:12:21 +0000 (15:12 -0500)]
[libc++] Enable availability based on the compiler instead of __has_extension (#84065)

__has_extension(...) doesn't work as intended when -pedantic-errors is
used with Clang. With that flag, __has_extension(...) is equivalent to
__has_feature(...), which means that checks like

    __has_extension(pragma_clang_attribute_external_declaration)

will return 0. In turn, this has the effect of disabling availability
markup in libc++, which is undesirable.

rdar://124078119
(cherry picked from commit 292a28df6c55679fad0589dea35278a8c66b2ae1)

11 months ago[InstCombine] Handle scalable splat in `getFlippedStrictnessPredicateAndConstant`
Yingwei Zheng [Tue, 5 Mar 2024 09:21:16 +0000 (17:21 +0800)]
[InstCombine] Handle scalable splat in `getFlippedStrictnessPredicateAndConstant`

(cherry picked from commit d51fcd4ed86ac6075c8a25b053c2b66051feaf62)

11 months ago[lld][LoongArch] Support the R_LARCH_{ADD,SUB}_ULEB128 relocation types (#81133)
Jinyang He [Tue, 5 Mar 2024 07:50:14 +0000 (15:50 +0800)]
[lld][LoongArch] Support the R_LARCH_{ADD,SUB}_ULEB128 relocation types (#81133)

For a label difference like `.uleb128 A-B`, MC generates a pair of
R_LARCH_{ADD,SUB}_ULEB128 if A-B cannot be folded as a constant. GNU
assembler generates a pair of relocations in more cases (when A or B is
in a code section with linker relaxation). It is similar to RISCV.

R_LARCH_{ADD,SUB}_ULEB128 relocations are created by Clang and GCC in
`.gcc_except_table` and other debug sections with linker relaxation
enabled. On LoongArch, first read the buf and count the available space.
Then add or sub the value. Finally truncate the expected value and fill
it into the available space.

(cherry picked from commit eaa9ef678c63bf392ec2d5b736605db7ea7e7338)

11 months ago[Clang] [Sema] Handle placeholders in '.*' expressions (#83103)
Sirraide [Tue, 27 Feb 2024 19:19:44 +0000 (20:19 +0100)]
[Clang] [Sema] Handle placeholders in '.*' expressions (#83103)

When analysing whether we should handle a binary expression as an
overloaded operator call or a builtin operator, we were calling
`checkPlaceholderForOverload()`, which takes care of any placeholders
that are not overload sets—which would usually make sense since those
need to be handled as part of overload resolution.

Unfortunately, we were also doing that for `.*`, which is not
overloadable, and then proceeding to create a builtin operator anyway,
which would crash if the RHS happened to be an unresolved overload set
(due hitting an assertion in `CreateBuiltinBinOp()`—specifically, in one
of its callees—in the `.*` case that makes sure its arguments aren’t
placeholders).

This pr instead makes it so we check for *all* placeholders early if the
operator is `.*`.

It’s worth noting that,
1. In the `.*` case, we now additionally also check for *any*
placeholders (not just non-overload-sets) in the LHS; this shouldn’t
make a difference, however—at least I couldn’t think of a way to trigger
the assertion with an overload set as the LHS of `.*`; it is worth
noting that the assertion in question would also complain if the LHS
happened to be of placeholder type, though.
2. There is another case in which we also don’t perform overload
resolution—namely `=` if the LHS is not of class or enumeration type
after handling non-overload-set placeholders—as in the `.*` case, but
similarly to 1., I first couldn’t think of a way of getting this case to
crash, and secondly, `CreateBuiltinBinOp()` doesn’t seem to care about
placeholders in the LHS or RHS in the `=` case (from what I can tell,
it, or rather one of its callees, only checks that the LHS is not a
pseudo-object type, but those will have already been handled by the call
to `checkPlaceholderForOverload()` by the time we get to this function),
so I don’t think this case suffers from the same problem.

This fixes #53815.

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
11 months ago[RISCV] Fix crash when unrolling loop containing vector instructions (#83384)
Shih-Po Hung [Sat, 2 Mar 2024 04:33:55 +0000 (12:33 +0800)]
[RISCV] Fix crash when unrolling loop containing vector instructions (#83384)

When MVT is not a vector type, TCK_CodeSize should return an invalid
cost. This patch adds a check in the beginning to make sure all cost
kinds return invalid costs consistently.

Before this patch, TCK_CodeSize returns a valid cost on scalar MVT but
other cost kinds doesn't.

This fixes the issue #83294 where a loop contains vector instructions
and MVT is scalar after type legalization when the vector extension is
not enabled,

(cherry picked from commit fb67dce1cb87e279593c27bd4122fe63bad75f04)

11 months ago[ELF] Internalize enum
Fangrui Song [Fri, 1 Mar 2024 19:17:22 +0000 (11:17 -0800)]
[ELF] Internalize enum

g++ -flto has a diagnostic `-Wodr` about mismatched redeclarations,
which even apply to `enum`.

Fix #83529

Reviewers: thesamesam

Reviewed By: thesamesam

Pull Request: https://github.com/llvm/llvm-project/pull/83604

(cherry picked from commit 4a3f7e798a31072a80a0731b8fb1da21b9c626ed)

11 months agoUnbreak *tf builtins for hexfloat (#82208)
Alexander Richardson [Wed, 21 Feb 2024 20:59:56 +0000 (12:59 -0800)]
Unbreak *tf builtins for hexfloat (#82208)

This re-lands cc0065a7d082f0bd322a538cf62cfaef1c8f89f8 in a way that
keeps existing targets working.

---------

Original commit message:
#68132 ended up removing
__multc3 & __divtc3 from compiler-rt library builds that have
QUAD_PRECISION but not TF_MODE due to missing int128 support.
I added support for QUAD_PRECISION to use the native hex float long double representation.

---------

Co-authored-by: Sean Perry <perry@ca.ibm.com>
(cherry picked from commit 99c457dc2ef395872d7448c85609f6cb73a7f89b)

11 months ago[AArch64] Skip over shadow space for ARM64EC entry thunk variadic calls (#80994)
Billy Laws [Tue, 27 Feb 2024 18:32:15 +0000 (18:32 +0000)]
[AArch64] Skip over shadow space for ARM64EC entry thunk variadic calls (#80994)

When in an entry thunk the x64 SP is passed in x4 but this cannot be
directly passed through since x64 varargs calls have a 32 byte shadow
store at SP followed by the in-stack parameters. ARM64EC varargs calls
on the other hand expect x4 to point to the first in-stack parameter.

11 months ago[AArch64] Fix generated types for ARM64EC variadic entry thunk targets (#80595)
Billy Laws [Mon, 5 Feb 2024 17:26:16 +0000 (17:26 +0000)]
[AArch64] Fix generated types for ARM64EC variadic entry thunk targets (#80595)

ISel handles filling in x4/x5 when calling variadic functions as they
don't correspond to the 5th/6th X64 arguments but rather to the end of
the shadow space on the stack and the size in bytes of all stack
parameters (ignored and written as 0 for calls from entry thunks).

Will PR a follow up with ISel handling after this is merged.

11 months ago[AArch64] Fix variadic tail-calls on ARM64EC (#79774)
Billy Laws [Wed, 31 Jan 2024 02:32:15 +0000 (02:32 +0000)]
[AArch64] Fix variadic tail-calls on ARM64EC (#79774)

ARM64EC varargs calls expect that x4 = sp at entry, special handling is
needed to ensure this with tail calls since they occur after the
epilogue and the x4 write happens before.

I tried going through AArch64MachineFrameLowering for this, hoping to
avoid creating the dummy object but this was the best I could do since
the stack info that uses isn't populated at this stage,
CreateFixedObject also explicitly forbids 0 sized objects.

11 months ago[LoongArch] Override LoongArchTargetLowering::getExtendForAtomicCmpSwapArg (#83656)
Lu Weining [Mon, 4 Mar 2024 00:38:52 +0000 (08:38 +0800)]
[LoongArch] Override LoongArchTargetLowering::getExtendForAtomicCmpSwapArg (#83656)

This patch aims to solve Firefox issue:
https://bugzilla.mozilla.org/show_bug.cgi?id=1882301

Similar to 616289ed2922. Currently LoongArch uses an ll.[wd]/sc.[wd]
loop for ATOMIC_CMP_XCHG. Because the comparison in the loop is
full-width (i.e. the `bne` instruction), we must sign extend the input
comparsion argument.

Note that LoongArch ISA manual V1.1 has introduced compare-and-swap
instructions. We would change the implementation (return `ANY_EXTEND`)
when we support them.

(cherry picked from commit 5f058aa211995d2f0df2a0e063532832569cb7a8)

11 months ago[TableGen] Fix wrong codegen of BothFusionPredicateWithMCInstPredicate (#83990)
Wang Pengcheng [Tue, 5 Mar 2024 11:54:02 +0000 (19:54 +0800)]
[TableGen] Fix wrong codegen of BothFusionPredicateWithMCInstPredicate (#83990)

We should generate the `MCInstPredicate` twice, one with `FirstMI`
and another with `SecondMI`.

(cherry picked from commit de1f33873beff93063577195e1214a9509e229e0)

11 months ago[OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540)
Vadim Paretsky [Sat, 9 Mar 2024 18:47:31 +0000 (10:47 -0800)]
[OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540)

MSVC does not define __BYTE_ORDER__ making the check for BigEndian
erroneously evaluate to true and breaking the struct definitions in MSVC
compiled builds correspondingly. The fix adds an additional check for
whether __BYTE_ORDER__ is defined by the compiler to fix these.

---------

Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
(cherry picked from commit 110141b37813dc48af33de5e1407231e56acdfc5)

11 months agoReleaseNotes for LLVM binary utilities (#83751)
Fangrui Song [Mon, 11 Mar 2024 20:40:01 +0000 (13:40 -0700)]
ReleaseNotes for LLVM binary utilities (#83751)

11 months ago[InstCombine] Fix infinite loop in select equivalence fold (#84036)
Nikita Popov [Wed, 6 Mar 2024 08:33:51 +0000 (09:33 +0100)]
[InstCombine] Fix infinite loop in select equivalence fold (#84036)

When replacing with a non-constant, it's possible that the result of the
simplification is actually more complicated than the original, and may
result in an infinite combine loop.

Mitigate the issue by requiring that either the replacement or
simplification result is constant, which should ensure that it's
simpler. While this check is crude, it does not appear to cause
optimization regressions in real-world code in practice.

Fixes https://github.com/llvm/llvm-project/issues/83127.

(cherry picked from commit 9f45c5e1a65a1abf4920b617d36ed05e73c04bea)

11 months ago [InstCombine] Fix shift calculation in InstCombineCasts (#84027)
Quentin Dian [Tue, 5 Mar 2024 22:16:28 +0000 (06:16 +0800)]
 [InstCombine] Fix shift calculation in InstCombineCasts (#84027)

Fixes #84025.

(cherry picked from commit e96c0c1d5e0a9916098b1a31acb006ea6c1108fb)

11 months ago[test] Make two sanitize-coverage tests pass with glibc 2.39+
Fangrui Song [Wed, 6 Mar 2024 21:17:43 +0000 (13:17 -0800)]
[test] Make two sanitize-coverage tests pass with glibc 2.39+

glibc 2.39 added `nonnull` attribute to most libio functions accepting a
`FILE*` parameter, including fprintf[1]. The -fsanitize=undefined mode
checks the argument to fprintf and has extra counters, not expected by
two tests. Specify -fno-sanitize=nonnull-attribute to make the two tests
pass.

Fix #82883

[1]: https://sourceware.org/git/?p=glibc.git;a=commit;h=64b1a44183a3094672ed304532bedb9acc707554

Pull Request: https://github.com/llvm/llvm-project/pull/84231

(cherry picked from commit c3acbf6bb06f9039f9850e18e0ae2f2adef63905)

11 months ago[DSE] Delay deleting non-memory-defs until end of DSE. (#83411)
Florian Hahn [Sat, 2 Mar 2024 12:34:36 +0000 (12:34 +0000)]
[DSE] Delay deleting non-memory-defs until end of DSE. (#83411)

DSE uses BatchAA, which caches queries using pairs of MemoryLocations.
At the moment, DSE may remove instructions that are used as pointers in
cached MemoryLocations. If a new instruction used by a new MemoryLoation
and this instruction gets allocated at the same address as a previosuly
cached and then removed instruction, we may access an incorrect entry in
the cache.

To avoid this delay removing all instructions except MemoryDefs until
the end of DSE. This should avoid removing any values used in BatchAA's
cache.

Test case by @vporpo from
https://github.com/llvm/llvm-project/pull/83181.
(Test not precommitted because the results are non-determinstic - memset
only sometimes gets removed)

PR: https://github.com/llvm/llvm-project/pull/83411
(cherry picked from commit 10f5e983a9e3162a569cbebeb32168716e391340)

11 months ago[clang][fat-lto-objects] Make module flags match non-FatLTO pipelines (#83159)
Paul Kirth [Thu, 29 Feb 2024 03:11:55 +0000 (19:11 -0800)]
[clang][fat-lto-objects] Make module flags match non-FatLTO pipelines (#83159)

In addition to being rather hard to follow, there isn't a good reason
why FatLTO shouldn't just share the same code for setting module flags
for (Thin)LTO. This patch simplifies the logic and makes sure we use set
these flags in a consistent way, independent of FatLTO.

Additionally, we now test that output in the .llvm.lto section actually
matches the output from Full and Thin LTO compilation.

(cherry picked from commit 7d8b50aaab8e0f935e3cb1f3f397e98b9e3ee241)

11 months agoAllow .alt_entry symbols to pass the .cfi nesting check (#82268)
Jon Roelofs [Wed, 28 Feb 2024 21:03:35 +0000 (13:03 -0800)]
Allow .alt_entry symbols to pass the .cfi nesting check (#82268)

A symbol with an `N_ALT_ENTRY` attribute may be defined in the middle of
a subsection, so it is reasonable to opt them out of the
`.cfi_{start,end}proc` nesting check.

Fixes: https://github.com/llvm/llvm-project/issues/82261
(cherry picked from commit 5b91647e3f82c9747c42c3239b7d7f3ade4542a7)

11 months agoMIPS: fix emitDirectiveCpsetup on N32 (#80534)
YunQiang Su [Mon, 26 Feb 2024 21:08:58 +0000 (05:08 +0800)]
MIPS: fix emitDirectiveCpsetup on N32 (#80534)

In gas, .cpsetup may expand to one of two code sequences (one is related to `__gnu_local_gp`), depending on -mno-shared and -msym32.
Since Clang doesn't support -mno-shared or -msym32, .cpsetup expands to one code sequence.
The N32 condition incorrectly leads to the incorrect `__gnu_local_gp` code sequence.

```
00000000 <t1>:
   0:   ffbc0008        sd      gp,8(sp)
   4:   3c1c0000        lui     gp,0x0
                        4: R_MIPS_HI16  __gnu_local_gp
   8:   279c0000        addiu   gp,gp,0
                        8: R_MIPS_LO16  __gnu_local_gp
```

Fixes: #52785
(cherry picked from commit 860b6edfa9b344fbf8c500c17158c8212ea87d1c)

11 months ago[libc++][modules] Fixes naming inconsistency. (#83036)
Mark de Wever [Tue, 27 Feb 2024 17:10:53 +0000 (18:10 +0100)]
[libc++][modules] Fixes naming inconsistency. (#83036)

The modules used is-standard-library and is-std-library. The latter is
the name used in the SG15 proposal,

Fixes: https://github.com/llvm/llvm-project/issues/82879
(cherry picked from commit b50bcc7ffb6ad6caa4c141a22915ab59f725b7ae)

11 months agoBump version to 18.1.2 (#84655)
Tom Stellard [Mon, 11 Mar 2024 14:31:28 +0000 (07:31 -0700)]
Bump version to 18.1.2 (#84655)

11 months agoBump version to 18.1.1
Tom Stellard [Fri, 8 Mar 2024 05:27:31 +0000 (21:27 -0800)]
Bump version to 18.1.1

11 months agoRemove RC suffix
Tobias Hieta [Tue, 19 Sep 2023 07:44:33 +0000 (09:44 +0200)]
Remove RC suffix

11 months agoMIPS: Fix asm constraints "f" and "r" for softfloat (#79116)
YunQiang Su [Tue, 27 Feb 2024 06:08:36 +0000 (14:08 +0800)]
MIPS: Fix asm constraints "f" and "r" for softfloat (#79116)

This include 2 fixes:
        1. Disallow 'f' for softfloat.
        2. Allow 'r' for softfloat.

Currently, 'f' is accpeted by clang, then LLVM meets an internal error.

'r' is rejected by LLVM by: couldn't allocate input reg for constraint
'r'.

Fixes: #64241, #63632
---------

Co-authored-by: Fangrui Song <i@maskray.me>
(cherry picked from commit c88beb4112d5bbf07d76a615ab7f13ba2ba023e6)

11 months ago[Mips] Fix unable to handle inline assembly ends with compat-branch o… (#77291)
yingopq [Sat, 24 Feb 2024 07:13:43 +0000 (15:13 +0800)]
[Mips] Fix unable to handle inline assembly ends with compat-branch o… (#77291)

…n MIPS

Modify:
Add a global variable 'CurForbiddenSlotAttr' to save current
instruction's forbidden slot and whether set reorder. This is the
judgment condition for whether to add nop. We would add a couple of
'.set noreorder' and '.set reorder' to wrap the current instruction and
the next instruction.
Then we can get previous instruction`s forbidden slot attribute and
whether set reorder by 'CurForbiddenSlotAttr'.
If previous instruction has forbidden slot and .set reorder is active
and current instruction is CTI. Then emit a NOP after it.

Fix https://github.com/llvm/llvm-project/issues/61045.

Because https://reviews.llvm.org/D158589 was 'Needs Review' state, not
ending, so we commit pull request again.

(cherry picked from commit 96abee5eef31274415681018553e1d4a16dc16c9)

11 months ago[NFC][AArch64] fix whitespace in AArch64SchedNeoverseV1 (#81744)
Philipp Tomsich [Thu, 15 Feb 2024 00:54:08 +0000 (16:54 -0800)]
[NFC][AArch64] fix whitespace in AArch64SchedNeoverseV1 (#81744)

One of the whitespace fixes didn't get added to the commit introducing
the Ampere1B model.
Clean it up.

(cherry picked from commit 3369e341288b3d9bb59827f9a2911ebf3d36408d)

11 months ago[AArch64] Initial Ampere1B scheduling model (#81341)
Philipp Tomsich [Wed, 14 Feb 2024 14:23:14 +0000 (06:23 -0800)]
[AArch64] Initial Ampere1B scheduling model (#81341)

The Ampere1B core is enabled with a new scheduling/pipeline model, as it
provides significant updates over the Ampere1 core; it reduces latencies
on many instructions, has some micro-ops reassigned between the XY and X
units, and provides modelling for the instructions added since Ampere1
and Ampere1A.

As this is the first model implementing the CSSC instructions, we update
the UnsupportedFeatures on all other models (that have CompleteModel
set).

Testcases are added under llvm-mca: these showed the FullFP16 feature
missing, so we are adding it in as part of this commit.

This *adds tests and additional fixes* compared to the reverted #81338.

(cherry picked from commit dd1897c6cb028bda7d4d541d1bb33965eccf0a68)

11 months ago[AArch64] Add the Ampere1B core (#81297)
Philipp Tomsich [Fri, 9 Feb 2024 23:22:09 +0000 (15:22 -0800)]
[AArch64] Add the Ampere1B core (#81297)

The Ampere1B is Ampere's third-generation core implementing a
superscalar, out-of-order microarchitecture with nested virtualization,
speculative side-channel mitigation and architectural support for
defense against ROP/JOP style software attacks.

Ampere1B is an ARMv8.7+ implementation, adding support for the FEAT
WFxT, FEAT CSSC, FEAT PAN3 and FEAT AFP extensions. It also includes all
features of the second-generation Ampere1A, such as the Memory Tagging
Extension and SM3/SM4 cryptography instructions.

(cherry picked from commit fbba818a78f591d89f25768ba31783714d526532)

11 months ago[AArch64] Make +pauth enabled in Armv8.3-a by default (#78027)
Anatoly Trosinenko [Thu, 1 Feb 2024 16:23:55 +0000 (19:23 +0300)]
[AArch64] Make +pauth enabled in Armv8.3-a by default (#78027)

Add AEK_PAUTH to ARMV8_3A in TargetParser and let it propagate to
ARMV8R, as it aligns with GCC defaults.

After adding AEK_PAUTH, several tests from TargetParserTest.cpp crashed
when trying to format an error message, thus update a format string in
AssertSameExtensionFlags to account for bitmask being pre-formatted as
std::string.

The CHECK-PAUTH* lines in aarch64-target-features.c are updated to
account for the fact that FEAT_PAUTH support and pac-ret can be enabled
independently and all four combinations are possible.

(cherry picked from commit a52eea66795018550e95c4b060165a7250899298)

11 months ago[Clang] Fixes to immediate-escalating functions (#82281)
cor3ntin [Wed, 21 Feb 2024 19:53:44 +0000 (20:53 +0100)]
[Clang] Fixes to immediate-escalating functions (#82281)

* Consider that immediate escalating function can appear at global
scope, fixing a crash

* Lambda conversion to function pointer was sometimes not performed in
an immediate function context when it should be.

Fixes #82258

(cherry picked from commit baf6bd303bd58a521809d456dd9b179636982fc5)

11 months ago[llvm-shlib] Change libLLVM-$MAJOR.so symlink to point to versioned SO (#82660)
Tom Stellard [Fri, 23 Feb 2024 23:58:32 +0000 (15:58 -0800)]
[llvm-shlib] Change libLLVM-$MAJOR.so symlink to point to versioned SO (#82660)

This symlink was added in 91a384621e5b762d9c173ffd247cfeadd5f436a2 to
maintain backwards compatibility, but it needs to point to
libLLVM.so.$MAJOR.$MINOR rather than libLLVM.so. This works better for
distros that ship libLLVM.so and libLLVM.so.$MAJOR.$MINOR in separate
packages and also prevents mistakes like
libLLVM-19.so -> libLLVM.so -> libLLVM.so.18.1

Fixes #82647

(cherry picked from commit 10c48a772742b7afe665a815b7eba2047f17dc4b)

11 months ago[llvm][AArch64] Do not inline a function with different signing scheme. (#80642)...
Dani [Mon, 26 Feb 2024 23:13:43 +0000 (00:13 +0100)]
[llvm][AArch64] Do not inline a function with different signing scheme. (#80642) (#82743)

f the signing scheme is different that maybe the functions assumes
different behaviours and dangerous to inline them without analysing
them. This should be a rare case.

11 months ago[clang][CodeGen] Keep processing the rest of AST after encountering unsupported MC...
Wentao Zhang [Thu, 22 Feb 2024 22:04:25 +0000 (16:04 -0600)]
[clang][CodeGen] Keep processing the rest of AST after encountering unsupported MC/DC expressions (#82464)

Currently, upon seeing unsupported decisions (more than 6 conditions, or
split nesting), the post-visitor hook dataTraverseStmtPost() returns a
false. As a result, in the rest of tree even supported decisions will
be skipped as well. Like in the below code:

{ // CompoundStmt
  a && b;           // 1: BinaryOperator (supported)
  a && foo(b && c); // 2: BinaryOperator (not yet supported due to split
                    //                    nesting)
  a && b;           // 3: BinaryOperator (supported)
}

Decision 3 will not be processed at all. And only one "Decision" region
will be emitted. Compiler explorer example:
https://godbolt.org/z/Px61sesoo

We hope to process such cases and emit two "Decision" regions (1 and 3)
in the above example.

(cherry picked from commit d4bfca3b2e673789f7c278d46a199ae8910ddd37)

11 months agoFix build on musl by including stdint.h (#81434)
Daniel Martinez [Thu, 22 Feb 2024 21:14:27 +0000 (21:14 +0000)]
Fix build on musl by including stdint.h (#81434)

openmp fails to build on musl since it lacks the defines for int32_t

Co-authored-by: Daniel Martinez <danielmartinez@cock.li>
(cherry picked from commit 45fe67dd61a6ac7df84d3a586e41c36a4767757f)

11 months ago[FlattenCFG] Fix the miscompilation where phi nodes exist in the merge point (#81987)
Yingwei Zheng [Sun, 25 Feb 2024 14:01:13 +0000 (22:01 +0800)]
[FlattenCFG] Fix the miscompilation where phi nodes exist in the merge point (#81987)

When there are phi nodes in the merge point of the if-region, we cannot
do the merge.
Alive2: https://alive2.llvm.org/ce/z/DbgEan
Fixes #70900.

(cherry picked from commit f920b746ea818f1d21f317116cbb105e3e85979a)

11 months ago[GVN] Drop nsw/nuw flags when replacing the result of a with.overflow intrinsic with...
Yingwei Zheng [Mon, 26 Feb 2024 07:55:56 +0000 (15:55 +0800)]
[GVN] Drop nsw/nuw flags when replacing the result of a with.overflow intrinsic with a overflowing binary operator (#82935)

Alive2: https://alive2.llvm.org/ce/z/gyL7mn
Fixes https://github.com/llvm/llvm-project/issues/82884.

(cherry picked from commit 892b4beeac50920e630f10905b2916295e2eb6d8)

11 months ago[SystemZ] Use VT (not ArgVT) for SlotVT in LowerCall(). (#82475)
Jonas Paulsson [Wed, 21 Feb 2024 15:26:16 +0000 (16:26 +0100)]
[SystemZ] Use VT (not ArgVT) for SlotVT in LowerCall(). (#82475)

When an integer argument is promoted and *not* split (like i72 -> i128 on
a new machine with vector support), the SlotVT should be i128, which is
stored in VT - not ArgVT.

Fixes #81417

(cherry picked from commit 9c0e45d7f0e2202e16dbd9a7b9f462e2bcb741ae)

11 months ago[SystemZ] Require D12 for i128 accesses in isLegalAddressingMode() (#79221)
Jonas Paulsson [Wed, 24 Jan 2024 19:16:05 +0000 (20:16 +0100)]
[SystemZ] Require D12 for i128 accesses in isLegalAddressingMode() (#79221)

Machines with vector support handle i128 in vector registers and
therefore only have the small displacement available for memory
accesses. Update isLegalAddressingMode() to reflect this.

(cherry picked from commit 84dcf3d35b6ea8d8b6c34bc9cf21135863c47b8c)

11 months agofix links on clang 18.1.0rc release page (#82739)
h-vetinari [Mon, 26 Feb 2024 14:06:56 +0000 (15:06 +0100)]
fix links on clang 18.1.0rc release page (#82739)

Looking at the [release
notes](https://prereleases.llvm.org/18.1.0/rc3/tools/clang/docs/ReleaseNotes.html)
for clang 18.1.0rc, there's some broken links, and many issue numbers
mis-formatted with an extra colon. Aside from being used inconsistently
(with/without colon), I think it should be uncontroversial that `See
(#62707).` is better than `See (#62707:).`

CC @tstellar @AaronBallman

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
12 months agoBackport 0bf4f82 to release/18.x (#82571)
Wentao Zhang [Sat, 24 Feb 2024 00:07:23 +0000 (18:07 -0600)]
Backport 0bf4f82 to release/18.x (#82571)

Manually cherry-pick 0bf4f82f661817c79bd538c82c99515837cf1cf8 (#80952)
and resolve conflicts

Closes #82570

12 months ago[libc++] Add details about string annotations (#82730)
Tacet [Fri, 23 Feb 2024 21:06:47 +0000 (22:06 +0100)]
[libc++] Add details about string annotations (#82730)

This commit adds information that only long strings are annotated, and
with all allocators by default.

To read why short string annotations are not turned on yet, read
comments in a related PR:
https://github.com/llvm/llvm-project/pull/79536

Upstreamed in: 7661ade5d1ac4fc8e1e2339b2476cb8e45c24641
Upstream PR: #80912

---------

Co-authored-by: Mark de Wever <zar-rpg@xs4all.nl>
Co-authored-by: Mark de Wever <zar-rpg@xs4all.nl>
12 months ago[llvm-readobj,ELF] Support --decompress/-z (#82594)
Fangrui Song [Thu, 22 Feb 2024 17:24:21 +0000 (09:24 -0800)]
[llvm-readobj,ELF] Support --decompress/-z (#82594)

When a section has the SHF_COMPRESSED flag, -p/-x dump the compressed
content by default. In GNU readelf, if --decompress/-z is specified,
-p/-x will dump the decompressed content. This patch implements the
option.

Close #82507

(cherry picked from commit 26d71d9ed56c4c23e6284dac7a9bdf603a5801f3)

12 months ago[docs][llvm-objcopy] Add missing formats (#81981)
Ulrich Weigand [Fri, 16 Feb 2024 11:11:04 +0000 (12:11 +0100)]
[docs][llvm-objcopy] Add missing formats (#81981)

Bring list of supported formats in docs back in sync with the code.

(cherry picked from commit bf471c915d14035a24ec027fb2bb0373cefdabe1)

12 months ago[libc++] Only include <setjmp.h> from the C library if it exists (#81887)
Louis Dionne [Fri, 16 Feb 2024 21:45:00 +0000 (16:45 -0500)]
[libc++] Only include <setjmp.h> from the C library if it exists (#81887)

In 2cea1babefbb, we removed the <setjmp.h> header provided by libc++. However, we did not conditionally include the underlying <setjmp.h>
header only if the C library provides one, which we otherwise do consistently (see e.g. 647ddc08f43c).

rdar://122978778
(cherry picked from commit d8278b682386f51dfba204849c624672a3df40c7)

12 months ago[Loads] Fix crash in isSafeToLoadUnconditionally with scalable accessed type (#82650)
Luke Lau [Thu, 22 Feb 2024 17:49:19 +0000 (01:49 +0800)]
[Loads] Fix crash in isSafeToLoadUnconditionally with scalable accessed type (#82650)

This fixes #82606 by updating isSafeToLoadUnconditionally to handle
fixed sized loads from a scalable accessed type.

(cherry picked from commit b0edc1c45284586fdb12edd666f95d99f5f62b43)

12 months agoReleaseNotes: mention -mtls-dialect=desc (#82731)
Fangrui Song [Fri, 23 Feb 2024 20:43:55 +0000 (12:43 -0800)]
ReleaseNotes: mention -mtls-dialect=desc (#82731)

12 months agoExtend GCC workaround to GCC < 8.4 for llvm::iterator_range ctor (#82643)
Thomas Preud'homme [Thu, 22 Feb 2024 21:01:05 +0000 (21:01 +0000)]
Extend GCC workaround to GCC < 8.4 for llvm::iterator_range ctor (#82643)

GCC SFINAE error with decltype was fixed in commit
ac5e28911abdfb8d9bf6bea980223e199bbcf28d which made it into GCC 8.4.
Therefore adjust GCC version test accordingly.

(cherry picked from commit 7f71fa909a10be182b82b9dfaf0fade6eb84796c)

12 months agoFix llvm-x86_64-debian-dylib buildbot
Tom Stellard [Wed, 21 Feb 2024 00:14:59 +0000 (00:14 +0000)]
Fix llvm-x86_64-debian-dylib buildbot

This was broken by 91a384621e5b762d9c173ffd247cfeadd5f436a2.

(cherry picked from commit ff4d6c64ee4269e4a9b67a4dae7e0b82ae1c3419)

12 months ago[cmake] Add minor version to library SONAME (#79376)
Tom Stellard [Tue, 20 Feb 2024 00:46:16 +0000 (16:46 -0800)]
[cmake] Add minor version to library SONAME (#79376)

We need to do this now that we are bumping the minor release number when
we create the release branch.

This also results in a slight change to the library names for LLVM. The
main library now has a more convential library name:
'libLLVM.so.$major.$minor'. The old library name: libLLVM-$major.so is
now a symlink that points to the new library. However, the symlink is
not present in the build directory. It is only present in the install
directory.

The library name was changed because it helped to keep the CMake changes
more simple.

Fixes #76273

(cherry picked from commit 91a384621e5b762d9c173ffd247cfeadd5f436a2)

12 months ago[workflows] Fix permissions check for creating new releases (#81163)
Tom Stellard [Wed, 21 Feb 2024 01:52:38 +0000 (17:52 -0800)]
[workflows] Fix permissions check for creating new releases (#81163)

The default GitHub token does not have read permissions on the org, so
we need to use a custom token in order to read the members of the
llvm-release-managers team.

(cherry picked from commit 2836d8edbfbcd461b25101ed58f93c862d65903a)

12 months ago[Release] Don't build during test-release.sh Phase 3 install (#82001)
Rainer Orth [Tue, 20 Feb 2024 06:26:48 +0000 (07:26 +0100)]
[Release] Don't build during test-release.sh Phase 3 install (#82001)

As described in [test-release.sh ninja install does builds in Phase
3](https://github.com/llvm/llvm-project/issues/80999), considerable
parts of Phase 3 of a `test-release.sh` build are run by `ninja
install`, ignoring both `$Verbose` and the parallelism set via `-j NUM`.

This patches fixes this by not specifying any explicit build target for
Phase 3, thus running the full build as usual.

Tested on `sparc64-unknown-linux-gnu`.

(cherry picked from commit f6ac598c104ed3c9f4bcbbe830f86500c8d1013e)

12 months ago[IndVarSimplify] Fix poison-safety when reusing instructions (#80458)
Nikita Popov [Mon, 5 Feb 2024 09:11:39 +0000 (10:11 +0100)]
[IndVarSimplify] Fix poison-safety when reusing instructions (#80458)

IndVars may replace an instruction with one of its operands, if they
have the same SCEV expression. However, such a replacement may be more
poisonous.

First, check whether the operand being poison implies that the
instruction is also poison, in which case the replacement is always
safe. If this fails, check whether SCEV can determine that reusing the
instruction is safe, using the same check as SCEVExpander.

Fixes https://github.com/llvm/llvm-project/issues/79861.

(cherry picked from commit 7d2b6f0b355bc98bbe3aa5bae83316a708da33ee)

12 months ago[SCEV] Move canReuseInstruction() helper into SCEV (NFC)
Nikita Popov [Fri, 2 Feb 2024 15:02:46 +0000 (16:02 +0100)]
[SCEV] Move canReuseInstruction() helper into SCEV (NFC)

To allow reusing it in IndVars.

(cherry picked from commit 43dd1e84df1ecdad872e1004af47b489e08fc228)

12 months ago[SCEVExpander] Do not reuse disjoint or (#80281)
Nikita Popov [Fri, 2 Feb 2024 09:52:05 +0000 (10:52 +0100)]
[SCEVExpander] Do not reuse disjoint or (#80281)

SCEV treats "or disjoint" the same as "add nsw nuw". However, when
expanding, we cannot generally replace an add SCEV node with an "or
disjoint" instruction. Just dropping the poison flag is insufficient in
this case, we would have to actually convert the or into an add.

This is a partial fix for #79861.

(cherry picked from commit 5b8e1a6ebf11b6e93bcc96a0d009febe4bb3d7bc)

12 months ago[IndVars] Add tests for #79861 (NFC)
Nikita Popov [Thu, 1 Feb 2024 11:57:59 +0000 (12:57 +0100)]
[IndVars] Add tests for #79861 (NFC)

(cherry picked from commit c105848fd29d3b46eeb794bb6b10dad04f903b09)

12 months ago[Serialization] Record whether the ODR is skipped (#82302)
Chuanqi Xu [Tue, 20 Feb 2024 05:31:28 +0000 (13:31 +0800)]
[Serialization] Record whether the ODR is skipped (#82302)

Close https://github.com/llvm/llvm-project/issues/80570.

In

https://github.com/llvm/llvm-project/commit/a0b6747804e46665ecfd00295b60432bfe1775b6,
we skipped ODR checks for decls in GMF. Then it should be natural to
skip storing the ODR values in BMI.

Generally it should be fine as long as the writer and the reader keep
consistent.

However, the use of preamble in clangd shows the tricky part.

For,

```
// test.cpp
module;

// any one off these is enough to crash clangd
// #include <iostream>
// #include <string_view>
// #include <cmath>
// #include <system_error>
// #include <new>
// #include <bit>
// probably many more

// only ok with libc++, not the system provided libstdc++ 13.2.1

// these are ok

export module test;
```

clangd will store the headers as preamble to speedup the parsing and the
preamble reuses the serialization techniques. (Generally we'd call the
preamble as PCH. However it is not true strictly. I've tested the PCH
wouldn't be problematic.) However, the tricky part is that the preamble
is not modules. It literally serialiaze and deserialize things. So
before clangd parsing the above test module, clangd will serialize the
headers into the preamble. Note that there is no concept like GMF now.
So the ODR bits are stored. However, when clangd parse the file
actually, the decls from preamble are thought as in GMF literally, then
hte ODR bits are skipped. Then mismatch happens.

To solve the problem, this patch adds another bit for decls to record
whether or not the ODR bits are skipped.

(cherry picked from commit 49775b1dc0cdb3a9d18811f67f268e3b3a381669)

12 months ago[llvm-objcopy] Add SystemZ support (#81841)
Ulrich Weigand [Fri, 16 Feb 2024 10:58:05 +0000 (11:58 +0100)]
[llvm-objcopy] Add SystemZ support (#81841)

This is also necessary for enabling ClangBuiltLinux:
https://github.com/ClangBuiltLinux/linux/issues/1530

(cherry picked from commit 3c02cb7492fc78fb678264cebf57ff88e478e14f)

12 months ago[compiler-rt][profile] Fix InstrProfilingFile possible resource leak. (#81363)
David CARLIER [Sat, 10 Feb 2024 19:14:28 +0000 (19:14 +0000)]
[compiler-rt][profile] Fix InstrProfilingFile possible resource leak. (#81363)

close #79708

(cherry picked from commit 0a255fcf4a90f9e864ae9321b28e4956f7c865fb)

12 months ago[PowerPC] Mask constant operands in ValueBit tracking (#67653)
Qiu Chaofan [Tue, 6 Feb 2024 10:37:31 +0000 (18:37 +0800)]
[PowerPC] Mask constant operands in ValueBit tracking (#67653)

In IR or C code, shift amount larger than value size is undefined
behavior. But in practice, backend lowering for shift_parts produces
add/sub of shift amounts, thus constant shift amounts might be
negative or larger than value size, which depends on ISA definition.

PowerPC ISA says, the lowest 7 bits (6 bits for 32-bit instruction)
will be taken, and if the highest among them is 1, result will be
zero, otherwise the low 6 bits (or 5 on 32-bit) are used as shift
amount.

This commit emulates the behavior and avoids array overflow in bit
permutation's value bits calculator.

(cherry picked from commit 292d9e869fcfc2ece694848db4022b0b939847e3)

12 months ago[llvm-objdump] Add support for the PT_OPENBSD_SYSCALLS segment type. (#82121)
Frederic Cambus [Tue, 20 Feb 2024 08:11:54 +0000 (09:11 +0100)]
[llvm-objdump] Add support for the PT_OPENBSD_SYSCALLS segment type. (#82121)

Reference: https://github.com/openbsd/src/blob/master/sys/sys/exec_elf.h
(cherry picked from commit 1b894864862d8049e4a2567a472efdc2eda1e035)

12 months ago[llvm-readobj] Add support for the PT_OPENBSD_SYSCALLS segment type. (#82122)
Frederic Cambus [Tue, 20 Feb 2024 08:12:58 +0000 (09:12 +0100)]
[llvm-readobj] Add support for the PT_OPENBSD_SYSCALLS segment type. (#82122)

Reference: https://github.com/openbsd/src/blob/master/sys/sys/exec_elf.h
(cherry picked from commit a8d7511811c7d7c689c3e8f858e8e00a56aba152)

12 months ago[OpenMP][AIX]Add assembly file containing microtasking routines and unnamed common...
Xing Xue [Tue, 20 Feb 2024 17:08:37 +0000 (12:08 -0500)]
[OpenMP][AIX]Add assembly file containing microtasking routines and unnamed common block definitions (#81770)

This patch adds assembly file `z_AIX_asm.S` that contains the 32- and
64-bit XCOFF version of microtasking routines and unnamed common block
definitions. This code has been run through the libomp LIT tests and a
user package successfully.

(cherry picked from commit 94100bc2fb1a39dbeb43d18a95176097c53f1324)

12 months ago[InstCombine] Fold gep of exact unsigned division (#82334)
Nikita Popov [Tue, 20 Feb 2024 11:48:13 +0000 (12:48 +0100)]
[InstCombine] Fold gep of exact unsigned division (#82334)

Extend the transform added in
https://github.com/llvm/llvm-project/pull/76458 to also handle unsigned
division. X exact/ Y * Y == X holds independently of whether the
division is signed or unsigned.

Proofs: https://alive2.llvm.org/ce/z/wFd5Ec
(cherry picked from commit 26d4afc3de86ca5416c8e38000362c526b6808cd)

12 months ago[InstCombine] Add unsigned variants of gep exact div tests (NFC)
Nikita Popov [Tue, 20 Feb 2024 10:08:01 +0000 (11:08 +0100)]
[InstCombine] Add unsigned variants of gep exact div tests (NFC)

(cherry picked from commit ec2c770b9f9a0e9eca4a893383d2b27dd4c0bfe7)

12 months ago[RISCV] Check type is legal before combining mgather to vlse intrinsic (#81107)
Luke Lau [Thu, 8 Feb 2024 22:51:11 +0000 (06:51 +0800)]
[RISCV] Check type is legal before combining mgather to vlse intrinsic (#81107)

Otherwise we will crash since target intrinsics don't have their types
legalized. Let the mgather get legalized first, then do the combine on
the legal type.
Fixes #81088

Co-authored-by: Craig Topper <craig.topper@sifive.com>
(cherry picked from commit 06c89bd59ca2279f76a41e851b7b2df634a6191e)

12 months ago[ValueTracking] Fix computeKnownFPClass for fpext (#81972)
Yingwei Zheng [Sat, 17 Feb 2024 15:30:45 +0000 (23:30 +0800)]
[ValueTracking] Fix computeKnownFPClass for fpext (#81972)

This patch adds the missing `subnormal -> normal` part for `fpext` in
`computeKnownFPClass`.
Fixes the miscompilation reported by
https://github.com/llvm/llvm-project/pull/80941#issuecomment-1947302100.

(cherry picked from commit a5865c3c3dbbd17ae12ecc1c297fe1fc2605df52)

12 months ago[Support/ELF] Add OpenBSD PT_OPENBSD_SYSCALLS constant.
Frederic Cambus [Sat, 17 Feb 2024 14:38:05 +0000 (15:38 +0100)]
[Support/ELF] Add OpenBSD PT_OPENBSD_SYSCALLS constant.

Reference: https://github.com/openbsd/src/blob/master/sys/sys/exec_elf.h
(cherry picked from commit 97eff26d0ca4d187a5efb8534af484dbb68bce30)

12 months ago[OpenMP][AIX] Set worker stack size to 2 x KMP_DEFAULT_STKSIZE if system stack size...
Xing Xue [Fri, 16 Feb 2024 20:12:41 +0000 (15:12 -0500)]
[OpenMP][AIX] Set worker stack size to 2 x KMP_DEFAULT_STKSIZE if system stack size is too big (#81996)

This patch sets the stack size of worker threads to `2 x
KMP_DEFAULT_STKSIZE` (2 x 4MB) for AIX if the system stack size is too
big. Also defines maximum stack size for 32-bit AIX.

(cherry picked from commit 2de269a641e4ffbb7a44e559c4c0a91bb66df823)

12 months ago[AIX] Add a dummy variable in the __llvm_orderfile section (#81968)
Wael Yehia [Fri, 16 Feb 2024 17:55:20 +0000 (12:55 -0500)]
[AIX] Add a dummy variable in the __llvm_orderfile section (#81968)

to satisfy the __start___llvm_orderfile reference when linking with
-bexpfull and -fprofile-generate on AIX.

(cherry picked from commit 15cccc55919d27eb2e89379a65f6c7809f679fda)

12 months agoUse container on Linux to run llvm-project-tests workflow (#81349) (#81807)
Tom Stellard [Sun, 18 Feb 2024 00:19:39 +0000 (16:19 -0800)]
Use container on Linux to run llvm-project-tests workflow (#81349) (#81807)

(cherry picked from commit fe20a759fcd20e1755ea1b34c5e6447a787925dc)

12 months ago[AArch64][GlobalISel] Fail legalization for unknown libcalls. (#81873)
David Green [Sat, 17 Feb 2024 08:57:14 +0000 (08:57 +0000)]
[AArch64][GlobalISel] Fail legalization for unknown libcalls. (#81873)

If, like powi on windows, the libcall is unavailable we should fall back
to SDAG. Currently we try and generate a call to "".

(cherry picked from commit 47c65cf62d06add9f55a77c9d45390fa3b986fc5)

12 months ago[lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739)
Ulrich Weigand [Wed, 14 Feb 2024 17:26:38 +0000 (18:26 +0100)]
[lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739)

With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:

                        9e8d8: R_390_TLS_TPOFF  *ABS*+0x18

This is caused by the config->isPic conditon in addTpOffsetGotEntry:

 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
   if (!sym.isPreemptible && !config->isPic) {
     in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
     return;
   }

It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.

Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)

(cherry picked from commit 6f907733e65d24edad65f763fb14402464bd578b)

12 months ago[lld] Fix test failures when running as root user (#81339)
Tom Stellard [Sat, 10 Feb 2024 04:57:05 +0000 (20:57 -0800)]
[lld] Fix test failures when running as root user (#81339)

This makes it easier to run the tests in a containerized environment.

(cherry picked from commit e165bea1d4ec2de96ee0548cece79d71a75ce8f8)

12 months ago[libc++][modules] Re-add build dir CMakeLists.txt. (#81370)
Mark de Wever [Tue, 13 Feb 2024 19:04:34 +0000 (20:04 +0100)]
[libc++][modules] Re-add build dir CMakeLists.txt. (#81370)

This CMakeLists.txt is used to build modules without build system
support. This was removed in d06ae33ec32122bb526fb35025c1f0cf979f1090.
This is used in the documentation how to use modules.

Made some minor changes to make it work with the std.compat module using
the std module.

Note the CMakeLists.txt in the build dir should be removed once build
system support is generally available.

(cherry picked from commit fc0e9c8315564288f9079a633892abadace534cf)

12 months ago[SLP]Fix PR79229: Do not erase extractelement, if it used in
Alexey Bataev [Thu, 25 Jan 2024 14:06:15 +0000 (06:06 -0800)]
[SLP]Fix PR79229: Do not erase extractelement, if it used in
multiregister node.

If the node can be span between several registers and same
extractelement instruction is used in several parts, it may be required
to keep such extractelement instruction to avoid compiler crash.

(cherry picked from commit 6fe21bc1dac883efa0dfa807f327048ae9969b81)

12 months ago[SLP]Fix PR79229: Check that extractelement is used only in a single node
Alexey Bataev [Wed, 24 Jan 2024 18:57:18 +0000 (10:57 -0800)]
[SLP]Fix PR79229: Check that extractelement is used only in a single node
before erasing.

Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.

(cherry picked from commit 48bbd7658710ef1699bf2a6532ff5830230aacc5)

12 months agoBackport [DAGCombine] Fix multi-use miscompile in load combine (#81586) (#81633)
Nikita Popov [Fri, 16 Feb 2024 13:50:14 +0000 (14:50 +0100)]
Backport [DAGCombine] Fix multi-use miscompile in load combine (#81586) (#81633)

(cherry picked from commit 25b9ed6e4964344e3710359bec4c831e5a8448b9)

12 months ago[LLD] [docs] Add more release notes for COFF and MinGW (#81977)
Martin Storsjö [Fri, 16 Feb 2024 13:48:29 +0000 (15:48 +0200)]
[LLD] [docs] Add more release notes for COFF and MinGW (#81977)

Add review references to all items already mentioned.

Move some items to the right section (from the MinGW section to COFF, as
the implementation is in the COFF linker side, and may be relevant for
non-MinGW cases as well).

12 months ago[lld][ELF] Support relax R_LARCH_ALIGN (#78692)
Jinyang He [Tue, 6 Feb 2024 01:09:13 +0000 (09:09 +0800)]
[lld][ELF] Support relax R_LARCH_ALIGN (#78692)

Refer to commit 6611d58f5bbc ("Relax R_RISCV_ALIGN"), we can relax
R_LARCH_ALIGN by same way. Reuse `SymbolAnchor`, `RISCVRelaxAux` and
`initSymbolAnchors` to simplify codes. As `riscvFinalizeRelax` is an
arch-specific function, put it override on `TargetInfo::finalizeRelax`,
so that LoongArch can override it, too.

The flow of relax R_LARCH_ALIGN is almost consistent with RISCV. The
difference is that LoongArch only has 4-bytes NOP and all executable
insn is 4-bytes aligned. So LoongArch not need rewrite NOP sequence.
Alignment maxBytesEmit parameter is supported in psABI v2.30.

(cherry picked from commit 06a728f3feab876f9195738b5774e82dadc0f3a7)

12 months ago[18.x][Docs] Add release note about Clang-defined target OS macros (#80044)
Zixu Wang [Fri, 16 Feb 2024 13:36:18 +0000 (05:36 -0800)]
[18.x][Docs] Add release note about Clang-defined target OS macros (#80044)

The change is included in the 18.x release. Move the release note to the
release branch and reformat.

(cherry picked from commit b40d5b1b08564d23d5e0769892ebbc32447b2987)

12 months ago[lld] Add target support for SystemZ (s390x) (#75643)
Ulrich Weigand [Tue, 13 Feb 2024 10:29:21 +0000 (11:29 +0100)]
[lld] Add target support for SystemZ (s390x) (#75643)

This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.

In addition to new platform code and the obvious changes, there were a
few additional changes to common code:

- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.

- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.

This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.

Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
(cherry picked from commit fe3406e349884e4ef61480dd0607f1e237102c74)

12 months ago[OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members swapped...
Xing Xue [Tue, 13 Feb 2024 20:11:24 +0000 (15:11 -0500)]
[OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members swapped for big-endian (#79188)

The direct lock data structure has bit `0` (the least significant bit)
of the first 32-bit word set to `1` to indicate it is a direct lock. On
the other hand, the first word (in 32-bit mode) or first two words (in
64-bit mode) of an indirect lock are the address of the entry allocated
from the indirect lock table. The runtime checks bit `0` of the first
32-bit word to tell if this is a direct or an indirect lock. This works
fine for 32-bit and 64-bit little-endian because its memory layout of a
64-bit address is (`low word`, `high word`). However, this causes
problems for big-endian where the memory layout of a 64-bit address is
(`high word`, `low word`). If an address of the indirect lock table
entry is something like `0x110035300`, i.e., (`0x1`, `0x10035300`), it
is treated as a direct lock. This patch defines `struct
kmp_base_tas_lock` with the ordering of the two 32-bit members flipped
for big-endian PPC64 so that when checking/setting tags in member
`poll`, the second word (the low word) is used. This patch also changes
places where `poll` is not already explicitly specified for
checking/setting tags.

(cherry picked from commit ac97562c99c3ae97f063048ccaf08ebdae60ac30)

12 months ago[OpenMP][test]Flip bit-fields in 'struct flags' for big-endian in test cases (#79895)
Xing Xue [Wed, 7 Feb 2024 20:24:52 +0000 (15:24 -0500)]
[OpenMP][test]Flip bit-fields in 'struct flags' for big-endian in test cases (#79895)

This patch flips bit-fields in `struct flags` for big-endian in test
cases to be consistent with the definition of the structure in libomp
`kmp.h`.

(cherry picked from commit 7a9b0e4acb3b5ee15f8eb138aad937cfa4763fb8)

12 months ago[LLD] [MinGW] Implement the --lto-emit-asm and -plugin-opt=emit-llvm options (#81475)
Martin Storsjö [Tue, 13 Feb 2024 07:32:40 +0000 (09:32 +0200)]
[LLD] [MinGW] Implement the --lto-emit-asm and -plugin-opt=emit-llvm options (#81475)

These were implemented in the COFF linker in
3923e61b96cf90123762f0e0381504efaba2d77a and
d12b99a4313816cf99e97cb5f579e2d51ba72b0b.

This matches the corresponding options in the ELF linker.

(cherry picked from commit d033366bd2189e33343ca93d276b40341dc39770)