platform/upstream/llvm.git
17 months ago[X86] Align stack to 16-bytes on 32-bit with X86_INTR call convention
Antonio Abbatangelo [Thu, 1 Jun 2023 08:18:12 +0000 (16:18 +0800)]
[X86] Align stack to 16-bytes on 32-bit with X86_INTR call convention

Adds a dynamic stack alignment to functions under the interrupt call
convention on x86-32. This fixes the issue where the stack can be
misaligned on entry, since x86-32 makes no guarantees about the stack
pointer position when the interrupt service routine is called.

The alignment is done by overriding X86RegisterInfo::shouldRealignStack,
and by setting the correct alignment in X86FrameLowering::calculateMaxStackAlign.
This forces the interrupt handler to be dynamically aligned, generating
the appropriate `and` instruction in the prologue and `lea` in the
epilogue. The `no-realign-stack` attribute can be used as an opt-out.

Fixes #26851

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D151400

17 months ago[clang][Interp] Optionally cast comparison result to non-bool
Timm Bäder [Thu, 1 Jun 2023 07:48:04 +0000 (09:48 +0200)]
[clang][Interp] Optionally cast comparison result to non-bool

Our comparison opcodes always produce a Boolean value and push it on the
stack. However, the result of such a comparison in C is int, so the
later code expects an integer value on the stack.

Work around this problem by casting the boolean value to int in those
cases. This is not ideal for C however. The comparison is usually
wrapped in a IntegerToBool cast anyway.

Differential Revision: https://reviews.llvm.org/D149645

17 months ago[InstCombine] Fix worklist management in transformToIndexedCompare()
Nikita Popov [Thu, 1 Jun 2023 08:31:13 +0000 (10:31 +0200)]
[InstCombine] Fix worklist management in transformToIndexedCompare()

Use replaceInstUsesWith() rather than plain RAUW to make sure the
old instructions are added back to the worklist for DCE.

17 months ago[DWP] add overflow check for llvm-dwp tools if offset overflow
zhuna [Thu, 1 Jun 2023 07:53:50 +0000 (15:53 +0800)]
[DWP] add overflow check for llvm-dwp tools if offset overflow

Now, if the offset overflow happens, we just silently ignore it.
We will generate a bad dwp file, which will crash the gdb or make
it undefined behavior, and hard to address the root cause. So, we
need to produce some messages if overflow happens.

Reviewed By: ayermolo, dblaikie, steven.zhang

Differential Revision: https://reviews.llvm.org/D144565

17 months ago[AArch64] Adjust costs of i1 and/or/xor reductions
David Green [Thu, 1 Jun 2023 08:28:48 +0000 (09:28 +0100)]
[AArch64] Adjust costs of i1 and/or/xor reductions

This expands the reduction cost of i1 and/or/xor, so that larger type sizes get
handled by the existing code. For i1 reductions - and will use maxv, or will use
minv and xor will use addv, plus the cost of legalizing the type for larger
vectors using and/or/xor. The i1 vectors will be legalized to higher width
integers (say v16i8), which this overrides the cost of. As with all i1 vectors
there is a chance that the types the i1 vector is created with and how it is
used will not match, introducing extra extends that are not necessarily
costmodelled.
https://godbolt.org/z/6Gc9K6b7T

Differential Revision: https://reviews.llvm.org/D151184

17 months ago[mlir][transform] Add support for expressing scalable tile sizes
Andrzej Warzynski [Tue, 16 May 2023 15:26:46 +0000 (16:26 +0100)]
[mlir][transform] Add support for expressing scalable tile sizes

This patch enables specifying scalable tile sizes when using the
Transform dialect to drive tiling, e.g.:
```
%1, %loop = transform.structured.tile %0 [[4]]
```
This is implemented by extending the TileOp with a dedicated attribute
for "scalability" and by updating various parsing hooks. At the moment,
only the trailing tile size can be scalable. The following is not yet
supported:
```
%1, %loop = transform.structured.tile %0 [[4], [4]]
```

This change is a part of larger effort to enable scalable vectorisation
in Linalg. See this RFC for more context:
  * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/

Differential Revision: https://reviews.llvm.org/D150944

17 months ago[InstCombine] Fix worklist management in foldPHIArgIntToPtrToPHI()
Nikita Popov [Thu, 1 Jun 2023 08:18:30 +0000 (10:18 +0200)]
[InstCombine] Fix worklist management in foldPHIArgIntToPtrToPHI()

Make sure the old operand is added back to the worklist for DCE.

17 months agoRevert "[BOLT][CMake] Redo the build and install targets"
Petr Hosek [Thu, 1 Jun 2023 08:03:16 +0000 (08:03 +0000)]
Revert "[BOLT][CMake] Redo the build and install targets"

This reverts commit f99a7d3e38095cfdaf7e729289a8894dd31c7efa since it
broke the bolt-aarch64-ubuntu-clang-shared bot.

17 months ago[clang][analyzer] Merge apiModeling.StdCLibraryFunctions and StdCLibraryFunctionArgs...
Balázs Kéri [Thu, 1 Jun 2023 07:20:36 +0000 (09:20 +0200)]
[clang][analyzer] Merge apiModeling.StdCLibraryFunctions and StdCLibraryFunctionArgs checkers into one.

Main reason for this change is that these checkers were implemented in the same class
but had different dependency ordering. (NonNullParamChecker should run before StdCLibraryFunctionArgs
to get more special warning about null arguments, but the apiModeling.StdCLibraryFunctions was a modeling
checker that should run before other non-modeling checkers. The modeling checker changes state in a way
that makes it impossible to detect a null argument by NonNullParamChecker.)
To make it more simple, the modeling part is removed as separate checker and can be only used if
checker StdCLibraryFunctions is turned on, that produces the warnings too. Modeling the functions
without bug detection (for invalid argument) is not possible. The modeling of standard functions
does not happen by default from this change on.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D151225

17 months ago[ValueTracking] Directly use KnownBits shift functions
Nikita Popov [Tue, 16 May 2023 08:55:44 +0000 (10:55 +0200)]
[ValueTracking] Directly use KnownBits shift functions

Make ValueTracking directly call the KnownBits shift helpers, which
provides more precise results.

Unfortunately, ValueTracking has a special case where sometimes we
determine non-zero shift amounts using isKnownNonZero(). I have my
doubts about the usefulness of that special-case (it is only tested
in a single unit test), but I've reproduced the special-case via an
extra parameter to the KnownBits methods.

Differential Revision: https://reviews.llvm.org/D151816

17 months ago[mlir][tensor] TrackingListener: Find replacement ops through cast-like ExtractSliceOps
Matthias Springer [Thu, 1 Jun 2023 07:00:08 +0000 (09:00 +0200)]
[mlir][tensor] TrackingListener: Find replacement ops through cast-like ExtractSliceOps

Certain ExtractSliceOps, that do extract all elements from the destination, are treated like casts when looking for replacement ops. Such ExtractSliceOps are typically rank expansions.

Differential Revision: https://reviews.llvm.org/D151804

17 months ago[mlir][tensor] Add pattern to drop redundant insert_slice rank expansion
Matthias Springer [Thu, 1 Jun 2023 06:47:00 +0000 (08:47 +0200)]
[mlir][tensor] Add pattern to drop redundant insert_slice rank expansion

Drop insert_slice rank expansions if they are directly followed by an inverse rank reduction.

Differential Revision: https://reviews.llvm.org/D151800

17 months ago[mlir][arith] Disallow zero ranked tensors for select's condition
Manas [Thu, 1 Jun 2023 06:41:42 +0000 (12:11 +0530)]
[mlir][arith] Disallow zero ranked tensors for select's condition

Zero ranked tensor (say tensor<i1>) when used for arith.select's condition,
crashes optimizer during bufferization. This patch puts a constraint on
condition to be either scalar or of matching shape as to its result.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D151270

17 months agoRevert "[Fuchsia] Pass through LLVM_ENABLE_HTTPLIB to stage 2"
Petr Hosek [Thu, 1 Jun 2023 06:04:16 +0000 (06:04 +0000)]
Revert "[Fuchsia] Pass through LLVM_ENABLE_HTTPLIB to stage 2"

This reverts commit 80614e162222e857d8767174284701aec69381c4.

17 months ago[BOLT][CMake] Redo the build and install targets
Petr Hosek [Fri, 26 May 2023 22:11:24 +0000 (22:11 +0000)]
[BOLT][CMake] Redo the build and install targets

The existing BOLT install targets are broken on Windows becase they
don't properly handle the output extension. We cannot use the existing
LLVM macros since those make assumptions that don't hold for BOLT. This
change instead implements custom macros following the approach used by
Clang and LLD.

Differential Revision: https://reviews.llvm.org/D151595

17 months ago[X86][BF16] Fix 2 crashes with vector broadcast
Phoebe Wang [Thu, 1 Jun 2023 05:38:08 +0000 (13:38 +0800)]
[X86][BF16] Fix 2 crashes with vector broadcast

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D151808

17 months ago[libc] Reduce math tests runtime
Guillaume Chatelet [Wed, 31 May 2023 12:40:10 +0000 (12:40 +0000)]
[libc] Reduce math tests runtime

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D151798

17 months ago[RISCV][NFC] Add isF argument to SchedSEWSet
wangpc [Thu, 1 Jun 2023 04:44:21 +0000 (12:44 +0800)]
[RISCV][NFC] Add isF argument to SchedSEWSet

So that we can remove `SchedSEWSetF` and simplify some code.

Reviewed By: michaelmaitland

Differential Revision: https://reviews.llvm.org/D151790

17 months ago[RISCV] check pointer before dereference
Piyou Chen [Thu, 1 Jun 2023 03:24:03 +0000 (20:24 -0700)]
[RISCV] check pointer before dereference

Encountered ASAN crash and found it dereference without check pointer.

Reviewed By: kito-cheng, eklepilkina

Differential Revision: https://reviews.llvm.org/D151716

17 months ago[SCEV] Compute AddRec range computations using different type BECount
Joshua Cao [Tue, 30 May 2023 08:53:06 +0000 (01:53 -0700)]
[SCEV] Compute AddRec range computations using different type BECount

Before this patch, we can only use the MaxBECount for an AddRec's range
computation if the MaxBECount has <= bit width of the AddRec. This patch
reasons that if a MaxBECount has > bit width, and is <= the max value of
AddRec's bit width, we can still use the MaxBECount.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151698

17 months ago[SCEV][NFC] Refactor range computation for AddRec to pass around APInt
Joshua Cao [Wed, 31 May 2023 03:40:10 +0000 (20:40 -0700)]
[SCEV][NFC] Refactor range computation for AddRec to pass around APInt

17 months ago[SCEV] Fix verification of SCEV multiples.
Joshua Cao [Thu, 1 Jun 2023 02:23:55 +0000 (19:23 -0700)]
[SCEV] Fix verification of SCEV multiples.

17 months ago[RISCV] Update some tests that used "interrupt"="user". NFC
Craig Topper [Thu, 1 Jun 2023 03:31:24 +0000 (20:31 -0700)]
[RISCV] Update some tests that used "interrupt"="user". NFC

Support for this was removed previously. Change them to "supervisor" since
they were testing generic "interrupt" things.

17 months ago[Analysis][LoongArch] Add sign extension for i32 parameters and returns
zhanglimin [Thu, 1 Jun 2023 03:13:47 +0000 (11:13 +0800)]
[Analysis][LoongArch] Add sign extension for i32 parameters and returns

In LoongArch ABI spec, we can see that in the LP64D ABI, unsigned 32-bit
types, such as unsigned int, are stored in general-purpose registers as
proper sign extensions of their 32-bit values.

Reference:
https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html#_abi_lp64d

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D151794

17 months ago[mlir][bytecode] Error if requested bytecode version is unsupported
Kevin Gleason [Thu, 1 Jun 2023 01:10:42 +0000 (18:10 -0700)]
[mlir][bytecode] Error if requested bytecode version is unsupported

Currently desired bytecode version is clamped to the maximum. This allows requesting bytecode versions that do not exist. We have added callsite validation for this in StableHLO to ensure we don't pass an invalid version number, probably better if this is managed upstream. If a user wants to use the current version, then omitting `setDesiredBytecodeVersion` is the best way to do that (as opposed to providing a large number).

Adding this check will also properly error on older version numbers as we increment the minimum supported version. Silently claming on minimum version would likely lead to unintentional forward incompatibilities.

Separately, due to bytecode version being `int64_t` and using methods to read/write uints, we can generate payloads with invalid version numbers:

```
mlir-opt file.mlir --emit-bytecode --emit-bytecode-version=-1 | mlir-opt
<stdin>:0:0: error: bytecode version 18446744073709551615 is newer than the current version 5
```

This is fixed with version bounds checking as well.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D151838

17 months agoSetting to control addressable bits in high memory
Jason Molenda [Thu, 1 Jun 2023 01:34:40 +0000 (18:34 -0700)]
Setting to control addressable bits in high memory

On AArch64, it is possible to have a program that accesses both low
(0x000...) and high (0xfff...) memory, and with pointer authentication,
you can have different numbers of bits used for pointer authentication
depending on whether the address is in high or low memory.

This adds a new target.process.highmem-virtual-addressable-bits
setting which the AArch64 Mac ABI plugin will use, when set, to
always set those unaddressable high bits for high memory addresses,
and will use the existing target.process.virtual-addressable-bits
setting for low memory addresses.

This patch does not change the existing behavior when only
target.process.virtual-addressable-bits is set.  In that case, the
value will apply to all addresses.

Not yet done is recognizing metadata in a live process connection
(gdb-remote qHostInfo) or a Mach-O corefile LC_NOTE to set the
correct number of addressing bits for both memory ranges.  That
will be a future change.

Differential Revision: https://reviews.llvm.org/D151292
rdar://109746900

17 months agoFix clang driver tests for cspgo in lld
Ellis Hoag [Thu, 1 Jun 2023 01:16:08 +0000 (18:16 -0700)]
Fix clang driver tests for cspgo in lld

The tests introduced by https://reviews.llvm.org/D151589 were failing
because I guess some test platforms don't have `lld`. Similar tests add
`-B%S/Inputs/lld` to the clang commands so lets try this here to fix the
tests.

```
clang: error: invalid linker name in argument '-fuse-ld=lld'
```

17 months ago[gn build] Port dc124cda7c78
LLVM GN Syncbot [Thu, 1 Jun 2023 01:15:34 +0000 (01:15 +0000)]
[gn build] Port dc124cda7c78

17 months ago[libc++] Optimize for_each for segmented iterators
Nikolas Klauser [Thu, 1 Jun 2023 01:14:32 +0000 (18:14 -0700)]
[libc++] Optimize for_each for segmented iterators

```
---------------------------------------------------
Benchmark                       old             new
---------------------------------------------------
bm_for_each/1               3.00 ns         2.98 ns
bm_for_each/2               4.53 ns         4.57 ns
bm_for_each/3               5.82 ns         5.82 ns
bm_for_each/4               6.94 ns         6.91 ns
bm_for_each/5               7.55 ns         7.75 ns
bm_for_each/6               7.06 ns         7.45 ns
bm_for_each/7               6.69 ns         7.14 ns
bm_for_each/8               6.86 ns         4.06 ns
bm_for_each/16              11.5 ns         5.73 ns
bm_for_each/64              43.7 ns         4.06 ns
bm_for_each/512              356 ns         7.98 ns
bm_for_each/4096            2787 ns         53.6 ns
bm_for_each/32768          20836 ns          438 ns
bm_for_each/262144        195362 ns         4945 ns
bm_for_each/1048576       685482 ns        19822 ns
```

Reviewed By: ldionne, Mordante, #libc

Spies: arichardson, libcxx-commits

Differential Revision: https://reviews.llvm.org/D151274

17 months ago[libc++] Introduce __for_each_segment and use it in copy/move
Nikolas Klauser [Thu, 1 Jun 2023 01:14:24 +0000 (18:14 -0700)]
[libc++] Introduce __for_each_segment and use it in copy/move

This simplifies the code inside copy/move and makes it easier to apply the optimization to other algorithms.

Reviewed By: ldionne, Mordante, #libc

Spies: arichardson, libcxx-commits

Differential Revision: https://reviews.llvm.org/D151265

17 months ago[lld] add context-sensitive PGO options for MachO
Ellis Hoag [Wed, 31 May 2023 21:17:35 +0000 (14:17 -0700)]
[lld] add context-sensitive PGO options for MachO

Enable support for CSPGO for lld MachO targets.

Since lld MachO does not support `-plugin-opt=`, we need to create the `--cs-profile-generate` and `--cs-profile-path=` options and propagate them in `Darwin.cpp`. These flags are not supported by ld64.

Also outline code into `getLastCSProfileGenerateArg()` to share between `CommonArgs.cpp` and `Darwin.cpp`.

CSPGO is already implemented for ELF (https://reviews.llvm.org/D56675) and COFF (https://reviews.llvm.org/D98763).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D151589

17 months agolldb: Fix cross-cu-reference test to explicitly request that feature
David Blaikie [Thu, 1 Jun 2023 00:35:39 +0000 (00:35 +0000)]
lldb: Fix cross-cu-reference test to explicitly request that feature

17 months ago[DebugInfo][Split DWARF][LTO]: Ensure only a single CU is emitted
David Blaikie [Wed, 31 May 2023 23:27:52 +0000 (23:27 +0000)]
[DebugInfo][Split DWARF][LTO]: Ensure only a single CU is emitted

Split DWARF doesn't handle LTO of any form (roughly there's an
assumption that each dwo file will have one CU - it's not explicitly
documented, nor explicitly handled, so the ecosystem isn't really well
understood/tested/etc).

This had previously been handled by implementing (& disabling by
default) the `-split-dwarf-cross-cu-references` flag, which would
disable use of ref_addr across two dwo CUs.

This worked for a while, at least in LTO (it didn't address Split
DWARF+Full LTO, but that's an unlikely combination, as the benefits of
Split DWARF are more limited in a full LTO build) - because the only
source of cross-CU references was inlined functions, so by making those
non-cross-CU (by moving the referenced inlined function DWARF
description into the referencing CU) the result was one CU per dwo.

But recently the Function Specialization pass was added to the ThinLTO
pipeline, which caused imported functions that may not be inlined to be
emitted by a backend compile. This meant foreign CU entities (not just
abstract origins/cross-CU referenced entities)/standalone foreign CUs
could be emitted by a backend compile.

The end result was, due to a bug* in binutils dwp (I think basically
it saw two CUs in a single dwo and reprocessed the offsets in the shared
debug_str_offsets.dwo section) this situation lead to corrupted strings.

So to make this more robust, I've generalized the definition of the
`-split-dwarf-cross-cu-references` flag (perhaps it should be renamed at
this point, but it's /really/ niche, doubt anyone's using it - more or
less there for experimentation when we get around to figuring out
spec'ing LTO+Split DWARF) to mean "single CU in a dwo file" and added
more general handling for this.

There's certainly some weird corner cases that could come up in terms of
"how do we choose which CU to put everything in" - for now it's "first
come, first served" which is probably going to be OK for ThinLTO - the
base module will have the first functions and first CU, imported
fragments will come after that. For LTO the choice will be fairly
arbitrary - but, again, essentially whichever module comes first.

* Arguably a bug in binutils dwp, but since the feature isn't well
  specified, I'd rather avoid dabbling in this uncertain area and ensure
  LLVM doesn't produce especially novel DWARF (dwos with multiple CUs)
  regardless of whether binutils dwp would/should be fixed. I'm not
  confident debuggers could read such a dwo file well, etc.

17 months ago[clang] NFCI: Use `DirectoryEntryRef` in framework lookup
Jan Svoboda [Wed, 31 May 2023 17:44:23 +0000 (10:44 -0700)]
[clang] NFCI: Use `DirectoryEntryRef` in framework lookup

This removes one use of the deprecated `DirectoryEntry::getName()`.

17 months ago[clang] NFCI: Use the `*Ref()` variant on search paths
Jan Svoboda [Wed, 31 May 2023 06:38:51 +0000 (23:38 -0700)]
[clang] NFCI: Use the `*Ref()` variant on search paths

This removes some uses of the deprecated `DirectoryEntry::getName()`.

17 months ago[clang] NFCI: Use `FileEntryRef` in `PPLexerChange`
Jan Svoboda [Wed, 31 May 2023 06:38:13 +0000 (23:38 -0700)]
[clang] NFCI: Use `FileEntryRef` in `PPLexerChange`

This removes some uses of the deprecated `FileEntry::getName()`.

17 months ago[Fuchsia] Pass through LLVM_ENABLE_HTTPLIB to stage 2
Daniel Thornburgh [Wed, 31 May 2023 22:56:10 +0000 (15:56 -0700)]
[Fuchsia] Pass through LLVM_ENABLE_HTTPLIB to stage 2

17 months ago[docs] Use ExecutorAddr::toPtr() in ORC documentation.
Mike Rostecki [Wed, 31 May 2023 22:04:47 +0000 (15:04 -0700)]
[docs] Use ExecutorAddr::toPtr() in ORC documentation.

The partial move from JITTargetAddress to ExecutorAddr in 8b1771bd9f30 did not
update the ORC or Kaleidoscope documents. This patch fixes the inconsistency.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D150458

17 months ago[OpenMP] Remove 'keep_alive' functionality from the device RTL
Joseph Huber [Wed, 24 May 2023 12:59:37 +0000 (07:59 -0500)]
[OpenMP] Remove 'keep_alive' functionality from the device RTL

The OpenMP DeviceRTL uses a hacky workaround to keep certain runtime
calls alive. This used a function that prevented them from being
optimized out. We needed this hack because the 'OpenMPOpt' pass likes to
introduce new runtime calls into the TU. This then interacted badly with
the method of linking the bitcode file per-TU like we do with Nvidia.
The OpenMPOpt pass would then generate a runtime call to a function that
was never linked in.

This should not be a problem anymore because we unconditionally link in
the `libomptarget.devicertl.a` runtime library. This should thus only
extract symbols that are undefined. So, if we do end up with an
unresolved reference it will be resolved by the static library.

The downside to this is that if we are doing non-LTO NVPTX compilation
that introduces one of these calls it will be linked outside the module
and therefore provide the overhead of an external function call.
However, removing this flag should make optimizing things easier. We
will need to see if that performance is a problem.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D151324

17 months ago[clang] NFCI: Use `DirectoryEntryRef` in `PrecompiledPreamble`
Jan Svoboda [Wed, 31 May 2023 06:35:23 +0000 (23:35 -0700)]
[clang] NFCI: Use `DirectoryEntryRef` in `PrecompiledPreamble`

This removes some uses of the deprecated `DirectoryEntry::getName()`.

17 months ago[clang] NFCI: Use `DirectoryEntryRef` for `ModuleMap::BuiltinIncludeDir`
Jan Svoboda [Wed, 31 May 2023 06:34:40 +0000 (23:34 -0700)]
[clang] NFCI: Use `DirectoryEntryRef` for `ModuleMap::BuiltinIncludeDir`

This removes some uses of the deprecated `DirectoryEntry::getName()`.

17 months ago[clang] NFCI: Use `DirectoryEntryRef` in `ASTWriter`
Jan Svoboda [Wed, 31 May 2023 05:12:07 +0000 (22:12 -0700)]
[clang] NFCI: Use `DirectoryEntryRef` in `ASTWriter`

This removes the call to deprecated `DirectoryEntry::getName()`.

17 months agoHostInfoMacOS: Add a utility function for finding an SDK-specific tool
Adrian Prantl [Fri, 26 May 2023 21:48:37 +0000 (14:48 -0700)]
HostInfoMacOS: Add a utility function for finding an SDK-specific tool

This is an API needed by swift-lldb.

https://reviews.llvm.org/D151591

17 months agoFactor out xcrun into a function (NFC)
Adrian Prantl [Fri, 26 May 2023 20:01:34 +0000 (13:01 -0700)]
Factor out xcrun into a function (NFC)

https://reviews.llvm.org/D151588

17 months ago[flang] CUDA Fortran - part 4/5: definability and characteristics
Peter Klausler [Sat, 6 May 2023 22:03:39 +0000 (15:03 -0700)]
[flang] CUDA Fortran - part 4/5: definability and characteristics

Extend the definability and procedure characteristics checking
infrastructure in semantics to check for context-dependent CUDA object
definability violations and problems with CUDA attribute incompatibility
in procedure interfaces.

Depends on https://reviews.llvm.org/D150159,
https://reviews.llvm.org/D150161, & https://reviews.llvm.org/D150162.

Differential Revision: https://reviews.llvm.org/D150163

17 months ago[lld] Add --lto-debug-pass-manager option
Ellis Hoag [Wed, 31 May 2023 21:02:23 +0000 (14:02 -0700)]
[lld] Add --lto-debug-pass-manager option

Add support for printing the passes run for LTO.

Both ELF and COFF have `--lto-debug-pass-manager` (`-ltodebugpassmanager`) to print the compiler passes run during LTO. This is useful to check that a certain compiler pass is run in a test, e.g., https://reviews.llvm.org/D151589

Reviewed By: #lld-macho, MaskRay, int3

Differential Revision: https://reviews.llvm.org/D151746

17 months ago[clang] Fix crash when passing a braced-init list to a parentehsized aggregate init...
Alan Zhao [Tue, 30 May 2023 23:27:14 +0000 (16:27 -0700)]
[clang] Fix crash when passing a braced-init list to a parentehsized aggregate init expression

The previous code incorrectly assumed that we would never call
warnBracedScalarInit(...) with a EK_ParenAggInitMember. This patch fixes
the bug by warning when a scalar member is initialized via a braced-init
list when performing a parentehsized aggregate initialization. This
behavior is consistent with parentehsized list aggregate initialization.

Fixes #63008

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D151763

17 months agoRevert "[2a/3][ASan][libcxx] std::deque annotations"
Vitaly Buka [Wed, 31 May 2023 20:31:49 +0000 (13:31 -0700)]
Revert "[2a/3][ASan][libcxx] std::deque annotations"

This reverts commit 605b9c76e093f6ed713b3fea47cb9726b346edeb.

17 months ago[mlir][spirv] Add printf op from SPIRV OpenCL extension set spec
Dimple Prajapati [Wed, 31 May 2023 20:41:25 +0000 (16:41 -0400)]
[mlir][spirv] Add printf op from SPIRV OpenCL extension set spec

This change adds op to support printf instruction from OpenCL extensions set.
This op helps writing out debug details from SPIRV kernel in a given format.

Patch By: drprajap
Reviewed By: antiagainst, kuhar

Differential Revision: https://reviews.llvm.org/D151731

17 months ago[flang] Add DerivedTypeSpec::VectorTypeAsFortran for PPC vector type
Kelvin Li [Mon, 29 May 2023 20:27:38 +0000 (16:27 -0400)]
[flang] Add DerivedTypeSpec::VectorTypeAsFortran for PPC vector type

VectorTypeAsFortran is added for writing PPC vector types to modules.

Coauthor: @tislam

Differential Revision: https://reviews.llvm.org/D151757

17 months ago[scudo] Release pages of larger block more frequently
Chia-hung Duan [Thu, 25 May 2023 23:08:38 +0000 (23:08 +0000)]
[scudo] Release pages of larger block more frequently

Release pages for large block (size greater than a page) is faster than
the small blocks. Besides, larger blocks are supposed not to be used
so often like smaller blocks which means we may hold several pages used
by large block and rarely get chance to release them if there's no
explicit M_PURGE call. Therefore, relax the release-interval condition
for large block.

This also fixes the assumption that FORCE_ALL should always try page
release.

Differential Revision: https://reviews.llvm.org/D151290

17 months ago[test] Add zero size global test to code-model-elf.ll
Arthur Eubanks [Wed, 31 May 2023 20:03:27 +0000 (13:03 -0700)]
[test] Add zero size global test to code-model-elf.ll

17 months ago[Tooling] Remove unused function setRestoreWorkingDir
Kazu Hirata [Wed, 31 May 2023 19:43:37 +0000 (12:43 -0700)]
[Tooling] Remove unused function setRestoreWorkingDir

The last use was removed by:

  commit 146ec74a8382dc820809d0a2bf4b918d0b5e6603
  Author: Jan Svoboda <jan_svoboda@apple.com>
  Date:   Fri Sep 10 10:24:16 2021 +0200

Once I remove the function, RestoreCWD is always true, so this patch
removes the variable and propagates the constant.

Differential Revision: https://reviews.llvm.org/D151786

17 months ago[X86] Use "l" prefix for data sections under medium/large code model
Arthur Eubanks [Wed, 10 May 2023 20:13:43 +0000 (13:13 -0700)]
[X86] Use "l" prefix for data sections under medium/large code model

And also set the SHF_X86_64_LARGE section flag.

gcc only uses the "l" prefix and SHF_X86_64_LARGE in the medium code model for data larger than -mlarge-data-threshold. But it seems more consistent to use it in the large code model as well in case separate parts of the binary aren't compiled with the large code model and also have a .data/.bss/.rodata section.

Reviewed By: MaskRay, tkoeppe

Differential Revision: https://reviews.llvm.org/D148836

17 months ago[libc++][docs] Add note about RFCs for significant changes
Louis Dionne [Wed, 17 May 2023 18:05:12 +0000 (11:05 -0700)]
[libc++][docs] Add note about RFCs for significant changes

Differential Revision: https://reviews.llvm.org/D150813

17 months ago[libc++] Add a few missing _LIBCPP_HIDE_FROM_ABI annotations
Louis Dionne [Wed, 31 May 2023 19:23:37 +0000 (12:23 -0700)]
[libc++] Add a few missing _LIBCPP_HIDE_FROM_ABI annotations

17 months ago[clang] Allow fp in atomic fetch max/min builtins
Yaxun (Sam) Liu [Fri, 19 May 2023 17:51:29 +0000 (13:51 -0400)]
[clang] Allow fp in atomic fetch max/min builtins

LLVM IR already allows floating point type in atomicrmw.
Update clang atomic fetch max/min builtins to accept
floating point type like we did for fetch add/sub.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D150985

Fixes: SWDEV-401056

17 months ago[Darwin] Fix more ASAN symbolizer tests
Francis Visoiu Mistrih [Wed, 31 May 2023 19:17:21 +0000 (12:17 -0700)]
[Darwin] Fix more ASAN symbolizer tests

RenderFrame now strips `wrap_`.

17 months ago[clang] NFCI: Use `FileEntryRef` in `VerifyDiagnosticConsumer`
Jan Svoboda [Wed, 31 May 2023 06:11:48 +0000 (23:11 -0700)]
[clang] NFCI: Use `FileEntryRef` in `VerifyDiagnosticConsumer`

This is a prep patch that enables removal of some calls to the deprecated `{File,Directory}Entry::getName()`.

17 months ago[clang] NFCI: Use `FileEntryRef` in `PPDirectives`
Jan Svoboda [Wed, 31 May 2023 06:09:40 +0000 (23:09 -0700)]
[clang] NFCI: Use `FileEntryRef` in `PPDirectives`

This is a prep patch that enables removal of some calls to the deprecated `{File,Directory}Entry::getName()`.

17 months ago[clang] Use the appropriate definition when checking FunctionDecl::isInlineBuiltinDec...
serge-sans-paille [Wed, 31 May 2023 05:57:35 +0000 (07:57 +0200)]
[clang] Use the appropriate definition when checking FunctionDecl::isInlineBuiltinDeclaration

This is a follow-up to https://reviews.llvm.org/D148723 and fixes the
bug reported by @mstorsjo.

Differential Revision: https://reviews.llvm.org/D151783

17 months ago[libc][docs] Update implementation status table for Date and Time Functions.
Tue Ly [Wed, 31 May 2023 15:11:08 +0000 (11:11 -0400)]
[libc][docs] Update implementation status table for Date and Time Functions.

Update implementation status table for Date and Time Functions to include different targets.

Reviewed By: jeffbailey

Differential Revision: https://reviews.llvm.org/D151809

17 months ago[RISCV] Change LdPat and StPat from multiclass to class. NFC
Craig Topper [Wed, 31 May 2023 18:46:52 +0000 (11:46 -0700)]
[RISCV] Change LdPat and StPat from multiclass to class. NFC

These used to contain multiple patterns, but that was simplified
when we moved to using ComplexPattern for load/store address matching.

17 months ago[DAG] Combine insert(shuffle(load), load, 0) into a single load
David Green [Wed, 31 May 2023 18:48:57 +0000 (19:48 +0100)]
[DAG] Combine insert(shuffle(load), load, 0) into a single load

Given an insert of a scalar load into a vector shuffle with mask
u,0,1,2,3,4,5,6 or 1,2,3,4,5,6,7,u (depending on the insert index),
it can be more profitable to convert to a single load and avoid the
shuffles. This adds a DAG combine for it, providing the new load is
still fast.

Differential Revision: https://reviews.llvm.org/D151029

17 months ago[Darwin] Fix ASAN symbolizer tests
Francis Visoiu Mistrih [Wed, 31 May 2023 18:32:13 +0000 (11:32 -0700)]
[Darwin] Fix ASAN symbolizer tests

RenderFrame now strips `wrap_`.

17 months ago[CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass
Igor Kirillov [Mon, 17 Apr 2023 18:24:45 +0000 (18:24 +0000)]
[CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass

Code generated with -Ofast and -O3 -ffp-contract=fast (add
-ffinite-math-only to enable vectorization) can differ significantly.
Code compiled with -O3 can be deinterleaved using patterns as the
instruction order is preserved. However, with the -Ofast flag, there
can be multiple changes in the computation sequence, and even the real
and imaginary parts may not be calculated in parallel.
For more details, refer to
llvm/test/CodeGen/AArch64/complex-deinterleaving-*-fast.ll and
llvm/test/CodeGen/AArch64/complex-deinterleaving-*-contract.ll tests.
This patch implements a more general approach and enables handling most
-Ofast cases.

Differential Revision: https://reviews.llvm.org/D148558

17 months ago[flang][hlfir] Lower structure constructor via AssignOp.
Slava Zakharin [Wed, 31 May 2023 16:06:51 +0000 (09:06 -0700)]
[flang][hlfir] Lower structure constructor via AssignOp.

I tried this patch, first. Some tests failed because of the extra
finalizations for the temporary LHSs: when LHS component is a derived
type with final subprograms, the finalizations might be detected
by counting/printing in the final subprograms and treated as errors
in the tests, because they are not expected.
So I also tried to reuse the StructureConstructor code lowering to FIR
followed by AsExprOp to produce the HLFIR "value". Unfortunately,
this did not resolve the finalization issues, because AsExprOp may
end up being bufferized into AssignOp as well.
So the extra finalizations are inherent problem for AssignOp,
and it has to be resolved separately. Thus, I decided to proceed
with a "cleaner" direct lowering to HLFIR (the initial patch).

I am thinking about adding an extra flag for AssignOp that would
indicate that the LHS is a compiler generated temporary, so we could
use something like AssignTemporary() in HLFIR-to-FIR converter.

Reviewed By: tblah

Differential Revision: https://reviews.llvm.org/D151752

17 months agoRevert "[compiler-rt][CMake] Properly set COMPILER_RT_HAS_LLD"
Arthur Eubanks [Wed, 31 May 2023 18:26:55 +0000 (11:26 -0700)]
Revert "[compiler-rt][CMake] Properly set COMPILER_RT_HAS_LLD"

This reverts commit 395a614d2cb69a431bd11e266021d91503c1d709.

Causes some bots to break, e.g. https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8779560688633165361/overview

17 months agoRevert "[mlir][Vector] Extend xfer drop unit dim patterns"
Diego Caballero [Wed, 31 May 2023 18:07:09 +0000 (18:07 +0000)]
Revert "[mlir][Vector] Extend xfer drop unit dim patterns"

This reverts commit a53cd03deac5e6272e9dae88a90cd51410d312d5.

This commit is exposing some implementation gaps in other patterns.
Reverting for now.

17 months ago[hwasan] RunMallocHooks with orig_size
Jin Xin Ng [Fri, 26 May 2023 18:57:21 +0000 (18:57 +0000)]
[hwasan] RunMallocHooks with orig_size

This matches behaviour of asan. sanitizer_common/TestCases/malloc_hook.cpp
should've caught this- but hwasan was on XFAIL.

Differential Revision: https://reviews.llvm.org/D151580

17 months ago[clang][analyzer][NFC] Use the operator new directly with the `BumpPtrAllocator`
Dmitri Gribenko [Wed, 31 May 2023 18:02:44 +0000 (20:02 +0200)]
[clang][analyzer][NFC] Use the operator new directly with the `BumpPtrAllocator`

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D151818

17 months ago[RISCV] Use class and inheritance instead of multiclass for some vector isel patterns...
Craig Topper [Wed, 31 May 2023 17:55:12 +0000 (10:55 -0700)]
[RISCV] Use class and inheritance instead of multiclass for some vector isel patterns. NFC

17 months ago[libcxxabi] copy back std::string_view patches from LLVM
Nick Desaulniers [Wed, 31 May 2023 16:50:35 +0000 (09:50 -0700)]
[libcxxabi] copy back std::string_view patches from LLVM

I made a series of changes to LLVM's demangle in:
- D148348
- D148353
- D148363
- D148375
and so did Fangrui in 3ece37b3fa2c and Ashay in D149061.

I didn't notice the banner about there being two copies of this in tree
and was modifying the downstream versions. Copy these changes back to
the upstream version. Oops!

Reviewed By: MaskRay, #libc_abi, ldionne, phosek

Differential Revision: https://reviews.llvm.org/D148566

17 months ago[mlir][sparse] fix crashes when generation conv_2d_nchw_fchw with Compressed Dense...
Peiming Liu [Wed, 31 May 2023 03:44:56 +0000 (03:44 +0000)]
[mlir][sparse] fix crashes when generation conv_2d_nchw_fchw with Compressed Dense Compressed Dense sparse encoding.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D151773

17 months ago[lldb] Take StringRef name in GetIndexOfChildMemberWithName (NFC)
Dave Lee [Sun, 28 May 2023 19:08:19 +0000 (12:08 -0700)]
[lldb] Take StringRef name in GetIndexOfChildMemberWithName (NFC)

Change the type of the `name` parameter from `char *` to `StringRef`.

Follow up to D151615.

Differential Revision: https://reviews.llvm.org/D151810

17 months ago[Fuchsia] Add llvm-debuginfod to toolchain
Daniel Thornburgh [Tue, 30 May 2023 21:20:46 +0000 (14:20 -0700)]
[Fuchsia] Add llvm-debuginfod to toolchain

17 months ago[RISCV] Lower extern_weak symbols using the GOT for the medany model
Jessica Clarke [Wed, 31 May 2023 17:30:36 +0000 (18:30 +0100)]
[RISCV] Lower extern_weak symbols using the GOT for the medany model

Such symbols may be undefined at link time and thus resolve to 0, which
may be further than 2GiB away from PC, causing the immediate to be out
of range for PC-relative addressing. Using the GOT avoids this, and is
the approach taken by AArch64.

Reviewed By: asb, MaskRay, arichardson

Differential Revision: https://reviews.llvm.org/D107280

17 months ago[RISCV] Add new lga pseudoinstruction
Jessica Clarke [Wed, 31 May 2023 17:30:27 +0000 (18:30 +0100)]
[RISCV] Add new lga pseudoinstruction

This mirrors lla and is always GOT-relative, allowing an explicit
request to use the GOT without having to expand the instruction. This
then means la is just defined in terms of lla and lga in the assembler,
based on whether PIC is enabled, and at the codegen level we replace la
entirely with lga since we only ever use la there when we want to load
from the GOT (and assert that to be the case).

See https://github.com/riscv-non-isa/riscv-asm-manual/issues/50

Reviewed By: asb, MaskRay

Differential Revision: https://reviews.llvm.org/D107278

17 months ago[RISCV] Add test showing the current extern_weak lowering
Jessica Clarke [Wed, 31 May 2023 17:30:19 +0000 (18:30 +0100)]
[RISCV] Add test showing the current extern_weak lowering

Reviewed By: asb, MaskRay

Differential Revision: https://reviews.llvm.org/D107279

17 months ago[mlir] Avoid folding `index.remu` and `index.rems` for 0 rhs
rikhuijzer [Wed, 31 May 2023 17:45:05 +0000 (10:45 -0700)]
[mlir] Avoid folding `index.remu` and `index.rems` for 0 rhs

As discussed in https://github.com/llvm/llvm-project/issues/59714#issuecomment-1369518768, the folder for the remainder operations should be resillient when the rhs is 0.
The file `IndexOps.cpp` was already checking for multiple divisions by zero, so I tried to stick to the code style from those checks.

Fixes #59714.

As a side note, is it correct that remainder operations are never optimized away? I would expect that the following code

```
func.func @remu_test() -> index {
  %c3 = index.constant 2
  %c0 = index.constant 1
  %0 = index.remu %c3, %c0
  return %0 : index
}
```
would be optimized to
```
func.func @remu_test() -> index {
  return index.constant 0 : index
}
```
when called with `mlir-opt --convert-scf-to-openmp temp.mlir`, but maybe I'm misunderstanding something.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D151476

17 months ago[flang] CUDA Fortran - part 3/5: declarations checking
Peter Klausler [Sat, 6 May 2023 22:03:39 +0000 (15:03 -0700)]
[flang] CUDA Fortran - part 3/5: declarations checking

Implements checks for CUDA Fortran attributes on objects, types, and
subprograms.  Includes a couple downgrades of existing errors into
warnings that were exposed during testing.

Depends on https://reviews.llvm.org/D150159 &
https://reviews.llvm.org/D150161.

Differential Revision: https://reviews.llvm.org/D150162

17 months ago[ARM][AArch64] Add tests for shuffles load patterns. NFC
David Green [Wed, 31 May 2023 17:42:01 +0000 (18:42 +0100)]
[ARM][AArch64] Add tests for shuffles load patterns. NFC

See D151029

17 months ago[LoadStoreVectorizer] Fix index width != pointer width case
Krzysztof Drewniak [Tue, 30 May 2023 21:01:21 +0000 (21:01 +0000)]
[LoadStoreVectorizer] Fix index width != pointer width case

Fixes https://github.com/llvm/llvm-project/issues/62856

Reviewed By: jlebar

Differential Revision: https://reviews.llvm.org/D151754

17 months agoRevert "[ThinLTO] Disable partial sample profile scaling by default"
Teresa Johnson [Wed, 31 May 2023 16:09:27 +0000 (09:09 -0700)]
Revert "[ThinLTO] Disable partial sample profile scaling by default"

This reverts commit aae8524bcc26cf04729f2bbc02ecb54233a587e4, which was
found to cause a few unexpected benchmark performance differences that
need investigation.

17 months ago[flang] CUDA Fortran - part 2/5: symbols & scopes
Peter Klausler [Sat, 6 May 2023 22:03:39 +0000 (15:03 -0700)]
[flang] CUDA Fortran - part 2/5: symbols & scopes

Add representations of CUDA Fortran data and subprogram attributes
to the symbol table and scopes of semantics.  Set them in name
resolution, and emit them to module files.

Depends on https://reviews.llvm.org/D150159.

Differential Revision: https://reviews.llvm.org/D150161

17 months ago[RISCV][InsertVSETVLI] Relax tail policy more often for vmv.s.x
Luke Lau [Wed, 31 May 2023 17:04:25 +0000 (17:04 +0000)]
[RISCV][InsertVSETVLI] Relax tail policy more often for vmv.s.x

If a vm.s.x pseudo has an undef passthru operand, then we're free to use
whatever tail policy we want for VL > 1. We previously relaxed the tail
policy for this but only when we could also expand the SEW.
This patch changes it to relax the tail policy even if the SEW can't be
expanded and removes a few more toggles, as well as fully moving the
vmv.s.x logic into getDemanded.

17 months ago[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block
Luke Lau [Wed, 31 May 2023 14:29:28 +0000 (14:29 +0000)]
[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block

vmv.s.x/vfmv.s.f instructions that only write to the first destination
element can use any SEW greater than or equal to its original SEW,
provided that it's writing to an implicit_def operand where we can
clobber the other lanes.

We were already handling this in needVSETVLI, which meant that when
scanning the instructions from top to bottom we could detect this and
avoid the toggle:

vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0

->
vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vmv.s.x v0, a0
The issue that this patch aims to solve is arises when the vmv.s.x is
the first vector instruction in the block and doesn't have any prior
predecessor info:

entry_bb:
li a0, 11
; No previous state here: forced to set VL/VTYPE
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0
vsetivli zero, 4, e16, mf2, ta, ma
vmerge.vvm v8, v9, v8, v0
doLocalPostpass can work backwards from bottom to top and work out if
an earlier vsetvli can be mutated to avoid a toggle. It uses
DemandedFields and getDemanded for this, which previously didn't take
into account the possibility of going to a larger SEW.

A previous patch consolidated the vmv.s.x logic from needVSETVLI logic
into getDemanded, and this patch removes the gate around it so that
doLocalPostpass can now delete vsetvlis like in the scenario below:

entry_bb:
li a0, 11
; Previous vsetivli mutated: second one deleted
vsetivli zero, 4, e16, mf2, ta, ma
vmv.s.x v0, a0
vmerge.vvm v8, v9, v8, v0

Differential Revision: https://reviews.llvm.org/D151561

17 months ago[RISCV][InsertVSETVLI] Move vmv.s.x SEW check into getDemandedBits. NFC
Luke Lau [Fri, 26 May 2023 12:58:04 +0000 (12:58 +0000)]
[RISCV][InsertVSETVLI] Move vmv.s.x SEW check into getDemandedBits. NFC

This patch restructures the logic that checks if vmv.s.x's SEW can be
expanded into getDemandedBits, so that it can be shared by both the
top-to-bottom and bottom-to-top passes.

It adds a third option for SEW in DemandedFields, that's weaker than
demanded but stronger than not demanded, that states that it the new SEW
must be greater than or equal to the current SEW.

Note that we now need to take care of the order of operands in
areCompatibleVTYPEs as the relation is no longer commutative.

A later patch will remove the gating on the bottom-to-top pass
(dolocalPostpass) and another one will relax the demands on the tail
policy further.

17 months ago[NFC][CLANG] Fix nullptr dereference issue in SetValueDataBasedOnQualType()
Manna, Soumi [Wed, 31 May 2023 17:11:14 +0000 (10:11 -0700)]
[NFC][CLANG] Fix nullptr dereference issue in SetValueDataBasedOnQualType()

This patch uses castAs instead of getAs which will assert if the type doesn't match in SetValueDataBasedOnQualType(clang::Value &, unsigned long long).

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D151770

17 months agoRevert "[RISCV] Add Zvfhmin extension for clang."
Craig Topper [Wed, 31 May 2023 17:02:12 +0000 (10:02 -0700)]
Revert "[RISCV] Add Zvfhmin extension for clang."

This reverts commit 35a0079238ce9fc36cdc8c6a2895eb5538bf7b4a.

The backend support is not present yet. The intrinsics will crash
the compiler if compiled to assembly or binary.

17 months agoRevert "[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block"
Luke Lau [Wed, 31 May 2023 17:14:55 +0000 (18:14 +0100)]
Revert "[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block"

This reverts commit 0ba41dd3806e658e67acb63353fd5540f2bf333c.

17 months ago[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block
Luke Lau [Wed, 31 May 2023 17:04:25 +0000 (17:04 +0000)]
[RISCV][InsertVSETVLI] Avoid vmv.s.x SEW toggle if at start of block

vmv.s.x and friends that only write to the first destination element can
use any SEW greater than or equal to its original SEW, provided that
it's writing to an implicit_def operand where we can clobber the other
lanes.

We were already handling this in needVSETVLI, which meant that when
scanning the instructions from top to bottom we could detect this and
avoid the toggle:

```
vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0

->
vsetivli zero, 4, e64, mf2, ta, ma
li a0, 11
vmv.s.x v0, a0

```
The issue that this patch aims to solve is whenever vmv.s.x arises when
the first vector instruction in the block and doesn't have any prior
predecessor info:

```
entry_bb:
li a0, 11
; No previous state here: forced to set VL/VTYPE
vsetivli zero, 1, e8, mf8, ta, ma
vmv.s.x v0, a0
vsetivli zero, 4, e16, mf2, ta, ma
vmerge.vvm v8, v9, v8, v0
```

doLocalPostpass can work backwards from bottom to top and work out if
an earlier vsetvli can be mutated to avoid a toggle. It uses
DemandedFields and getDemanded for this, which previously didn't take
into account the possibility of going to a larger SEW.

This patch adds a third option for SEW in DemandedFields, that's weaker
than demanded but stronger than not demanded, that states that it the
new SEW must be greater than or equal to the current SEW.

We can then use this option to move that vmv.s.x specific logic from
needVSETVLI into getDemanded, making it available for both phase 2 and
3, i.e. we can now mutate the earlier vsetivli going from bottom to top:

```
entry_bb:
li a0, 11
; Previous vsetivli mutated: second one deleted
vsetivli zero, 4, e16, mf2, ta, ma
vmv.s.x v0, a0
vmerge.vvm v8, v9, v8, v0
```

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D151561

17 months ago[NFC][CLANG] Fix nullptr dereference issue in HandleRISCVRVVVectorBitsTypeAttr()
Manna, Soumi [Wed, 31 May 2023 16:54:51 +0000 (09:54 -0700)]
[NFC][CLANG] Fix nullptr dereference issue in HandleRISCVRVVVectorBitsTypeAttr()

This patch uses castAs instead of getAs which will assert if the type doesn't match in HandleRISCVRVVVectorBitsTypeAttr(clang::QualType &, clang::ParsedAttr &, clang::Sema &)

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D151769

17 months ago[Demangle] fix deref of std::string_view::end()
Nick Desaulniers [Wed, 31 May 2023 16:44:13 +0000 (09:44 -0700)]
[Demangle] fix deref of std::string_view::end()

In D148546, I replaced much of the use of llvm::StringView w/
std::string_view.  There's one important semantic difference between the
two:

In most STL containers, end() returns an iterator that refers to one
past the end of the container. But llvm::StringView::end() refers to the
last element.

Expressions such as `&*my_std_string_view.end()` produce the failed
assertion:

  include/c++/v1/__iterator/bounded_iter.h:93: assertion
  __in_bounds(__current_) failed: __bounded_iter::operator*: Attempt to
  dereference an out-of-range iterator

This was caught when copying the recent downstream changes back upstream
in D148566, and is reproducible via:

  $ libcxx/utils/ci/run-buildbot generic-debug-mode

when compiled with clang and clang++. The correct way to get the same
value as before without dereferencing invalid iterators is to prefer
`&*my_std_string_view.rbegin() + 1`.

Fix this downstream so that I might copy it back upstream in D148566.

The other instance of `&*my_std_string_view.end()` that I introduced in
D148546 has been fixed already in D149061.

Reviewed By: ashay-github

Differential Revision: https://reviews.llvm.org/D151760

17 months ago[flang] CUDA Fortran - part 1/5: parsing
Peter Klausler [Sat, 6 May 2023 22:03:39 +0000 (15:03 -0700)]
[flang] CUDA Fortran - part 1/5: parsing

Begin upstreaming of CUDA Fortran support in LLVM Flang.

This first patch implements parsing for CUDA Fortran syntax,
including:
 - a new LanguageFeature enum value for CUDA Fortran
 - driver change to enable that feature for *.cuf and *.CUF source files
 - parse tree representation of CUDA Fortran syntax
 - dumping and unparsing of the parse tree
 - the actual parsers for CUDA Fortran syntax
 - prescanning support for !@CUF and !$CUF
 - basic sanity testing via unparsing and parse tree dumps

... along with any minimized changes elsewhere to make these
work, mostly no-op cases in common::visitors instances in
semantics and lowering to allow them to compile in the face
of new types in variant<> instances in the parse tree.

Because CUDA Fortran allows the kernel launch chevron syntax
("call foo<<<blocks, threads>>>()") only on CALL statements and
not on function references, the parse tree nodes for CallStmt,
FunctionReference, and their shared Call were rearranged a bit;
this caused a fair amount of one-line changes in many files.

More patches will follow that implement CUDA Fortran in the symbol
table and name resolution, and then semantic checking.

Differential Revision: https://reviews.llvm.org/D150159

17 months agoFix -u option in dsymutil, to not emit an extra DW_LNE_set_address if the original...
Shubham Sandeep Rastogi [Fri, 26 May 2023 19:05:09 +0000 (12:05 -0700)]
Fix -u option in dsymutil, to not emit an extra DW_LNE_set_address if the original line table was empty

With dsymutil's -u option, only the accelerator tables should be
updated, but with https://reviews.llvm.org/D150554 the -u option will
still re-generate the line table. If the line table was empty, that is,
it was a dummy line table, with no entries in it, dsymutil will always
generate a line table with a DW_LNE_end_sequence, a funky side effect of
this is that when the line table is re-generated, it will always emit a
DW_LNE_set_address first, which will change the line table total size.
This patch addresses this by making sure that if all the line table has
in it is a DW_LNE_end_sequence, it is the same as a dummy entry.

Differential Revision: https://reviews.llvm.org/D151579

17 months ago[gn build] Port 8e728adcfedd
LLVM GN Syncbot [Wed, 31 May 2023 16:33:01 +0000 (16:33 +0000)]
[gn build] Port 8e728adcfedd

17 months ago[MLIR][Linalg] (NFC) Improve RUN command in `generalize-pad-tensor.mlir`
Lorenzo Chelini [Wed, 31 May 2023 15:34:13 +0000 (17:34 +0200)]
[MLIR][Linalg] (NFC) Improve RUN command in `generalize-pad-tensor.mlir`

There is no need to specify any `check-prefix` here.

17 months agoworkflows/release-tasks: Upload lit releases to pypi
Tom Stellard [Wed, 31 May 2023 16:13:04 +0000 (09:13 -0700)]
workflows/release-tasks: Upload lit releases to pypi

Reviewed By: thieta, kwk

Differential Revision: https://reviews.llvm.org/D146491

17 months ago[flang] Fix interpretations of x87 80-bit Inf/NaN
Peter Klausler [Fri, 26 May 2023 15:50:35 +0000 (08:50 -0700)]
[flang] Fix interpretations of x87 80-bit Inf/NaN

Current implementations of x87 80-bit extended precision floating
point interpret 7FFF8000000000000000 as +Inf, not a Nan.  The explicit
MSB in the significand must be set for an infinity.

Differential Revision: https://reviews.llvm.org/D151739