platform/upstream/llvm.git
18 months ago[InstCombine] add tests for signbit compares; NFC
Sanjay Patel [Mon, 23 Jan 2023 20:48:15 +0000 (15:48 -0500)]
[InstCombine] add tests for signbit compares; NFC

18 months ago[NFC] Remove redundant range check
Scott Linder [Mon, 23 Jan 2023 23:00:15 +0000 (23:00 +0000)]
[NFC] Remove redundant range check

Remove gratuitous check introduced in
25c0ea2a5370813f46686918a84e0de27e107d08 which was generating a warning
when compiling under GCC.

18 months ago[CMake] Replace list(FIND) by if(IN_LIST) where index isn't used
Aaron Puchert [Mon, 23 Jan 2023 22:58:43 +0000 (23:58 +0100)]
[CMake] Replace list(FIND) by if(IN_LIST) where index isn't used

If we don't use the index otherwise, if(IN_LIST) is more readable and
doesn't clutter the local scope with index variables.

This was pointed out by @beanz in D96670.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D142405

18 months ago[profile] Disable test which needs update after D141512
Vitaly Buka [Mon, 23 Jan 2023 22:50:52 +0000 (14:50 -0800)]
[profile] Disable test which needs update after D141512

18 months ago[NFC] Consolidate llvm::CodeGenOpt::Level handling
Scott Linder [Mon, 16 Jan 2023 23:55:22 +0000 (23:55 +0000)]
[NFC] Consolidate llvm::CodeGenOpt::Level handling

Add free functions llvm::CodeGenOpt::{getLevel,getID,parseLevel} to
provide common implementations for functionality that has been
duplicated in many places across the codebase.

Differential Revision: https://reviews.llvm.org/D141968

18 months ago[clang-format] Fix bugs in parsing C++20 module import statements
Owen Pan [Sun, 22 Jan 2023 03:46:41 +0000 (19:46 -0800)]
[clang-format] Fix bugs in parsing C++20 module import statements

Also fixes #60145.

Differential Revision: https://reviews.llvm.org/D142296

19 months ago[clang] Fix unused variable warning in isBuiltinSupported
serge-sans-paille [Mon, 23 Jan 2023 22:11:05 +0000 (23:11 +0100)]
[clang] Fix unused variable warning in isBuiltinSupported

Warnings introduced by cf1756146d386667a80501fb8161505d12950804

19 months ago[Fuchsia] Build windows runtimes using cross compilation on Linux
Haowei Wu [Sat, 14 Jan 2023 00:51:14 +0000 (16:51 -0800)]
[Fuchsia] Build windows runtimes using cross compilation on Linux

This patch provides initial support of building Clang runtimes for
Windows when using Fuchsia Clang toolchains under Linux.

Differential Revision: https://reviews.llvm.org/D141738

19 months agoreadability-const-return-type: don't diagnose a template function returning T, even...
Andy Getzendanner [Mon, 23 Jan 2023 13:36:55 +0000 (13:36 +0000)]
readability-const-return-type: don't diagnose a template function returning T, even if sometimes instantiated with e.g. T = const int.

It's not really a readability problem since there's no `const` to read at the declaration site, and returning std::remove_const_t<T> instead usually only hurts readability.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D140434

19 months agoFix test expected result [NFC]
Nuno Lopes [Mon, 23 Jan 2023 22:04:55 +0000 (22:04 +0000)]
Fix test expected result [NFC]

Take 2
Aopologies for the noise. Metadata numbering doesn't seem to be stable..

19 months ago[docs] Add/update docs regarding LLVM_NATIVE_TOOL_DIR vs LLVM_TABLEGEN
Martin Storsjö [Mon, 23 Jan 2023 12:09:56 +0000 (14:09 +0200)]
[docs] Add/update docs regarding LLVM_NATIVE_TOOL_DIR vs LLVM_TABLEGEN

Differential Revision: https://reviews.llvm.org/D142349

19 months agoFix test expected result [NFC]
Nuno Lopes [Mon, 23 Jan 2023 21:36:56 +0000 (21:36 +0000)]
Fix test expected result [NFC]

For some reason, it only fails on some buildbots, but not in all, and not on my computer..

19 months agoRevert "[build] Fix stand-alone builds of clang."
Francesco Petrogalli [Mon, 23 Jan 2023 21:31:34 +0000 (22:31 +0100)]
Revert "[build] Fix stand-alone builds of clang."

It breaks some builds [1] with the following error:

```
ccache /usr/bin/c++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.build/tools/clang/lib/Basic -I/home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.src/clang/lib/Basic -I/home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.src/clang/include -I/home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.build/tools/clang/include -I/home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.build/include -I/home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.src/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG  -fno-exceptions -fno-rtti -UNDEBUG -std=c++17 -MD -MT tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/RISCV.cpp.o -MF tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/RISCV.cpp.o.d -o tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/RISCV.cpp.o -c /home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.src/clang/lib/Basic/Targets/RISCV.cpp
In file included from /home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.src/clang/lib/Basic/Targets/RISCV.cpp:19:
/home/omp-vega20-0/bbot/openmp-offload-amdgpu-runtime/llvm.src/llvm/include/llvm/TargetParser/RISCVTargetParser.h:29:10: fatal error: llvm/TargetParser/RISCVTargetParserDef.inc: No such file or directory
   29 | #include "llvm/TargetParser/RISCVTargetParserDef.inc"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
2.225 [3029/31/825] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Version.cpp.o
2.524 [3029/30/826] Building RISCVTargetParserDef.inc...
```

[1] https://lab.llvm.org/buildbot/#/builders/193/builds/25362

This reverts commit 52bcdac3b8425e20023151bb726b56fd6f62ec17.

19 months ago[clang-tidy] Improve rename_check.py
Chris Cotter [Mon, 23 Jan 2023 21:24:01 +0000 (21:24 +0000)]
[clang-tidy] Improve rename_check.py

rename_check.py now find and renames the test file. rename_check.py
also will now use 'git mv', so the developer no longer has to manually
add the file after running the script.

Reviewed By: carlosgalvezp

Differential Revision: https://reviews.llvm.org/D141463

19 months ago[clang-tidy][NFC] Use C++17 nested namespaces in clang-tidy headers
Carlos Galvez [Sun, 22 Jan 2023 16:30:10 +0000 (16:30 +0000)]
[clang-tidy][NFC] Use C++17 nested namespaces in clang-tidy headers

We forgot to apply the change to headers in the previous patch,
due to missing "-header-filter" in the run-clang-tidy invocation.

Differential Revision: https://reviews.llvm.org/D142307

19 months ago[Clang] [Python] Fix tests when default config file contains -include
Sam James [Mon, 9 Jan 2023 04:32:10 +0000 (04:32 +0000)]
[Clang] [Python] Fix tests when default config file contains -include

In Gentoo, we make use of Clang's recently-enhanced config file support
and add a default include to `clang` invocations using '-include ...'.

This breaks clang-python tests like so:
```
======================================================================
ERROR: test_includes (tests.cindex.test_translation_unit.TestTranslationUnit)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/dev-python/clang-python-15.0.6/work/clang/bindings/python/tests/cindex/test_translation_unit.py", line 145, in test_includes
    eq(i[0], i[1])
  File "/var/tmp/portage/dev-python/clang-python-15.0.6/work/clang/bindings/python/tests/cindex/test_translation_unit.py", line 132, in eq
    self.assert_normpaths_equal(expected[0], actual.source.name)
AttributeError: 'NoneType' object has no attribute 'name'

======================================================================
FAIL: test_inclusion_directive (tests.cindex.test_translation_unit.TestTranslationUnit)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/dev-python/clang-python-15.0.6/work/clang/bindings/python/tests/cindex/test_translation_unit.py", line 157, in test_inclusion_directive
    self.assert_normpaths_equal(i[0], i[1])
  File "/var/tmp/portage/dev-python/clang-python-15.0.6/work/clang/bindings/python/tests/cindex/test_translation_unit.py", line 126, in assert_normpaths_equal
    self.assertEqual(os.path.normpath(path1),
AssertionError: '/var/tmp/portage/dev-python/clang-python-1[58 chars]r1.h' != '/usr/include/gentoo/fortify.h'
- /var/tmp/portage/dev-python/clang-python-15.0.6/work/clang/bindings/python/tests/cindex/INPUTS/header1.h
+ /usr/include/gentoo/fortify.h
```

Disable using the default Clang configuration files on the system, like
we did for other tests.

Bug: https://bugs.gentoo.org/890204
Differential Revision: https://reviews.llvm.org/D141248

19 months ago[build] Fix stand-alone builds of clang.
Francesco Petrogalli [Mon, 23 Jan 2023 12:27:10 +0000 (13:27 +0100)]
[build] Fix stand-alone builds of clang.

The header file `llvm/include/llvm/Targetparser/RISCVTargetParser.h`
relies on the auto-generated *.inc file associated to the tablegen
target `RISCVTargetParserTableGen`.

Both clangBasic and clangDriver include `RISCVTargetParser.h`,
therefore we need to make sure that the *.inc file is avaiable to
avoid compilation errors like the following:

    FAILED: tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/RISCV.cpp.o
    /usr/bin/c++  [bunch of non interesting stuff]  -c <path-to>/llvm-project/clang/lib/Basic/Targets/RISCV.cpp
    In file included from <path-to>/llvm-project/clang/lib/Basic/Targets/RISCV.cpp:19:
    <path-to>/llvm-project/llvm/include/llvm/TargetParser/RISCVTargetParser.h:29:10: fatal error: llvm/TargetParser/RISCVTargetParserDef.inc: No such file or directory
      29 | #include "llvm/TargetParser/RISCVTargetParserDef.inc"
         |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The stand-alone build of `clang` has been tested with the following
script (see [*] for further information):

```
build_llvm=`pwd`/build-llvm
build_clang=`pwd`/build-clang
installprefix=`pwd`/install
llvm=`pwd`/llvm-project
mkdir -p $build_llvm
mkdir -p $installprefix

cmake -G Ninja -S $llvm/llvm -B $build_llvm \
      -DLLVM_INSTALL_UTILS=ON \
      -DCMAKE_INSTALL_PREFIX=$installprefix \
      -DCMAKE_BUILD_TYPE=Release

ninja -C $build_llvm install

cmake -G Ninja -S $llvm/clang -B $build_clang \
      -DLLVM_EXTERNAL_LIT=$build_llvm/utils/lit \
      -DLLVM_ROOT=$installprefix
```

[*] https://llvm.org/docs/GettingStarted.html#stand-alone-builds

Differential Revision: https://reviews.llvm.org/D141581

19 months ago[libc++] implement P1020R1 P1973R1 make_unique[shared]_for_overwrite
Hui [Tue, 3 Jan 2023 18:51:34 +0000 (18:51 +0000)]
[libc++] implement P1020R1 P1973R1 make_unique[shared]_for_overwrite

Differential Revision: https://reviews.llvm.org/D140913

19 months ago[clang-tidy][NFC] Fix Release Notes build error
Carlos Galvez [Mon, 23 Jan 2023 21:05:08 +0000 (21:05 +0000)]
[clang-tidy][NFC] Fix Release Notes build error

19 months ago[clang-tidy] Introduce HeaderFileExtensions and ImplementationFileExtensions options
Carlos Galvez [Wed, 4 Jan 2023 16:42:16 +0000 (16:42 +0000)]
[clang-tidy] Introduce HeaderFileExtensions and ImplementationFileExtensions options

We have a number of checks designed to analyze problems
in header files only, for example:

bugprone-suspicious-include
google-build-namespaces
llvm-header-guard
misc-definitions-in-header
...

All these checks duplicate the same logic and options
to determine whether a location is placed in the main
source file or in the header. More checks are coming
up with similar requirements.

Thus, to remove duplication, let's move this option
to the top-level configuration of clang-tidy (since
it's something all checks should share).

Since the checks fetch the option via getLocalOrGlobal,
the behavior is unchanged.

Add a deprecation notice for all checks that use the
local option, prompting to update to the global option.

The functionality for parsing the option will need to
remain in the checks during the transition period.
Once the local options are fully removed, the goal
is to store the parsed options in the ClangTidyContext,
that checks can easily have access to.

Differential Revision: https://reviews.llvm.org/D141000

19 months ago[bazel] Enable layering_check for llvm/unittests
Fangrui Song [Mon, 23 Jan 2023 20:56:23 +0000 (12:56 -0800)]
[bazel] Enable layering_check for llvm/unittests

19 months ago[bazel] Fix --features=layering_check issues for llvm/unittests
Fangrui Song [Mon, 23 Jan 2023 20:55:00 +0000 (12:55 -0800)]
[bazel] Fix --features=layering_check issues for llvm/unittests

19 months agoAMDGPU: Clean up LDS-related occupancy calculations
Nicolai Hähnle [Thu, 1 Dec 2022 10:00:30 +0000 (11:00 +0100)]
AMDGPU: Clean up LDS-related occupancy calculations

Occupancy is expressed as waves per SIMD. This means that we need to
take into account the number of SIMDs per "CU" or, to be more precise,
the number of SIMDs over which a workgroup may be distributed.

getOccupancyWithLocalMemSize was wrong because it didn't take SIMDs
into account at all.

At the same time, we need to take into account that WGP mode offers
access to a larger total amount of LDS, since this can affect how
non-power-of-two LDS allocations are rounded. To make this work
consistently, we distinguish between (available) local memory size and
addressable local memory size (which is always limited by 64kB on
gfx10+, even with WGP mode).

This change results in a massive amount of test churn. A lot of it is
caused by the fact that the default work group size is 1024, which means
that (due to rounding effects) the default occupancy on older hardware
is 8 instead of 10, which affects scheduling via register pressure
estimates. I've adjusted most tests by just running the UTC tools, but
in some cases I manually changed the work group size to 32 or 64 to make
sure that work group size chunkiness has no effect.

Differential Revision: https://reviews.llvm.org/D139468

19 months agoAMDGPU: Add a scheduler test to demonstrate an upcoming change
Nicolai Hähnle [Tue, 6 Dec 2022 22:10:15 +0000 (23:10 +0100)]
AMDGPU: Add a scheduler test to demonstrate an upcoming change

19 months agoAMDGPU: Re-run UTC scripts on some test cases
Nicolai Hähnle [Tue, 6 Dec 2022 21:41:08 +0000 (22:41 +0100)]
AMDGPU: Re-run UTC scripts on some test cases

Reduce the diff of subsequent changes.

19 months agoAMDGPU: Add AMDGPUSubtarget::getEUsPerCU()
Nicolai Hähnle [Thu, 1 Dec 2022 09:57:22 +0000 (10:57 +0100)]
AMDGPU: Add AMDGPUSubtarget::getEUsPerCU()

We will use this for more accurate occupancy computations. Note that
IsaInfo takes WGP mode vs. CU mode into account on gfx10+.

Differential Revision: https://reviews.llvm.org/D139467

19 months ago[RISCV] Add a test case for a missed PRE oppurtunity when inserting vsetvlis
Philip Reames [Mon, 23 Jan 2023 20:29:36 +0000 (12:29 -0800)]
[RISCV] Add a test case for a missed PRE oppurtunity when inserting vsetvlis

19 months ago[flang] Avoid unnecessary temporaries in ArrayValueCopy.
Slava Zakharin [Sat, 21 Jan 2023 03:58:36 +0000 (19:58 -0800)]
[flang] Avoid unnecessary temporaries in ArrayValueCopy.

Assume no conflict between pointer arrays and arrays without the target
attribute, if the fact of an array not having the target attribute
can be reliably computed.

This change speeds up SPEC CPU2017/527.cam from 2.5k seconds to 880 seconds
on Icelake, and makes further performance investigation easier.

Differential Revision: https://reviews.llvm.org/D142273

19 months ago[docs] Add release notes for news in 16.x done by me, or otherwise relating to MinGW...
Martin Storsjö [Mon, 23 Jan 2023 11:25:55 +0000 (13:25 +0200)]
[docs] Add release notes for news in 16.x done by me, or otherwise relating to MinGW targets

Differential Revision: https://reviews.llvm.org/D142346

19 months ago[mlir][sparse] clean vectorization bail-out for VL=0
Aart Bik [Sat, 21 Jan 2023 21:05:32 +0000 (13:05 -0800)]
[mlir][sparse] clean vectorization bail-out for VL=0

Fixes https://github.com/llvm/llvm-project/issues/59970

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D142290

19 months ago[bazel] Fix layering_check issues of {llvm,clang}:all
Fangrui Song [Mon, 23 Jan 2023 20:08:13 +0000 (12:08 -0800)]
[bazel] Fix layering_check issues of {llvm,clang}:all

19 months ago[flang] Keep polymorphic aspect when lowering intrinsic arguments
Valentin Clement [Mon, 23 Jan 2023 19:23:02 +0000 (20:23 +0100)]
[flang] Keep polymorphic aspect when lowering intrinsic arguments

Make sure the source passed to an intrinsic is still polymorphic when
it is an element of a polymorphic array. This was not handled properly
before.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D142380

19 months ago[mlir][spirv] Move uint asm name test to the proper place
Lei Zhang [Mon, 23 Jan 2023 19:08:06 +0000 (11:08 -0800)]
[mlir][spirv] Move uint asm name test to the proper place

19 months ago[InstCombine] Handle select inst when eliminating constant memcpy
Anshil Gandhi [Mon, 23 Jan 2023 16:19:55 +0000 (09:19 -0700)]
[InstCombine] Handle select inst when eliminating constant memcpy

Allow iterating through SelectInst use of the alloca when
checking if it is only ever overwritten from constant memory.
Recursively determine if the SelectInst is replacable and insert
it into the Worklist if so. Finally, define a new SelectInst to
replace the old one, with both of it's values replaced according
to the WorkMap.

Differential Revision: https://reviews.llvm.org/D136524

19 months ago[AMDGPU] Use more consistemt way to avoid overflow in the scheduler
Stanislav Mekhanoshin [Mon, 23 Jan 2023 18:59:46 +0000 (10:59 -0800)]
[AMDGPU] Use more consistemt way to avoid overflow in the scheduler

Use more consistent way to avoid overflow when calculating SGPR
and VGPR pressure limits.

Differential Revision: https://reviews.llvm.org/D142262

19 months ago[llvm] Fix warnings
Kazu Hirata [Mon, 23 Jan 2023 18:57:56 +0000 (10:57 -0800)]
[llvm] Fix warnings

This patch fixes:

  llvm/lib/IR/DataLayout.cpp:942:13: warning: unused variable ‘VecTy’
  [-Wunused-variable]

  llvm/lib/Transforms/IPO/OpenMPOpt.cpp:2899:27: warning: unused
  variable ‘MI’ [-Wunused-variable]

19 months ago[TableGen] Avoid repeated lookups of Uses and Defs records. NFC.
Jay Foad [Mon, 23 Jan 2023 18:21:04 +0000 (18:21 +0000)]
[TableGen] Avoid repeated lookups of Uses and Defs records. NFC.

19 months agoSilence an MSVC "not all control paths return" warning; NFC
Aaron Ballman [Mon, 23 Jan 2023 18:43:41 +0000 (13:43 -0500)]
Silence an MSVC "not all control paths return" warning; NFC

19 months agoRun cmdline address expressions through ABI's FixAddress
Jason Molenda [Mon, 23 Jan 2023 18:42:31 +0000 (10:42 -0800)]
Run cmdline address expressions through ABI's FixAddress

On systems like ARM, where the non-addressable bits of a pointer
value may be used for metadata (ARMv8.3 pointer authentication, or
Type Byte Ignore), those bits need to be cleared before the address
points to a valid memory location.  Add a call to the target's ABI
to clear those from address expression arguments to the lldb
commands (e.g. `disassemble -a`).

Differential Revision: https://reviews.llvm.org/D141629

19 months ago[AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling
Stanislav Mekhanoshin [Mon, 16 Jan 2023 23:17:06 +0000 (15:17 -0800)]
[AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling

Unlike older ASICs GFX10+ have a lot of VGPRs. Therefore, it is possible
to achieve high occupancy even with all or almost all addressable VGPRs
used. Our scheduler was never tuned for this scenario. The VGPR Critical
Limit threshold always comes very high, even if maximum occupancy is
targeted. For example on gfx1100 it is set to 192 registers even with
the requested occupancy 16. As a result scheduler starts prioritizing
register pressure reduction very late and we easily end up spilling.

This patch makes VGPR critical limit similar to what we would have on
pre-gfx10 targets with much more limited VGPR budget while still trying
to maintain occupancy as it does now.

Pre-gfx10 ASICs shall not be affected as the limit shall be the same
as before, and on gfx10+ it shall only affect regions where we have
to spill.

Fixes: SWDEV-377300

Differential Revision: https://reviews.llvm.org/D141876

19 months ago[AArch64] Remove AES, SHA2, SHA3 and SM4 features from armv8.6-a+
David Green [Mon, 23 Jan 2023 18:39:17 +0000 (18:39 +0000)]
[AArch64] Remove AES, SHA2, SHA3 and SM4 features from armv8.6-a+

The Armv8.6-a and later architecture definitions included AES, SHA2,
SHA3 and SM4, but this did not have an effect when specifying
-march=armv8.6-a. The did not set preprocessor features
(https://godbolt.org/z/1YKad6M8e) or enable the relevant instructions
(like eor3 from sha3: https://godbolt.org/z/vY9v4MqvG). Similarly
architectures armv8 to armv8.5 defined +crypto, but this did not effect
the -march's, only the -mcpu with those architectures. I believe this
was working as intended.

After D141411 we now add the default features for architectures except
for +crypto, which has had the effect of enabling aes/sha2/sha3/sm4 when
-march=armv8.6-a is used. This patch removed those crypto features
again, going back to how things were before. It also removes the
AEK_CRYPTO feature from lower architecture levels, moving it to the cpus
that use it. This shouldn't make any changes, but a few extra tests have
been added for preprocessor features that have improved since llvm 15.

The -mcpu=ampere1 cpu is the only armv8.6+ cpu at present. For that, the
AES, SHA2 and SHA3 features have been re-added to the CPU definition to
keep it in-line with the gcc definition from
https://github.com/gcc-mirror/gcc/commit/db2f5d661239737157cf131de7d4df1c17d8d88d.

Differential Revision: https://reviews.llvm.org/D141606

19 months ago[AArch64] Function multi-versioning release notes added. NFC.
Pavel Iliin [Fri, 20 Jan 2023 23:31:53 +0000 (23:31 +0000)]
[AArch64] Function multi-versioning release notes added. NFC.

Differential Revision: https://reviews.llvm.org/D142265

19 months ago[tsan] Always initialize tsan when building shared lib
Han Zhu [Fri, 13 Jan 2023 20:18:32 +0000 (12:18 -0800)]
[tsan] Always initialize tsan when building shared lib

Differential Revision: https://reviews.llvm.org/D142039

19 months agoRevert "[AArch64] Function multi-versioning release notes added. NFC."
Pavel Iliin [Mon, 23 Jan 2023 18:22:20 +0000 (18:22 +0000)]
Revert "[AArch64] Function multi-versioning release notes added. NFC."

This reverts commit 5474d7d932710c260f03ce6c6387ec9d82bd10e2.
Wrong differential revision link was used.

19 months ago[Clang] Fix a Wbitfield-enum-conversion warning in DirectoryLookup.h
Shivam Gupta [Mon, 23 Jan 2023 18:18:37 +0000 (23:48 +0530)]
[Clang] Fix a Wbitfield-enum-conversion warning in DirectoryLookup.h

When compiling clang/Lex/DirectoryLookup.h with option -Wbitfield-enum-conversion, we get the following warning:

DirectoryLookup.h:77:17: warning:
      bit-field 'DirCharacteristic' is not wide enough to store all enumerators of
      'CharacteristicKind' [-Wbitfield-enum-conversion]
      : u(Map), DirCharacteristic(DT), LookupType(LT_HeaderMap),

DirCharacteristic is a bitfield with 2 bits (4 values)
  /// DirCharacteristic - The type of directory this is: this is an instance of
  /// SrcMgr::CharacteristicKind.
  unsigned DirCharacteristic : 2;

Whereas SrcMgr::CharacterKind is an enum with 5 values:
enum CharacteristicKind {
  C_User,
  C_System,
  C_ExternCSystem,
  C_User_ModuleMap,
  C_System_ModuleMap
};

Solution is to increase DirCharacteristic bitfield from 2 to 3.
Patch by Dimitri van Heesch

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D142304

19 months ago[RISCV] Move Processors and Features from RISCV.td to their own files.
Craig Topper [Mon, 23 Jan 2023 18:03:06 +0000 (10:03 -0800)]
[RISCV] Move Processors and Features from RISCV.td to their own files.

This reduces RISCV.td to mainly being a top level include file.

Reviewed By: asb, luismarques

Differential Revision: https://reviews.llvm.org/D142239

19 months agoMark BuiltinHeaders.def as textual
Adrian Prantl [Mon, 23 Jan 2023 18:14:15 +0000 (10:14 -0800)]
Mark BuiltinHeaders.def as textual

19 months ago[AArch64] Function multi-versioning release notes added. NFC.
Pavel Iliin [Fri, 20 Jan 2023 23:31:53 +0000 (23:31 +0000)]
[AArch64] Function multi-versioning release notes added. NFC.

Differential Revision: https://reviews.llvm.org/D141606

19 months agoRevert "[lldb] Remove timer from SBModule copy ctor"
Dave Lee [Mon, 23 Jan 2023 17:51:43 +0000 (09:51 -0800)]
Revert "[lldb] Remove timer from SBModule copy ctor"

This reverts commit 84c6129c943135e2c32b9254f08d0a2e7b21116a.

19 months ago[AArch64][SME2] Add Multi-vector saturating extract narrow intrinsics
Caroline Concatto [Mon, 23 Jan 2023 17:15:34 +0000 (17:15 +0000)]
[AArch64][SME2] Add Multi-vector saturating extract narrow intrinsics

Add the following intrinsic:
  SQCVT
  SQCVTU
  UQCVT

NOTE: These intrinsics are still in development and are subject to future changes.

Reviewed By: kmclaughlin

Differential Revision: https://reviews.llvm.org/D142035

19 months ago[SCCPSolver] Move helper functions inside SCCPSolver (NFC).
Florian Hahn [Mon, 23 Jan 2023 17:41:12 +0000 (17:41 +0000)]
[SCCPSolver] Move helper functions inside SCCPSolver (NFC).

This patch moves a couple of helper functions from the global llvm::
namespace into the SCCPSolver class. This reduces the need for separate
SCCPSolver arguments and also limits the scope of those functions that
have quite generic names.

(The remaining isConstant and isOverdefined should ideally be removed)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142370

19 months ago[SCCP] Auto-generate check lines for ip-ranges-select.ll.
Florian Hahn [Mon, 23 Jan 2023 17:40:30 +0000 (17:40 +0000)]
[SCCP] Auto-generate check lines for ip-ranges-select.ll.

19 months ago[libc++][doc] Fixes the usage of improper markup.
Mark de Wever [Mon, 23 Jan 2023 17:26:18 +0000 (18:26 +0100)]
[libc++][doc] Fixes the usage of improper markup.

19 months ago[AArch64][SME2] Add multi-vector convert to/from floating-point intrinsic
Caroline Concatto [Mon, 23 Jan 2023 16:25:07 +0000 (16:25 +0000)]
[AArch64][SME2] Add multi-vector convert to/from floating-point intrinsic

Add the following intrinsic:

  FCVT
  BFCVT
  FCVTZS
  FCVTZU
  SCVTF
  UCVTF

This patch also adds SelectCVTIntrinsic to handle the cases when the
intrinsic returns multiple (two or four) outputs

NOTE: These intrinsics are still in development and are subject to future changes.

Reviewed By: kmclaughlin

Differential Revision: https://reviews.llvm.org/D142032

19 months agoAdd support for clang-cl's option `-fexcess-precision`.
Zahira Ammarguellat [Mon, 23 Jan 2023 15:40:10 +0000 (10:40 -0500)]
Add support for clang-cl's option `-fexcess-precision`.

This option is useful for clang and clang-cl.

Differential Revision: https://reviews.llvm.org/D142367

19 months ago[mlir] support unsigned int in mlir::spirv::ConstantOp::getAsmResultNames
Xiang Li [Sun, 22 Jan 2023 03:24:41 +0000 (22:24 -0500)]
[mlir] support unsigned int in mlir::spirv::ConstantOp::getAsmResultNames

Fixes #60184  https://github.com/llvm/llvm-project/issues/60184

Differential Revision: https://reviews.llvm.org/D142295

19 months ago[libc][NFC] Reduce CMake configuration time
Guillaume Chatelet [Mon, 23 Jan 2023 16:52:46 +0000 (16:52 +0000)]
[libc][NFC] Reduce CMake configuration time

This patch reduces CMake configuration time drastically by removing a non-linear behavior.
Time to execute CMake configure step goes from 45s to 15s.

Differential Revision: https://reviews.llvm.org/D142374

19 months ago[AArch64][Clang] Adjust default features for v8.9-A/v9.4-A in clang driver
Lucas Prates [Wed, 21 Dec 2022 16:45:38 +0000 (16:45 +0000)]
[AArch64][Clang] Adjust default features for v8.9-A/v9.4-A in clang driver

Update the clang driver to include the following features as default for
the v8.9-A/v9.4-A architecture versions:

* FEAT_SPECRES2
* FEAT_CSSC
* FEAT_RASv2

Patch by Sam Elliott.

Reviewed By: lenary, tmatheson

Differential Revision: https://reviews.llvm.org/D141404

19 months ago[AArch64] Add command line support for v9.4-A's Instrumentation Extension
Lucas Prates [Thu, 19 Jan 2023 14:46:06 +0000 (14:46 +0000)]
[AArch64] Add command line support for v9.4-A's Instrumentation Extension

This introduces command line support (`+ite`) for the v9.4-A's
Instrumentation Extension (FEAT_ITE).

Patch by Son Tuan Vu.

Reviewed By: lenary, tmatheson

Differential Revision: https://reviews.llvm.org/D141403

19 months ago[SCCP] Add initial tests for NUW/NSW inference.
Florian Hahn [Mon, 23 Jan 2023 16:14:56 +0000 (16:14 +0000)]
[SCCP] Add initial tests for NUW/NSW inference.

19 months ago[docs] add early Arm arch support improvements to release notes
Ties Stuij [Mon, 23 Jan 2023 16:08:28 +0000 (16:08 +0000)]
[docs] add early Arm arch support improvements to release notes

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D142229

19 months ago[mlir][MemRefToLLVM] Remove the code for lowering collaspe/expand_shape
Quentin Colombet [Mon, 23 Jan 2023 10:37:02 +0000 (10:37 +0000)]
[mlir][MemRefToLLVM] Remove the code for lowering collaspe/expand_shape

collapse/expand_shape are supposed to be expanded before we hit the
lowering code.
The expansion is done with the pass called expand-strided-metadata.

This patch is NFC in spirit but not in practice because
expand-strided-metadata won't try to accomodate for "invalid" strides
for dynamic sizes that are 1 at runtime.

The previous code was broken in that respect too, but differently: it
handled only the case of row-major layouts.
That whole part is being reworked separately.

Differential Revision: https://reviews.llvm.org/D136483

19 months agobazel: adapt for https://github.com/llvm/llvm-project/commit/a4699a43e42615281c96599d...
Krasimir Georgiev [Mon, 23 Jan 2023 15:38:46 +0000 (15:38 +0000)]
bazel: adapt for https://github.com/llvm/llvm-project/commit/a4699a43e42615281c96599d20977cabf10bfb9c

19 months ago[AArch64] Check 128-bit Sysreg Builtins
Archibald Elliott [Tue, 20 Dec 2022 16:08:06 +0000 (16:08 +0000)]
[AArch64] Check 128-bit Sysreg Builtins

This patch contains several related changes:

1. We move to using TARGET_BUILTIN for the 128-bit system register
   builtins to give better error messages when d128 has not been
   enabled, or has been enabled in a per-function manner.

2. We now validate the inputs to the 128-bit system register builtins,
   like we validate the other system register builtins.

3. We update the list of named PSTATE accessors for MSR (immediate), and
   now correctly enforce the expected ranges of the immediates. There is
   a long comment about how we chose to do this to comply with the ACLE
   when most of the PSTATE accessors for MSR (immediate) have aliased
   system registers for MRS/MSR which expect different values. In short,
   the MSR (immediate) names are prioritised, rather than falling-back
   to the register form when the value is out of range.

Differential Revision: https://reviews.llvm.org/D140222

19 months ago[mlir] fix outdated assert in affine symbol verification
Alex Zinenko [Fri, 20 Jan 2023 13:42:09 +0000 (13:42 +0000)]
[mlir] fix outdated assert in affine symbol verification

The verification of affine value classification for symbols was
expecting, incorrectly, that the dimension operand of `memref.dim` was
being produced by a constant-like operation. This is legacy of the
dimension being an attribute originally, and was never updated after it
was switched to be an operation. Treat such cases conservatively and
classify the value as non-symbol.

A more advanced version could attempt to check that the value would be a
valid symbol for all possible values the dimension attribute could take,
but this does not seem immediately useful.

Fixes #59993.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D142204

19 months ago[mlir] fix side effects for transform.AlternativesOp
Alex Zinenko [Fri, 20 Jan 2023 12:17:59 +0000 (12:17 +0000)]
[mlir] fix side effects for transform.AlternativesOp

It should have an "Allocate" effect on entry block arguments of all
regions in addition to consuming the operand.

Also relax the assertion in transform-dialect-check-uses until we can
properly support region-based control flow.

Fixes #60075.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D142200

19 months ago[mlir] add RemoveConstantIfCondition to populateOpenACCToSCFConversionPatterns
Xiang Li [Sat, 21 Jan 2023 16:36:05 +0000 (11:36 -0500)]
[mlir] add RemoveConstantIfCondition to populateOpenACCToSCFConversionPatterns

Fixes #60058  https://github.com/llvm/llvm-project/issues/60058
It hit assert when legalizePatternResult on success of ExpandIfCondition which did nothing just return success when if condition is constant.

Added RemoveConstantIfCondition to remove the if cond by getCanonicalizationPatterns.
Also remove the check for constant if cond in ExpandIfCondition and
change check ifCond to assert because only op with ifCond will need legalize in ConvertOpenACCToSCFPass

Differential Revision: https://reviews.llvm.org/D142286

19 months ago[AArch64] Support v8.9-A/v9.4-A in .arch_extension directive
Lucas Prates [Wed, 21 Dec 2022 16:22:35 +0000 (16:22 +0000)]
[AArch64] Support v8.9-A/v9.4-A in .arch_extension directive

This adds support for the v8.9-A/v9.4-A architectural extensions to be
used in .arch_extension assembly directives.

Patch by Sam Elliott.

Reviewed By: lenary, tmatheson

Differential Revision: https://reviews.llvm.org/D141402

19 months ago[AArch64] Add missing system register for v8.9-A/v9.4-A Permission Indirection Extension
Lucas Prates [Tue, 20 Dec 2022 17:34:01 +0000 (17:34 +0000)]
[AArch64] Add missing system register for v8.9-A/v9.4-A Permission Indirection Extension

This adds support for the missing `PIRE0_EL12` system register, part of
v8.9-A/v9.4-A's Permission Indirection Extension.

Patch by Son Tuan Vu.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D141400

19 months ago[Libomptarget][NFC] Address a few warnings in libomptarget
Joseph Huber [Mon, 23 Jan 2023 14:55:35 +0000 (08:55 -0600)]
[Libomptarget][NFC] Address a few warnings in libomptarget

Summary:
Fix a few minor warnings that show up in `libomptarget`.

19 months ago[Libomptarget] Include "hsa/hsa.h" instead
Joseph Huber [Mon, 23 Jan 2023 14:42:36 +0000 (08:42 -0600)]
[Libomptarget] Include "hsa/hsa.h" instead

Summary:
Recently AMD moved the "hsa.h" include to "hsa/hsa.h". This causes
several warning. This patch checks to see if we can include that one
instead. This should hopefully keep things backwards compatible while
silencing the warnings.

19 months ago[Libomptarget][NFC] Silence unknown CUDA version warnings
Joseph Huber [Mon, 23 Jan 2023 14:13:30 +0000 (08:13 -0600)]
[Libomptarget][NFC] Silence unknown CUDA version warnings

Summary:
These warnings are very loud considering they get repeated at least 30
times each build. This patch just silences them.

19 months ago[llvm][tablegen][jupyter] Fixup README
David Spickett [Mon, 23 Jan 2023 14:47:56 +0000 (14:47 +0000)]
[llvm][tablegen][jupyter] Fixup README

Make the first line a title and relative link
to the Markdown of the demo notebook.

19 months ago[Clang[NFC] Fix bitmask for NullabilityPayload in Types.h
Shivam Gupta [Mon, 23 Jan 2023 12:46:21 +0000 (18:16 +0530)]
[Clang[NFC] Fix bitmask for NullabilityPayload in Types.h

Found by PVS-Studio - https://pvs-studio.com/en/blog/posts/cpp/1003/, N37.

The code you is using the bit mask NullabilityKindMask which is 0x3
(00000011 in binary) to clear the bits in the NullabilityPayload variable.
Since NullabilityPayload is a 64-bit variable and NullabilityKindMask is
only a 8-bit variable(0x3), it will only affect the last 8 bits of the
variable. The higher 56 bits will remain unchanged.

Differential Revision: https://reviews.llvm.org/D142334

19 months ago[MC] Define and use MCInstrDesc implicit_uses and implicit_defs. NFC.
Jay Foad [Wed, 11 Jan 2023 12:20:02 +0000 (12:20 +0000)]
[MC] Define and use MCInstrDesc implicit_uses and implicit_defs. NFC.

The new methods return a range for easier iteration. Use them everywhere
instead of getImplicitUses, getNumImplicitUses, getImplicitDefs and
getNumImplicitDefs. A future patch will remove the old methods.

In some use cases the new methods are less efficient because they always
have to scan the whole uses/defs array to count its length, but that
will be fixed in a future patch by storing the number of implicit
uses/defs explicitly in MCInstrDesc. At that point there will be no need
to 0-terminate the arrays.

Differential Revision: https://reviews.llvm.org/D142215

19 months ago[flang] Add conditional rebox when passing fir.box to optional fir.class
Valentin Clement [Mon, 23 Jan 2023 14:41:23 +0000 (15:41 +0100)]
[flang] Add conditional rebox when passing fir.box to optional fir.class

When a `!fir.box<>` is passed as an actual argument to an optional
`!fir.class<>` dummy it needs a `fir.rebox` in order to propagate
the dynamic type information.
The `fir.rebox` needs to happen only on present argument.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D142340

19 months ago[SCCP] Regenerate check lines for some tests.
Florian Hahn [Mon, 23 Jan 2023 14:29:33 +0000 (14:29 +0000)]
[SCCP] Regenerate check lines for some tests.

19 months ago[DAG] visitAnd - fold (and (ext (and V, c1)), c2) -> (and (ext V), (and c1, (ext...
Simon Pilgrim [Mon, 23 Jan 2023 14:17:56 +0000 (14:17 +0000)]
[DAG] visitAnd - fold (and (ext (and V, c1)), c2) -> (and (ext V), (and c1, (ext c2)))

Also, move the XformToShuffleWithZero and combineCarryDiamond folds later after some of the more basic canonicalizations/combines (such as this) have had a chance to occur

Fixes the v8i1-masks.ll regression from D127115

19 months ago[include-cleaner] Ranking of providers based on hints
Kadir Cetinkaya [Tue, 29 Nov 2022 14:49:32 +0000 (15:49 +0100)]
[include-cleaner] Ranking of providers based on hints

Introduce signals to rank providers of a symbol.

Differential Revision: https://reviews.llvm.org/D139921

19 months ago[LLVM][TableGen] Support combined cells in jupyter kernel
David Spickett [Tue, 23 Aug 2022 10:36:46 +0000 (11:36 +0100)]
[LLVM][TableGen] Support combined cells in jupyter kernel

This changes the default mode to cache the code blocks we're
asked to compile until we see the new `%reset` magic to clear that cache.

This means that if you run several cells in sequence, at the end you're
compiling the code from all the cells at once.

This emulates what the ipython kernel does where it uses a persistent
interpreter state by default.

`%reset` will only be acted on when it's in the cell we're asked to run
(the newest code).

`%args` we will use the most recent value we have cached.

The example notebook has been updated to explain that.

Depends on D132378

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D132646

19 months ago[LLVM][TableGen] Add jupyter kernel for llvm-tblgen
David Spickett [Mon, 22 Aug 2022 14:22:07 +0000 (15:22 +0100)]
[LLVM][TableGen] Add jupyter kernel for llvm-tblgen

This is based on the MLIR opt kernel:
https://github.com/llvm/llvm-project/tree/main/mlir/utils/jupyter

The inent of this is to enable experimentation and the creation
of interactive tutorials for the basics of tablegen.

Noteable changes from that:
* Removed the codemirror mode settings since those won't exist
  for tablegen.
* Added "%args" "magic" to control arguments sent to llvm-tblgen.

(magics are directives, see
https://ipython.readthedocs.io/en/stable/interactive/magics.html)

For example the following:
```
%args --print-detailed-records
class Stuff {}

def water_bottle : Stuff {}
```
Produces:
```
DETAILED RECORDS for file -

-------------------- Global Variables (0) --------------------

-------------------- Classes (1) --------------------

Stuff  |<stdin>:1|
  Template args: (none)
  Superclasses: (none)
  Fields: (none)

-------------------- Records (1) --------------------

water_bottle  |<stdin>:3|
  Superclasses: Stuff
  Fields: (none)
```

Reviewed By: jpienaar, awarzynski

Differential Revision: https://reviews.llvm.org/D132378

19 months agoFix MSVC "not all control paths return a value" warning. NFC.
Simon Pilgrim [Mon, 23 Jan 2023 14:08:20 +0000 (14:08 +0000)]
Fix MSVC "not all control paths return a value" warning. NFC.

19 months ago[NFC][Instcombine] More trunc fp-to-int tests.
Samuel Parker [Mon, 23 Jan 2023 14:10:47 +0000 (14:10 +0000)]
[NFC][Instcombine] More trunc fp-to-int tests.

19 months ago[VPlan] Switch default graph traits to be recursive, update VPDomTree.
Florian Hahn [Mon, 23 Jan 2023 14:00:42 +0000 (14:00 +0000)]
[VPlan] Switch default graph traits to be recursive, update VPDomTree.

This updates the GraphTraits specialization for VPBlockBase to recurse
through VPRegionBlocks.

This in turn enables using VPDominatorTree to query dominance between
any block in a plan. This should enable additional use cases, including
improvements to def-use verification and porting IR-based transforms
that rely on the dominator tree.

Specifically, this change means that for regions, the entry and exit
blocks dominate the successors of the region.

Depends on D140512 and D142162.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D140513

19 months ago[clang] Optimize clang::Builtin::Info density
serge-sans-paille [Tue, 17 Jan 2023 09:34:59 +0000 (10:34 +0100)]
[clang] Optimize clang::Builtin::Info density

Reorganize clang::Builtin::Info to have them naturally align on 4 bytes
boundaries.

Instead of storing builtin headers as a straight char pointer, enumerate
them and store the enum. It allows to use a small enum instead of a
pointer to reference them.

On a 64 bit machine, this brings sizeof(clang::Builtin::Info) from 56
down to 48 bytes.

On a release build on my Linux 64 bit machine, it shrinks the size of
libclang-cpp.so by 193kB.

The impact on performance is negligible in terms of instruction count,
but the wall time seems better, see
https://llvm-compile-time-tracker.com/compare.php?from=b3d8639f3536a4876b511aca9fb7948ff9266cee&to=a89b56423f98b550260a58c41e64aff9e56b76be&stat=task-clock

Differential Revision: https://reviews.llvm.org/D142024

19 months ago[NFC][AArch64] Rename SVE2p1 sclamp and uclamp tests
David Sherwood [Mon, 23 Jan 2023 13:07:18 +0000 (13:07 +0000)]
[NFC][AArch64] Rename SVE2p1 sclamp and uclamp tests

Both sclamp and uclamp are part of the SVE2p1 feature so I've
renamed the tests accordingly:

sve2-intrinsics-sclamp.ll -> sve2p1-intrinsics-sclamp.ll
sve2-intrinsics-uclamp.ll -> sve2p1-intrinsics-uclamp.ll

19 months ago[Clang][NFC] Remove documentation and mentions of deleted tools
Joseph Huber [Mon, 23 Jan 2023 13:13:33 +0000 (07:13 -0600)]
[Clang][NFC] Remove documentation and mentions of deleted tools

Summary:
These tools were deleted since LLVM 15. They are no longer present so we
should damnatio memoriae.

19 months ago[Clang] Remove flaky test line from linker wrapper test
Joseph Huber [Mon, 23 Jan 2023 13:06:53 +0000 (07:06 -0600)]
[Clang] Remove flaky test line from linker wrapper test

Summary:
This test is a little flaky and isn't as necessary anymore now that we
only generate one temporary file.

19 months ago[InstCombine] Make worklist check in memcpy from constant fold more precise
Nikita Popov [Mon, 23 Jan 2023 13:14:19 +0000 (14:14 +0100)]
[InstCombine] Make worklist check in memcpy from constant fold more precise

The phi operands need to be either in the worklist or be the
alloca itself, because that one does not require replacement.

19 months ago[AArch64][SME2] MOVA tile-to-vector and vector-to-tile should not accept VG suffix
Sander de Smalen [Thu, 12 Jan 2023 12:30:12 +0000 (12:30 +0000)]
[AArch64][SME2] MOVA tile-to-vector and vector-to-tile should not accept VG suffix

Reviewed By: MattDevereau

Differential Revision: https://reviews.llvm.org/D141601

19 months ago[AArch64][SME2] NFC: Simplify multiclasses for mova/movaz.
Sander de Smalen [Fri, 20 Jan 2023 11:46:49 +0000 (11:46 +0000)]
[AArch64][SME2] NFC: Simplify multiclasses for mova/movaz.

Reviewed By: CarolineConcatto

Differential Revision: https://reviews.llvm.org/D142198

19 months ago[AArch64][SME] Allow predicate-as-counter operands for psel
Sander de Smalen [Thu, 12 Jan 2023 13:03:44 +0000 (13:03 +0000)]
[AArch64][SME] Allow predicate-as-counter operands for psel

The specification says:

  For programmer convenience, an assembler must also accept
  predicate-as-counter register names for the destination predicate
  register and the first source predicate register

Reviewed By: CarolineConcatto, MattDevereau

Differential Revision: https://reviews.llvm.org/D141603

19 months ago[BOLT] Fix build error after D142214
Jay Foad [Mon, 23 Jan 2023 12:54:38 +0000 (12:54 +0000)]
[BOLT] Fix build error after D142214

19 months ago[Test] Add test exercising scenarios of widening into loop-invariant condition
Max Kazantsev [Mon, 23 Jan 2023 12:24:41 +0000 (19:24 +0700)]
[Test] Add test exercising scenarios of widening into loop-invariant condition

19 months ago[Test] Add test for PR60234
Max Kazantsev [Mon, 23 Jan 2023 12:19:40 +0000 (19:19 +0700)]
[Test] Add test for PR60234

https://github.com/llvm/llvm-project/issues/60234 explains how widening
of a branch by loop-invariant condition is causing a miscompile.

19 months ago[AArch64][SVE2p1] Add SVE2.1 fclamp intrinsic
David Sherwood [Tue, 17 Jan 2023 15:44:09 +0000 (15:44 +0000)]
[AArch64][SVE2p1] Add SVE2.1 fclamp intrinsic

Adds an intrinsic for the following instruction:

* fclamp

Differential Revision: https://reviews.llvm.org/D141942

19 months ago[X86][ABI] Don't preserve return regs for preserve_all/preserve_most CCs
Anton Bikineev [Wed, 4 Jan 2023 23:51:21 +0000 (00:51 +0100)]
[X86][ABI] Don't preserve return regs for preserve_all/preserve_most CCs

Currently both calling conventions preserve registers that are used to
store a return value. This causes the returned value to be lost:

  define i32 @bar() {
    %1 = call preserve_mostcc i32 @foo()
    ret i32 %1
  }

  define preserve_mostcc i32 @foo() {
    ret i32 2
    ; preserve_mostcc will restore %rax,
    ; whatever it was before the call.
  }

This contradicts the current documentation (preserve_allcc "behaves
identical to the `C` calling conventions on how arguments and return
values are passed") and also breaks [[clang::preserve_most]].

This change makes CSRs be preserved iff they are not used to store a
return value (e.g. %rax for scalars, {%rax:%rdx} for __int128, %xmm0
for double). For void functions no additional registers are
preserved, i.e. the behaviour is backward compatible with existing
code.

Differential Revision: https://reviews.llvm.org/D141020

19 months ago[LLDB] Fix build error after D142214
Jay Foad [Mon, 23 Jan 2023 12:27:50 +0000 (12:27 +0000)]
[LLDB] Fix build error after D142214

19 months ago[IR] Avoid creation of GEPs into vectors (in one place)
Jannik Silvanus [Thu, 19 Jan 2023 15:04:45 +0000 (16:04 +0100)]
[IR] Avoid creation of GEPs into vectors (in one place)

The method DataLayout::getGEPIndexForOffset(Type *&ElemTy, APInt &Offset)
allows to generate GEP indices for a given byte-based offset.
This allows to generate "natural" GEPs using the given type structure
if the byte offset happens to match a nested element object.

With opaque pointers and a general move towards byte-based GEPs [1],
this function may be questionable in the future.

This patch avoids creation of GEPs into vectors in routines that use
DataLayout::getGEPIndexForOffset by not returning indices in that case.

The reason is that A) GEPs into vectors have been discouraged for a long
time [2], and B) that GEPs into vectors are currently broken if the element
type is overaligned [1]. This is also demonstrated by a lit test where
previously InstCombine replaced valid loads by poison. Note that
the result of InstCombine on that test is *still* invalid, because
padding bytes are assumed.
Moreover, GEPs into vectors may be outright forbidden in the future [1].

[1]: https://discourse.llvm.org/t/67497
[2]: https://llvm.org/docs/GetElementPtr.html

The test case is new. It will be precommitted if this patch is accepted.

Differential Revision: https://reviews.llvm.org/D142146

19 months ago[Transforms] Add lit test for instcombine on load into vector of overaligned elements.
Jannik Silvanus [Thu, 19 Jan 2023 17:56:11 +0000 (18:56 +0100)]
[Transforms] Add lit test for instcombine on load into vector of overaligned elements.

The result is currently broken in two ways:

 - Valid loads are replaced by poison
 - An array-like layout with padding bytes is assumed

This commit serves as precommit for a patch that addresses the first issue.
The second issue will remain a TODO.

Contributors:
    Sebastian Neubauer <sebastian.neubauer@amd.com>