platform/upstream/llvm.git
2 years ago[X86] Replace X86ISD::AVG with generic ISD::AVGCEILU
David Green [Fri, 11 Feb 2022 18:57:18 +0000 (18:57 +0000)]
[X86] Replace X86ISD::AVG with generic ISD::AVGCEILU

Pulled out of D106237, this replaces the X86ISD::AVG DAG node with the
generic ISD::AVGCEILU. It doesn't remove the detectAVGPattern method,
but the extra generic ISel matching does alter the existing test.

Differential Revision: https://reviews.llvm.org/D119073

2 years ago[hwasan] keep debug intrinsicts in AllocaInfo.
Florian Mayer [Thu, 10 Feb 2022 23:57:47 +0000 (15:57 -0800)]
[hwasan] keep debug intrinsicts in AllocaInfo.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119498

2 years ago[flang] Allow mixed association of procedure pointers and targets
Peter Steinfeld [Thu, 10 Feb 2022 03:46:34 +0000 (19:46 -0800)]
[flang] Allow mixed association of procedure pointers and targets

Section 10.2.2.4, paragraph 3 states that a procedure pointer with an explicit
interface must have the same characteristics as its target.  Previously, we
interpreted this as disallowing such pointers to point to procedures with
implicit interfaces.  But several other compilers allow this.

We make an exception for the case where the explicit interface cannot be
called via an implicit interface.

This change makes us allow this, also

Differential Revision: https://reviews.llvm.org/D119404

2 years ago[lld/coff] Make lld-link work in a non-MSVC shell, add /winsysroot:
Peter Kasting [Fri, 11 Feb 2022 18:45:58 +0000 (13:45 -0500)]
[lld/coff] Make lld-link work in a non-MSVC shell, add /winsysroot:

Makes lld-link work in a non-MSVC shell by autodetecting MSVC toolchain. Also
adds support for /winsysroot and a few other switches.

All this is done by refactoring to share code with clang-cl's existing support
for the same.

Differential Revision: https://reviews.llvm.org/D118070

2 years ago[gn build] Manually port c7eb84634519e6497
Arthur Eubanks [Fri, 11 Feb 2022 18:51:52 +0000 (10:51 -0800)]
[gn build] Manually port c7eb84634519e6497

Since the bot is broken due to hwasan issues, it's not auto updating the file lists.

2 years ago[nfc] [hwasan] factor out logic to collect info about stack
Florian Mayer [Thu, 3 Feb 2022 19:56:42 +0000 (11:56 -0800)]
[nfc] [hwasan] factor out logic to collect info about stack

this is the first step in unifying some of the logic between hwasan and
mte stack tagging. this only moves around code, changes to converge
different implementations of the same logic follow later.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118947

2 years ago[RGT] Refactor environment-specific checks to use GTEST_SKIP()
Paul Robinson [Fri, 11 Feb 2022 18:43:22 +0000 (10:43 -0800)]
[RGT] Refactor environment-specific checks to use GTEST_SKIP()

This allows using GTEST_SKIP() to identify un-executed tests.

Found by the Rotten Green Tests project.

2 years ago[RGT] Refactor Windows-specific checks into their own test
Paul Robinson [Fri, 11 Feb 2022 18:42:38 +0000 (10:42 -0800)]
[RGT] Refactor Windows-specific checks into their own test

This allows using GTEST_SKIP() to identify un-executed tests.

Found by the Rotten Green Tests project.

2 years ago[RGT] Exercise both paths through a test
Paul Robinson [Fri, 11 Feb 2022 18:41:24 +0000 (10:41 -0800)]
[RGT] Exercise both paths through a test

BitcastToGEP had an opaque/typed pointer decision point, make sure it
exercises both sides.

Found by the Rotten Green Tests project.

2 years ago[OpenMP][FIX] The `llvm.amdgcn.s.barrier` is actually not aligned
Johannes Doerfert [Fri, 11 Feb 2022 18:22:51 +0000 (12:22 -0600)]
[OpenMP][FIX] The `llvm.amdgcn.s.barrier` is actually not aligned

If we assume `llvm.amdgcn.s.barrier` is aligned we may remove it and
cause OpenMP GPU applications on the AMD GPU to be stuck or wrongly
synchronized.

Reported by Carlo Bertolli.

2 years ago[clang][OpaquePtr] Remove call to getPointerElementType() in CodeGenModule::GetAddrOf...
Arthur Eubanks [Fri, 11 Feb 2022 18:39:26 +0000 (10:39 -0800)]
[clang][OpaquePtr] Remove call to getPointerElementType() in CodeGenModule::GetAddrOfGlobalTemporary()

2 years agosanitizer_common: make internal/external headers compatible
Dmitry Vyukov [Fri, 11 Feb 2022 15:11:23 +0000 (16:11 +0100)]
sanitizer_common: make internal/external headers compatible

This is a follow up to 4f3f4d672254
("sanitizer_common: fix __sanitizer_get_module_and_offset_for_pc signature mismatch")
which fixes a similar problem for msan build.

I am getting the following error compiling a unit test for code that
uses sanitizer_common headers and googletest transitively includes
sanitizer interface headers:

In file included from third_party/gwp_sanitizers/singlestep_test.cpp:3:
In file included from sanitizer_common/sanitizer_common.h:19:
sanitizer_interface_internal.h:41:5: error: typedef redefinition with different types
('struct __sanitizer_sandbox_arguments' vs 'struct __sanitizer_sandbox_arguments')
  } __sanitizer_sandbox_arguments;
common_interface_defs.h:39:3: note: previous definition is here
} __sanitizer_sandbox_arguments;

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119546

2 years ago[PSE] Allow duplicate predicates in debug output
Philip Reames [Fri, 11 Feb 2022 18:37:39 +0000 (10:37 -0800)]
[PSE] Allow duplicate predicates in debug output

This lets us avoid redundant implication work in the constructor of SCEVUnionPredicate which simplifies an upcoming change.  If we're actually building a predicate via PSE, that goes through addPredicate which does include the implication check.

2 years ago[X86] combineVSelectToBLENDV - handle vselect(vXi1,A,B) -> blendv(sext(vXi1),A,B)
Simon Pilgrim [Fri, 11 Feb 2022 18:38:07 +0000 (18:38 +0000)]
[X86] combineVSelectToBLENDV - handle vselect(vXi1,A,B) -> blendv(sext(vXi1),A,B)

For pre-AVX512 targets, attempt to sign-extend a vXi1 condition mask to pass to a X86ISD::BLENDV node

Fixes Issue #53760

2 years ago[AMDGPU] Merge AMDGPULDSUtils into AMDGPUMemoryUtils
Stanislav Mekhanoshin [Fri, 11 Feb 2022 00:30:10 +0000 (16:30 -0800)]
[AMDGPU] Merge AMDGPULDSUtils into AMDGPUMemoryUtils

Differential Revision: https://reviews.llvm.org/D119502

2 years ago[ISel] Port AArch64 HADD and RHADD to ISel
David Green [Fri, 11 Feb 2022 18:28:56 +0000 (18:28 +0000)]
[ISel] Port AArch64 HADD and RHADD to ISel

This ports the aarch64 combines for HADD and RHADD over to DAG combine,
so that they can be used in more architectures (notably MVE in a
followup patch). They are renamed to AVGFLOOR and AVGCEIL in the
process, to avoid confusion with instructions such as X86 hadd. The code
was also rewritten slightly to remove the AArch64 idiosyncrasies.

The general pattern for a AVGFLOORS is
  %xe = sext i8 %x to i32
  %ye = sext i8 %y to i32
  %a = add i32 %xe, %ye
  %r = lshr i32 %a, 1
  %t = trunc i32 %r to i8

An AVGFLOORU is equivalent with zext. Because of the truncate
lshr==ashr, as the top bits are not demanded. An AVGCEIL also includes
an extra rounding, so includes an extra add of 1.

Differential Revision: https://reviews.llvm.org/D106237

2 years ago[AlwaysInliner] Respect noinline call site attribute
Dávid Bolvanský [Fri, 11 Feb 2022 18:22:51 +0000 (19:22 +0100)]
[AlwaysInliner] Respect noinline call site attribute

```
always_inline foo() { }

bar () {

noinline foo();
}
```

We should prefer call site attribute over attribute on decl. This is fix for AlwaysInliner, similar fix is needed for normal Inliner (follow up).

Related to https://reviews.llvm.org/D119061

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D119553

2 years ago[CodeView] Match any backend version in the new test
Reid Kleckner [Fri, 11 Feb 2022 18:20:10 +0000 (10:20 -0800)]
[CodeView] Match any backend version in the new test

This makes the test pass for any LLVM_VERSION_MAJOR/MINOR value. Vendors
override these, and they change every six months.

2 years ago[libc++] Remove __functional_base
Nikolas Klauser [Fri, 11 Feb 2022 18:15:18 +0000 (19:15 +0100)]
[libc++] Remove __functional_base

Reviewed By: ldionne, Quuxplusone, #libc

Spies: Mordante, mgorny, libcxx-commits, arichardson, llvm-commits, arphaman

Differential Revision: https://reviews.llvm.org/D119439

2 years agoRevert "StackProtector: ignore debug insts when splitting blocks."
Tim Northover [Fri, 11 Feb 2022 18:06:28 +0000 (18:06 +0000)]
Revert "StackProtector: ignore debug insts when splitting blocks."

This reverts commit 7605ca85f1a8e4e61e7de98856630d67da11aaae.

It caused an assertion failure in Fuschia.

2 years ago[InferAddressSpaces] Fix assert on invalid cast ordering
Austin Kerbow [Fri, 11 Feb 2022 06:43:45 +0000 (22:43 -0800)]
[InferAddressSpaces] Fix assert on invalid cast ordering

If a cast is needed when replacing uses with newly created values, the
cast must be inserted after the instruction that defines the new value.

Fixes: SWDEV-321215

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D119524

2 years ago[CoroFrame][OpaquePtr] Remove getPointerElementType() call
Arthur Eubanks [Fri, 11 Feb 2022 17:52:44 +0000 (09:52 -0800)]
[CoroFrame][OpaquePtr] Remove getPointerElementType() call

Get it from the byval type instead.

2 years ago[Hexagon] Add patterns for select(i1, Q, Q)
Krzysztof Parzyszek [Fri, 11 Feb 2022 17:19:00 +0000 (09:19 -0800)]
[Hexagon] Add patterns for select(i1, Q, Q)

2 years ago[gn build] Port 31f9519d48c2
LLVM GN Syncbot [Fri, 11 Feb 2022 17:36:54 +0000 (17:36 +0000)]
[gn build] Port 31f9519d48c2

2 years ago[mlir][bufferize] Use rewriter instead of replacing all uses directly
Matthias Springer [Fri, 11 Feb 2022 17:31:00 +0000 (02:31 +0900)]
[mlir][bufferize] Use rewriter instead of replacing all uses directly

This is important for compatibility with DialectConversion.

2 years ago[RISCV] Insert VSETVLI at the end of a basic block if we didn't produce BlockInfo...
Craig Topper [Fri, 11 Feb 2022 05:33:08 +0000 (21:33 -0800)]
[RISCV] Insert VSETVLI at the end of a basic block if we didn't produce BlockInfo.Exit.

This is an alternative to D118667 that instead of fixing the store
to match phase 1, it tries to detect the mismatch with the expected
value at the end of the block. This inserts a vsetvli after the vse
to satisfy the requirement of the other basic block.

We still have serious design issues in the pass, that is going to
require some rethinking.

Differential Revision: https://reviews.llvm.org/D119518

2 years agoRevert "[RISCV] Fix a vsetvli insertion bug involving loads/stores." and "[RISCC...
Craig Topper [Fri, 11 Feb 2022 05:03:12 +0000 (21:03 -0800)]
Revert "[RISCV] Fix a vsetvli insertion bug involving loads/stores." and "[RISCC] Add missing words to comment. NFC"

This reverts commit f943c58cae2480755cecdac5be832274f238df93.
and commit 7eb781072744b31a60e82b5a5903471032d4845f.

This introduced a new bug that appears to be easier to hit.

Differential Revision: https://reviews.llvm.org/D119517

2 years ago[RISCV] Add test case for a vsetvli insertion bug found after D118667.
Craig Topper [Thu, 10 Feb 2022 21:55:29 +0000 (13:55 -0800)]
[RISCV] Add test case for a vsetvli insertion bug found after D118667.

We're missing a vsetvli before a vse after a redsum in this test.

This appears to be because the vmv.s.x has a VL of 1, but did not
trigger a vsetvli because it is a scalar move op and any non-zero
VL would work. So it looked at it the predecessors and decided it was
that they all had a non-zero vl. Then the redsum was visited, it
also took the VL from the predecessors since the vmv.s.x and the 4
was found compatible.

Finally we visit the vse and it looks at the BBLocalInfo and sees
that is compatible because it contains a VL of 1 from the vmv.s.x,
the first instruction in the block. BBLocalInfo was not updated
when the vredsum was visited because BBLocalInfo was valid and no
vsetvli was generated.

I think fundamentally the vmv.s.x optimization has the same first
phase and third phase not matching problem that D118667 was trying
to fix for stores.

Differential Revision: https://reviews.llvm.org/D119516

2 years ago[M68k] Adopt the new VarLenCodeEmitterGen for arithmetic instructions
Min-Yih Hsu [Mon, 6 Dec 2021 03:36:10 +0000 (11:36 +0800)]
[M68k] Adopt the new VarLenCodeEmitterGen for arithmetic instructions

This patch refactors all the existing M68k arithmetic instructions
to use the new VarLenCodeEmitterGen infrastructure.

This patch is tested by the existing MC test cases.

Note that one of the codegen tests needed to be updated because the
ordering of two equivalent instructions were switched.

Differential Revision: https://reviews.llvm.org/D115234

2 years ago[TableGen][CodeEmitter] Introducing the VarLenCodeEmitterGen infrastructure
Min-Yih Hsu [Mon, 6 Dec 2021 03:01:17 +0000 (11:01 +0800)]
[TableGen][CodeEmitter] Introducing the VarLenCodeEmitterGen infrastructure

Full write up:
https://gist.github.com/mshockwave/66e98d099256deefc062633909bb7b5b

The existing CodeEmitterGen infrastructure is unable to generate encoder
function for ISAs with variable-length instructions. This patch
introduces a new infrastructure to support variable-length instruction
encoding, including a new TableGen syntax for writing instruction
encoding directives and a new TableGen backend component,
VarLenCodeEmitterGen, built on top of CodeEmitterGen.

Differential Revision: https://reviews.llvm.org/D115128

2 years ago[TableGen][AMDGPU] Allow empty register classes
Jay Foad [Fri, 11 Feb 2022 14:07:15 +0000 (14:07 +0000)]
[TableGen][AMDGPU] Allow empty register classes

Remove ARTIFICIAL_VGPR which only existed to make VReg_1 not empty.

Differential Revision: https://reviews.llvm.org/D119552

2 years ago[libc++] Remove unused include from ranges_swap_ranges.h
Joe Loser [Thu, 10 Feb 2022 23:11:39 +0000 (18:11 -0500)]
[libc++] Remove unused include from ranges_swap_ranges.h

`ranges_swap_ranges.h` includes `<type_traits>` but does not use anything from
it. So, remove the include.

Differential Revision: https://reviews.llvm.org/D119491

2 years ago[AMDGPU] Make enable-flat-scratch a subtarget feature
Sebastian Neubauer [Fri, 11 Feb 2022 17:18:25 +0000 (18:18 +0100)]
[AMDGPU] Make enable-flat-scratch a subtarget feature

Use a subtarget feature instead of a command line argument to reduce
global state.
We want to enable flat scratch for graphics in some cases and this
doesn't work well with command line options.

Differential Revision: https://reviews.llvm.org/D119425

2 years ago[AMDGPU] replace hostcall module flag with function attribute
Sameer Sahasrabuddhe [Fri, 11 Feb 2022 05:13:41 +0000 (10:43 +0530)]
[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

2 years ago[AMDGPU] Add a new intrinsic to control fp_trunc rounding mode
Julien Pages [Fri, 11 Feb 2022 17:00:09 +0000 (12:00 -0500)]
[AMDGPU] Add a new intrinsic to control fp_trunc rounding mode

Add a new llvm.fptrunc.round intrinsic to precisely control
the rounding mode when converting from f32 to f16.

Differential Revision: https://reviews.llvm.org/D110579

2 years ago[ConstraintElimination] Add test for #48253.
Florian Hahn [Fri, 11 Feb 2022 17:07:13 +0000 (17:07 +0000)]
[ConstraintElimination] Add test for #48253.

Test from https://github.com/llvm/llvm-project/issues/48253.

2 years ago[CSSPGO] Do not recount callee samples when computing profile summary for nested...
Hongtao Yu [Fri, 11 Feb 2022 05:55:19 +0000 (21:55 -0800)]
[CSSPGO] Do not recount callee samples when computing profile summary for nested CS profile.

When generating nested CS profile with all calling contexts of a function duplicated into a base profile under `--generate-merged-base-profiles`, do not recount callee samples when computing profile summary. This fixes the profile summary mismatch between flat cs profile and nested cs profile, for both extbinary and text format.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D119494

2 years ago[Bazel] Document continuous and pre-merge testing
Geoffrey Martin-Noble [Fri, 11 Feb 2022 16:59:11 +0000 (08:59 -0800)]
[Bazel] Document continuous and pre-merge testing

2 years ago[SystemZ/z/OS] Add alias for XPLINK return
Kai Nacke [Fri, 11 Feb 2022 15:24:50 +0000 (10:24 -0500)]
[SystemZ/z/OS] Add alias for XPLINK return

The XPLINK return `b 2(7)` has size 4 bytes, while the Linux return
`br 7` only has size 2 bytes. Thus a new alias is required to have correct
instruction byte count. It also fixes the conditional return code.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D119437

2 years ago[X86] Move combineToExtendBoolVectorInReg before the select combines. NFC.
Simon Pilgrim [Fri, 11 Feb 2022 16:51:36 +0000 (16:51 +0000)]
[X86] Move combineToExtendBoolVectorInReg before the select combines. NFC.

Avoid the need for a forward declaration.

Cleanup prep for Issue #53760

2 years ago[libc++][format] LWG-3648 format should not print bool with 'c'
Mark de Wever [Wed, 9 Feb 2022 16:36:12 +0000 (17:36 +0100)]
[libc++][format] LWG-3648 format should not print bool with 'c'

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D119350

2 years ago[libc++][format] LWG-3654 basic_format_context::arg(size_t) should be noexcept
Mark de Wever [Wed, 9 Feb 2022 16:36:12 +0000 (17:36 +0100)]
[libc++][format] LWG-3654 basic_format_context::arg(size_t) should be noexcept

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D119349

2 years ago[X86] combineToExtendBoolVectorInReg - use explicit arguments. NFC.
Simon Pilgrim [Fri, 11 Feb 2022 16:39:56 +0000 (16:39 +0000)]
[X86] combineToExtendBoolVectorInReg - use explicit arguments. NFC.

Replace the *_EXTEND node with the raw operands, this will make it easier to use combineToExtendBoolVectorInReg for any boolvec extension combine.

Cleanup prep for Issue #53760

2 years ago[libc++][nfc] Add TEST_HAS_NO_CHAR8_T.
Mark de Wever [Wed, 2 Feb 2022 18:28:05 +0000 (19:28 +0100)]
[libc++][nfc] Add TEST_HAS_NO_CHAR8_T.

This avoids using an libc++ internal macro in our tests. This version
doesn't depend on the internal macro but redefines it.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D119460

2 years ago[gn build] Port 3b470d1ce992
LLVM GN Syncbot [Fri, 11 Feb 2022 16:20:57 +0000 (16:20 +0000)]
[gn build] Port 3b470d1ce992

2 years ago[libc++][ranges] Implement ranges::min_element
Nikolas Klauser [Fri, 11 Feb 2022 12:11:57 +0000 (13:11 +0100)]
[libc++][ranges] Implement ranges::min_element

Implement ranges::min_element

Reviewed By: Quuxplusone, Mordante, #libc

Spies: miscco, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D117025

2 years ago[gn build] Port 1e77b396ffe4
LLVM GN Syncbot [Fri, 11 Feb 2022 16:11:07 +0000 (16:11 +0000)]
[gn build] Port 1e77b396ffe4

2 years ago[InstCombine] Check source element type in gep of phi of gep fold
Nikita Popov [Fri, 11 Feb 2022 16:09:52 +0000 (17:09 +0100)]
[InstCombine] Check source element type in gep of phi of gep fold

2 years ago[libc++] Add ranges::in_fun_result
Nikolas Klauser [Fri, 11 Feb 2022 16:01:58 +0000 (17:01 +0100)]
[libc++] Add ranges::in_fun_result

Add `ranges::in_fun_result`

Reviewed By: Quuxplusone, #libc, var-const

Spies: CaseyCarter, var-const, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D116974

2 years ago[dexter] Don't generate results files by default
OCHyams [Fri, 11 Feb 2022 15:45:07 +0000 (15:45 +0000)]
[dexter] Don't generate results files by default

Dexter saves various files to a new results directory each time it is run
(including when it's run by lit tests) and there isn't a way to opt-out. This
patch reconfigures the behaviour to be opt-in by removing the default
`--results-directory` location. Now results are only saved if
`--results-directory` is specified.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D119545

2 years agoInferAddressSpaces: Fix assert on inferred source for inttoptr/ptrtoint
Matt Arsenault [Wed, 9 Feb 2022 00:22:00 +0000 (19:22 -0500)]
InferAddressSpaces: Fix assert on inferred source for inttoptr/ptrtoint

If we had some source value we could infer an address space from that
went through a ptrtoint/inttoptr pair, this would fail since bitcast
can't change the address space.

Fixes issue 53665.

2 years ago[PHITransAddr] Check GEP source element type
Nikita Popov [Fri, 11 Feb 2022 15:22:21 +0000 (16:22 +0100)]
[PHITransAddr] Check GEP source element type

It's not the same GEP if the source element type is different.

2 years ago[clang][sema] - remove CodeCompleter nullptr checks
Simon Pilgrim [Fri, 11 Feb 2022 15:09:32 +0000 (15:09 +0000)]
[clang][sema] - remove CodeCompleter nullptr checks

All paths have already dereferenced the CodeCompleter pointer in the ResultBuilder constructor

2 years ago[clang][sema] ActOnExplicitInstantiation - remove Prev nullptr check
Simon Pilgrim [Fri, 11 Feb 2022 15:08:31 +0000 (15:08 +0000)]
[clang][sema] ActOnExplicitInstantiation - remove Prev nullptr check

All paths have already dereferenced the Prev pointer

2 years ago[clang] RewriteModernObjC::SynthBlockInitExpr - remove block nullptr check
Simon Pilgrim [Fri, 11 Feb 2022 15:05:09 +0000 (15:05 +0000)]
[clang] RewriteModernObjC::SynthBlockInitExpr - remove block nullptr check

All paths have already dereferenced the block pointer

2 years ago[lld-macho][nfc] Rename %no_fatal_warnings_lld in tests
Jez Ng [Fri, 11 Feb 2022 15:06:38 +0000 (10:06 -0500)]
[lld-macho][nfc] Rename %no_fatal_warnings_lld in tests

... to use hyphens instead of underscores, making it consistent with
our other substitutions like %no-arg-lld and %lld-watchos.

Reviewed By: keith

Differential Revision: https://reviews.llvm.org/D119513

2 years ago[clang] inheritance fix for nomerge attribute
Dávid Bolvanský [Fri, 11 Feb 2022 14:49:06 +0000 (15:49 +0100)]
[clang] inheritance fix for nomerge attribute

Discussed here: https://reviews.llvm.org/D119061#3310822

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D119451

2 years ago[OpenCL] Adjust diagnostic for subgroup support.
Anton Zabaznov [Fri, 11 Feb 2022 12:54:55 +0000 (15:54 +0300)]
[OpenCL] Adjust diagnostic for subgroup support.

OpenCL C 3.0 __opencl_c_subgroups feature is slightly different
then other equivalent features and extensions (fp64 and 3d image writes):
OpenCL C 3.0 device can support the extension but not the feature.
cl_khr_subgroups requires subgroup independent forward progress.

This patch adjusts the check which is used when translating language
builtins to check either the extension or feature is supported.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D118999

2 years ago[pseudo] NFC, fix some typos.
Haojian Wu [Fri, 11 Feb 2022 14:34:40 +0000 (15:34 +0100)]
[pseudo] NFC, fix some typos.

2 years ago[OpenMP] libomp: fix bug in implementation of distribute construct.
AndreyChurbanov [Fri, 11 Feb 2022 14:34:26 +0000 (17:34 +0300)]
[OpenMP] libomp: fix bug in implementation of distribute construct.

Fixed mistaken iterations distribution between different target regions.

Differential Revision: https://reviews.llvm.org/D118393

2 years ago[NFC][SLP] Set default parameter for Offset equal to zero
Anton Afanasyev [Fri, 11 Feb 2022 12:49:44 +0000 (15:49 +0300)]
[NFC][SLP] Set default parameter for Offset equal to zero

2 years ago[clang-format] Avoid multiple calls to FormatToken::getNextNonComment(). NFC.
Marek Kurdej [Fri, 11 Feb 2022 14:15:18 +0000 (15:15 +0100)]
[clang-format] Avoid multiple calls to FormatToken::getNextNonComment(). NFC.

2 years ago[clang-format] Mark FormatToken::getNextNonComment() nodiscard. NFC.
Marek Kurdej [Fri, 11 Feb 2022 13:19:59 +0000 (14:19 +0100)]
[clang-format] Mark FormatToken::getNextNonComment() nodiscard. NFC.

2 years ago[docs] Fix missing space in the GettingStarted documentation
Louis Dionne [Fri, 11 Feb 2022 14:17:33 +0000 (09:17 -0500)]
[docs] Fix missing space in the GettingStarted documentation

2 years ago[TableGen] Dump RC.Allocatable with -register-info-debug
Jay Foad [Fri, 11 Feb 2022 13:48:28 +0000 (13:48 +0000)]
[TableGen] Dump RC.Allocatable with -register-info-debug

2 years ago[test-release.sh] Add option to disable building clang-tools-extra during release...
Amy Kwan [Fri, 11 Feb 2022 07:06:25 +0000 (01:06 -0600)]
[test-release.sh] Add option to disable building clang-tools-extra during release testing.

This patch adds an option (no-clang-tools) to disable building clang-tools-extra when
performing release testing. Prior to this patch, clang-tools-extra was built by default,
but on some platforms (such as AIX), clang-tools-extra is not supported, and so we do
not normally build it. Furthermore, this change should not change the invocation for
targets that build clang-tools-extra normally.

Differential Revision: https://reviews.llvm.org/D119520

2 years ago[InstCombine] Check source element type in phi of gep fold
Nikita Popov [Fri, 11 Feb 2022 13:25:09 +0000 (14:25 +0100)]
[InstCombine] Check source element type in phi of gep fold

Rather than checking that the type is the same (which is always
the case, given how these are part of the same phi) check that the
source element type is the same. With opaque pointers, this is no
longer implied.

2 years ago[RISCV] Add the policy operand for some masked RVV ternary IR intrinsics.
Zakk Chen [Fri, 11 Feb 2022 12:24:37 +0000 (04:24 -0800)]
[RISCV] Add the policy operand for some masked RVV ternary IR intrinsics.

Masked reduction intrinsics are specical cases which don't need to have policy
operand. The mask only affects which elements are read. It doesn't effect the
destination register.
The reduction intrinsics have a dedicated destination operand. If it
is undef, we use tail agnostic. If it not undef we use tail
undisturbed.

Co-Authored-by: Craig Topper <craig.topper@sifive.com>
Differential Revision: https://reviews.llvm.org/D117681

2 years ago[mlir][MemRef] Fix MemRefCopyOpLowering to use correct number of bytes
Adrian Kuegel [Fri, 11 Feb 2022 11:53:47 +0000 (12:53 +0100)]
[mlir][MemRef] Fix MemRefCopyOpLowering to use correct number of bytes

When lowering to memrefCopy call, the size for i1 type was calculated as 0.
Instead of using getTypeSizeInBits() and dividing by 8, we should just use getTypeSize().

Differential Revision: https://reviews.llvm.org/D119540

2 years ago[OpenCL] Add support of language builtins for OpenCL C 3.0
Anton Zabaznov [Mon, 7 Feb 2022 12:45:42 +0000 (15:45 +0300)]
[OpenCL] Add support of language builtins for OpenCL C 3.0

OpenCL C 3.0 introduces optionality to some builtins, in particularly
to those which are conditionally supported with pipe, device enqueue
and generic address space features.

The idea is to conditionally support such builtins depending on the language options
being set for a certain feature. This allows users to define functions with names
of those optional builtins in OpenCL (as such names are not reserved).

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D118605

2 years ago[demangler] Adjust unqualified name parsing
Nathan Sidwell [Tue, 25 Jan 2022 20:23:31 +0000 (12:23 -0800)]
[demangler] Adjust unqualified name parsing

The unqualified name grammar includes <ctor-dtor-name>, but we handle
that specially in parseNestedName.  This is a little awkward.  We can
pass in the current scope and have parseUnqualifiedName deal with
cdtors too.  That also allows a couple of other simplifications:

1) parseUnqualifiedName can also build up the NestedName, when the
provided scope is non-null.  Which means ...

2) parseUnscopedName can pass a "std" scope in (and tailcall).

3) ... and also parseNestedName need not construct the nestedname itself.

4) also parseNestedName's detection of a cdtor-name doesn't have to
rule out a decomposition name anymore.

This change also makes adding module demangling more
straight-forwards, btw.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D119154

2 years ago[GVN] Store source element type for GEP expressions
Nikita Popov [Fri, 11 Feb 2022 12:01:17 +0000 (13:01 +0100)]
[GVN] Store source element type for GEP expressions

To avoid incorrectly merging GEPs with different source types
under opaque pointers.

To avoid increasing the Expression structure size, this reuses the
existing type member. The code does not rely on this to be the
expression result type, it's only used as a disambiguator.

2 years ago[AArch64][SVE] Fix selection failure caused by fp/int convert using non-Neon types
Bradley Smith [Thu, 10 Feb 2022 12:38:02 +0000 (12:38 +0000)]
[AArch64][SVE] Fix selection failure caused by fp/int convert using non-Neon types

Fixes: #53679

Differential Revision: https://reviews.llvm.org/D119428

2 years ago[mlir][MemRef] Fix MemRefCastOpLowering for 32 bit index type.
Adrian Kuegel [Fri, 11 Feb 2022 10:45:47 +0000 (11:45 +0100)]
[mlir][MemRef] Fix MemRefCastOpLowering for 32 bit index type.

The lowering creates llvm.insertvalue with the rank value, so it needs to use
index type instead of 64 bit integer type. Otherwise, we get an error:

llvm.insertvalue' op Type mismatch: cannot insert 'i64' into '!llvm.struct<(i32, ptr<i8>)>'

Differential Revision: https://reviews.llvm.org/D119534

2 years ago[MLIR][Presburger] normalizeDivisionByGCD: fix bug when constant term is negative
Arjun P [Fri, 11 Feb 2022 11:31:50 +0000 (17:01 +0530)]
[MLIR][Presburger] normalizeDivisionByGCD: fix bug when constant term is negative

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D119531

2 years ago[IR] Check GEP source type when comparing instructions
Nikita Popov [Fri, 11 Feb 2022 11:25:36 +0000 (12:25 +0100)]
[IR] Check GEP source type when comparing instructions

Two GEPs with same indices but different source type are not the
same.

Worth noting that FunctionComparator already handles this correctly.

2 years ago[clang][dataflow] Include terminator statements in buildStmtToBasicBlockMap
Stanislav Gatev [Thu, 10 Feb 2022 16:34:07 +0000 (16:34 +0000)]
[clang][dataflow] Include terminator statements in buildStmtToBasicBlockMap

This will be necessary later when we add support for evaluating logic
expressions such as && and ||.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D119447

2 years ago[compiler-rt][xray] Disable fdr-reinit test on AArch64
David Spickett [Fri, 11 Feb 2022 11:09:28 +0000 (11:09 +0000)]
[compiler-rt][xray] Disable fdr-reinit test on AArch64

We run bots on a shared machine and under high load
this test sometimes segfaults.

https://lab.llvm.org/buildbot/#/builders/185/builds/1368

==1952234==XRay FDR init successful.
==1952234==XRay FDR: Not flushing to file, 'no_file_flush=true'.
<...>fdr-reinit.cpp.script: line 4: 1952234 Segmentation fault
XRAY_OPTIONS="verbosity=1" <...>/fdr-reinit.cpp.tmp

Looking at the printed output I think it's happening at:
// Finally, we should signal the sibling thread to stop.
keep_going.clear(std::memory_order_release);

Disabling the test while I try to reproduce.

2 years ago[AMDGPU][GlobalISel] Fix insert point in FoldableFneg combine
Mirko Brkusanin [Fri, 11 Feb 2022 10:44:50 +0000 (11:44 +0100)]
[AMDGPU][GlobalISel] Fix insert point in FoldableFneg combine

Newly created fneg was built after some of it's uses in some cases.
Now it will be built immediately after instruction whose dst it negates.

Differential Revision: https://reviews.llvm.org/D119459

2 years ago[clang-format] Assert default style instead of commenting. NFC.
Marek Kurdej [Fri, 11 Feb 2022 11:00:35 +0000 (12:00 +0100)]
[clang-format] Assert default style instead of commenting. NFC.

2 years ago[clang-format] Simplify conditions in spaceRequiredBetween. NFC.
Marek Kurdej [Fri, 11 Feb 2022 10:59:58 +0000 (11:59 +0100)]
[clang-format] Simplify conditions in spaceRequiredBetween. NFC.

2 years agoAdd cmake to source release tarballs
Konrad Kleine [Fri, 11 Feb 2022 10:50:33 +0000 (11:50 +0100)]
Add cmake to source release tarballs

I've split the git archive generation into three steps:

1. generate pure tarball
2. append top-level cmake directory to all tarballs
3. compress the archive

This was inspired by D118252 and can be considered an alternative
approach for all projects to have access to the shared cmake
directory when building in standalone mode.

When generating source tarballs on my local laptop it takes 9 minutes and 45 seconds WITH this patch applied. When this patch is not applied, it takes 9minutes and 38 seconds. That means, this patch introduces a slowdown of 7 seconds, which seems fair.

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D118481

2 years ago[clang] VisitCastExpr - use cast<> instead of dyn_cast<> to avoid dereference of...
Simon Pilgrim [Fri, 11 Feb 2022 10:51:34 +0000 (10:51 +0000)]
[clang] VisitCastExpr - use cast<> instead of dyn_cast<> to avoid dereference of nullptr

The pointer is always dereferenced, so assert the cast is correct (which it should be as we just created that ScalableVectorType) instead of returning nullptr

2 years agoLoopReroll::isLoopControlIV - use cast<> instead of dyn_cast<> to avoid dereference...
Simon Pilgrim [Fri, 11 Feb 2022 10:19:25 +0000 (10:19 +0000)]
LoopReroll::isLoopControlIV - use cast<> instead of dyn_cast<> to avoid dereference of nullptr

The pointer is always dereferenced by isCompareUsedByBranch, so assert the cast is correct instead of returning nullptr

2 years ago[M68k] Add missing include
Simon Pilgrim [Fri, 11 Feb 2022 10:17:11 +0000 (10:17 +0000)]
[M68k] Add missing include

Fixup for experimental m68k target after D119359

2 years ago[OpenCL] Add OpenCL 3.0 atomics to -fdeclare-opencl-builtins
Sven van Haastregt [Fri, 11 Feb 2022 10:14:14 +0000 (10:14 +0000)]
[OpenCL] Add OpenCL 3.0 atomics to -fdeclare-opencl-builtins

Add the atomic overloads for the `global` and `local` address spaces,
which are new in OpenCL 3.0.  Ensure the preexisting `generic`
overloads are guarded by the generic address space feature macro.

Ensure a subset of the atomic builtins are guarded by the
`__opencl_c_atomic_order_seq_cst` and `__opencl_c_atomic_scope_device`
feature macros, and enable those macros for SPIR/SPIR-V targets in
`opencl-c-base.h`.

Also guard the `cl_ext_float_atomics` builtins with the atomic order
and scope feature macros.

Differential Revision: https://reviews.llvm.org/D119420

2 years agoStackProtector: ignore debug insts when splitting blocks.
Tim Northover [Thu, 10 Feb 2022 13:28:50 +0000 (13:28 +0000)]
StackProtector: ignore debug insts when splitting blocks.

When deciding where to split a block to insert stack guard checks, we should
move past any debug instructions we see that might (e.g.) be separating a tail
call from its frame wrangling.

2 years ago[SCCP] Check that load/store and global type match
Nikita Popov [Fri, 11 Feb 2022 10:00:31 +0000 (11:00 +0100)]
[SCCP] Check that load/store and global type match

SCCP requires that the load/store type and global type are the
same (it does not support bitcasts of tracked globals). With
typed pointers this was implicitly enforced.

2 years ago[clang-format] Add tests for spacing between ref-qualifier and `noexcept`. NFC.
Marek Kurdej [Fri, 11 Feb 2022 09:49:53 +0000 (10:49 +0100)]
[clang-format] Add tests for spacing between ref-qualifier and `noexcept`. NFC.

Cf. https://github.com/llvm/llvm-project/issues/44542.
Cf. https://github.com/llvm/llvm-project/commit/ae1b7859cbd61d2284d9690bc53482d0b6a46f63.

2 years ago[analyzer] Restrict CallDescription fuzzy builtin matching
Balazs Benics [Fri, 11 Feb 2022 09:45:18 +0000 (10:45 +0100)]
[analyzer] Restrict CallDescription fuzzy builtin matching

`CallDescriptions` for builtin functions relaxes the match rules
somewhat, so that the `CallDescription` will match for calls that have
some prefix or suffix. This was achieved by doing a `StringRef::contains()`.
However, this is somewhat problematic for builtins that are substrings
of each other.

Consider the following:

`CallDescription{ builtin, "memcpy"}` will match for
`__builtin_wmemcpy()` calls, which is unfortunate.

This patch addresses/works around the issue by checking if the
characters around the function's name are not part of the 'name'
semantically. In other words, to accept a match for `"memcpy"` the call
should not have alphanumeric (`[a-zA-Z]`) characters around the 'match'.

So, `CallDescription{ builtin, "memcpy"}` will not match on:

 - `__builtin_wmemcpy: there is a `w` alphanumeric character before the match.
 - `__builtin_memcpyFOoBar_inline`: there is a `F` character after the match.
 - `__builtin_memcpyX_inline`: there is an `X` character after the match.

But it will still match for:
 - `memcpy`: exact match
 - `__builtin_memcpy`: there is an _ before the match
 - `__builtin_memcpy_inline`: there is an _ after the match
 - `memcpy_inline_builtinFooBar`: there is an _ after the match

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D118388

2 years agoCleanup MCParser headers
serge-sans-paille [Wed, 9 Feb 2022 19:00:42 +0000 (20:00 +0100)]
Cleanup MCParser headers

As usual with that header cleanup series, some implicit dependencies now need to
be explicit:

llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h

Preprocessed lines to build llvm on my setup:
after:  1068185081
before: 1068324320

So no compile time benefit to expect, but we still get the looser coupling
between files which is great.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119359

2 years ago[mlir][LLVM] Add support for adding a garbage collector to a LLVM function
Markus Böck [Fri, 11 Feb 2022 09:23:35 +0000 (10:23 +0100)]
[mlir][LLVM] Add support for adding a garbage collector to a LLVM function

This patch simply adds an optional garbage collector attribute to LLVMFuncOp which maps 1:1 to the "gc" property of functions in LLVM.

Differential Revision: https://reviews.llvm.org/D119492

2 years ago[InstCombine] Check type compatibility in indexed load fold
Nikita Popov [Fri, 11 Feb 2022 09:15:17 +0000 (10:15 +0100)]
[InstCombine] Check type compatibility in indexed load fold

This fold could use a rewrite to an offset-based implementation,
but for now make sure it doesn't crash with opaque pointers.

2 years ago[LV] Move unrelated tests from first-order-recurrence-chains.ll
Florian Hahn [Fri, 11 Feb 2022 09:15:41 +0000 (09:15 +0000)]
[LV] Move unrelated tests from first-order-recurrence-chains.ll

2 years ago[AArch64] Emit TBAA metadata for SVE load/store intrinsics
Sander de Smalen [Fri, 11 Feb 2022 07:53:20 +0000 (07:53 +0000)]
[AArch64] Emit TBAA metadata for SVE load/store intrinsics

In Clang we can attach TBAA metadata based on the load/store intrinsics
based on the operation's element type.

This also contains changes to InstCombine where the AArch64-specific
intrinsics are transformed into generic LLVM load/store operations,
to ensure that all metadata is transferred to the new instruction.

There will be some further work after this patch to also emit TBAA
metadata for SVE's gather/scatter- and struct load/store intrinsics.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D119319

2 years agoAdd a new interface method `getAsmBlockName()` on OpAsmOpInterface to control block...
Mehdi Amini [Mon, 7 Feb 2022 21:10:18 +0000 (21:10 +0000)]
Add a new interface method `getAsmBlockName()` on OpAsmOpInterface to control block names

This allows operations to control the block ids used by the printer in nested regions.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D115849

2 years ago[InstCombine] Require equal source element type in icmp of gep fold
Nikita Popov [Fri, 11 Feb 2022 08:38:28 +0000 (09:38 +0100)]
[InstCombine] Require equal source element type in icmp of gep fold

Without opaque pointers, this is implicitly enforced. This previously
resulted in a miscompile.

2 years ago[Bitcode] Add partial support for opaque pointer auto-upgrade
Nikita Popov [Thu, 27 Jan 2022 14:47:28 +0000 (15:47 +0100)]
[Bitcode] Add partial support for opaque pointer auto-upgrade

Auto-upgrades that rely on the pointer element type do not work in
opaque pointer mode. The idea behind this patch is that we can
instead work with type IDs, for which we can retain the pointer
element type. For typed pointer bitcode, we will have a distinct
type ID for pointers with distinct element type, even if there will
only be a single corresponding opaque pointer type.

The disclaimer here is that this is only the first step of the change,
and there are still more getPointerElementType() calls to remove.
I expect that two more patches will be needed:
1. Track all "contained" type IDs, which will allow us to handle
function params (which are contained in the function type) and GEPs
(which may use vectors of pointers)
2. Track type IDs for values, which is e.g. necessary to handle loads.

Differential Revision: https://reviews.llvm.org/D118694

2 years ago[ArgPromotion] Protect harder against recursive promotion (PR42028)
Nikita Popov [Thu, 10 Feb 2022 09:38:37 +0000 (10:38 +0100)]
[ArgPromotion] Protect harder against recursive promotion (PR42028)

In addition to the self-recursion check, also check whether there
is more than one node in the SCC, which implies that there is a
larger cycle. I believe checking SCC structure (rather than
something like norecurse) is the right thing to do here, because
this is specifically about preventing infinite loops over the SCC.

Fixes https://github.com/llvm/llvm-project/issues/42028.

Differential Revision: https://reviews.llvm.org/D119418

2 years ago[mlir][OpDSL] Add support for basic rank polymorphism.
gysit [Fri, 11 Feb 2022 08:20:37 +0000 (08:20 +0000)]
[mlir][OpDSL] Add support for basic rank polymorphism.

Previously, OpDSL did not support rank polymorphism, which required a separate implementation of linalg.fill. This revision extends OpDSL to support rank polymorphism for a limited class of operations that access only scalars and tensors of rank zero. At operation instantiation time, it scales these scalar computations to multi-dimensional pointwise computations by replacing the empty indexing maps with identity index maps. The revision does not change the DSL itself, instead it adapts the Python emitter and the YAML generator to generate different indexing maps and and iterators depending on the rank of the first output.

Additionally, the revision introduces a `linalg.fill_tensor` operation that in a future revision shall replace the current handwritten `linalg.fill` operation. `linalg.fill_tensor` is thus only temporarily available and will be renamed to `linalg.fill`.

Reviewed By: nicolasvasilache, stellaraccident

Differential Revision: https://reviews.llvm.org/D119003