review.tizen.org Git - platform/upstream/llvm.git/log

Partially Revert "scan-view: Remove Reporter.py and associated AppleScript files"

This reverts some of commit dbb01536f6f49fa428f170e34466072ef439b3e9.

The Reporter module was still being used by the ScanView.py module and deleting
it caused scan-view to fail. This commit adds back Reporter.py but removes the
code the references the AppleScript files which were removed in
dbb01536f6f49fa428f170e34466072ef439b3e9.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D96367

[Polly] Hide IslScheduleOptimizer implementation from header. NFC.

These are implementation details of the IslScheduleOptimizer pass
implementation and not use anywhere else. Hence, we can move them to the
cpp file and into an anonymous namespace.

Only getPartialTilePrefixes is, aside from the pass itself, used
externally (by the ScheduleOptimizerTest) and moved into the polly
namespace.

[mlir] detect integer overflow in debug mode

Rationale:
This computation failed ASAN for the following input
(integer overflow during 4032000000000000000 * 100):

tensor<100x200x300x400x500x600x700x800xf32>

This change adds a simple overflow detection during
debug mode (which we run more regularly than ASAN).
Arguably this is an unrealistic tensor input, but
in the context of sparse tensors, we may start to
see cases like this.

Bug:
https://bugs.llvm.org/show_bug.cgi?id=49136

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96530

[knownbits] Preserve known bits for small shift recurrences

The motivation for this is that I'm looking at an example that uses shifts as induction variables. There's lots of other omissions, but one of the first I noticed is that we can't compute tight known bits. (This indirectly causes SCEV's range analysis to produce very poor results as well.)

Differential Revision: https://reviews.llvm.org/D96440

[lld][WebAssembly] Fix for weak undefined functions in -pie mode

This fixes two somewhat related issues.  Firstly we were never
generating imports for weak functions (even with the `import-functions`
policy for undefined symbols).  Adding a direct call to foo in the
`weak-undefined-pic.s` exposed a crash in the linker which this
change fixes.

Secondly we were failing to call `handleWeakUndefines` for the `-pie`
case which is PIC but doesn't set the undefined symbol policy to
`import-functions`.  With this change `-pie` binaries will by default
call `handleWeakUndefines` which generates the undefined stub handlers
for any weakly undefined symbols.

Fixes: https://github.com/emscripten-core/emscripten/issues/13337

Differential Revision: https://reviews.llvm.org/D95914

[tests] Autogen update test to remove whitespace diffs

[tests] precommit a tests for D96534 (and other range quality items)

[tests] Autogen a few tests for ease of update

[NFC,memprof] Update test after D96319

[Msan, NewPM] Reduce size of msan binaries

EarlyCSEPass called after msan redices code size by about 10%.
Similar optimization exists for legacy pass manager in
addGeneralOptsForMemorySanitizer.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D96406

[RISCV] Add a pattern for a scalable vector mask vnot.

We can use a vnand.mm with the same register for both inputs.
This avoids materializing an alls ones constant with vmset.mm.

[NFC] Extract function which registers sanitizer passes

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D96481

[Sanitizer] Fix sanitizer tests without reducing optimization levels

As discussed, these tests are compiled with optimization to mimic real
sanitizer usage [1].

Let's mark relevant functions with `noinline` so we can continue to
check against the stack traces in the report.

[1] https://reviews.llvm.org/D96198

This reverts commit 04af72c5423eb5ff7c0deba2d08cb46d583bb9d4.

Differential Revision: https://reviews.llvm.org/D96357

[flang][fir][NFC] Move BoxType to TableGen type definition

This patch is a follow up of D96422 and move BoxType to TableGen.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96476

ObjectFileELF: Test whether reloc_header is non-null instead of asserting.

It is possible for the GetSectionHeaderByIndex lookup to fail because
the previous FindSectionContainingFileAddress lookup found a segment
instead of a section. This is possible if the binary does not have
a PLT (which means that lld will in some circumstances set DT_JMPREL
to 0, which is typically an address that is part of the ELF headers
and not in a section) and may also be possible if the section headers
have been stripped. To handle this possibility, replace the assert
with an if.

Differential Revision: https://reviews.llvm.org/D93438

[lldb] Add step target to ThreadPlanStepInRange constructor

`QueueThreadPlanForStepInRange` accepts a `step_into_target`, but the constructor for
`ThreadPlanStepInRange` does not. Instead, a caller would optionally call
`SetStepInTarget()` in a separate statement.

This change adds `step_into_target` as a constructor argument. This simplifies
construction of `ThreadPlanSP`, by avoiding a subsequent downcast and conditional
assignment. This constructor is already used in downstream repos.

Differential Revision: https://reviews.llvm.org/D96539

Remove test code that cause MSAN failure.

Summary:
The negative test (with the feature being added disabled) caused MSAN failure and that's the added feature is supposed to fix. Therefore the negative test code is being removed.

[WebAssembly] Use the new crt1-command.o if present.

If crt1-command.o exists in the sysroot, the libc has new-style command
support, so use it.

Differential Revision: https://reviews.llvm.org/D89274

s[mlir] Tighten computation of inferred SubView result type.

The AffineMap in the MemRef inferred by SubViewOp may have uncompressed symbols which result in type mismatch on otherwise unused symbols. Make the computation of the AffineMap compress those unused symbols which results in better canonical types.
Additionally, improve the error message to report which inferred type was expected.

Differential Revision: https://reviews.llvm.org/D96551

[RISCV] Initial support for insert/extract subvector

This patch handles cast-like insert_subvector & extract_subvector
in which case:
1. index starts from 0.
2. inserting a fixed-width vector into a scalable vector,
or extracting a fixed-width vector from a scalable vector.

Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D96352

NFC: update clang tests to check ordering and alignment for atomicrmw/cmpxchg.

The ability to specify alignment was recently added, and it's an
important property which we should ensure is set as expected by
Clang. (Especially before making further changes to Clang's code in
this area.) But, because it's on the end of the lines, the existing
tests all ignore it.

Therefore, update all the tests to also verify the expected alignment
for atomicrmw and cmpxchg. While I was in there, I also updated uses
of 'load atomic' and 'store atomic', and added the memory ordering,
where that was missing.

[dfsan] Introduce memory mapping for origin tracking

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D96545

Replace deprecated %T in 2 tests.

In D91442, @MaskRay commented about a failure. This commit does the following to
address his comments:

1. Replace %T with %t as former is deprecated.
2. Add an explicit --sysroot argument in a test.

Some tests were failing when gcc-10-riscv64-linux-gnu is installed on test machine.
This was happening because the test was checking a case when --gcc-toolchain is not
provided. But if --sysroot was also not provided then code could pick a toolchain
installed in /usr. So to make the test more robust, I have provided an explicit --sysroot
argument. Its value has been chosen to match the existing patterns.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93023

[AArch64] Adding Neon Sm3 & Sm4 Intrinsics

This adds SM3 and SM4 Intrinsics support for AArch64, specifically:
        vsm3ss1q_u32
        vsm3tt1aq_u32
        vsm3tt1bq_u32
        vsm3tt2aq_u32
        vsm3tt2bq_u32
        vsm3partw1q_u32
        vsm3partw2q_u32
        vsm4eq_u32
        vsm4ekeyq_u32

Reviewed By: labrinea

Differential Revision: https://reviews.llvm.org/D95655

Fix errors in distributions

[OpenMP] libomp: minor changes to improve library performance

Three minor changes in this patch:
- added UNLIKELY hint to few rarely executed branches;
- replaced couple of run time checks with debug assertions;
- moved check of presence of ittnotify tool from inside the function call.

Differential Revision: https://reviews.llvm.org/D95816

Undo test changs introduced by D96193.

Summary:
The test doesn't work on Windows but there seems no good way to disable the test for Windows only so I'm undoing the test changes.

NFCI. With the move to the new pass manager by default, sanitize-coverage.c is now passing on ARM.

This change removes the XFAIL from the original test and duplicates the test into sanitize-coverage-old-pm.c
which uses the old pass manager and has the corresponding XFAIL.

This should fix the XPASS from this and similar runs:
http://lab.llvm.org:8011/#/builders/60/builds/1875

[lldb] Disable x86-multithread-write.test with reproducers

This test is failing on GreenDragon. Disabling it until I have bandwidth
to investigate why the register values are different during replay.

[OpenMP] Enable omp_get_num_devices() on Windows

This patch enables omp_get_num_devices() and omp_get_initial_device() on
Windows by providing an alternative to dlsym on Windows, and proposes to
add a new libomptarget entry, __tgt_get_num_devices().

Differential Revision: https://reviews.llvm.org/D96182

Fix incorrect indentation in LangRef.rst

[CSSPGO] Process functions in a top-down order on a dynamic call graph.

Functions are currently processed by the sample profiler loader in a top-down order defined by the static call graph. The order is being adjusted to be a top-down order based on the input context-sensitive profile. One benefit is that the processing order of caller and callee in one SCC would follow the context order in the profile to favor more inlining. Another benefit is that the processing order of caller and callee through an indirect call (which is not on the static call graph) can be honored which in turn allows for more inlining.

The profile top-down order for SCC is also extended to support non-CS profiles.

Two switches `-mllvm -use-profile-indirect-call-edges` and `-mllvm -use-profile-top-down-order` are being introduced.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D95988

Fix incorrect indentation in LangRef.rst

Encode alignment attribute for `cmpxchg`

This is a follow up patch to D83136 adding the align attribute to `cmpxchg`.
See also D83465 for `atomicrmw`.

Differential Revision: https://reviews.llvm.org/D87443

Encode alignment attribute for `atomicrmw`

This is a follow up patch to D83136 adding the align attribute to `atomicwmw`.

Differential Revision: https://reviews.llvm.org/D83465

[AMDGPU] Fix promote alloca with double use in a same insn

If we have an instruction where more than one pointer operands
are derived from the same promoted alloca, we are fixing it for
one argument and do not fix a second use considering this user
done.

Fix this by deferring processing of memory intrinsics until all
potential operands are replaced.

Fixes: SWDEV-271358

Differential Revision: https://reviews.llvm.org/D96386

Move implementation of isAssumeLikeIntrinsic into IntrinsicInst

This is remove dependency on ValueTracking in the future patch.

Differential Revision: https://reviews.llvm.org/D96079

[flang][fir][NFC] Rename WhereOp to IfOp.

[clangd] Retire the cross-file-rename command-line flag.

This patch only focuses on the flag. Removing actual single-file mode
(and the flag in RenameOption) will come in a follow-up.

Differential Revision: https://reviews.llvm.org/D96495

Revert "[lldb/test] Automatically find debug servers to test"

The commit 7df4eaaa937332c0617aa665080533966e2c98a0 appears to
break the windows bot. Revert while I investigate.

[flang] Don't perform macro replacement unless *.F, *.F90, &c.

Avoid spurious and confusing macro replacements from things like
-DPIC on Fortran source files whose suffixes indicate that preprocessing
is not expected.

Add gfortran-like "-cpp" and "-nocpp" flags to f18 to force predefinition
of macros independent of the source file suffix.

Differential Revision: https://reviews.llvm.org/D96464

[CodeGen] Split out cold exception handling pads.

Support for splitting exception handling pads was added in D73739. This
change updates the code to split out exception handling pads if profile
information indicates that they are cold. For a given function with
multiple landind pads, if one of them is hot they are all retained as
part of the hot code section.

Differential Revision: https://reviews.llvm.org/D96372

llvm-dwarfdump: fix the counting when printing DW_OP_entry_value

The block size is in bytes, and not number of operands.

Differential Revision: https://reviews.llvm.org/D96472

[CodeGen] Basic block sections should take precendence over splitting.

The use of basic block sections should take precedence over the machine
function splitting pass. Since they use the same underlying mechanism
they are kept exclusive. Updated the tests to check that split machine
functions is overridden by all flavours of basic block sections.

Differential Revision: https://reviews.llvm.org/D96392

[flang][fir] Update the kind mapping class.

The kind mapper provides a portable mechanism to map Fortran type KIND values
independent of the front-end to their corresponding MLIR and LLVM types.

Differential Revision: https://reviews.llvm.org/D96362

[dfsan] Add origin chain utils

This is a part of https://reviews.llvm.org/D95835.

The design is based on MSan origin chains.

An 4-byte origin is a hash of an origin chain. An origin chain is a
pair of a stack hash id and a hash to its previous origin chain. 0 means
no previous origin chains exist. We limit the length of a chain to be
16. With origin_history_size = 0, the limit is removed.

The change does not have any test cases yet. The following change
will be adding test cases when the APIs are used.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D96160

AMDGPU: Restrict soft clause bundling at half of the available regs

Fixes a testcase that was overcommitting large register tuples to a
bundle, which the register allocator could not possibly satisfy. This
was producing a bundle which used nearly all of the available SGPRs
with a series of 16-dword loads (not all of which are freely available
to use).

This is a quick hack for some deeper issues with how the clause
bundler tracks register pressure.

Overall the pressure tracking used here doesn't make sense and is too
imprecise for what it needs to avoid the allocator failing. The
pressure estimate does not account for the alignment requirements of
large SGPR tuples, so this was really underestimating the pressure
impact. This also ignores the impact of the extended live range of the
use registers after the bundle is introduced. Additionally, it didn't
account for some wide tuples not being available due to reserved
registers.

This regresses a few cases. These end up introducing more
spilling. This is also a function of the global pressure being used in
the decision to bundle, not the local pressure impact of the bundle
itself.

[flang] Improve "Error reading module file" error message

Instead of using a message attachment with further details,
emit the details as part of a single message.

Differential Revision: https://reviews.llvm.org/D96465

[lld][WebAssembly] Delay the merging of data section when dynamic linking

With dynamic linking we have the current limitation that there can be
only a single active data segment (since we use __memory_base as the
load address and we can't do arithmetic in constant expresions).

This change delays the merging of active segments until a little later
in the linking process which means that the grouping of data by section,
and the magic __start/__end symbols work as expected under dynamic
linking.

Differential Revision: https://reviews.llvm.org/D96453

[clang][Arm] Fix handling of -Wa,-implicit-it=

Similiar to D95872, this flag can be set for the assembler directly.
Move validation code into a reusable helper function.

Link: https://bugs.llvm.org/show_bug.cgi?id=49023
Link: https://github.com/ClangBuiltLinux/linux/issues/1270
Reported-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D96285

[tests] Precommit tests for D96440

[InstCombine] add tests for disguised mul ops; NFC

BPF: Add LLVMAnalysis in CMakefile LINK_COMPONENTS

buildbot reported a build error like below:
  BPFTargetMachine.cpp:(.text._ZN4llvm19TargetTransformInfo5ModelINS_10BPFTTIImplEED2Ev
    [_ZN4llvm19TargetTransformInfo5ModelINS_10BPFTTIImplEED2Ev]+0x14):
    undefined reference to `llvm::TargetTransformInfo::Concept::~Concept()'
  lib/Target/BPF/CMakeFiles/LLVMBPFCodeGen.dir/BPFTargetMachine.cpp.o:
    In function `llvm::TargetTransformInfo::Model<llvm::BPFTTIImpl>::~Model()':

Commit a260ae716030 ("BPF: Implement TTI.IntImmCost() properly")
added TargetTransformInfo to BPF, which requires LLVMAnalysis
dependence. In certain cmake configurations, lacking explicit
LLVMAnalysis dependency may cause compilation error.
Similar to other targets, this patch added LLVMAnalysis
in CMakefile LINK_COMPONENTS explicitly.

Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache"

This reverts commit b7d870eae7fdadcf10d0f177faa7409c2e37d776 and the
subsequent fix "[Polly] Fix build after AssumptionCache change (D96168)"
(commit e6810cab09fcbc87b6e5e4d226de0810e2f2ea38).

It caused indeterminism in the output, such that e.g. the
polly-x86_64-linux buildbot failed accasionally.

[libc++][format] Enable format_error on older compilers.

It seems like modifying the header doesn't cause libc++ to be rebuild.
So the breakage of the previous commit didn't happen on my system.

This should fix the build of https://buildkite.com/mlir/mlir-core

[flang] Fix typo in FlangConfig.cmake.in.

`find_package(Flang)` does not work as there is a missing `@` in the
FlangConfig.cmake.in file. This patch fixes the issue.

Reviewed By: thopre

Differential Revision: https://reviews.llvm.org/D96484

[libc++][format] Improve Add basic_format_parse_context.

Add an additional guard to prevent building on older clang versions.

This should fix the build of https://buildkite.com/mlir/mlir-core

[asan][test] Fix Linux/odr-violation.cpp on gcc

[ARM] Single source vmovnt tests. NFC

[AMDGPU] Better selection of base offset when merging DS reads/writes

When merging a pair of DS reads or writes needs to materialize the base
offset in a vgpr, choose a value that is aligned to as high a power of
two as possible. This maximises the chance that different pairs can use
the same base offset, in which case the base offset registers can be
commoned up by MachineCSE.

Differential Revision: https://reviews.llvm.org/D96421

[TargetLowering][RISCV][AArch64][PowerPC] Enable BuildUDIV/BuildSDIV on illegal types before type legalization if we can find a larger legal type that supports MUL.

If we wait until the type is legalized, we'll lose information
about the orginal type and need to use larger magic constants.
This gets especially bad on RISCV64 where i64 is the only legal
type.

I've limited this to simple scalar types so it only works for
i8/i16/i32 which are most likely to occur. For more odd types
we might want to do a small promotion to a type where MULH is legal
instead.

Unfortunately, this does prevent some urem/srem+seteq matching since
that still require legal types.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D96210

[ELF] Resolve defined symbols before undefined symbols

When parsing an object file, LLD interleaves undefined symbol resolution (which
may recursively fetch other lazy objects) with defined symbol resolution.

This may lead to surprising results, e.g. if an object file defines currently
undefined symbols and references another lazy symbol, we may interleave defined
symbols with the lazy fetch, potentially leading to the defined symbols
resolving to different files.

As an example, if both `a.a(a.o)` and `a.a(b.o)` define `foo` (not in COMDAT
group, or in different COMDAT groups) and `__profd_foo` (in COMDAT group
`__profd_foo`).  LLD may resolve `foo` to `a.a(a.o)` and `__profd_foo` to
`b.a(b.o)`, i.e. different files.

```
parse ArchiveFile a.a
  entry fetches a.a(a.o)
  parse ObjectFile a.o
    define entry
    define foo
    reference b
    b fetches a.a(b.o)
    parse ObjectFile b.o
      define prevailing __profd_foo
    define (ignored) non-prevailing __profd_foo
```

Assuming a set of interconnected symbols are defined all or none in several lazy
objects. Arguably making them resolve to the same file is preferable than making
them resolve to different files (some are lazy objects).

The main argument favoring the new behavior is the stability. The relative order
between a defined symbol and an undefined symbol does not change the symbol
resolution behavior.  Only the relative order between two undefined symbols can
affect fetching behaviors.

---

The real world case is reduced from a Fuchsia PGO usage: `a.a(a.o)` has a
constructor within COMDAT group C5 while `a.a(b.o)` has a constructor within
COMDAT group C2. Because they use different group signatures, they are not
de-duplicated. It is not entirely whether Clang behavior is entirely conforming.

LLD selects the PGO counter section (`__profd_*`) from `a.a(b.o)` and the
constructor section from `a.a(a.o)`. The `__profd_*` is a SHF_LINK_ORDER section
linking to its own non-prevailing constructor section, so LLD errors
`sh_link points to discarded section`. This patch fixes the error.

Differential Revision: https://reviews.llvm.org/D95985

[gn build] port ed98676fa483

Support multi-configuration generators correctly in several config files

Multi-configuration generators (such as Visual Studio and Xcode) allow the specification of a build flavor at build time instead of config time, so the lit configuration files need to support that - and they do for the most part. There are several places that had one of two issues (or both!):

1) Paths had %(build_mode)s set up, but then not configured, resulting in values that would not work correctly e.g. D:/llvm-build/%(build_mode)s/bin/dsymutil.exe
2) Paths did not have %(build_mode)s set up, but instead contained $(Configuration) (which is the value for Visual Studio at configuration time, for Xcode they would have had the equivalent) e.g. "D:/llvm-build/$(Configuration)/lib".

This seems to indicate that we still have a lot of fragility in the configurations, but also that a number of these paths are never used (at least on Windows) since the errors appear to have been there a while.

This patch fixes the configurations and it has been tested with Ninja and Visual Studio to generate the correct paths. We should consider removing some of these settings altogether.

Reviewed By: JDevlieghere, mehdi_amini

Differential Revision: https://reviews.llvm.org/D96427

[sanitizer] Fix suffix-log-path_test.c on arm-linux-gnu

The recent suffix-log-path_test.c checks for a full stacktrace and
since on some arm-linux-gnu configuration the slow unwinder is used
on default (when the compiler emits thumb code as default), it
requires -funwind-tables on tests.

It also seems to fix the issues disable by d025df3c1de.

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D96337

[LV] Add tests showing suboptimal vectorization for narrow types.

This patch adds additional test cases showing missing/sub-optimal
vectorization for loops which contain small and wider memory ops on
AArch64.

[flang] Remove `LINK_WITH_FIR` cmake switch

Most components required for this are already there.

Build and Testing clean.
ninja check-flang

Reviewed By: clementval, tskeith

Differential Revision: https://reviews.llvm.org/D96411

[RISCV] Add support loads, stores, and splats of vXi1 fixed vectors.

This refines how we determine which masks types are legal and adds
support for loads, stores, and all ones/zeros splats.

I left a fixme in store handling where I think we need to zero
extra bits if the type isn't a multiple of a byte. If I remember
right from X86 there was some case we could have a store of a
1, 2, or 4 bit mask and have a scalar zextload that then expected the
bits to be 0. Its tricky to zero the bits with RVV. We need to do
something like round VL up, zero a register, lower the VL back down,
then do a tail undisturbed move into the zero register. Another
option might be to generate a mask of 1/2/4 bits set with a VL of 8
and use that to mask off the bits.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96468

[DAG] foldLogicOfSetCCs - Generalize and/or (setcc X, CMax, ne), (setcc X, CMin, ne/eq) fold. NFCI.

Prep work to add support for non-uniform vectors - replace APInt values with using the SDValue ops directly.

[libc++][format] Add basic_format_parse_context.

Implements parts of:
- P0645 Text Formatting

Depends on D92214

Reland with changes:
The format header will only be compiled if the compiler used has support
for concepts. This should fix the issues with the initial version.

Differential Revision: https://reviews.llvm.org/D93166

[gn build] Port 7e3b9aba609f

Revert "[flang][fir][NFC] Move BoxType to TableGen type definition"

This reverts commit d96bb48f7874d636ffd0ee233e2d66fff0614b8f.

BPF: Implement TTI.IntImmCost() properly

This patch implemented TTI.IntImmCost() properly.
Each BPF insn has 32bit immediate space, so for any immediate
which can be represented as 32bit signed int, the cost
is technically free. If an int cannot be presented as
a 32bit signed int, a ld_imm64 instruction is needed
and a TCC_Basic is returned.

This change is motivated when we observed that
several bpf selftests failed with latest llvm trunk, e.g.,
  #10/16 strobemeta.o:FAIL
  #10/17 strobemeta_nounroll1.o:FAIL
  #10/18 strobemeta_nounroll2.o:FAIL
  #10/19 strobemeta_subprogs.o:FAIL
  #96 snprintf_btf:FAIL

The reason of the failure is due to that
SpeculateAroundPHIsPass did aggressive transformation
which alters control flow for which currently verifer
cannot handle well. In llvm12, SpeculateAroundPHIsPass
is not called.

SpeculateAroundPHIsPass relied on TTI.getIntImmCost()
and TTI.getIntImmCostInst() for profitability
analysis. This patch implemented TTI.getIntImmCost()
properly for BPF backend which also prevented
transformation which caused the above test failures.

Differential Revision: https://reviews.llvm.org/D96448

[Timer] On macOS count number of executed instructions

In addition to wall time etc. this should allow us to get less noisy
values for time measurements.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D96049

[flang][fir][NFC] Move BoxType to TableGen type definition

This patch is a follow up of D96422 and move BoxType to TableGen.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96476

[lldb] Fix that running a top level expression without a process fails with a cryptic error

Right now when running `expr --top-level -- void foo() {}`, LLDB just prints a cryptic
`error: Couldn't find $__lldb_expr() in the module` error. The reason for that is
that if we don't have a running process, we try to set our execution policy to always use the
IR interpreter (ExecutionPolicyNever) which works even without a process. However
that code didn't consider the special ExecutionPolicyTopLevel which we use for
top-level expressions. By changing the execution policy to ExecutionPolicyNever,
LLDB thinks we're actually trying to interpret a normal expression inside our
`$__lldb_expr` function and then fails when looking for it.

This just adds an exception for top-level expressions to that code and a bunch of tests.

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D91723

[mlir][LLVM] NFC - Refactor a lookupOrCreateFn to reuse common function creation.

Differential revision: https://reviews.llvm.org/D96488

[lldb] Don't emit a warning when using Objective-C getters in expressions

Clang emits a warning when accessing an Objective-C getter but not using the result.
This gets triggered when just trying to print a getter value in the expression parser (where
Clang just sees a normal expression like `obj.getter` while parsing).

This patch just disables the warning in the expression parser (similar to what we do with
the C++ equivalent of just accessing a member variable but not doing anything with it).

Reviewed By: kastiglione

Differential Revision: https://reviews.llvm.org/D94307

[ARM] Add CostKind to getMVEVectorCostFactor.

This adds the CostKind to getMVEVectorCostFactor, so that it can
automatically account for CodeSize costs, where it returns a cost of 1
not the MVEFactor used for Throughput/Latency. This helps simplify the
caller code and allows us to get the codesize cost more correct in more
cases.

Store the calculated constant expression value into the ConstantExpr object

With https://reviews.llvm.org/D63376, we began storing the APValue
directly into the ConstantExpr object so that we could reuse the
calculated value later. However, it missed a case when not in C++11
mode but the expression is known to be constant.

[lldb] Log the actual expression result in UserExpression::Evaluate

This used to be a LLDB_LOGF call that used the printf %s syntax.
0ab109d43d9d8389fe686ee8df4a82634f246b55 changed it to LLDB_LOG but didn't
update this format string to use formatv's syntax so this just printed '%s'.

Improve STRICT_FSETCC codegen in absence of no NaN

As for SETCC, use a less expensive condition code when generating
STRICT_FSETCC if the node is known not to have Nan.

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D91972

[lld][WebAssembly] Common superclass for input globals/events/tables

This commit regroups commonalities among InputGlobal, InputEvent, and
InputTable into the new InputElement. The subclasses are defined
inline in the new InputElement.h. NFC.

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D94677

[DebugInfo] Only perform TypeSize -> unsigned cast when necessary

This commit moves a line in SelectionDAGBuilder::handleDebugValue to
avoid implicitly casting a TypeSize object to an unsigned earlier than
necessary. It was possible that we bail out of the loop before the value
is ever used, which means we could create a superfluous TypeSize
warning.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D96423

[mlir] make ModuleTranslation mapping fields private

ModuleTranslation contains multiple fields that keep track of the mappings
between various MLIR and LLVM IR components. The original ModuleTranslation
extension model was based on inheritance, with these fields being protected and
thus accessible in the ModuleTranslation and derived classes. The
inheritance-based model doesn't scale to translation of more than one derived
dialect and will be progressively replaced with a more flexible one based on
dialect interfaces and a translation state that is separate from
ModuleTranslation. This change prepares the replacement by making the mappings
private and providing public methods to access them.

Depends On D96436

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96437

[mlir] Make JitRunnerMain main take a DialectRegistry

Historically, JitRunner has been registering all available dialects with the
context and depending on them without the real need. Make it take a registry
that contains only the dialects that are expected in the input and stop linking
in all dialects.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96436

[Attr] Apply GNU-style attributes to expression statements

Before this commit, expression statements could not be annotated
with statement attributes. Whenever parser found attribute, it
unconditionally assumed that it was followed by a declaration.
This not only doesn't allow expression attributes to have attributes,
but also produces spurious error diagnostics.

In order to maintain all previously compiled code, we still assume
that GNU attributes are followed by declarations unless ALL of those
are statement attributes. And even in this case we are not forcing
the parser to think that it should parse a statement, but rather
let it proceed as if no attributes were found.

Differential Revision: https://reviews.llvm.org/D93630

[lldb/test] Automatically find debug servers to test

Our test configuration logic assumes that the tests can be run either
with debugserver or with lldb-server. This is not entirely correct,
since lldb server has two "personalities" (platform server and debug
server) and debugserver is only a replacement for the latter.

A consequence of this is that it's not possible to test the platform
behavior of lldb-server on macos, as it is not possible to get a hold of
the lldb-server binary.

One solution to that would be to duplicate the server configuration
logic to be able to specify both executables. However, that seems
excessively redundant.

A well-behaved lldb should be able to find the debug server on its own,
and testing lldb with a different (lldb-|debug)server does not seem very
useful (even in the out-of-tree debugserver setup, we copy the server
into the build tree to make it appear "real").

Therefore, this patch deletes the configuration altogether and changes
the low-level server retrieval functions to be able to both lldb-server
and debugserver paths. They do this by consulting the "support
executable" directory of the lldb under test.

Differential Revision: https://reviews.llvm.org/D96202

[ARM] Copy-paste error in ARMv87a architecture definition.

In the tablegen architecture definition, the Name field for the
ARMv87a record read "ARMv86a". All the other records contain their own
names.

Corrected it to "ARMv87a", and added the necessary value in
ARMArchEnum for that to refer to.

Reviewed By: pratlucas

Differential Revision: https://reviews.llvm.org/D96493

[OpenCL] Fix missing const attributes for get_image_ builtins

Various get_image builtin function declarations did not have the const
attribute. Bring the const attributes of `-fdeclare-opencl-builtins`
more in sync with `opencl-c.h`.

Return "[Codegenprepare][X86] Use usub with overflow opt for IV increment"

The patch did not account for one corner case where cmp does not dominate
the loop latch. This patch adds this check, hopefully it's cheap because
the CFG does not change during the transform, so DT queries should be
executed quickly.

If you see compile time slowness from this, please revert.

Differential Revision: https://reviews.llvm.org/D96119

[gn build] Port b4993cf54d7f

[Test] Add test that exposed failure on reverted patch in codegen

Correct swift_bridge duplicate attribute warning logic

The swift_bridge attribute warns when the attribute is applied multiple
times to the same declaration. However, it warns about the arguments
being different to the attribute without ever checking if the arguments
actually are different. If the arguments are different, diagnose,
otherwise silently accept the code. Either way, drop the duplicated
attribute.

[ARM] Change getScalarizationOverhead overload used in gather costs. NFC

This changes which of the getScalarizationOverhead overloads is used in
the gather/scatter cost to use the base variant directly, not relying on
the version using heuristics on the number of args with no args
provided. It should still produce the same costs for scalarized
gathers/scatters.

[test][Dexter] Fix test failure if space in python path

The '%dexter_regression_test' substitution was missing quotes around the
python executable, unlike other substitutions of a similar nature in the
file. This changes fixes the issue.

Differential Revision: https://reviews.llvm.org/D96420

Reviewed by: jmorse, aganea

[flang][driver] Move standard macro predefs to a dedicated method (nfc)

This patch just addresses one of the outstanding TODOs. More
specifically, it moves all the outstanding standard macro predefinitions
from `SetDefaultFortranOpts` to `setDefaultPredefinitions`. This
dedicated method for standard macro predefs was introduced in:
* https://reviews.llvm.org/D96032

[AMDGPU] Move kill lowering to WQM pass and add live mask tracking

Move implementation of kill intrinsics to WQM pass. Add live lane
tracking by updating a stored exec mask when lanes are killed.
Use live lane tracking to enable early termination of shader
at any point in control flow.

Reviewed By: piotr

Differential Revision: https://reviews.llvm.org/D94746

NFC: Migrate CodeMetrics to work on InstructionCost

This patch migrates cost values and arithmetic to work on InstructionCost.
When the interfaces to TargetTransformInfo are changed, any InstructionCost
state will propagate naturally.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D96030

Revert "[Codegenprepare][X86] Use usub with overflow opt for IV increment"

This reverts commit 3d15b7e7dfc3e2cefc47791d1e8d95909e937842.

We've found an internal failure, need to analyze.