review.tizen.org Git - platform/upstream/llvm.git/log

[llvm-cov gcov] Change case to match the prevailing style && replace getString with readString

[Flang][test] Fix Windows buildbot.

Add

REQUIRES: shell

to tests that execute a UNIX shell script to not run on Windows.

[test] Fix nocompress.test

[profile] Fix variable name

This fixes a bug introduced in d85c258fd1e7459cc8085b5f364e356f50b490a4.

[llvm-profdata] Make diagnostics consistent with the (no capitalization, no period) style

The format is currently inconsistent. Use the https://llvm.org/docs/CodingStandards.html#error-and-warning-messages style.

And add `error:` or `warning:` to CHECK lines wherever appropriate.

[profile] Don't publish VMO if there are no counters

If there are no counters, there's no need to publish the VMO.

Differential Revision: https://reviews.llvm.org/D102786

[LLD] [COFF] Avoid doing repeated fuzzy symbol lookup for each iteration. NFC.

This is run every time around in the main linker loop. Once a match
has been found, stop trying to rematch such a symbol.

Not sure if this has any actual measurable performance impact though
(SymbolTable::findMangle() iterates over the whole symbol table for
each call and does fuzzy matching on top of that) but this makes the
code more reassuring to read at least. (This is in practice run for def
files listing undecorated stdcall functions to be exported.)

Differential Revision: https://reviews.llvm.org/D104529

[LLD] [MinGW] Print errors/warnings in lld-link with a "ld.lld" prefix

Pass the original argv[0] to the coff linker, as the coff linker uses
the basename of argv[0] as the log prefix.

This makes error messages to be printed with a "ld.lld:" prefix
instead of "lld-link:". The current "lld-link:" prefix can be confusing
to users, as they're invoking the MinGW linker (and might not even have
a lld-link executable).

Keep the first argument as lld-link when printing the command line, to
make it an actually reproducible standalone command.

Differential Revision: https://reviews.llvm.org/D104526

[llvm-profdata] Delete unneeded empty output filename check

[RISCV] Prevent formation of shXadd(.uw) and add.uw if it prevents the use of addi.

If the outer add has an simm12 immediate operand we should prefer
it instead of materializing it in a register. This would guarantee
and extra instruction and temporary register. Since we don't check
one use on the shl or zext we might generate more instructions if
there is an additional user.

[NFC] AMD Zen 3: fix typo in a comment

Simplify some typedef struct

[gn build] (manually) port b9c05aff205b (MIRTests)

[lld/mac] Make sure __thread_ptrs is in front of __thread_bss

The exact location doesn't matter, but it should be in front
of __thread_bss. We put it right in front of __thread_data
which is where ld64 seems to put it as well.

Fixes PR50769.

(As mentioned on the bug, there is probably a more structural
fix too, see comment 5. If we don't address this, it's likely
we'll run into this again with other synthetic sections. But
for now, let's fix the immediate breakage.)

Differential Revision: https://reviews.llvm.org/D104596

[lld/mac] Give __DATA,__thread_ptrs type S_THREAD_LOCAL_VARIABLE_POINTERS

...instead of S_NON_LAZY_SYMBOL_POINTERS. This matches ld64.

Part of PR50769.

While here, also remove an old TODO that was done in D87178.

Differential Revision: https://reviews.llvm.org/D104594

[MIRPrinter] Add machine metadata support.

- Distinct metadata needs generating in the codegen to attach correct
  AAInfo on the loads/stores after lowering, merging, and other relevant
  transformations.
- This patch adds 'MachhineModuleSlotTracker' to help assign slot
  numbers to these newly generated unnamed metadata nodes.
- To help 'MachhineModuleSlotTracker' track machine metadata, the
  original 'SlotTracker' is rebased from 'AbstractSlotTrackerStorage',
  which provides basic interfaces to create/retrive metadata slots. In
  addition, once LLVM IR is processsed, additional hooks are also
  introduced to help collect machine metadata and assign them slot
  numbers.
- Finally, if there is any such machine metadata, 'MIRPrinter' outputs
  an additional 'machineMetadataNodes' field containing all the
  definition of those nodes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D103205

[amdgpu] Improve the from f32 to i64.

- Take the same principle as the conversion from f64 to i64 with extra
necessary pre- and post-processing. It helps to reduce that conversion
sequence by half compared to legacy one.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D104427

[InstCombine][test] add tests for select-of-bit-manip; NFC

Revert "Re-Revert "DirectoryWatcher: add an implementation for Windows""

This reverts commit fb32de9e97af0921242a021e30020ffacf7aa6e2.

Remove the secondary synchronization point as noted by Adrian. This is
technically only to make the builders happier about tests and should not
be needed. This also pushes the condition variable setting to after the
watch is actually established (which was the source of the original race
condition, but would normally succeed as the thread shouldn't get put to
sleep immediately on the trigger of the condition variable).

This also was pretested on the chromium builders:
https://ci.chromium.org/ui/p/chromium/builders/try/win_upload_clang/1612/overview.

Allow building for release with EXPENSIVE_CHECKS

D97225 moved LazyCallGraph verify() calls behind EXPENSIVE_CHECKS,
but verity() is defined for debug builds only so this had the unintended
effect of breaking release builds with EXPENSIVE_CHECKS.

Fix by enabling verify() for both debug and EXPENSIVE_CHECKS.

Differential Revision: https://reviews.llvm.org/D104514

[ARM][NFC] Tidy up subtarget frame pointer routines

getFramePointerReg only depends on information in ARMSubtarget,
so move it in there so it can be accessed from more places.

Make use of ARMSubtarget::getFramePointerReg to remove duplicated code.

The main use of useR7AsFramePointer is getFramePointerReg, so inline it.

Differential Revision: https://reviews.llvm.org/D104476

Revert "[clang][FPEnv] Clang floatng point model ffp-model=precise enables ffp-contract=on"

This reverts commit a1449a10dbcfcf353f4f761281c4e572b4ce9308.
Seems like my changes to LNT had no effect -- puzzled.
The 21 tests pass on my sandbox with the clang patch but are
failing in exec time in the bot

[gn build] Port 134723edd5bf

[libcxx] Move all algorithms into their own headers

This is a fairly mechanical change, it just moves each algorithm into
its own header. This is intended to be a NFC.

This commit re-applies 7ed7d4ccb899, which was reverted in 692d7166f771
because the Modules build got broken. The modules build has now been
fixed, so we're re-committing this.

Differential Revision: https://reviews.llvm.org/D103583

Attribution note
----------------
I'm only committing this. This commit is a mix of D103583, D103330 and
D104171 authored by:

Co-authored-by: Christopher Di Bella <cjdb@google.com>
Co-authored-by: zoecarver <z.zoelec2@gmail.com>

[clang-cl] Don't expand /permissive- to /ZC:strictStrings yet

Follow up on rGc70b0e808da8

/Zc:strictStrings is an alias to an option part of the -W group. When the driver tries to render the option back to a string for the cc1 invocation, it sadly gets rendered with the original spelling instead of the alias, causing issues reported here: https://reviews.llvm.org/D103773#inline-989447

I am thinking it's the best to revert this part of the patch until I figured out how to correctly add the arg and until /Zc:strictStrings- exists/is needed.

[clang][FPEnv] Clang floatng point model ffp-model=precise enables ffp-contract=on

This patch changes the ffp-model=precise to enables -ffp-contract=on
(previously -ffp-model=precise enabled -ffp-contract=fast). This is a
follow-up to Andy Kaylor's comments in the llvm-dev discussion
"Floating Point semantic modes". From the same email thread, I put
Andy's distillation of floating point options and floating point modes
into UsersManual.rst

Differential Revision: https://reviews.llvm.org/D74436

[mlir] Add EmitC dialect

This upstreams the EmitC dialect and the corresponding Cpp target, both
initially presented with [1], from [2] to MLIR core. For the related
discussion, see [3].

[1] https://reviews.llvm.org/D76571
[2] https://github.com/iml130/mlir-emitc
[3] https://llvm.discourse.group/t/emitc-generating-c-c-from-mlir/3388

Co-authored-by: Jacques Pienaar <jpienaar@google.com>
Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Co-authored-by: Oliver Scherf <oliver.scherf@iml.fraunhofer.de>
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D103969

[LoopUnroll] Push runtime unrolling decision up into tryToUnrollLoop()

Currently, UnrollLoop() is passed an AllowRuntime flag and decides
itself whether runtime unrolling should be used or not. This patch
pushes the decision into the caller and allows us to eliminate the
ULO.TripCount and ULO.TripMultiple parameters.

Differential Revision: https://reviews.llvm.org/D104487

[RISCV] Optimize add-mul in the zba extension with SH*ADD

This patch does the following optimization.

Rx + Ry * 18 => (SH1ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 20 => (SH2ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 24 => (SH3ADD (SH1ADD Rx, Rx), Ry)
Rx + Ry * 36 => (SH2ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 40 => (SH3ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 72 => (SH3ADD (SH3ADD Rx, Rx), Ry)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D104588

[RISCV][test] Add new tests for add-mul optimization in the zba extension with SH*ADD

These tests will show the following optimization by future patches.

Rx + Ry * 18  => (SH1ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 20  => (SH2ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 24  => (SH3ADD (SH1ADD Rx, Rx), Ry)
Rx + Ry * 36  => (SH2ADD (SH3ADD Rx, Rx), Ry)
Rx + Ry * 40  => (SH3ADD (SH2ADD Rx, Rx), Ry)
Rx + Ry * 72  => (SH3ADD (SH3ADD Rx, Rx), Ry)
Rx * (3 << C) => (SLLI (SH1ADD Rx, Rx), C)
Rx * (5 << C) => (SLLI (SH2ADD Rx, Rx), C)
Rx * (9 << C) => (SLLI (SH3ADD Rx, Rx), C)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D104507

[ORC][examples] Add missing library dependence

[mlir][linalg] Lower subtensor(pad_tensor) to pad_tensor(subtensor)

Only high padding is supported at the moment. Low padding will be added in a separate commit.

Differential Revision: https://reviews.llvm.org/D104357

[re-land][lld-macho] Avoid force-loading the same archive twice

This reverts commit c9b241efd68c5a0f1f67e9250960ade454f3bc11, which was
a backout diff to fix the buildbots.

The real culprit of the crash is
https://github.com/llvm/llvm-project/commit/1d31fb8d122b1117cf20a9edc09812db8472e930,
which is being reverted.

Differential Revision: https://reviews.llvm.org/D104353

Revert "[lld-macho] Have path-related functions return std::string, not StringRef"

This reverts commit 1d31fb8d122b1117cf20a9edc09812db8472e930.

Making `rerootPath` return a temporary std::string caused a
use-after-free:

https://ci.chromium.org/ui/p/chromium/builders/try/win_upload_clang/1608/overview

[llvm][Inliner] Add an optional PriorityInlineOrder

This patch adds an optional PriorityInlineOrder, which uses the heap to order inlining.
The callsite which size is smaller would have a higher priority.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D104028

[ORC][C-bindings] Add access to LLJIT IRTransformLayer, ThreadSafeModule utils.

This patch was derived from Valentin Churavy's work in
https://reviews.llvm.org/D104480. It adds support for setting the transform on
an IRTransformLayer, and for accessing the IRTransformLayer in LLJIT. It also
adds access to the ThreadSafeModule::withModuleDo method for thread-safe
access to modules.

A new example has been added to show how to use these APIs to optimize a module
during materialization.

Thanks Valentin!

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D103855

[ORC][examples] Fix file name in comment.

[libfuzzer] Disable failing DFSan-related tests

These have been broken by https://reviews.llvm.org/D104494.
However, `lib/fuzzer/dataflow/` is unused (?) so addressing this is not a priority.

Added TODOs to re-enable them in the future.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104568

[InstCombine] Don't transform code if DoTransform is false

In patch https://reviews.llvm.org/D72396, it doesn't check DoTransform before transforming the code, and generates wrong result for the attached test case.

Differential Revision: https://reviews.llvm.org/D104567

Revert "[lld-macho] Avoid force-loading the same archive twice"

This reverts commit 24706cd73cd150543753a2e169c68a2c68da46a1.
Test seems to fail flakily. See comments on https://reviews.llvm.org/D104353
for a hypothesis for why.

[InstrProfiling][ELF] Make __profd_ private if the function does not use value profiling

On ELF, the D1003372 optimization can apply to more cases. There are two
prerequisites for making `__profd_` private:

* `__profc_` keeps `__profd_` live under compiler/linker GC
* `__profd_` is not referenced by code

The first is satisfied because all counters/data are in a section group (either
`comdat any` or `comdat noduplicates`). The second requires that the function
does not use value profiling.

Regarding the second point: `__profd_` may be referenced by other text sections
due to inlining. There will be a linker error if a prevailing text section
references the non-prevailing local symbol.

With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`)
clang is 4.2% smaller (1-169620032/177066968).
`stat -c %s **/*.o | awk '{s+=$1}END{print s}' is 2.5% smaller.

Reviewed By: davidxl, rnk

Differential Revision: https://reviews.llvm.org/D103717

[flang] Recode a switch() to dodge a sketchy warning

One of the buildbots uses a compiler (can't tell which) that
doesn't approve of a "default:" in a switch statement whose
cases appear to completely cover all possible values of an
enum class. But this switch is in raw data dumping code that
needs to allow for incorrect values in memory. So rewrite it
as a cascade of if statements; performance doesn't matter here.

[profile][test] Delete profraw directory so that tests are immune to format version upgrade

AMDGPU: Fix infinite loop in DAG combine with fneg + fma

We were not reporting isFNegFree for v2f32, although it is effectively
free after legalization. The generic combine was pulling fneg out of
the fma source operands, and the AMDGPU combine was doing the
opposite.

Re-Revert "DirectoryWatcher: add an implementation for Windows"

This reverts commit 76f1baa7875acd88bdd4b431eed6e2d2decfc0fe.

Also reverts 2 follow-ups:

1. Revert "DirectoryWatcher: also wait for the notifier thread"
This reverts commit 527a1821e6f8e115db3335a3341c7ac491725a0d.

2. Revert "DirectoryWatcher: close a possible window of race on Windows"
This reverts commit a6948da86ad7e78d66b26263c2681ef6385cc234.

Makes tests hang, see comments on https://reviews.llvm.org/D88666

AMDGPU: Fix assert on m0_lo16/m0_hi16

These get added (redundantly) to the bundle expanded for indirect
register accesses. We hit this path only when there is a call in the
function.

[OpenMP] Make bug49334.cpp more reproducible

`bug49334.cpp` cannot detect data race in `libomptarget` efficiently. It
is reported that with `N = 256` and `BS = 16`, the data race can be reproduced
more steadily. The next coming pathces will fix it so this patch is expected to
fail now.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104552

[CSSPGO] Undoing the concept of dangling pseudo probe

As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.

I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104477

[flang] Fix clang build (struct/class mismatch warning)

A recent patch changed a struct into a class, but missed a
forward definition. GCC didn't warn, but clang does. Fix.

Whitespace fixes for
193e41c987127aad86d0380df83e67a85266f1f1
which reportedly fails on the mac builds.

Partial rollback: Disable MLIR verifier parallelism.

Deadlocks have been found in several downstream projects as noted on the original patch: https://reviews.llvm.org/D104207

Disabling pending full root cause analysis.

Differential Revision: https://reviews.llvm.org/D104570

[LoopUnroll] Simplify optimization remarks

Remove dependence on ULO.TripCount/ULO.TripMultiple from ORE and
debug code. For debug code, print information about all exits.
For optimization remarks, only include the unroll count and the
type of unroll (complete, partial or runtime), but omit detailed
information about exit folding, now that more than one exit may
be folded.

Differential Revision: https://reviews.llvm.org/D104482

[CSSPGO][llvm-profgen] Fix an issue in findDisjointRanges

We were using 0 as an indicator of invalid offset when computing disjoint ranges. In reality, 0 can be an valid code offset which stands for the first function in .text section. I'm using UINT64_MAX as an invalid code offset instead.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104497

[mlir] Add support to SourceMgrDiagnosticHandler for filtering FileLineColLocs

This revision adds support for passing a functor to SourceMgrDiagnosticHandler for filtering out FileLineColLocs when emitting a diagnostic. More specifically, this can be useful in situations where there may be large CallSiteLocs with locations that aren't necessarily important/useful for users.

For now the filtering support is limited to FileLineColLocs, but conceptually we could allow filtering for all locations types if a need arises in the future.

Differential Revision: https://reviews.llvm.org/D103649

[GCOVProfiling] don't profile Fn's w/ noprofile attribute

Similar to D104475, the Linux kernel would like to avoid compiler
generated code in certain functions. The no_profile function
attribute can be used in C to generate the the noprofile fn attr in IR.
Respect that from GCOVProfiling.

Link: https://lore.kernel.org/lkml/CAKwvOdmPTi93n2L0_yQkrzLdmpxzrOR7zggSzonyaw2PGshApw@mail.gmail.com/
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D104257

[Clang][Codegen] Add GNU function attribute 'no_profile' and lower it to noprofile

noprofile IR attribute already exists to prevent profiling with PGO;
emit that when a function uses the newly added no_profile function
attribute.

The Linux kernel would like to avoid compiler generated code in
functions annotated with such attribute. We already respect this for
libcalls to fentry() and mcount().

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223
Link: https://lore.kernel.org/lkml/CAKwvOdmPTi93n2L0_yQkrzLdmpxzrOR7zggSzonyaw2PGshApw@mail.gmail.com/
Reviewed By: MaskRay, void, phosek, aaron.ballman

Differential Revision: https://reviews.llvm.org/D104475

[NFC][compiler-rt][hwasan] Move hwasanThreadList().CreateCurrentThread() into InitThreads

Once D104553 lands, CreateCurrentThread will be able to accept optional
parameters for initializing the hwasan thread object. On fuchsia, we can get
stack info in the platform-specific InitThreads and pass it through
CreateCurrentThread. On linux, this is a no-op.

Differential Revision: https://reviews.llvm.org/D104561

[lld-macho] Have path-related functions return std::string, not StringRef

findLibrary() returned a StringRef while findFramework & other helper
functions returned std::strings. Standardize on std::string.

(I initially tried making the helper functions all return StringRefs,
but I realized we shouldn't return input StringRefs since their
lifetimes would not be obvious from the calling code.)

[lld-macho] Handle non-extern symbols marked as private extern

Previously, we asserted that such a case was invalid, but in fact
`ld -r` can emit such symbols if the input contained a (true) private
extern, or if it contained a symbol started with "L".

Non-extern symbols marked as private extern are essentially equivalent
to regular TU-scoped symbols, so no new functionality is needed.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104502

[flang][OpenMP] Add semantic checks for occurrence of nested Barrier regions

This patch adds the following nesting check for `barrier` constructs:

```
A barrier region may not be closely nested inside a worksharing, loop, task, taskloop, critical, ordered, atomic, or master region.
```

Also adds a test case for the check,

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D99888

[libc++] [P1518R2] Better CTAD behavior for containers with allocators.

P1518 does the following in C++23 but we'll just do it in C++17 as well:
- Stop requiring `Alloc` to be an allocator on some container-adaptor deduction guides
- Stop deducing from `Allocator` on some sequence container constructors
- Stop deducing from `Allocator` on some other container constructors (libc++ already did this)

The affected constructors are the "allocator-extended" versions of
constructors where the non-allocator arguments are already sufficient
to deduce the allocator type. For example,

    std::pmr::vector<int> v1;
    std::vector v2(v1, std::pmr::new_delete_resource());
    std::stack s2(v1, std::pmr::new_delete_resource());

Differential Revision: https://reviews.llvm.org/D97742

[clang-tidy] performance-unnecessary-copy-initialization: Directly examine the initializing var's initializer.

This fixes false positive cases where a reference is initialized outside of a
block statement and then its initializing variable is modified. Another case is
when the looped over container is modified.

Differential Revision: https://reviews.llvm.org/D103021

Reviewed-by: ymandel

[RISCV] Teach vsetvli insertion to remember when predecessors have same AVL and SEW/LMUL ratio if their VTYPEs otherwise mismatch.

Previously we went directly to unknown state on VTYPE mismatch.
If we instead remember the partial match, we can use this to
still use X0, X0 vsetvli in successors if AVL and needed SEW/LMUL
ratio match.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D104069

[CSSPGO][llvm-profgen] Ignore LBR records after interrupt transition

If we have seen an inwards transition from external code to internal code, but not a following outwards transition, the inwards transition is likely due to interrupt which is usually unpaired. Ignore current and subsequent entries since they are likely from an unrelated pre-interrupt context.

LBR records from different interrupt context are unrelated and they should not be mixed together. Currenlty the OS does this for task-scheduling interrupt but not for all interrupts.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D104276

[AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask

Implemented the transformation of xor (llvm.amdgcn.class x, mask), -1 into
llvm.amdgcn.class(x, ~mask). Added LIT tests as well.

Differential Revision: https://reviews.llvm.org/D104049

[CSSPGO] Fix an invalid hash table reference issue in the CS preinliner.

We were using a `StringMap` object to store all profiles to be emitted. The object is basically an unordered hash table, therefore updating it in the process of trasvering it may cause issue since the underlying bucket array could change.

I'm also moving the `csspgo-preinliner` switch around so that no context tri will be constructed (by the constructor of `CSPreInliner`) when the switch is off.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104267

[DFSan] Cleanup code for platforms other than Linux x86_64.

These other platforms are unsupported and untested.
They could be re-added later based on MSan code.

Reviewed By: gbalats, stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104481

Revert "Delay initialization of OptBisect"

This reverts commit ec91df8d8195b8b759a89734dba227da1eaa729f.

It was committed by accident.

XFAIL a testcase on Hexagon (missing-abstract-variable.ll)

This seems to be a common problem among several architectures.

Delay initialization of OptBisect

When LLVM is used in other projects, it may happen that global cons-
tructors will execute before the call to ParseCommandLineOptions.
Since OptBisect is initialized via a constructor, and has no ability
to be updated at a later time, passing "-opt-bisect-limit" to the
parse function may have no effect.

To avoid this problem use a cl::cb (callback) to set the bisection
limit when the option is actually processed.

Differential Revision: https://reviews.llvm.org/D104551

[OpenMP] Update FAQ for enabling cuda offloading

Add an FAQ entry and add a few lines to an existing one. Document
the use of `GCC_INSTALL_PREFIX` for pointing clang to correct
GCC installation for two-stage build.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D104474

[hwasan] Clarify report for allocation-tail-overwritten.

Explain what the given stack trace means before showing it, rather than
only in the paragraph at the end.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D104523

[compiler-rt][hwasan] Move Thread::Init into hwasan_linux.cpp

This allows for other implementations to define their own version of `Thread::Init`.
This will be the case for Fuchsia where much of the thread initialization can be
broken up between different thread hooks (`__sanitizer_before_thread_create_hook`,
`__sanitizer_thread_create_hook`, `__sanitizer_thread_start_hook`). Namely, setting
up the heap ring buffer and stack info and can be setup before thread creation.
The stack ring buffer can also be setup before thread creation, but storing it into
`__hwasan_tls` can only be done on the thread start hook since it's only then we
can access `__hwasan_tls` for that thread correctly.

Differential Revision: https://reviews.llvm.org/D104248

[flang] Runtime implementation for default derived type formatted I/O

This is *not* user-defined derived type I/O, but rather Fortran's
built-in capabilities for using derived type data in I/O lists
and NAMELIST groups.

This feature depends on having the derived type description tables
that are created by Semantics available, passed through compilation
as initialized static objects to which pointers can be targeted
in the descriptors of I/O list items and NAMELIST groups.

NAMELIST processing now handles component references on input
(e.g., "&GROUP x%component = 123 /").

The C++ perspectives of the derived type information records
were transformed into proper classes when it was necessary to add
member functions to them.

The code in Semantics that generates derived type information
was changed to emit derived type components in component order,
not alphabetic order.

Differential Revision: https://reviews.llvm.org/D104485

[lldb-vscode] attempt to fix flakiness

There are many tests failing intermittently for lldb-vscode after
https://reviews.llvm.org/rGaa4685c0fb3aab5acb90be5fd3eb5ba8bf1e3211. I'm
unsure if this actually the culprit, so I'm softly removing that feature
to see if that fixes the issue.

[lld/mac] Support -data_in_code_info, -function_starts flags

These are on by default, but there's also an explicit flag for them.

Differential Revision: https://reviews.llvm.org/D104543

Rename option -icf MODE to --icf=MODE

The `icf` command-line option is not present in ld64, so it should use the LLD option syntax, which begins with double dashes and separates primary option from any suboption with the equal sign.

Differential Revision: https://reviews.llvm.org/D104548

[AArch64] Add TableGen patterns to generate uaddlv

uaddv(uaddlp(x)) ==> uaddlv(x)
addp(uaddlp(x)) ==> uaddlv(x)

Differential Revision: https://reviews.llvm.org/D104236

[NFC][libomptarget] Build elf_common with PIC.

Differential Revision: https://reviews.llvm.org/D104545

[NFC][libomptarget] Fixed -DLLVM_ENABLE_RUNTIMES="openmp" build.

Differential Revision: https://reviews.llvm.org/D104535

Fix build failure on 32 bit Arm

This patch fixes build failure caused by commit
f27e4548fc42876f66dac260ca3b6df0d5fd5fd6 on 32 bit arm.

Differential Revision: https://reviews.llvm.org/D103292

RISCV: simplify a test case for RISCV (NFCI)

The output of the object file is unimportant and entirely discarded.
Simply redirect the output to `/dev/null` or `NUL` as the case may be.
Additionally, the space between the labels is unimportant. There is no
need to add space between the labels. Two labels at the same address
are sufficient to generate the difference expression and should still
test the same behaviour.

[HWASan] Run LAM tests with -hwasan-generate-tags-with-calls.

The default callback instrumentation in x86 LAM mode uses ASLR bits
to randomly choose a tag, and thus has a 1/64 chance of choosing a
stack tag of 0, causing stack tests to fail intermittently. By using
__hwasan_generate_tag to pick tags, we guarantee non-zero tags and
eliminate the test flakiness.

aarch64 doesn't seem to have this problem using thread-local addresses
to pick tags, so perhaps we can remove this workaround once we implement
a similar mechanism for LAM.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D104470

[clang] Implement P2266 Simpler implicit move

This Implements [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2266r1.html|P2266 Simpler implicit move]].

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone

Differential Revision: https://reviews.llvm.org/D99005

[mlir] Add notes about using external interface application.

Differential Revision: https://reviews.llvm.org/D104489

[DAG] SelectionDAG::computeKnownBits - use APInt::insertBits to merge subvector knownbits. NFCI.

As noticed on D104472 we can use APInt::insertBits which will avoid a lot of temporary APInt creations

[analyzer] Handle NTTP invocation in CallContext.getCalleeDecl()

This fixes a crash in MallocChecker for the situation when operator new (delete) is invoked via NTTP and makes the behavior of CallContext.getCalleeDecl(Expr) identical to CallEvent.getDecl().

Reviewed By: vsavchenko

Differential Revision: https://reviews.llvm.org/D103025

[libclang] Fix error handler in translateSourceLocation.

Given an invalid SourceLocation, translateSourceLocation will call
clang_getNullLocation, and then do nothing with the result. But
clang_getNullLocation has no side effects: it just constructs and
returns a null CXSourceLocation value.

Surely the intention was to //return// that null CXSourceLocation to
the caller, instead of throwing it away and pressing on anyway.

Reviewed By: miyuki

Differential Revision: https://reviews.llvm.org/D104442

[ORC][C-bindings] Re-order object transform function arguments.

ObjInOut is an in-out parameter not a return value argument, so by convention
it should come after the context value (Ctx).

[ORC] Use uint8_t rather than char for RPC wrapper-function calls.

This partially reverts 838490de7ed, which broke some Solaris bots. Apparently
Solaris defines int8_t as char rather than signed char, which made the
SerializationTypeName<char> specialization a redefinition.

This partial revert isolates use of uint8_t buffers to ORC-RPC handling of
wrapper functions only. The TargetProcessControl::runWrapper method will
continue to use char buffers.

[clang] Exclude function pointers on DefaultedComparisonAnalyzer

This implements a more comprehensive fix than was done at D95409.
Instead of excluding just function pointer subobjects, we also
exclude any user-defined function pointer conversion operators.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D103855

[ORC] Add support for dumping objects to the C API.

Provides ObjectTransformLayer APIs, a getter to access the
ObjectTransformLayer member of LLJIT, and the DumpObjects utility
to make construction of a dump-to-disk transform easy.

An example showing how the new APIs can be used has been added in
llvm/examples/OrcV2Examples/OrcV2CBindingsDumpObjects.

Revert D104028 "[llvm][Inliner] Add an optional PriorityInlineOrder"

[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration (try 3)

This patch handles one particular case of one-iteration loops for which SCEV
cannot straightforwardly prove BECount = 1. The idea of the optimization is to
symbolically execute conditional branches on the 1st iteration, moving in topoligical
order, and only visiting blocks that may be reached on the first iteration. If we find out
that we never reach header via the latch, then the backedge can be broken.

This implementation uses InstSimplify. SCEV version was rejected due to high
compile time impact.

Differential Revision: https://reviews.llvm.org/D102615
Reviewed By: nikic

[MLIR] Introduce scf.execute_region op

Introduce the execute_region op that is able to hold a region which it
executes exactly once. The op encapsulates a CFG within itself while
isolating it from the surrounding control flow. Proposal discussed here:
https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282

execute_region enables one to inline a function without lowering out all
other higher level control flow constructs (affine.for/if, scf.for/if)
to the flat list of blocks / CFG form. It thus allows the benefit of
transforms on higher level control flow ops available in the presence of
the inlined calls. The inlined calls continue to benefit from
propagation of SSA values across their top boundary. Functions won’t
have to remain outlined until later than desired. Abstractions like
affine execute_regions, lambdas with implicit captures could be lowered
to this without first lowering out structured loops/ifs or outlining.
But two potential early use cases are of: (1) an early inliner (which
can inline functions by introducing execute_region ops), (2) lowering of
an affine.execute_region, which cleanly maps to an scf.execute_region
when going from the affine dialect to the scf dialect.

Differential Revision: https://reviews.llvm.org/D75837

[Attributor] Fix UB behavior on uninitalized bool variables.

Found by ASAN.

[AMDGPU] Update generated checks. NFC.

[InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X)

InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.

Differential Revision: https://reviews.llvm.org/D104193

[Test] Add XFAIL unit test for PR50765

[flang] Rewrite test for CPU_TIME

Don't rely on volatile writes to keep the CPU busy - it seems MSVC
optimizes them out, so we don't get different values for 'start' and
'end' on Windows. Rewrite the test to loop until we get a different
value for 'end'.

Fix suggested by Michael Kruse in
https://reviews.llvm.org/rG57e85622bbdb2eb18cc03df2ea457019c58f6912#inline-6002

Committing to fix the Windows buildbot, post-commit comments welcome!