review.tizen.org Git - platform/upstream/llvm.git/log

[SystemZ] Bugfix for the 'N' code for inline asm operand.

Don't use a local MachineOperand copy in SystemZAsmPrinter::PrintAsmOperand()
and change the register as it may break the MRI tracking of register
uses. Use an MCOperand instead.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D105757

[InstCombine] reduce signbit test of logic ops to cmp with zero

This is the pattern from the description of:
https://llvm.org/PR50816

There might be a way to generalize this to a smaller or more
generic pattern, but I have not found it yet.

https://alive2.llvm.org/ce/z/ShzJoF

define i1 @src(i8 %x) {
  %add = add i8 %x, -1
  %xor = xor i8 %x, -1
  %and = and i8 %add, %xor
  %r = icmp slt i8 %and, 0
  ret i1 %r
}

define i1 @tgt(i8 %x) {
  %r = icmp eq i8 %x, 0
  ret i1 %r
}

[InstCombine][tests] add tests for signbit + logic; NFC

PR50816

[mlir][linalg][python] Add auto-generated file warning (NFC).

Annotate LinalgNamedStructuredOps.yaml with a comment stating the file is auto-generated and should not be edited manually.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105809

[CostModel][X86] Adjust truncate SSE/AVX legalized costs based on llvm-mca reports.

Update truncation costs based on the worst case costs from the script in D103695.

Move to using legalized types wherever possible, which allows us to prune the cost tables.

[gn build] port 0da172b1766e more

[AArch64] Set the latency of Cortex-A55 stores to 1

This sets the latency of stores to 1 in the Cortex-A55 scheduling model,
to better match the values given in the software optimization guide.

The latency of a store in normal llvm scheduling does not appear to have
a lot of uses. If the store has no outputs then the latency is somewhat
meaningless (and pre/post increment update operands use the WriteAdr
write for those operands instead). The one place it does alter things is
the latency between a store and the end of the scheduling region, which
can in turn have an effect on the critical path length. As a result a
latency of 1 is more correct and offers ever-so-slightly better
scheduling of instructions near the end of the block.

They are marked as RetireOOO to keep the llvm-mca from introducing
stalls where non would exist.

Differential Revision: https://reviews.llvm.org/D105541

[gn build] (semi-manually) port 0da172b1766e

Partially implement P1401R5 (Narrowing contextual conversions to bool)

Support Narrowing conversions to bool in if constexpr condition
under C++23 language mode.

Only if constexpr is implemented as the behavior of static_assert
is already conforming. Still need to work on explicit(bool) to
complete support.

sanitizer_common: fix 32-bit build

https://reviews.llvm.org/D105716 enabled thread safety annotations,
and that broke 32-bit build:
https://green.lab.llvm.org/green/job/lldb-cmake/33604/consoleFull#-77815080549ba4694-19c4-4d7e-bec5-911270d8a58c

1. Enable thread-safety analysis in unit tests
(this catches the breakage even in 64-bit mode).
2. Add NO_THREAD_SAFETY_ANALYSIS to sanitizer_allocator_primary32.h
to unbreak the build.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105808

Fix the Clang documentation builder; NFC.

It was broken three days ago by the changes in D95561.

[lldb/Target] Fix event handling during process launch

This patch fixes process event handling when the events are broadcasted
at launch. To do so, the patch introduces a new listener to fetch events
by hand off the event queue and then resending them ensure the event ordering.

Differental Revision: https://reviews.llvm.org/D105698

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[RS4GC] Use one DVCache for both inlineGetBaseAndOffset() and insertParsePoints()

This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103240

[RS4GC] Add a test to demonstrate duplication of base generation. NFC

This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103238

[PowerPC] Fix rounding mode for vec_round in altivec.h

The function is supposed to be the equivalent of rint() (as in
round to nearest, ties to even) rather than round() (round to
nearest, ties away from zero). In fact, the instruction we emit
without VSX is vrfin which is correct. However, with VSX we emit
xvrspi which is the equivalent of round() and therefore incorrect.
Since there is no equivalent VSX instruction, simply use vrfin
regardless of availability of VSX.

sanitizer_common: make sem_trywait as non-blocking

sem_trywait never blocks.
Use REAL instead of COMMON_INTERCEPTOR_BLOCK_REAL.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105775

sanitizer_common: remove debugging logic from the internal allocator

The internal allocator adds 8-byte header for debugging purposes.
The problem with it is that it's not possible to allocate nicely-sized
objects without a significant overhead. For example, if we allocate
512-byte objects, that will be rounded up to 768 or something.
This logic migrated from tsan where it was added during initial development,
I don't remember that it ever caught anything (we don't do bugs!).
Remove it so that it's possible to allocate nicely-sized objects
without overheads.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105777

[OpenMP] Support OpenMP 5.1 attributes

OpenMP 5.1 added support for writing OpenMP directives using [[]]
syntax in addition to using #pragma and this introduces support for the
new syntax.

In OpenMP, the attributes take one of two forms:
[[omp::directive(...)]] or [[omp::sequence(...)]]. A directive
attribute contains an OpenMP directive clause that is identical to the
analogous #pragma syntax. A sequence attribute can contain either
sequence or directive arguments and is used to ensure that the
attributes are processed sequentially for situations where the order of
the attributes matter (remember:
https://eel.is/c++draft/dcl.attr.grammar#4.sentence-4).

The approach taken here is somewhat novel and deserves mention. We
could refactor much of the OpenMP parsing logic to work for either
pragma annotation tokens or for attribute clauses. It would be a fair
amount of effort to share the logic for both, but it's certainly
doable. However, the semantic attribute system is not designed to
handle the arbitrarily complex arguments that OpenMP directives
contain. Adding support to thread the novel parsed information until we
can produce a semantic attribute would be considerably more effort.
What's more, existing OpenMP constructs are not (often) represented as
semantic attributes. So doing this through Attr.td would be a massive
undertaking that would likely only benefit OpenMP and comes with
additional risks. Rather than walk down that path, I am taking
advantage of the fact that the syntax of the directives within the
directive clause is identical to that of the #pragma form. Once the
parser recognizes that we're processing an OpenMP attribute, it caches
all of the directive argument tokens and then replays them as though
the user wrote a pragma. This reuses the same OpenMP parsing and
semantic logic directly, but does come with a risk if the OpenMP
committee decides to purposefully diverge their pragma and attribute
syntaxes. So, despite this being a novel approach that does token
replay, I think it's actually a better approach than trying to do this
through the declarative syntax in Attr.td.

[mlir][Linalg] Improve comprehensive bufferization for scf.yield.

Previously, comprehensive bufferization of scf.yield did not have enough information
to detect whether an enclosing scf::for bbargs would bufferize to a buffer equivalent
to that of the matching scf::yield operand.
As a consequence a separate sanity check step would be required to determine whether
bufferization occured properly.
This late check would miss the case of calling a function in an loop.

Instead, we now pass and update aliasInfo during bufferization and it is possible to
imrpove bufferization of scf::yield and drop that post-pass check.

Add an example use case that was failing previously.
This slightly modifies the error conditions, which are also updated as part of this
revision.

Differential Revision: https://reviews.llvm.org/D105803

sanitizer_common: use 0 for empty stack id

We use 0 for empty stack id from stack depot.
Deadlock detector 1 is the only place that uses -1
as a special case. Use 0 because there is a number
of checks of the form "if (stack id) ...".

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105776

[llvm][sve] Lowering for VLS truncating stores

This adds custom lowering for truncating stores when operating on
fixed length vectors in SVE. It also includes a DAG combine to
fold extends followed by truncating stores into non-truncating
stores in order to prevent this pattern appearing once truncating
stores are supported.

Currently truncating stores are not used in certain cases where
the size of the vector is larger than the target vector width.

Differential Revision: https://reviews.llvm.org/D104471

[OpenMP][OMPT] Fix compile-time assertion in ompt-multiplex.h

The compile-time assertion is supposed to prevent double-free caused by
unexpected combination of preprocessor defines passed by an OMPT tool.
The current defines are not used, so this patch replaces the check with
macros actually used in ompt-multiplex.h

Reported by: Semih Burak

Differential Revision: https://reviews.llvm.org/D104633

[PowerPC] Remove unnecessary 64-bit guards from altivec.h

A number of functions in the header have guards for 64-bit only
that were presumably added as some of the functions in the blocks
use vector __int128 which is only available in 64-bit mode.
A more appropriate guard (__SIZEOF_INT128__) has been added for
those functions since, making the 64-bit guards redundant.
This patch removes those guards as they inadvertently guard code
that uses vector long long which does not actually require 64-bit
mode.

sanitizer_common: add thread safety annotations

Enable clang Thread Safety Analysis for sanitizers:
https://clang.llvm.org/docs/ThreadSafetyAnalysis.html

Thread Safety Analysis can detect inconsistent locking,
deadlocks and data races. Without GUARDED_BY annotations
it has limited value. But this does all the heavy lifting
to enable analysis and allows to add GUARDED_BY incrementally.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105716

sanitizer_common: rename Mutex to MutexState

We have 3 different mutexes (RWMutex, BlockingMutex __tsan::Mutex),
each with own set of downsides. I want to unify them under a name Mutex.
But it will conflict with Mutex in the deadlock detector,
which is a way too generic name. Rename it to MutexState.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105773

[LLDB] Testsuite: Add helper to check for AArch64 target

This patch adds a helper function to test target architecture is
AArch64 or not. This also tightens isAArch64* helpers by adding an
extra architecture check.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D105483

[mlir] factor math-to-llvm out of standard-to-llvm

After the Math has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the Math
dialect operations to LLVM into a separate library and a separate
conversion pass.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D105702

[LV] Ignore candidate VFs with invalid costs.

This follows on from discussion on the mailing-list:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151047.html

to interpret an Invalid cost as 'infinitely expensive', as this
simplifies some of the legalization issues with scalable vectors.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D105473

[X86][SSE] X86ISD::FSETCC nodes (cmpss/cmpsd) return a 0/-1 allbits signbits result

Annoyingly, i686 cmpsd handling still fails to remove the unnecessary neg(and(x,1))

[X86][SSE] Add signbit tests to show cmpss/cmpsd ops not recognised as 'allbits' results.

[analyzer][NFC] Display the correct function name even in crash dumps

The `-analyzer-display-progress` displayed the function name of the
currently analyzed function. It differs in C and C++. In C++, it
prints the argument types as well in a comma-separated list.
While in C, only the function name is displayed, without the brackets.
E.g.:

C++: foo(), foo(int, float)
C: foo

In crash traces, the analyzer dumps the location contexts, but the
string is not enough for `-analyze-function` in C++ mode.
This patch addresses the issue by dumping the proper function names
even in stack traces.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D105708

[llvm-exegesis] Fix compilation with old libpfm versions.

Do not try include `perfmon/perf_event.h` when we are not sure that it
exists.

Fixes PR51017.

Differential Revision: https://reviews.llvm.org/D105615

[AArch64] De-capitalize some Emit* functions

AsmParser/AsmPrinter/Streamer are mostly consistent on emit* functions now.

[LLDB] Only build TestWatchTaggedAddress.py on aarch64 PAC targets

This patch fixes buildbot failures caused by TestWatchTaggedAddress.py

Differential Revision: https://reviews.llvm.org/D101361

[mlir] Fix broadcasting check with 1 values

The trait was inconsistent with the other broadcasting logic here. And
also fix printing here to use ? rather than -1 in the error.

Differential Revision: https://reviews.llvm.org/D105748

Support AArch64/Linux watchpoint on tagged addresses

AArch64 architecture support virtual addresses with some of the top bits ignored.
These ignored bits can host memory tags or bit masks that can serve to check for
authentication of address integrity. We need to clear away the top ignored bits
from watchpoint address to reliably hit and set watchpoints on addresses
containing tags or masks in their top bits.

This patch adds support to watch tagged addresses on AArch64/Linux.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D101361

[llvm][Inliner] Templatize PriorityInlineOrder

The patch templatize PriorityInlinerOrder so that it can accept any type priority metric.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D104972

[lld-macho][nfc] Fix YAML input in compact-unwind-sym-relocs.s

* Adjust strsize so llvm-objdump doesn't complain about it extending
past the end of file
* Remove symbol that was referencing a deleted section
* Adjust n_sect of the remaining `_main` symbol to point at the right
section

[OpenMP] Create and use `__kmpc_is_generic_main_thread`

In order to fold calls based on high-level knowledge and control flow
tracking it helps to expose the information as a runtime call. The
logic: `!SPMD && getTID() == getMasterTID()` was used in various places
and is now encapsulated in `__kmpc_is_generic_main_thread`. As part of
this rewrite we replaced eager computation of arguments with on-demand
computation, especially helpful if the calls can be folded and arguments
don't need to be computed consequently.

Differential Revision: https://reviews.llvm.org/D105768

[OpenMP] Simplify variable sharing and increase shared memory size

In order to avoid malloc/free, up to NUM_SHARED_VARIABLES_IN_SHARED_MEM
(=64) variables are communicated in dedicated shared memory instead. The
simplification does avoid the need for an "init" and requires "deinit"
only if we ever communicate more than NUM_SHARED_VARIABLES_IN_SHARED_MEM
variables.

Differential Revision: https://reviews.llvm.org/D105767

[Attributor][NFCI] Add UsedAssumedInformation to more interfaces

As with other Attributor interfaces we often want to know if assumed
information was used to answer a query. This is important if only
known information is allowed or if known information can lead to an
early fixpoint. The users have been adjusted but none of them utilizes
the new information yet.

[IndVars] Don't widen pointers in WidenIV::getWideRecurrence

It's not a reasonable transform, and calling getSignExtendExpr() on a
pointer hits an assertion.

[lld-macho][nfc] clang-format

[lld-macho][nfc] Remove unnecessary llvm:: namespace prefixes

[lld-macho][nfc] Avoid using std::map for PlatformKinds

The mappings we were using had a small number of keys, so a vector is
probably better. This allows us to remove the last usage of std::map in
our codebase.

I also used `removeSimulator` to simplify the code a bit further.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105786

[VPlan] Remove default arg from getVPValue (NFC).

The const version of VPValue::getVPValue still had a default value for
the value index. Remove the default value and use getVPSingleValue
instead, which is the proper function.

[RISCV] Add tests for suboptimal handling of negative constants for i32 uaddo/usubo on RV64. NFC

We end up zero extending constants when we promote to i64. We
should sign extend instead to allow use of addiw or improve
constant materialization.

[RISCV] Add tests for suboptimal handling of negative constants on the LHS of i32 shifts/rotates/subtracts on RV64. NFC

The constants end up getting zero extended to i64, but sign extend
would be better for constant materialization. We're using W
instructions so either behavior is correct since the upper bits
aren't read.

[lld/mac] Unbreak objc.s after 6e05c1cd5f98

[lld/mac] Always reference dyld_stub_binder when linked with libSystem

lld currently only references dyld_stub_binder when it's needed.
ld64 always references it when libSystem is linked.
Match ld64.

The (somewhat lame) motivation is that `nm` on a binary without any
export writes a "no symbols" warning to stderr, and this change makes
it so that every binary in practice has at least a reference to
dyld_stub_binder, which suppresses that.

Every "real" output file will reference dyld_stub_binder, so most
of the time this shouldn't make much of a difference. And if you
really don't want to have this reference for whatever reason, you
can stop passing -lSystem, like you have to for ld64 anyways.

(After linking any dylib, we dump the exported list of symbols to
a txt file with `nm` and only relink downstream deps if that txt
file changes. A nicer fix is to make lld optionally write .tbd files
with the public interface of a linked dylib and use that instead,
but for now the txt files are what we do.)

Differential Revision: https://reviews.llvm.org/D105782

[RISCV] Remove stale FIXME from a test. NFC

sext has been used for sltu/sltiu since e0e62e97.

[DivRemPairs] Add an initial case for hoisting to a common predecessor.

This patch adds support for hoisting the division and maybe the
remainder for control flow graphs like this.

```
PredBB
  |  \
  |  Rem
  |  /
Div
```

If we have DivRem we'll hoist both to PredBB. If not we'll just
hoist Div and expand Rem using the Div.

This improves our codegen for something like this

```
__uint128_t udivmodti4(__uint128_t dividend, __uint128_t divisor, __uint128_t *remainder) {
    if (remainder != 0)
        *remainder = dividend % divisor;
    return dividend / divisor;
}
```

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D87555

[lld/mac] Use normal Undefined machinery for dyld_stub_binder lookup

This is for aesthetic reasons, I'm not aware of anything that needs
this in practice. It does have a few effects:

- `-undefined dynamic_lookup` now has an effect for dyld_stub_binder.
  This matches ld64.

- `-U dyld_stub_binder` now works like you'd expect (it doesn't work in ld64).

- The error message for a missing dyld_stub_binder symbol now looks like
  other undefined reference symbols, it changes from

      symbol dyld_stub_binder not found (normally in libSystem.dylib). Needed to perform lazy binding.

  to

      error: undefined symbol: dyld_stub_binder
      >>> referenced by lazy binding (normally in libSystem.dylib)

Also add test coverage for that error message.

But in practice, this should have no interesting effects since everything links
in dyld_stub_binder via libSystem anyways.

Differential Revision: https://reviews.llvm.org/D105781

[ARM] Add lowering of uadd_sat to uq{add|sub}8 and uq{add|sub}16

This follow the lead of https://reviews.llvm.org/D68974 to add lowering
of unsigned saturated addition/subtraction.

Differential Revision: https://reviews.llvm.org/D105413

Revert "[clang-repl] Implement partial translation units and error recovery."

This reverts commit 6775fc6ffa3ca1c36b20c25fa4e7f48f81213cf2.

It also reverts "[lldb] Fix compilation by adjusting to the new ASTContext signature."

This reverts commit 03a3f86071c10a1f6cbbf7375aa6fe9d94168972.

We see some failures on the lldb infrastructure, these changes might play a role
in it. Let's revert it now and see if the bots will become green.

Ref: https://reviews.llvm.org/D104918

[Analysis] Remove unused declaration isPotentiallyReachableFromMany (NFC)

[IfCvt] Don't use pristine register for counting liveins for predicated instructions.

The test case here hits machine verifier problems. There are volatile
long loads that the results of do not get used, loading into two dead
registers. IfCvt will predicate them and as it does will add implicit
uses of the predicating registers due to thinking they are live in. As
nothing has used the register, the machine verifier disagrees that they
are really live and we end up with a failure.

The registers come from Pristine regs that LivePhysRegs counts as live.
This patch adds a addLiveInsNoPristines method to be used instead in
IfCvt, so that only really live in regs need to be added as implicit
operands.

Differential Revision: https://reviews.llvm.org/D90965

sanitizer_common: unbreak ThreadRegistry tests

https://reviews.llvm.org/D105713
forgot to update tests for the new ctor.

Differential Revision: https://reviews.llvm.org/D105772

[lldb] Fix compilation by adjusting to the new ASTContext signature.

This change was introduced in https://reviews.llvm.org/D104918

sanitizer_common: add simpler ThreadRegistry ctor

Currently ThreadRegistry is overcomplicated because of tsan,
it needs tid quarantine and reuse counters. Other sanitizers
don't need that. It also seems that no other sanitizer now
needs max number of threads. Asan used to need 2^24 limit,
but it does not seem to be needed now. Other sanitizers blindly
copy-pasted that without reasons. Lsan also uses quarantine,
but I don't see why that may be potentially needed.

Add a ThreadRegistry ctor that does not require any sizes
and use it in all sanitizers except for tsan.
In preparation for new tsan runtime, which won't need
any of these parameters as well.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105713

[clang-repl] Implement partial translation units and error recovery.

https://reviews.llvm.org/D96033 contained a discussion regarding efficient
modeling of error recovery. @rjmccall has outlined the key ideas:

Conceptually, we can split the translation unit into a sequence of partial
translation units (PTUs). Every declaration will be associated with a unique PTU
that owns it.

The first key insight here is that the owning PTU isn't always the "active"
(most recent) PTU, and it isn't always the PTU that the declaration
"comes from". A new declaration (that isn't a redeclaration or specialization of
anything) does belong to the active PTU. A template specialization, however,
belongs to the most recent PTU of all the declarations in its signature - mostly
that means that it can be pulled into a more recent PTU by its template
arguments.

The second key insight is that processing a PTU might extend an earlier PTU.
Rolling back the later PTU shouldn't throw that extension away. For example, if
the second PTU defines a template, and the third PTU requires that template to
be instantiated at float, that template specialization is still part of the
second PTU. Similarly, if the fifth PTU uses an inline function belonging to the
fourth, that definition still belongs to the fourth. When we go to emit code in
a new PTU, we map each declaration we have to emit back to its owning PTU and
emit it in a new module for just the extensions to that PTU. We keep track of
all the modules we've emitted for a PTU so that we can unload them all if we
decide to roll it back.

Most declarations/definitions will only refer to entities from the same or
earlier PTUs. However, it is possible (primarily by defining a
previously-declared entity, but also through templates or ADL) for an entity
that belongs to one PTU to refer to something from a later PTU. We will have to
keep track of this and prevent unwinding to later PTU when we recognize it.
Fortunately, this should be very rare; and crucially, we don't have to do the
bookkeeping for this if we've only got one PTU, e.g. in normal compilation.
Otherwise, PTUs after the first just need to record enough metadata to be able
to revert any changes they've made to declarations belonging to earlier PTUs,
e.g. to redeclaration chains or template specialization lists.

It should even eventually be possible for PTUs to provide their own slab
allocators which can be thrown away as part of rolling back the PTU. We can
maintain a notion of the active allocator and allocate things like Stmt/Expr
nodes in it, temporarily changing it to the appropriate PTU whenever we go to do
something like instantiate a function template. More care will be required when
allocating declarations and types, though.

We would want the PTU to be efficiently recoverable from a Decl; I'm not sure
how best to do that. An easy option that would cover most declarations would be
to make multiple TranslationUnitDecls and parent the declarations appropriately,
but I don't think that's good enough for things like member function templates,
since an instantiation of that would still be parented by its original class.
Maybe we can work this into the DC chain somehow, like how lexical DCs are.

We add a different kind of translation unit `TU_Incremental` which is a
complete translation unit that we might nonetheless incrementally extend later.
Because it is complete (and we might want to generate code for it), we do
perform template instantiation, but because it might be extended later, we don't
warn if it declares or uses undefined internal-linkage symbols.

This patch teaches clang-repl how to recover from errors by disconnecting the
most recent PTU and update the primary PTU lookup tables. For instance:

```./clang-repl
clang-repl> int i = 12; error;
In file included from <<< inputs >>>:1:
input_line_0:1:13: error: C++ requires a type specifier for all declarations
int i = 12; error;
^
error: Parsing failed.
clang-repl> int i = 13; extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=13
clang-repl> quit
```

Differential revision: https://reviews.llvm.org/D104918

sanitizer_common: sanitize time functions

We have SleepForSeconds, SleepForMillis and internal_sleep.
Some are implemented in terms of libc functions, some -- in terms
of syscalls. Some are implemented in per OS files,
some -- in libc/nolibc files. That's unnecessary complex
and libc functions cause crashes in some contexts because
we intercept them. There is no single reason to have calls to libc
when we have syscalls (and we have them anyway).

Add internal_usleep that is implemented in terms of syscalls per OS.
Make SleepForSeconds/SleepForMillis/internal_sleep a wrapper
around internal_usleep that is implemented in sanitizer_common.cpp once.

Also remove return values for internal_sleep, it's not used anywhere.

Eventually it would be nice to remove SleepForSeconds/SleepForMillis/internal_sleep.
There is no point in having that many different names for the same thing.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105718

sanitizer_common: split LibIgnore into fast/slow paths

LibIgnore is checked in every interceptor.
Currently it has all logic in the single function
in the header, which makes it uninlinable.
Split it into fast path (no libraries ignored)
and slow path (have ignored libraries).
It makes the fast path inlinable (single load).

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105719

[lld-macho][nfc] Expand the compact unwind symbol reloc test

Add a bit more detail to the comments, and check that the final binary
does indeed have a `__unwind_info` section (D105557 previosly regressed
this).

Also rename the test to emphasize that we are testing relocations
compact unwind, not relocations in general.

[InstCombine] Add optimization to prevent poison from being propagated.

In D104569, Freeze was inserted just before br to solve the `branching on undef` miscompilation problem.
But value analysis was being disturbed by added freeze.

```
v = load ptr
cond = freeze(icmp (and v, const), const')
br cond, ...
```
The case in which value analysis disturbed is as above.
By changing freeze to add immediately after load, value analysis will be successful again.

```
v = load ptr
freeze(icmp (and v, const), const')
=>
v = load ptr
v' = freeze v
icmp (and v', const), const'
```
In this patch, I propose the above optimization.
With this patch, the poison will not spread as the freeze is performed early.

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D105392

Fix windows directory separator some more for test from b447b9dce0d105e7f0b22db719fe8624108e99dc

Reapply "llvm-symbolizer: Fix "start file" to work with Split DWARF"

Originally committed as 04c203e310bd3fb58e16c936c0200d680100526e
Reverted in 768510632c5ddbf9438693d9c7db1903e39295ad due to the test
failing when encountering windows directory separators.

Fix the path separator platform issue with a FileCheck pattern {{[/\\]}}

Original commit message:

A followup to the feature added in 69da27c7496ea373567ce5121e6fe8613846e7a5
that added the optional "start file name" to match "start line" - but this
didn't work with Split DWARF because of the need for the decl file number
resolution code to refer back to the skeleton unit to find its .debug_line
contribution. So this patch adds the necessary infrastructure to track the
skeleton unit corresponding to a split full unit for the purpose of this
lookup.

[DivRemPairs] Add test cases for D87555. NFC

[RISCV] Restore non-constant srem test I accidentally deleted. NFC

[Analysis] Remove changeCondBranchToUnconditionalTo (NFC)

The last use was removed on Jan 21, 2021 in commit
0895b836d74ed333468ddece2102140494eb33b6.

[RISCV] Add test cases for div/rem with constant left hand side. NFC

Some of these would produce better code if we used W instructions,
but constant LHS currently prevents that.

[Attributor][FIX] Destroy bump allocator objects to avoid leaks

AllocationInfo and DeallocationInfo objects themselves are allocated
with the Attributor bump allocator and do not need to be deallocated.
That said, the sets in AllocationInfo and DeallocationInfo need to be
destroyed to avoid memory leaks.

[OpenMP] Detect SPMD compatible kernels and execute them as such

In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.

The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D102307

[OpenMP][FIX] Add missing `)` to remark

[OpenMP] Remove checkXXXX device runtime functions

We had multiple functions to determine the execution mode (SPMD/Generic)
and runtime status (initialized/uninitialized) but that just increased
complexity without a real benefit. Especially with D102307 in mind it
is helpful to reduce the dependence on the `ident_t` flags.

Differential Revision: https://reviews.llvm.org/D105586

[OpenMP][NFCI] Re-enable two remarks tests after D101977 landed

[OpenMP] Create custom state machines for generic target regions

In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.

The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D101977

[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976

[Attributor][FIX] Sanitize queries to LVI and ScalarEvolution

When we talk to outside analyse, e.g., LVI and ScalarEvolution, we need
to be careful with the query. The particular error occurred because we
folded a PHI node before the LVI query but the context location was now
not dominated by the value anymore. This is not supported by LVI so we
have to filter these situations before we query the outside analyses.

[Attributor] Look through selects in genericValueTraversal

If we can simplify the select condition we can avoid one value in the
traversal.

Differential Revision: https://reviews.llvm.org/D103861

[OpenMP][FIX] Update remark in test file after rewording

[Attributor] Reorganize AAHeapToStack

In order to simplify future extensions, e.g., the merge of
AAHeapToShared in to AAHeapToStack, we reorganize AAHeapToStack and the
state we keep for each malloc-like call. The result is also less
confusing as we only track malloc-like calls, not all calls. Further, we
only perform the updates necessary for a malloc-like to argue it can go
to the stack, e.g., we won't check all uses if we moved on to the
"must-be-freed" argument.

This patch also uses Attributor helps to simplify the allocated size,
alignment, and the potentially freed objects.

Overall, this is mostly a reorganization and only the use of the
optimistic helpers should change (=improve) the capabilities a bit.

Differential Revision: https://reviews.llvm.org/D104993

[Attributor][FIX] Do not replace a value with a non-dominating instruction

We have to be careful when we replace values to not use a non-dominating
instruction. It makes sense that simplification offers those as
"simplified values" but we can't manifest them in the IR without PHI
nodes. In the future we should consider potentially adding those PHI
nodes.

[ARM] Extra widening and narrowing combinations tests. NFC

[Attributor] Use AAValueSimplify to simplify returned values

We should use AAValueSimplify for all value simplification, however
there was some leftover logic that predates AAValueSimplify in
AAReturnedValues. This remove the AAReturnedValues part and provides a
replacement by making AAValueSimplifyReturned strong enough to handle
all previously covered cases. Further, this improve
AAValueSimplifyCallSiteReturned to handle returned arguments.

AAReturnedValues is now much easier and the collected returned
values/instructions are now from the associated function only, making it
much more sane. We also do not have the brittle logic anymore that looks
for unresolved calls. Instead, we use AAValueSimplify to handle
recursion.

Useful code has been split into helper functions, e.g., an Attributor
interface to get a simplified value.

Differential Revision: https://reviews.llvm.org/D103860

[Attributor] Introduce an optimistic getUnderlyingObjects helper

As the `llvm::getUnderlyingObjects` helper, the optimistic version
collects objects that might be the base of a given pointer. In contrast
to the llvm variant, the optimistic one will use assumed information,
e.g., about select conditions or dead blocks, to provide a more precise
result.

Differential Revision: https://reviews.llvm.org/D103859

[Attributor][FIX] Traverse uses even if a value is assumed constant

Not all attributes are able to handle the interprocedural step and
follow the uses into a call site. Let them be able to combine call site
uses instead. This might result in some unused values/arguments being
leftover but it removes problems where we misused "is dead" even though
it was actually "is simplified/replaced".

We explicitly check for dead values due to constant propagation in
`AAIsDeadValueImpl::areAllUsesAssumedDead` instead.

Differential Revision: https://reviews.llvm.org/D103858

Revert Attributor patch series

Broke check-clang, see https://reviews.llvm.org/D102307#2869065
Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`

Reland "[clang-repl] Allow passing in code as positional arguments."

This reverts commit 3ec88ca60b24 which reverted e386871e1d21 due to a asan build
failure.

This patch removes the new lines in the test case which seem to introduce the
failure.

Differential revision: https://reviews.llvm.org/D104898

Revert "llvm-symbolizer: Fix "start file" to work with Split DWARF"

This reverts commit 04c203e310bd3fb58e16c936c0200d680100526e.
Test fails on Windows.