review.tizen.org Git - platform/upstream/llvm.git/log

s/setGenerator/addGenerator/ in the JIT docs. NFC

NFC: Add a simple test for introspection call formatting

[OpenMP] Add info for device table changes

Summary:
This patch adds a feature to print information whenever the host-device pointer
mapping table is changed by inserting or removing an entry. This introduces a
new bit field for LIBOMPTARGET_INFO at position 0x8.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D100600

[sanitizer] Simplify GetTls with dl_iterate_phdr on Linux and use it on musl/FreeBSD

... so that FreeBSD specific GetTls/glibc specific pthread_self code can be
removed. This also helps FreeBSD arm64/powerpc64 which don't have GetTls
implementation yet.

GetTls is the range of

* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus

On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.

This patch uses `dl_iterate_phdr` to collect TLS blocks. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.

This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize`. However, huge glibc x86-64 binaries with numerous shared objects
may observe time complexity penalty, so exclude them for now. Use the simplified
method with non-Android Linux for now, but in theory this can be used with *BSD
and potentially other ELF OSes.

This removal of RISC-V `__builtin_thread_pointer` makes the code compilable with
more compiler versions (added in Clang in 2020-03, added in GCC in 2020-07).

This simplification enables D99566 for TLS Variant I architectures.

Note: as of musl 1.2.2 and FreeBSD 12.2, dlpi_tls_data returned by
dl_iterate_phdr is not desired: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254774
This can be worked around by using `__tls_get_addr({modid,0})` instead
of `dlpi_tls_data`. The workaround can be shared with the workaround for glibc<2.25.

This fixes some tests on Alpine Linux x86-64 (musl)

```
test/lsan/Linux/cleanup_in_tsd_destructor.c
test/lsan/Linux/fork.cpp
test/lsan/Linux/fork_threaded.cpp
test/lsan/Linux/use_tls_static.cpp
test/lsan/many_tls_keys_thread.cpp

test/msan/tls_reuse.cpp
```

and `test/lsan/TestCases/many_tls_keys_pthread.cpp` on glibc aarch64.

The number of sanitizer test failures does not change on FreeBSD/amd64 12.2.

Differential Revision: https://reviews.llvm.org/D98926

[lldb] Raise a CrashLogParseException when failing to parse JSON crashlog

Throw an exception with an actually helpful message when we fail to
parse a JSON crashlog.

NFC: Add missing matcher for test method

The intention is to match the definition.

[OpenMP5][DOCS] Update status of masked construct and correct the color
for omp_target_is_present, NFC.

[AST] Fix location call storage with common last-invocation

Differential Revision: https://reviews.llvm.org/D100548

[mlir][vector][avx] add AVX dot product to X86Vector dialect with lowering

In the long run, we want to unify the dot product codegen solutions between
all target architectures, but this intrinsic enables experimenting with AVX
specific implementations in the meantime.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D100593

[InferAttrs] Do not mark first argument of str(n)cat as writeonly.

str(n)cat appends a copy of the second argument to the end of the first
argument. To find the end of the first argument, str(n)cat has to read
from it until it finds the terminating 0. So it should not be marked as
writeonly. I think this means the argument should not be marked as
writeonly.

(This is causing a mis-compile with legacy DSE, before it got removed)

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D100601

[clang][AArch64] Correctly align HFA arguments when passed on the stack

When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example

    struct S {
      __attribute__ ((__aligned__(16))) double v[4];
    };

Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)

Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.

This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.

The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.

For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.

On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.

Patch by Momchil Velikov and Lucas Prates.

Differential Revision: https://reviews.llvm.org/D98794

[mlir][scf] NFC - Add a getIterOpOperands helper to scf::ForOp

[LLDB] Use path relative to binary for finding .dwo files.

DWARF allows .dwo file paths to be relative rather than absolute. When
they are relative, DWARF uses DW_AT_comp_dir to find the .dwo
file. DW_AT_comp_dir can also be relative, making the entire search
patch for the .dwo file relative. In this case, LLDB currently
searches relative to its current working directory, i.e. the directory
from which the debugger was launched. This is not right, as the
compiler, which generated the relative paths, can have no idea where
the debugger will be launched. The correct thing is to search relative
to the location of the executable binary. That is what this patch
does.

Differential Revision: https://reviews.llvm.org/D97786

[AST][Introspection] Add a check to detect if introspection is supported.

This could probably be made into a compile time constant, but that would involve generating a second inc file.

Reviewed By: steveire

Differential Revision: https://reviews.llvm.org/D100530

[AST] Add a print method to Introspection LocationCall

Add a print method that takes a raw_ostream.
Change LocationCallFormatterCpp::format to call that method.

Reviewed By: steveire

Differential Revision: https://reviews.llvm.org/D100423

[scudo][standalone] Fuchsia related fixes

While attempting to roll the latest Scudo in Fuchsia, some issues
arose. While trying to debug them, it appeared that `DCHECK`s were
also never exercised in Fuchsia. This CL fixes the following
problems:
- the size of a block in the TransferBatch class must be a multiple
  of the compact pointer scale. In some cases, it wasn't true, which
  lead to obscure crashes. Now, we round up `sizeof(TransferBatch)`.
  This only materialized in Fuchsia due to the specific parameters
  of the `DefaultConfig`;
- 2 `DCHECK` statements in Fuchsia were incorrect;
- `map()` & co. require a size multiple of a page (as enforced in
  Fuchsia `DCHECK`s), which wasn't the case for `PackedCounters`.
- In the Secondary, a parameter was marked as `UNUSED` while it is
  actually used.

Differential Revision: https://reviews.llvm.org/D100524

[TableGen] Reduce the number of map lookups in TypeSetByHwMode::getOrCreate. NFCI

hasMode was looking up the map once. Then we'd either call get which
would look up again, or we'd insert into the map which requires
walking the map to find the insertion point.

I believe the hasMode was needed because get has a special case
to look for DefaultMode if the mode being asked for doesn't exist.
We don't want that here so we were using hasMode to make sure we
wouldn't hit that case.

Simplify to a regular operator[] access which will default
construct a SetType if the lookup fails.

[AMDGPU] Factor out predicate FmaakFmamkF32Insts

Differential Revision: https://reviews.llvm.org/D100409

[libcxx][NFC] removes IndentRequires from .clang-format

IndentRequires is apparently breaking CI because it uses an older clang-format.

Partially rolls back 2e3a78b8ca10.

[mlir][AsmPrinter] Fix multi-threaded segfault by using explicit null stream per thread

We were using llvm::nulls, but that isn't thread safe so we switch to giving each thread it's own null stream.

Differential Revision: https://reviews.llvm.org/D100578

[clang] [AArch64] Fix handling of HFAs passed to Windows variadic functions

The documentation says that for variadic functions, all composites
are treated similarly, no special handling of HFAs/HVAs, not even
for the fixed arguments of a variadic function.

Differential Revision: https://reviews.llvm.org/D100467

[VPlan] Replace a few unnecessary includes with forward decls.

[AMDGPU] Add new EmitDstSel field to VOPPofile. NFC.

Differential Revision: https://reviews.llvm.org/D100589

[clang-format] Option for empty lines after an access modifier.

The current logic for access modifiers in classes ignores the option 'MaxEmptyLinesToKeep=1'. It is therefore impossible to have a coding style that requests one empty line after an access modifier. The patch allows the user to configure how many empty lines clang-format should add after an access modifier. This will remove lines if there are to many and will add them if there are missing.

Reviewed By: MyDeveloperDay, curdeius

Differential Revision: https://reviews.llvm.org/D98237

[gn build] Port 82787eb2285d

[mlir] Add verification for `linalg.tiled_loop` op.

Differential Revision: https://reviews.llvm.org/D100555

[mlir] Expose `updateBoundsForCyclicDistribution` in Linalg/Utils.h.

Differential Revision: https://reviews.llvm.org/D100580

[AMDGPU] Move LDS lowering related utility functions to a separate utils file.

Move some utility functions which are used within LDS lowering pass to a separate utils
file so that other LDS related passes can make use of them when required.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D100526

[mlir] Add helpers to set lbs, ubs, steps for linalg.tiled_loop.

Differential Revision: https://reviews.llvm.org/D100579

[mlir] Add support for adding attribute+type traits/interfaces to tablegen defs

This matches the current support provided to operations, and allows attaching traits, interfaces, and using the DeclareInterfaceMethods utility. This was missed when attribute/type generation was first added.

Differential Revision: https://reviews.llvm.org/D100233

[Hexagon] Avoid infinite loops in type legalization when lowering SETCC

Only widen SETCC if the operands can be widened. Not checking that caused
infinite widen-split loops in legalization.

[flang][OpenMP] Remove `OmpEndLoopDirective` handles from code.

This directive is currently lowered as NOP.

Patch is an attempt upstream code from:
PR: https://github.com/flang-compiler/f18-llvm-project/pull/573

Reviewed By: kiranchandramohan, schweitz, clementval

Differential Revision: https://reviews.llvm.org/D100576

[RISCV] Share RVInstIShift and RVInstIShiftW instruction format classes with the B extension.

This generalizes RVInstIShift/RVInstIShiftW to take the upper
5 or 7 bits of the immediate as an input instead of only bit 30. Then
we can share them.

For RVInstIShift I left a hardcoded 0 at bit 26 where RV128 gets
a 7th bit for the shift amount.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D100424

[OpenMP] Added codegen for masked directive

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100514

[mlir][sparse] remove restriction on vectorization of index type

Rationale:
Now that vector<?xindex> is allowed, the restriction on vectorization
of index types in the sparse compiler can be removed. Also needs
generalization of scatter/gather index types.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D100522

[clang][patch] Modify diagnostic level from err to warn: anyx86_interrupt_regsave

Reviewed By: Aaron Ballman

Differential Revision: https://reviews.llvm.org/D100511

[libc++] NFC: Use ASSERT_SAME_TYPE consistently in string.h and wchar.h tests

[libc++] Remove test suite workarounds on Apple with old Clangs

In 5fd17ab, we worked around the Apple system headers not providing
const-correct overloads for some <string.h> functions. However, that
required an attribute that was only present in recent Clangs at the
time. We can now assume that all supported Clang versions on Apple
platforms do support that attribute.

Differential Revision: https://reviews.llvm.org/D100477

[LoopUnrollAndJam] Avoid repeated instructions for UAJ analysis

Avoid visiting repeated instructions for processHeaderPhiOperands as it can cause a scenario of endless loop. Test case is attached and can be ran with `opt -basic-aa -tbaa -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=4`.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D97407

[NewPM] Cleanup IR printing instrumentation

Being lazy with printing the banner seems hard to reason with, we should print it
unconditionally first (it could also lead to duplicate banners if we
have multiple functions in -filter-print-funcs).

The printIR() functions were doing too many things. I separated out the
call from PrintPassInstrumentation since we were essentially doing two
completely separate things in printIR() from different callers.

There were multiple ways to generate the name of some IR. That's all
been moved to getIRName(). The printing of the IR name was also
inconsistent, now it's always "IR Dump on $foo" where "$foo" is the
name. For a function, it's the function name. For a loop, it's what's
printed by Loop::print(), which is more detailed. For an SCC, it's the
list of functions in parentheses. For a module it's "[module]", to
differentiate between a possible SCC with a function called "module".

To preserve D74814, we have to check if we're going to print anything at
all first. This is unfortunate, but I would consider this a special
case that shouldn't be handled in the core logic.

Reviewed By: jamieschmeiser

Differential Revision: https://reviews.llvm.org/D100231

[asan] Add an offset for the kernel address sanitizer on FreeBSD

This is based on a port of the sanitizer runtime to the FreeBSD kernel
that has been commited as https://cgit.freebsd.org/src/commit/?id=38da497a4dfcf1979c8c2b0e9f3fa0564035c147
and the following commits.

Reviewed By: emaste, dim
Differential Revision: https://reviews.llvm.org/D98285

[Driver] Enable kernel address and memory sanitizers on FreeBSD

Test Plan: using kernel ASAN and MSAN implementations in FreeBSD

Reviewed By: emaste, dim, arichardson

Differential Revision: https://reviews.llvm.org/D98286

[PowerPC] Add ROP Protection Instructions for PowerPC

There are four new PowerPC instructions that are introduced in
Power 10. They are hashst, hashchk, hashstp, hashchkp.

These instructions will be used for ROP Protection.
This patch adds the four instructions.

Reviewed By: nemanjai, amyk, #powerpc

Differential Revision: https://reviews.llvm.org/D99375

[flang] Add list input test to GTest suite

[libc] Add index operator[] to StringView

[LSR] Fix for pre-indexed generated constant offset

This patch changed the isLegalUse check to ensure that
LSRInstance::GenerateConstantOffsetsImpl generates an
offset that results in a legal addressing mode and
formula. The check is changed to look similar to the
assert check used for illegal formulas.

Differential Revision: https://reviews.llvm.org/D100383

Change-Id: Iffb9e32d59df96b8f072c00f6c339108159a009a

Revert "[DebugInfo] Replace debug uses in replaceUsesOutsideBlock"

This reverts commit 96a1e6b7cf72d9bd625903ea4b441404200383cf.

Failing build bots e.g. https://lab.llvm.org/buildbot/#/builders/161/builds/163

[libcxx][NFC] removes BreakBeforeConceptDeclarations from .clang-format

BreakBeforeConceptDeclarations is apparently breaking CI.

Partially rolls back 2e3a78b8ca10.

[DebugInfo] Replace debug uses in replaceUsesOutsideBlock

Value::replaceUsesOutsideBlock doesn't replace debug uses which leads to an
unnecessary reduction in variable location coverage. Fix this, add a unittest for
it, and add a regression test demonstrating the change through instcombine's
replacedSelectWithOperand.

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D99169

[Clang][Docs] Claim the atomic compare

I'm working on the implementation of OpenMP 5.1 feature `atomic compare`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D100507

[InstCombine] update RUN lines in assume test; NFC

This was in a draft of from D82703, but it got left out
of the committed version, so we were not actually testing
the new code.

Fix potential infinite loop with malformed attribute syntax

Double square bracket attribute arguments can be arbitrarily complex,
and the attribute argument parsing logic recovers by skipping tokens.
As a fallback recovery mechanism, parse recovery stops before reading a
semicolon. This could lead to an infinite loop in the attribute list
parsing logic.

[NFC] Remove the -instcombine flag from strict-fadd.ll

This also fixes a CHECK line in @fadd_strict_unroll which ensures the
changes made to fixReduction() to support in-order reductions with
unrolling are being tested correctly.

[yaml2obj/obj2yaml/llvm-readobj] Support printing and parsing AVR-specific e_flags

The `e_flags` contains a mixture of bitfields and regular ones, ensure all of them can be serialized and deserialized.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D100250

[TableGen] [docs] Correct a reference in the TableGen Overview document

Differential Revision: https://reviews.llvm.org/D100382

[AMDGPU] Fix large return values with amdgpu_gfx

Returning in memory is not supported, so fall back to sret.
Also, extend i1 and i16 to i32. Otherwise, they would be passed through
memory.

Differential Revision: https://reviews.llvm.org/D100543

[X86] combineCMP - fold cmpEQ/NE(TRUNC(X),0) -> cmpEQ/NE(X,0)

If we are truncating from a i32 source before comparing the result against zero, then see if we can directly compare the source value against zero.

If the upper (truncated) bits are known to be zero then we can compare against that, hopefully increasing the chances of us folding the compare into a EFLAG result of the source's operation.

Fixes PR49028.

Differential Revision: https://reviews.llvm.org/D100491

[AArch64][NEON] Match (or (and -a b) (and (a+1) b)) => bit select

With this patch vbslq_f32(vnegq_s32(a), b, c) lowers to a BIT instruction.

Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D100304

[clangd] Only allow remote index to be enabled from user config.

Differential Revision: https://reviews.llvm.org/D100542

Fix bug in .eh_frame/.debug_frame PC offset calculation for DW_EH_PE_pcrel

This fixes the following bugs:
https://bugs.llvm.org/show_bug.cgi?id=27249
https://bugs.llvm.org/show_bug.cgi?id=46414

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D100328

[VPlan] Add VPRecipeBase::mayHaveSideEffects.

Add an initial version of a helper to determine whether a recipe may
have side-effects.

Reviewed By: a.elovikov

Differential Revision: https://reviews.llvm.org/D100259

[lldb] Fix incorrect test data in FileSpecTest.IsRelative

Found by clang-tidy's bugprone-suspicious-missing-comma.

add test case for ignoring -flto=auto and -flto=jobserver

as requested in https://reviews.llvm.org/D99501, test that the two new options are ignored.

Reviewed By: tejohnson, fhahn

Differential Revision: https://reviews.llvm.org/D100484

[DAGCombiner] Fold step_vector with add/mul/shl

This patch implements some DAG combines for STEP_VECTOR:
add step_vector(C1), step_vector(C2) -> step_vector(C1+C2)
add (add X step_vector(C1)), step_vector(C2) -> add X step_vector(C1+C2)
mul step_vector(C1), C2 -> step_vector(C1*C2)
shl step_vector(C1), C2 -> step_vector(C1<<C2)

TestPlan: check-llvm

Differential Revision: https://reviews.llvm.org/D100088

[SVE][LoopVectorize] Fix crash in InnerLoopVectorizer::widenPHIInstruction

There were a few places in widenPHIInstruction where calculations of
offsets were failing to take the runtime calculation of VF into
account for scalable vectors. I've fixed those cases in this patch
as well as adding an assert that we should not be scalarising for
scalable vectors.

Tests are added here:

Transforms/LoopVectorize/AArch64/sve-widen-phi.ll

Differential Revision: https://reviews.llvm.org/D99254

[RISCV] Pre-commit vector shuffle test cases

This codegen will be improved by future patches.

[AA] Updates for D95543.

Addressing latter comments in D95543:
- `AliasResult::Result` renamed to `AliasResult::Kind`
- Offset printing added for `PartialAlias` case in `-aa-eval`
- Removed VisitedPhiBBs check from BasicAA'

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D100454

[AArch64] Use type-legalization cost for code size memop cost.

At the moment, getMemoryOpCost returns 1 for all inputs if CostKind is
CodeSize or SizeAndLatency. This fools LoopUnroll into thinking memory
operations on large vectors have a cost of one, even if they will get
expanded to a large number of memory operations in the backend.

This patch updates getMemoryOpCost to return the cost for the type
legalization for both CodeSize and SizeAndLatency. This should more
accurately reflect the number of memory operations required.

I am not sure how latency should properly be included in SizeAndLatency
from the description, but returning the size cost should be clearly more
accurate.

This does not cause any binary changes when building
MultiSource/SPEC2000/SPEC2006 with -O3 -flto for AArch64, likely because
large vector memops are not really formed by code emitted from Clang.
But using the C/C++ matrix extension can easily result in code with very
large vector operations directly from Clang, e.g.
https://clang.godbolt.org/z/6xzxcTGvb

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D100291

NFC put the armv6m entry with the other Cortex-M entries

The armv6m entry in cores_match() got separated from its
friends armv7m and armv7em. Reuniting them to make it
easier to keep them updated in all at the same time.

[flang] Update the regression tests to use the new driver when enabled

This patch updates most of the remaining regression tests (~400) to use
`flang-new` rather then `f18` when `FLANG_BUILD_NEW_DRIVER` is set.
This allows us to share more Flang regression tests between `f18` and
`flang-new`. A handful of tests have not been ported yet - these are
currently either failing or not supported by the new driver.

Summary of changes:
  * RUN lines in tests are updated to use `%flang_fc1` instead of `%f18`
  * option spellings in tests are updated to forms accepted by both `f18` and
    `flang-new`
  * variables in Bash scripts are renamed (e.g. F18 --> FLANG_FC1)
The updated tests will now be run with the new driver, `flang-new`,
whenever it is enabled (i.e when `FLANG_BUILD_NEW_DRIVER` is set).

Although this patch touches many files, vast majority of the changes are
automatic:
```
grep -IEZlr "%f18" flang/test/ | xargs -0 -l sed -i 's/%f18/%flang_fc1/g
```

Differential Revision: https://reviews.llvm.org/D100309

[NFC][LoopVectorize] Remove unnecessary VF.isScalable asserts

There are a few places in LoopVectorize.cpp where we have been too
cautious in adding VF.isScalable() asserts and it can be confusing.
It also makes it more difficult to see the genuine places where
work needs doing to improve scalable vectorization support.

This patch changes getMemInstScalarizationCost to return an
invalid cost instead of firing an assert for scalable vectors. Also,
vectorizeInterleaveGroup had multiple asserts all for the same
thing. I have removed all but one assert near the start of the
function, and added a new assert that we aren't dealing with masks
for scalable vectors.

Differential Revision: https://reviews.llvm.org/D99727

[clang][deps] NFC: Improve documentation

Fix typos and simplify wording

Mark armv6m compat with armv7em; match armv7em being compat with armv6m

armv7em and armv6m in ArchSpec cores_match() will return true.
There was a small bug where the reverse order would not return true.

rdar://76387176

Add convenient composed tsan constants

This change adds convenient composed constants to be used for tsan_read_try_lock annotations, reducing the boilerplate at the instrumentation site.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D99595

[AArch64] Fix windows vararg functions with floats in the fixed args

On Windows, float arguments are normally passed in float registers
in the calling convention for regular functions. For variable
argument functions, floats are passed in integer registers. This
already was done correctly since many years.

However, the surprising bit was that floats among the fixed arguments
also are supposed to be passed in integer registers, contrary to regular
functions. (This also seems to be the behaviour on ARM though, both
on Windows, but also on e.g. hardfloat linux.)

In the calling convention, don't promote shorter floats to f64, but
convert them to integers of the same length. (Floats passed as part of
the actual variable arguments are promoted to double already on the
C/Clang level; the LLVM vararg calling convention doesn't do any
extra promotion of f32 to f64 - this matches how it works on X86 too.)

Technically, this is an ABI break compared to older LLVM versions,
but it fixes compatibility with the official platform ABI. (In practice,
floats among the fixed arguments in variable argument functions is
a pretty rare construct.)

Differential Revision: https://reviews.llvm.org/D100365

[clang] [test] Share patterns in CodeGen/ms_abi_aarch64.c between cases. NFC.

Differential Revision: https://reviews.llvm.org/D100468

Reland "[lit] Handle plain negations directly in the internal shell"

Keep running "not --crash" via the external "not" executable, but
for plain negations, and for cases that use the shell "!" operator,
just skip that argument and invert the return code.

The libcxx tests only use the shell operator "!" for negations,
never the "not" executable, because libcxx tests can be run without
having a fully built llvm tree available providing the "not"
executable.

This allows using the internal shell for libcxx tests.

It should be possible to reland this now that D99938 fixed the
one test failure in clang-tidy that broke when "not" was handled
internally, letting lit/python execute grep.exe directly instead
of via not.exe. (See D99330 and D99406 for more commentery on the
exact issue that broke and other potential ways of fixing it.)

Differential Revision: https://reviews.llvm.org/D98859

Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting"

This reverts commit faf9f11589ce892b31d271917cf840f8ca903221.

Issues with this patch have been reported in
https://reviews.llvm.org/D100264#2689917 and
https://bugs.llvm.org/show_bug.cgi?id=49967.

[NewGVN] Add phi-of-ops operands if no real PHI is created.

If the PHI-of-ops simplifies to an existing value, no real PHI is
created, which means the dependencies between the
PHI-of-ops and its operands is not materialized in IR. At the
moment, we fail to create a real PHI node for the PHI-of-ops,
because the PHI-of-ops root instruction is not re-visited if
one of the PHI-of-ops operands changes. We need to add the
operands as additional users in this case.

Even with this patch, there are still some dependencies
missing. I will continue tackling the outstanding
reporeted crashes in this area.

Fixes PR36501, PR42422, PR42557.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D66924

[RISCV] Add a PatFrag to shorten repeated (XLenVT (VLOp GPR:$vl)) in V extension patterns.

Reduces the amount of changes needed in D100288.

[RISCV][Clang] Add vmv and vfmv series intrinsic functions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper, Jim

Differential Revision: https://reviews.llvm.org/D100266

[scudo] Restore zxtest compatibility

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D100426

Fix the build of `mlir-doc` (again)

This is more fallout from add_mlir_doc() API change

[Test] Propagate nofree attribute from function to calls

Fix Interface doc generation after recent change to add_mlir_doc() API

This is basically fixing the build of `mlir-doc`

[AMDGPU] Disable forceful inline of non-kernel functions which use LDS.

Now since LDS uses within non-kernel functions are being handled in the
pass - LowerModuleLDS, we *NO* need to *forcefully* inline non-kernel
functions just because they use LDS. Do forceful inlining only when the
pass - LowerModuleLDS is not enabled. It is enabled by default.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D100481

Change add_mlir_doc CMake macro to take the tablegen command as last argument to allow extra flags

This is useful for expressing specific table-gen options, like selecting
a particular dialect to print.
Use it to fix the documentation for the `pdl_interp` dialect which is now
generating the first dialect it finds in its input which is `pdl`.

Differential Revision: https://reviews.llvm.org/D100517

[libcxx][NFC] adjusts formatting rules

This will reduce the amount of noisy feedback during reviews.

Differential Revision: https://reviews.llvm.org/D99691

fix comment typos to cycle bots

[gn build] Port b7459a10dad1

[lldb] Simplify output for skipped categories in dotest.py

Print a single line listing all the categories that are being skipped,
rather than relying on the check.*Support() functions specifying why a
particular category will be skipped. If we know why a category got
skipped, still print that in verbose mode.

The motivation for this change is that sometimes engineers misidentify
the output of these messages as the cause for a test failure (e.g. not
being able to build libc++ or libstdc++).

Differential revision: https://reviews.llvm.org/D100508

[DWARF] Fix crash for DWARFDie::dump.

When DIE is extracted manually, the DieArray is empty. When dump is invoked on aforementioned DIE it tries to extract child, even if Dump options say otherwise. Resulting in crash.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D99698

Revert "Simplify BitVector code"

This reverts commit 82f0e3d3ea6bf927e3397b2fb423abbc5821a30f.

The change breaks the asan buildbots.

https://lab.llvm.org/buildbot/#/builders/99/builds/2835

[llvm-objdump] try to fix section-filter.test in full builds after 51aa61e74bdb

[llvm-objdump] try to fix hexagon tests more after 51aa61e74bdb

[llvm-objdump] try to fix hexagon and riscv tests after 1035123ac50db

[hwasan] Fix lock contention on thread creation.

Do not hold the free/live thread list lock longer than necessary.
This change speeds up the following benchmark 10x.

constexpr int kTopThreads = 50;
constexpr int kChildThreads = 20;
constexpr int kChildIterations = 8;

void Thread() {
  for (int i = 0; i < kChildIterations; ++i) {
    std::vector<std::thread> threads;
    for (int i = 0; i < kChildThreads; ++i)
      threads.emplace_back([](){});
    for (auto& t : threads)
      t.join();
  }
}

int main() {
  std::vector<std::thread> threads;
  for (int i = 0; i < kTopThreads; ++i)
    threads.emplace_back(Thread);
  for (auto& t : threads)
    t.join();
}

Differential Revision: https://reviews.llvm.org/D100348

[llvm-objdump] Switch command-line parsing from llvm::cl to OptTable

This is similar to D83530, but for llvm-objdump.

The motivation is the desire to add an `llvm-otool` symlink to
llvm-objdump that behaves like macOS's `otool`, using the same
technique the at llvm-objcopy uses to behave like `strip` (etc).

This change for the most part preserves behavior. In some cases,
it increases compatibility with GNU objdump a bit. For example,
the long options now require two dashes, and the long options
taking arguments for the most part now require a `=` in front
of the value. Exceptions are flags where tests passed the
value separately, for these the separate form is kept as
an alias to the = form.

The one-letter short form args are now joined or separate
and long longer accept a =, which also matches GNU objdump.

cl::opt<>s in libraries now have to be explicitly plumbed
through. This patch does that for --x86-asm-syntax=, but
there's hope that we can remove that again.

Differential Revision: https://reviews.llvm.org/D100433

[Sema] Fold VLA types in compound literals to constant arrays.

Similar to variables with an initializer, this is never valid in
standard C, so we can safely constant-fold as an extension. I ran into
this construct in a couple proprietary codebases.

While I'm here, drive-by fix for 090dd647: we should only fold variables
with VLA types, not arbitrary variably modified types.

Differential Revision: https://reviews.llvm.org/D98363

Reapply "[InferAttributes] Materialize all infered attributes for declaration"" and follow on patches.

This reverts commit ab98f2c7129a52e216fd7e088b964cf4af27b0f2 and 98eea392cdbcdb7360e58b46e9329573f092cd96.

It includes a fix for the clang test which triggered the revert. I failed to notice this one because there was another AMDGPU llvm test with a similiar name and the exact same text in the error message. Odd. Since only one build bot reported the clang test, I didn't notice that one.