review.tizen.org Git - platform/upstream/llvm.git/log

[fir] Add fir reduction builder

This patch introduces a bunch of builder functions
to create function calls to runtime reduction functions.

This patch is part of the upstreaming effort from fir-dev branch.

This patch failed previously because a macro was missing.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D114460

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: mleair <leairmark@gmail.com>

[NPM] Fix LoopNestPasses in -print-pipeline-passes

Fix printing of LoopNestPasses when using the opt pipeline printer
option -print-pipeline-passes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D114771

[mlir] Make sure linearizeCollapsedDims doesn't drop input map dims

The new affine map generated by linearizeCollapsedDims should not drop
dimensions. We need to make sure we create a map with at least as many
dimensions as the source map. This prevents
FoldProducerReshapeOpByLinearization from generating invalid IR.

This solves regression in IREE due to https://github.com/llvm/llvm-project/commit/e4e4da86aff5606ef792d987a3ec85639219228c

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D114838

This reverts commit 9a844c2a9b5c09b4c35d573394a99ab860621581.

Revert "[mlir] Make sure linearizeCollapsedDims doesn't drop input map dims"

This reverts commit bc38673e4de50b995f4bc46d1a4b0ad95bef2356.

[mlir] Make sure linearizeCollapsedDims doesn't drop input map dims

The new affine map generated by linearizeCollapsedDims should not drop
dimensions. We need to make sure we create a map with at least as many
dimensions as the source map. This prevents
FoldProducerReshapeOpByLinearization from generating invalid IR.

This solves regression in IREE due to https://github.com/llvm/llvm-project/commit/e4e4da86aff5606ef792d987a3ec85639219228c

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D114838

[clang-offload-bundler] Reuse original file extension for device archive member

This patch changes clang-offload-bundler to use the original file extension for
the device archive member when unbundling archives instead of printing a warning
and defaulting to ".o".

Differential Revision: https://reviews.llvm.org/D114776

[Legalizer] Avoid expansion to BR_CC if illegal

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D110616

[sanitizer] Add delta compression stack depot

Compress by factor 4x, takes about 10ms per 8 MiB block.

Depends on D114498.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D114503

[memprof] Align each rawprofile section to 8b.

The first 8b of each raw profile section need to be aligned to 8b since
the first item in each section is a u64 count of the number of items in
the section.
Summary of changes:
* Assert alignment when reading counts.
* Update test to check alignment, relax some size checks to allow padding.
* Update raw binary inputs for llvm-profdata tests.

Differential Revision: https://reviews.llvm.org/D114826

[lldb] Temporarily skip TestTsanBasic on Darwin

See ongoing discussion in https://reviews.llvm.org/D112603.

[X86] Pre-commit tests to show the problem of SQRT when `RefinementSteps` = 0. NFC

[mlir] Update accessors prefixed form (NFC)

[sanitizer] Add compress_stack_depot flag

Depends on D114494.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D114495

[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to handle udiv/sdiv/urem/srem.

The V extension supports .vx instructions for integer division and
remainder so we should sink splats for that operand.

[libcxx][doc] Document recent spaceship projects progress

Update a couple authors, differentials, and completed projects for operator<=> implementation

Reviewed By: #libc, Mordante, Quuxplusone, ldionne

Differential Revision: https://reviews.llvm.org/D114682

Add toggling for -fnew-infallible/-fno-new-infallible

Allow toggling of -fnew-infallible so last instance takes precedence

Testing:
ninja check-all

Reviewed By: bruno

Differential Revision: https://reviews.llvm.org/D113523

[test] Avoid dumping .o in source tree (expand-pseudos.ll)

Piping the input to llc avoids that (i.e. llc .... < %s vs llc ... %s)

[NFC][sanitizer] Add entry point for compression

Add Compression::Test type which just pretends packing,
but does nothing useful. It's only called from test for now.

Depends on D114493.

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D114494

[mlir][sparse] added sparse out element wise mult integration test

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D114822

[llvm-profgen] Truncate the context with zero probe ID

Due to the debug info merging, there may have some contexts with zero probe id, we should truncate the context to avoid misleading pre-inliner.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D114284

[ObjectYAML/obj2yaml/yaml2obj][MachO] Support indirect symbol table

Tools such as `llvm-objdump` or `llvm-readobj` support indirect symbol
tables. Here, support it for `obj2yaml` and `yaml2obj`.

Reviewed By: jhenderson, drodriguez

Differential Revision: https://reviews.llvm.org/D114410

[FS-AFDO][llvm-profgen] Generate profile with FS-AFDO discriminator

In order to support generating profile with FS discriminator, three kind of changes are done in llvm-profgen:

1) Dissassemble .rodata section to check if FS discriminator var ('"__llvm_fs_discriminator__"') exists and set the corresponding flag in the binary.

2) Change the discriminator decoding in `getBaseDiscriminator` and `getDuplicationFactor`.

3) set true for `FunctionSamples::ProfileIsFS` to enable FS functionality in ProfileData.

Reviewed By: xur, hoy, wenlei

Differential Revision: https://reviews.llvm.org/D113296

[runtimes][openmp] Change to not treat ARCH-unknown-linux-gnu as errors

When OpenMP is compiled as a part runtimes for multiple targets, openmp
is compiled under build/runtimes/runtimes-arch-unknown-linux-gnu-bins
directory. Old implementation treats this directory name as errors.
This patch adds a guard like "[Uu]known[^-]".

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D114346

[mlir][sparse] fix typos in integration tests

Reviewed By: bixia, wrengr

Differential Revision: https://reviews.llvm.org/D114820

[flang] Rearrange prototype & code placement of IsCoarray()

A quick fix last week to the shared library build caused
the predicate IsCoarray(const Symbol &) to be moved from
Semantics to Evaluate. This patch completes that move in
a way that properly combines the existing IsCoarray() tests
for expressions and other object with the test for a symbol.

Differential Revision: https://reviews.llvm.org/D114806

Revert "[MLIR] Update Vector To LLVM conversion to be aware of assume_alignment"

This reverts commit 29a50c5864ddab283c1ff38694fb5926ce37b39a.

After LLVM lowering, the original patch incorrectly moved alignment
information across an unconstrained GEP operation. This is only correct
for some index offsets in the GEP. It seems that the best approach is,
in fact, to rely on LLVM to propagate information from the llvm.assume()
to users.

Thanks to Thomas Raoux for catching this.

[Clang] Add option to disable -mconstructor-aliases with -mno-constructor-aliases

We've found that when profiling, counts are only generated for the real definition of constructor aliases (C2 in mangled name). However, when compiling the C1 version is present at the callsite and leads to a lack of counts due to this aliasing. This causes us to miss out on inlining an otherwise hot constructor.

-mconstructor-aliases is AFAICT an optimization, so having a disabling flag if wanted seems valuable.

Testing:
ninja check-all

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D114130

[NFC][regalloc] Factor accesses to ExtraRegInfo

We'll move ExtraRegInfo to the RegAllocEvictionAdvisor subsequently.
This change prepares for that by factoring all accesses.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D114759

Big-endian version of vpermxor

A big-endian version of vpermxor, named vpermxor_be, is added to LLVM
and Clang. vpermxor_be can be called directly on both the little-endian
and the big-endian platforms.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D114540

[TSan][Darwin] Avoid crashes due to interpreting non-zero shadow content as a pointer

We would like to use TLS to store the ThreadState object (or at least a
reference ot it), but on Darwin accessing TLS via __thread or manually
by using pthread_key_* is problematic, because there are several places
where interceptors are called when TLS is not accessible (early process
startup, thread cleanup, ...).

Previously, we used a "poor man's TLS" implementation, where we use the
shadow memory of the pointer returned by pthread_self() to store a
pointer to the ThreadState object.

The problem with that was that certain operations can populate shadow
bytes unbeknownst to TSan, and we later interpret these non-zero bytes
as the pointer to our ThreadState object and crash on when dereferencing
the pointer.

This patch changes the storage location of our reference to the
ThreadState object to "real" TLS. We make this work by artificially
keeping this reference alive in the pthread_key destructor by resetting
the key value with pthread_setspecific().

This change also fixes the issue were the ThreadState object is
re-allocated after DestroyThreadState() because intercepted functions
can still get called on the terminating thread after the
THREAD_TERMINATE event.

Radar-Id: rdar://problem/72010355

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D110236

[OpenMP][libomp][doc] Add environment variables documentation

Add documentation for the environment variables for libomp

Differential Revision: https://reviews.llvm.org/D114269

[flang] Define & implement a lowering support API IsContiguous() in runtime

Create a new flang/runtime/support.cpp module to hold miscellaneous
runtime APIs to support lowering, and define an API IsContiguous() to
wrap the member function predicate Descriptor::IsContiguous().
And do a little clean-up of other API headers that don't need to expose
Runtime/descriptor.h.

Differential Revision: https://reviews.llvm.org/D114752

[ADT] Remove 0-width Asserts in APInt.getZExtValue

Remove assertion that disallows getting a zero-extended value from a
zero-width APInt. This check is too restrictive and makes it difficult
to use APInt to model zero-width things, e.g., zero-width wires in the
CIRCT project.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@sifive.com>
Reviewed By: lattner, darthscsi, nikic

Differential Revision: https://reviews.llvm.org/D114768

[NFC][sanitizer] Fail test quickly

[InstCombine] Allow fake vector insert folding to bit-logic only if the insert element is integer type

The below commit is causing assertion when insert element type is not integer
type such as half. This is because the transformation is creating zext before
doing bitwise OR, and the zext is supported only for integer types
https://github.com/llvm/llvm-project/commit/80ab06c599a0f5a90951c36a57b2a9b492b19d61

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D114734

[NFC] Refactor symbol table parsing.

Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.

This is a retry of the original patch https://reviews.llvm.org/D113965 which was reverted. There was a deadlock in the Manual DWARF indexing code during symbol preloading where the module was asked on the main thread to preload its symbols, and this would in turn cause the DWARF manual indexing to use a thread pool to index all of the compile units, and if there were relocations on the debug information sections, these threads could ask the ObjectFile to load section contents, which could cause a call to ObjectFileELF::RelocateSection() which would ask for the symbol table from the module and it would deadlock. We can't lock the module in ObjectFile::GetSymtab(), so the solution I am using is to use a llvm::once_flag to create the symbol table object once and then lock the Symtab object. Since all APIs on the symbol table use this lock, this will prevent anyone from using the symbol table before it is parsed and finalized and will avoid the deadlock I mentioned. ObjectFileELF::GetSymtab() was never locking the module lock before and would put off creating the symbol table until somewhere inside ObjectFileELF::GetSymtab(). Now we create it one time inside of the ObjectFile::GetSymtab() and immediately lock it which should be safe enough. This avoids the deadlocks and still provides safety.

Differential Revision: https://reviews.llvm.org/D114288

[flang] Correct INQUIRE(POSITION= & PAD=)

INQUIRE(POSITION=)'s results need to reflect the POSITION=
specifier used for the OPEN statement until the unit has been
repositioned. Preserve the POSITION= from OPEN and used it
for INQUIRE(POSITION=) until is becomes obsolete.

INQUIRE(PAD=) is implemented here in the case of an unconnected unit
with Fortran 2018 semantics; i.e., "UNDEFINED", rather than Fortran 90's
"YES"/"NO" (see 4.3.6 para 2). Apparent failures with F'90-only tests
will persist with INQUIRE(PAD=); these discrepancies don't seem to warrant
an option or environment variable.

To make the implementation of INQUIRE more closely match the language
in the standard, rename IsOpen() to IsConnected(), and use it explicitly
for the various INQUIRE specifiers.

Differential Revision: https://reviews.llvm.org/D114755

[mlir][sparse] refine simply dynamic sparse tensor outputs

Proper test for sparse tensor outputs is a single condition throughout
the whole tensor index expression (not a general conjunction, since this
may include other conditions that cause cancellation).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D114810

[flang] Re-fold bounds expressions in DATA implied DO loops

To accommodate triangular implied DO loops in DATA statements, in which
the bounds of nested implied DO loops might depend on the values of the
indices of outer implied DO loops in the same DATA statement set, it
is necessary to run them through constant folding each time they are
encountered.

Differential Revision: https://reviews.llvm.org/D114754

[clang-repl][NFC] Fix calling convention mismatch in test

Test failed on x86 platforms due to a calling convention mismatch
when member function was called like a free function. In this patch,
member function is marked static to address this.

[lldb] Fix broken skipUnlessUndefinedBehaviorSanitizer decorator

727bd89b605b broke the UBSan decorator. The decorator compiles a custom
source code snippet that exposes UB and verifies the presence of a UBSan
symbol in the generated binary. The aforementioned commit broke both by
compiling a snippet without UB and discarding the result.

[flang] Fix usage & catch errors for MAX/MIN with keyword= arguments

Max(), MIN(), and their specific variants are defined with an unlimited
number of dummy arguments named A1=, A2=, &c. whose names are almost never
used in practice but should be allowed for and properly checked for the
usual errors when they do appear. The intrinsic table's entries otherwise
have fixed numbers of dummy argument definitions, so add some special
case handling in a few spots for MAX/MIN/&c. checking and procedure
characteristics construction.

Differential Revision: https://reviews.llvm.org/D114750

[lldb] Fix TypeError: argument of type 'NoneType' is not iterable

Check if we have an apple_sdk before checking if it contains "internal".

[lldb] Mark TestTsanBasic and TestUbsanBasic as "no debug info" tests

Speed up testing by not rerunning the test for all debug info variants.

[mlir][tensor] InsertSliceOp verification.

This revision reintroduces tensor.insert_slice verification which seems
to have vanished over time: a verifier was initially introduced in cf9503c1b752062d9abfb2c7922a50574d9c5de4
but for some reason the invalid.mlir was not properly updated; as time passed the verifier was not called anymore and later the code was deleted.

As a consequence, a non-negligible portion of tests has run astray using invalid
tensor.insert_slice semantics and needed to be fixed.

Also, extract isRankReducedType from TensorOps for better reuse
Originally, this facility was used by both tensor and memref forms but
it got copied around as dialects were split.

Differential Revision: https://reviews.llvm.org/D114715

[mlir][MemRef] Fix SubViewOp canonicalization when a subset of unit-dims are dropped.

The canonical type of the result of the `memref.subview` needs to make
sure that the previously dropped unit-dimensions are the ones dropped
for the canonicalized type as well. This means the generic
`inferRankReducedResultType` cannot be used. Instead the current
dropped dimensions need to be querried and the same need to be dropped.

Reviewed By: nicolasvasilache, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D114751

AArch64 GIsel: legalize lshr operands, even if it is poison

Previously, this caused GlobalISel to emit invalid IR (a gpr32 to gpr64
copy) and fail during verification.

While this shift is not defined (returns poison), it should not crash
codegen, as it may appear inside dead code (for example, a select
instruction), and it is legal IR input, as long as the value is unused.

Discovered while trying to build Julia with LLVM v13:
https://github.com/JuliaLang/julia/pull/42602.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D114389

[memprof] Disallow memprof profile reader tests on non-x86 archs.

The memprof profile reader tests rely on binary data which is generated
from and meant to be interpreted on little endian architectures. Add a
REQUIRES: x86_64-linux clause to both tests to ensure they don't fail on big
endian targets such as ppc.

[SCEV] Verify integrity of ValuesAtScopes and users (NFC)

Make sure that ValuesAtScopes and ValuesAtScopesUsers are
consistent during SCEV verification.

[clang][docs] Inclusive language: remove use of sanity check in option description

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D114562

[NFC][Clang]Inclusive language: Replace uses of whitelist in clang/test

[memprof] Disable pedantic warnings, suppress variadic macro warning.

The memprof unit tests use an older version of gmock (included in the
repo) which does not build cleanly with -pedantic:
https://github.com/google/googletest/issues/2650
For now just silence the warning by disabling pedantic and add the
appropriate flags for gcc and clang.

[mlir][tosa] Add tosa.conv2d as fully_connected canonicalization

For a 1x1 weight and stride of 1, the input/weight can be reshaped and passed into a fully connected op then reshaped back

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D114757

fix inverted logic for HideUnrelatedOptions

It seems clearer to me that this would check for *any of* instead of
*all of* these option categories, as it looks to me like that was the
intent. But apparently this logic has always has been inverted, and
possibly never fully used?

Differential Revision: https://reviews.llvm.org/D114572

[libc][clang-tidy] fix namespace check for externals

Up until now, all references to `errno` were marked with `NOLINT`, since
it was technically calling an external function. This fixes the lint
rules so that `errno`, as well as `malloc`, `calloc`, `realloc`, and
`free` are all allowed to be called as external functions. All of the
relevant `NOLINT` comments have been removed, and the documentation has
been updated.

Reviewed By: sivachandra, lntue, aaron.ballman

Differential Revision: https://reviews.llvm.org/D113946

[memprof] Fix unit test build after refactoring shared header.

The memprof unittest also needs to include the MemProfData.inc header
directly to have access to MEMPROF_RAW_MAGIC and MEMPROF_RAW_VERSION
globals.

[ELF][PPC64] Remove unneeded PPC64PCRelLongBranchThunk

This reverts the PPC64PCRelLongBranchThunk part from D86706.
PPC64PCRelLongBranchThunk is the same as PPC64R12SetupStub.

Use `__gep_setup_` instead of `__long_branch_pcrel_` for the stub symbol name
as it more closely indicates the operation.
(Note: GNU ld uses `*.long_branch.*` and `*.plt_branch.*`).

Reviewed By: NeHuang, nemanjai

Differential Revision: https://reviews.llvm.org/D114656

[lldb] Fix indentation in builders/darwin.py

[lldb] Search PrivateFrameworks when using an internal SDK

Make sure to add the PrivateFrameworks directory to the frameworks path
when using an internal SDK. This is necessary for the "on-device" test
suite.

rdar://84519268

Differential revision: https://reviews.llvm.org/D114742

[InstSimplify] add logic fold for 'or'

https://alive2.llvm.org/ce/z/4PaPDy

There's a related fold where the inner 'or' is replaced by 'and',
but that needs to be more careful about matching a 'not'.

[InstSimplify] reduce code duplication for 'or' logic folds; NFC

[InstSimplify] make 'or' test names more descriptive; NFC

Also, vary the types in a couple of tests for better coverage.

[ELF] Change -z unknown from error to warning

There is a trend of having more optional options (usually security
hardening related) like -z cet-report=, -z bti-report=, -z force-bti.
If ld.lld 14.0.0 uses a warning, in 15/16/17/... timeframe when people
add new options to software, they can worry less about linker errors on ld.lld 14.0.0.

In some cases `-z foo` does essential work where a silent ignore can be
problematic, but the user has received a warning. From my observation, the
doing-essential-work `-z foo` is much fewer than the converse. In addition,
the user who cares can use `--fatal-warnings` (Note: GNU ld doesn't upgrade warnings to errors).
It is unclear whether we need something like `clang -Wunknown-warning-option`.

If we ever run into unfortunate transition like `-z start-stop-gc`, the
affected software (e.g. ldc is a compiler which passes linker options to the underlying ld)
can blindly add the `-z` option, without worrying it may cause a linker error to LLD 14.0.0.

Reviewed By: jrtc27, peter.smith

Differential Revision: https://reviews.llvm.org/D114748

[gn build] Port 7cca33b40f77

[memprof] Extend llvm-profdata to display MemProf profile summaries.

This commit adds initial support to llvm-profdata to read and print
summaries of raw memprof profiles.
Summary of changes:
* Refactor shared defs to MemProfData.inc
* Extend show_main to display memprof profile summaries.
* Add a simple raw memprof profile reader.
* Add a couple of tests to tools/llvm-profdata.

Differential Revision: https://reviews.llvm.org/D114286

[flang] Address TODO from previous changes to IsSaved()

An earlier fix to evaluate::IsSaved() needed to preserve its
treatment of named constants in modules and main programs -- i.e.
they would appear to be saved -- until a correction was added
to the lowering code. This TODO can now be resolved.

Differential Revision: https://reviews.llvm.org/D114756

Typo fix

[SLP]Improve isFixedVectorShuffle and its use.

Extended support for undefined source vector/extract indices/non-fixed
vector types, also no need to check for the parent of the extractelement
instructions with the constant indicies.

Differential Revision: https://reviews.llvm.org/D114121

[InstSimplify] reduce code duplication for 'or' logic fold; NFC

[InstSimplify] adjust tests for 'or' of logic ops; NFC

Half of the tests had an extra instruction so were not testing the minimal patterns.

[InstSimplify] refactor 'or' logic folds; NFC

Reduce duplication for handling the top-level commuted operands.
There are several other folds that should be moved in here, but
we need to make sure there's good test coverage.

[InstSimplify] add tests for 'or' with logic ops; NFC

The code for these transforms can be refactored,
but the existing tests are incomplete.

[InstSimplify] add tests for 'or' logic folds; NFC

The tests are adapted from the xor patterns used with:
892648b18a8c
b326c058146f

[SLP][NFC]Move static function to make it visible in member function,
NFC.

Revert "Use VersionTuple for parsing versions in Triple. This makes it possible to distinguish between "16" and "16.0" after parsing, which previously was not possible."

This reverts commit 1e8286467036d8ef1a972de723f805a4981b2692.

llvm/test/Transforms/LoopStrengthReduce/X86/2009-11-10-LSRCrash.ll fails
with assertion failure:

llc: /home/nikic/llvm-project/llvm/include/llvm/ADT/Optional.h:196: T& llvm::optional_detail::OptionalStorage<T, true>::getValue() & [with T = unsigned int]: Assertion `hasVal' failed.
...
#8 0x00005633843af5cb llvm::MCStreamer::emitVersionForTarget(llvm::Triple const&, llvm::VersionTuple const&)
#9 0x0000563383b47f14 llvm::AsmPrinter::doInitialization(llvm::Module&)

[RegionPass] Added check for -filter-print-funcs option to the region IR dumps.

Differential Revision: https://reviews.llvm.org/D114310

[SCEV] Track and invalidate ValuesAtScopes users

ValuesAtScopes maps a SCEV and a Loop to another SCEV. While we
invalidate entries if the left-hand SCEV is invalidated, we
currently don't do this for the right-hand SCEV. Fix this by
tracking users in a reverse map and using it for invalidation.

This is conceptually the same change as D114738, but using the
reverse map to avoid performance issues.

Differential Revision: https://reviews.llvm.org/D114788

[JITLink][ELF] Don't skip sections of size 0

Size 0 sections can have symbols that have size 0. Build those sections
and symbols into the LinkGraph so they can be used properly if needed.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D114749

[JITLink][ELF] Add support for reading extended table

Add support for reading extended table in ELF object file. This allows
JITLink to support ELF object files with many sections.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D114747

[CSSPGO] Sorting nodes in a cycle of profiled call graph.

For nodes that are in a cycle of a profiled call graph, the current order the underlying scc_iter computes purely depends on how those nodes are reached from outside the SCC and inside the SCC, based on the Tarjan algorithm. This does not honor profile edge hotness, thus does not gurantee hot callsites to be inlined prior to cold callsites. To mitigate that, I'm adding an extra sorter on top of scc_iter to sort scc functions in the order of callsite hotness, instead of changing the internal of scc_iter.

Sorting on callsite hotness can be optimally based on detecting cycles on a directed call graph, i.e, to remove the coldest edge until a cycle is broken. However, detecting cycles isn't cheap. I'm using an MST-based approach which is faster and appear to deliver some performance wins.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D114204

[PS4][DWARF] Explicitly set default DWARF version to 4

[clang][dataflow] Make header parse

Looks like this is actually dead code?

[LV] Remove unneeded cast to Operator [NFC]

Fix file extension of alignment-assumption-ignorelist.cppp test

During the renaming of blacklist to ignorelist this test got renamed
incorrectly.

Differential revision: https://reviews.llvm.org/D114710

Code quality: Combine V_RSQ

Combine V_RCP and V_SQRT into V_RSQ on AMDGPU for GlobalISel.

Change-Id: I93c5dcb412483156a6e8b68c4085cbce83ac9703

Revert "[fir] Add fir reduction builder"

This reverts commit cf3422d3df5b00d771bba837b9f51f67ab07eb64.

This fails on some buildbots

[fir] Remove unused fct recordTypeCanBeMemCopied

Remove unused fct added with 47f759309eeaf9bd77debe4f6c3e1fe52913b537

[mlir][linalg] Add decompose to CodegenStrategy.

Add the decompose patterns that lower higher dimensional convolutions to lower dimensional ones to CodegenStrategy and use CodegenStrategy to test the decompose patterns. Additionally, remove the assertion that checks the anchor op name is set in the CodegenStrategyTest pass. Removing the assertion allows us to simplify the pipelines used in the interchange and decompose tests.

Depends On D114797

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114798

[mlir][linalg] Adapt the decompose patterns to use a filter (NFC).

The revision updates the convolution decomposition patterns to take a linalg transformation filter. The transformation filter in a later revision allows use the patterns from CodegenStrategy.

Depends On D114690

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114797

Use VersionTuple for parsing versions in Triple. This makes it possible to distinguish between "16" and "16.0" after parsing, which previously was not possible.

See also https://github.com/android/ndk/issues/1455.

Differential Revision: https://reviews.llvm.org/D114163

[DSE] Use optimized access if available for redundant store elimination.

Using the optimized access enables additional optimizations in cases
where the defining access is a non-aliasing store.

Alternatively we could also walk upwards and skip non-aliasing defs
here, but my experiments so far showed that this will noticeably
increase compile-time for little extra gain compared to just using the
optimized access.

Improvements of dse.NumRedundantStores on MultiSource/CINT2006/CPF2006
on X86 with -O3:

     test-suite...-typeset/consumer-typeset.test     1.00                  76.00              7500.0%
     test-suite.../Benchmarks/Bullet/bullet.test     3.00                  12.00              300.0%
     test-suite...006/453.povray/453.povray.test     3.00                   6.00              100.0%
     test-suite...telecomm-gsm/telecomm-gsm.test     1.00                   2.00              100.0%
     test-suite...ediabench/gsm/toast/toast.test     1.00                   2.00              100.0%
     test-suite...marks/7zip/7zip-benchmark.test     1.00                   2.00              100.0%
     test-suite...ications/JM/lencod/lencod.test     7.00                  10.00              42.9%
     test-suite...6/464.h264ref/464.h264ref.test     6.00                   8.00              33.3%
     test-suite...ications/JM/ldecod/ldecod.test     6.00                   7.00              16.7%
     test-suite...006/447.dealII/447.dealII.test    33.00                  33.00               0.0%
     test-suite...6/471.omnetpp/471.omnetpp.test    NaN                     1.00               nan%
     test-suite...006/450.soplex/450.soplex.test    NaN                     2.00               nan%
     test-suite.../CINT2006/403.gcc/403.gcc.test    NaN                     7.00               nan%
     test-suite...lications/ClamAV/clamscan.test    NaN                     1.00               nan%
     test-suite...CI_Purple/SMG2000/smg2000.test    NaN                     3.00               nan%

Follow-up to D111727.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D112315

[mlir][linalg] Support the empty anchor op string when padding.

Add support for an empty anchor op string in vectorization. An empty anchor op string is useful after fusion when there are multiple different operations to vectorize.

Depends On D114689

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114690

[clang][dataflow] Fix broken build in ClangStaticAnalyzer

Adds a missing virtual destructor.

[mlir][linalg] Use top down traversal for padding.

Pad the operation using a top down traversal. The top down traversal unlocks folding opportunities and dim op canonicalizations due to the introduced extract slice operation after the padded operation.

Depends On D114585

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114689

[DAG] Create fptosi.sat from clamped fptosi

This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.

A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.

This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.

Differential Revision: https://reviews.llvm.org/D111976

[mlir][linalg] Fix windows build issue in hoist padding.

Iterating backwardSlice and removing elements at the same time can fail on windows for specific build configurations (the code was introduced in https://reviews.llvm.org/D114420). This revision introduces a second vector to collect all operations and removes them after finishing the reverse iteration.

Reviewed By: hpmorgan

Differential Revision: https://reviews.llvm.org/D114775

[OpenMP] Add RTL function to externalization RAII

This patch adds the `__kmpc_get_warp_size` OpenMP RTL function to the
externalization RAII struct. This was getting optimized out and then
being replaced with an undefined value once added back in, causing bugs
for complex reductions.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D114802

[mlir][linalg] Run CSE after every CodegenStrategy transformation.

Add CSE after every transformation. Transformations such as tiling introduce redundant computation, for example, one AffineMinOp for every operand dimension pair. Follow up transformations such as Padding and Hoisting benefit from CSE since comparing slice sizes simplifies to comparing SSA values instead of analyzing affine expressions.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114585

[lld-macho] Mark dylib symbols coming from -weak_framework as weak-ref.

PR:52564

Differential Revision: https://reviews.llvm.org/D114397

[fir] Add fir reduction builder

This patch introduces a bunch of builder functions
to create function calls to runtime reduction functions.

This patch is part of the upstreaming effort from fir-dev branch.

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: mleair <leairmark@gmail.com>
Differential Revision: https://reviews.llvm.org/D114460

Reviewed By: awarzynski