review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Keith Smiley [Sat, 13 Nov 2021 01:24:26 +0000 (17:24 -0800)]

[llvm-obcopy][MachO] Add error for MH_PRELOAD

Previously this would crash. Fixes https://bugs.llvm.org/show_bug.cgi?id=51877

Differential Revision: https://reviews.llvm.org/D113819

commit | commitdiff | tree

Vy Nguyen [Tue, 9 Nov 2021 00:50:34 +0000 (19:50 -0500)]

[lld-macho] Allow exporting weak_def_can_be_hidden(AKA "autohide") symbols

autohide symbols behaves similarly to private_extern symbols.
However, LD64 allows exporting autohide symbols. LLD currently does not.
This patch allows LLD to export them.

Differential Revision: https://reviews.llvm.org/D113167

commit | commitdiff | tree

Matheus Izvekov [Fri, 12 Nov 2021 23:40:18 +0000 (00:40 +0100)]

[clang] retain type sugar in auto / template argument deduction

This implements the following changes:
* AutoType retains sugared deduced-as-type.
* Template argument deduction machinery analyses the sugared type all the way
down. It would previously lose the sugar on first recursion.
* Undeduced AutoType will be properly canonicalized, including the constraint
template arguments.
* Remove the decltype node created from the decltype(auto) deduction.

As a result, we start seeing sugared types in a lot more test cases,
including some which showed very unfriendly `type-parameter-*-*` types.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D110216

commit | commitdiff | tree

Phoebe Wang [Sat, 13 Nov 2021 01:24:05 +0000 (09:24 +0800)]

[X86][ABI] Change the alignment of f80 in 32-bit calling convention to meet with different data layout

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113739

commit | commitdiff | tree

Vitaly Buka [Sat, 13 Nov 2021 01:56:19 +0000 (17:56 -0800)]

[asan] More leaks in test

It fails to detect a single leak with GLIBC 2.34.

commit | commitdiff | tree

Vy Nguyen [Sat, 13 Nov 2021 01:26:30 +0000 (20:26 -0500)]

[lld-macho] Parallelize scanning the symbol tables in export/unexport-ing.

(Split from D113167)
Benchmarking on one of our large apps which exports a few thousands symbols,
this showed an improvement of ~17%.

x ./LLD_no_parallel.txt
+ ./LLD_with_parallel.txt

    N           Min           Max        Median           Avg        Stddev
x  10         84.01         89.41         88.64        87.693     1.7424061
+  10          71.9         74.29         72.63        72.753    0.77734663
Difference at 95.0% confidence
-14.94 +/- 1.26763
-17.0367% +/- 1.44553%
(Student's t, pooled s = 1.34912)

(wallclock)

Differential Revision: https://reviews.llvm.org/D113820

commit | commitdiff | tree

Vitaly Buka [Sat, 13 Nov 2021 01:41:50 +0000 (17:41 -0800)]

[asan] Fix "no matching function" on GCC

commit | commitdiff | tree

Nico Weber [Sat, 13 Nov 2021 01:09:01 +0000 (20:09 -0500)]

[gn build] (semi-manually) port cb0e14ce6dcd

commit | commitdiff | tree

Vitaly Buka [Sat, 13 Nov 2021 00:52:25 +0000 (16:52 -0800)]

[sanitizer] Fix test linking

commit | commitdiff | tree

Ben Langmuir [Fri, 12 Nov 2021 22:45:02 +0000 (14:45 -0800)]

[ORC][ORC-RT] Register type metadata from __swift5_types MachO section

Similar to how the other swift sections are registered by the ORC
runtime's macho platform, add the __swift5_types section, which contains
type metadata. Add a simple test that demonstrates that the swift
runtime recognized the registered types.

rdar://85358530

Differential Revision: https://reviews.llvm.org/D113811

commit | commitdiff | tree

Craig Topper [Sat, 13 Nov 2021 00:26:13 +0000 (16:26 -0800)]

[RISCV] Fixed duplicate RUN line on float-intrinsics.ll. NFC

We had two identical RV64I RUN lines. One should be RV32I.

commit | commitdiff | tree

Josh Learn [Sat, 13 Nov 2021 00:17:18 +0000 (16:17 -0800)]

[clang][objc][codegen] Skip emitting ObjC category metadata when the
category is empty

Currently, if we create a category in ObjC that is empty, we still emit
runtime metadata for that category. This is a scenario that could
commonly be run into when using __attribute__((objc_direct_members)),
which elides the need for much of the category metadata. This is
slightly wasteful and can be easily skipped by checking the category
metadata contents during CodeGen.

rdar://66177182

Differential Revision: https://reviews.llvm.org/D113455

commit | commitdiff | tree

Vitaly Buka [Thu, 11 Nov 2021 02:17:20 +0000 (18:17 -0800)]

[sanitizer] Switch dlsym hack to internal_allocator

Since glibc 2.34, dlsym does
  1. malloc 1
  2. malloc 2
  3. free pointer from malloc 1
  4. free pointer from malloc 2
These sequence was not handled by trivial dlsym hack.

This fixes https://bugs.llvm.org/show_bug.cgi?id=52278

Reviewed By: eugenis, morehouse

Differential Revision: https://reviews.llvm.org/D112588

commit | commitdiff | tree

Philip Reames [Fri, 12 Nov 2021 23:00:39 +0000 (15:00 -0800)]

[runtime-unroll] Use incrementing IVs instead of decrementing ones

This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing.

Why does this matter?  A couple of reasons:
* SCEV doesn't have a native subtract node.  Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such.  As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones.  (You can see this in the inferred flags in some of the test cases.)
* Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language.  We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced.  (You can see this looking at nearby phis in the test cases.)

Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen.

* Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value.  We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.

commit | commitdiff | tree

Lawrence D'Anna [Fri, 12 Nov 2021 23:38:35 +0000 (15:38 -0800)]

[lldb] temporarily disable TestPaths.test_interpreter_info on windows

I'm disabling this test until the fix is reviewed
(here https://reviews.llvm.org/D113650/)

commit | commitdiff | tree

Craig Topper [Fri, 12 Nov 2021 21:20:20 +0000 (13:20 -0800)]

[RISCV] Improve codegen for i32 udiv/urem by constant on RV64.

The division by constant optimization often produces constants that
are uimm32, but not simm32. These constants require 3 or 4 instructions
to materialize without Zba.

Since these instructions are often used by a multiply with a LHS
that needs to be zero extended with an AND, we can switch the MUL
to a MULHU by shifting both inputs left by 32. Once we shift the
constant left, the upper 32 bits no longer need to be 0 so constant
materialization is free to use LUI+ADDIW. This reduces the constant
materialization from 4 instructions to 3 in some cases while also
reducing the zero extend of the LHS from 2 shifts to 1.

Differential Revision: https://reviews.llvm.org/D113805

commit | commitdiff | tree

Duncan P. N. Exon Smith [Fri, 12 Nov 2021 22:22:00 +0000 (14:22 -0800)]

lld: const-qualify iterations through VarStreamArray, NFC

No functionality change here; just unblocking a patch to LLVM.

commit | commitdiff | tree

Duncan P. N. Exon Smith [Fri, 12 Nov 2021 02:07:14 +0000 (18:07 -0800)]

IR: Fix const-correctness of SwitchInst::CaseIterator and CaseHandle

Fix some confusion between the two types of `const` a pointer/iterator
can have. Users of a SwitchInst::CaseIterator should not (and do not!)
manually mutate the SwitchInst::CaseHandle that tracks its internal
state. Change operator*() to return `const CaseHandle&`, remove the
non-const-qualified operator*(), and const-qualify
CaseHandle::setValue() and CaseHandle::setSuccessor().

Differential Revision: https://reviews.llvm.org/D113788

commit | commitdiff | tree

Duncan P. N. Exon Smith [Fri, 12 Nov 2021 21:50:29 +0000 (13:50 -0800)]

ADT: Avoid repeating iterator adaptor/facade template params, NFC

Take advantage of class name injection to avoid redundantly specifying
template parameters of iterator adaptor/facade base classes.

No functionality change, although the private typedefs changed in a
couple of cases.

  - Added a private typedef HashTableIterator::BaseT, following the
    pattern from r207084 / 3478d4b164e8d3eba01f5bfa3fc5bfb287a78b97, to
    pre-emptively appease MSVC (maybe it's not necessary anymore but
    looks like we do this pretty consistently). Otherwise, I removed
    private
  - Removed private typedefs filter_iterator_impl::BaseT and
    FilterIteratorTest::InputIterator::BaseT since there was only one
    use of each and the definition was no longer interesting.

commit | commitdiff | tree

Alexey Bataev [Fri, 12 Nov 2021 21:45:38 +0000 (13:45 -0800)]

[SLP][NFCAdd a test for vector intrinsic with scalar parameter, NFC.

commit | commitdiff | tree

Félix Cloutier [Thu, 11 Nov 2021 02:03:36 +0000 (18:03 -0800)]

format_arg attribute does not support nullable instancetype return type

* The format_arg attribute tells the compiler that the attributed function
  returns a format string that is compatible with a format string that is being
  passed as a specific argument.
* Several NSString methods return copies of their input, so they would ideally
  have the format_arg attribute. A previous differential (D112670) added
  support for instancetype methods having the format_arg attribute when used
  in the context of NSString method declarations.
* D112670 failed to account that instancetype can be sugared in certain narrow
  (but critical) scenarios, like by using nullability specifiers. This patch
  resolves this problem.

Differential Revision: https://reviews.llvm.org/D113636
Reviewed By: ahatanak

Radar-Id: rdar://85278860

commit | commitdiff | tree

David Tenty [Fri, 12 Nov 2021 21:27:57 +0000 (16:27 -0500)]

[libcxx][AIX] XFAIL tests enabled by locale.fr_FR.UTF-8

We missed the tests in the earlier XFAIL-ing because the locale.fr_FR.UTF-8
feature wasn't available, but since an upgrade these are now showing up
on the CI.

Differential Revision: https://reviews.llvm.org/D113791

commit | commitdiff | tree

Mogball [Fri, 12 Nov 2021 01:17:05 +0000 (01:17 +0000)]

[mlir][ods] Cleanup of Class Codegen helper

Depends on D113331

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D113714

commit | commitdiff | tree

Peter Klausler [Thu, 11 Nov 2021 20:36:15 +0000 (12:36 -0800)]

[flang] Handle ENTRY names in IsPureProcedure() predicate

Fortran defines an ENTRY point name as being pure if its enclosing
subprogram scope defines a pure procedure.

Differential Revision: https://reviews.llvm.org/D113711

commit | commitdiff | tree

Mogball [Fri, 12 Nov 2021 21:17:38 +0000 (21:17 +0000)]

[mlir][ods] DialectAsmPrinter -> AsmPrinter in comments

commit | commitdiff | tree

Vitaly Buka [Fri, 12 Nov 2021 21:00:54 +0000 (13:00 -0800)]

[asan] Fix GCC warning "left shift count >= width"

Fixes PR52385

commit | commitdiff | tree

Jez Ng [Fri, 12 Nov 2021 20:59:07 +0000 (15:59 -0500)]

[lld-macho] Fix symbol relocs handling for LSDAs

Similar to D113702, but for the LSDAs. Clang seems to emit all LSDA
relocs as section relocs, but ld -r can turn those relocs into symbol
ones.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D113721

commit | commitdiff | tree

Jez Ng [Fri, 12 Nov 2021 21:01:25 +0000 (16:01 -0500)]

[lld-macho] Teach ICF to dedup functions with identical unwind info

Dedup'ing unwind info is tricky because each CUE contains a different
function address, if ICF operated naively and compared the entire
contents of each CUE, entries with identical unwind info but belonging
to different functions would never be considered identical. To work
around this problem, we slice away the function address before
performing ICF. We rely on `relocateCompactUnwind()` to correctly handle
these truncated input sections.

Here are the numbers before and after D109944, D109945, and this diff
were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W:

Without any optimizations:

             base           diff           difference (95% CI)
  sys_time   0.849 ± 0.015  0.896 ± 0.012  [  +4.8% ..   +6.2%]
  user_time  3.357 ± 0.030  3.512 ± 0.023  [  +4.3% ..   +5.0%]
  wall_time  3.944 ± 0.039  4.032 ± 0.031  [  +1.8% ..   +2.6%]
  samples    40             38

With `-dead_strip`:

             base           diff           difference (95% CI)
  sys_time   0.847 ± 0.010  0.896 ± 0.012  [  +5.2% ..   +6.5%]
  user_time  3.377 ± 0.014  3.532 ± 0.015  [  +4.4% ..   +4.8%]
  wall_time  3.962 ± 0.024  4.060 ± 0.030  [  +2.1% ..   +2.8%]
  samples    47             30

With `-dead_strip` and `--icf=all`:

             base           diff           difference (95% CI)
  sys_time   0.935 ± 0.013  0.957 ± 0.018  [  +1.5% ..   +3.2%]
  user_time  3.472 ± 0.022  6.531 ± 0.046  [ +87.6% ..  +88.7%]
  wall_time  4.080 ± 0.040  5.329 ± 0.060  [ +30.0% ..  +31.2%]
  samples    37             30

Unsurprisingly, ICF is now a lot slower, likely due to the much larger
number of input sections it needs to process. But the rest of the
linker only suffers a mild slowdown.

Note that the compact-unwind-bad-reloc.s test was expanded because we
now handle the relocation for CUE's function address in a separate code
path from the rest of the CUE relocations. The extended test covers both
code paths.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D109946

commit | commitdiff | tree

Sanjay Patel [Fri, 12 Nov 2021 20:49:46 +0000 (15:49 -0500)]

[AArch64][x86] add tests for swapped cmp+vselect patterns; NFC

These patterns were noted in the recent D113212 and follow-ups.
I did not bother to duplicate every test because it should be
clear if we recognize the swaps from a smaller sample. We have
complete coverage for the original patterns.

commit | commitdiff | tree

wlei [Tue, 9 Nov 2021 07:05:16 +0000 (23:05 -0800)]

[llvm-profgen] Fix bug of setting function entry

Previously we set `isFuncEntry` flag to true when the funcName from DWARF is equal to the name in symbol table and we use this flag to ignore reporting callsite sample that's from an intra func branch. However, in HHVM, it appears that the symbol table name is inconsistent with the dwarf info func name, it's likely due to `OptimizeGlobalAliases`.

This change is a workaround in llvm-profgen side to mark the only one range as the function entry and add warnings for the remaining inconsistence.

This also fixed a missing `getCanonicalFnName` for symbol name which caused the mismatching as well.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D113492

commit | commitdiff | tree

Aaron Puchert [Thu, 11 Nov 2021 20:44:20 +0000 (21:44 +0100)]

Comment Sema: Make most of CommentSema private (NFC)

We only need to expose setDecl, copyArray and the actOn* methods.

commit | commitdiff | tree

Aaron Puchert [Fri, 12 Nov 2021 20:09:40 +0000 (21:09 +0100)]

Comment AST: Recognize function-like objects via return type (NFC)

Instead of pretending that function pointer type aliases or variables
are functions, and thereby losing the information that they are type
aliases or variables, respectively, we use the existence of a return
type in the DeclInfo to signify a "function-like" object.

That seems pretty natural, since it's also the return type (or parameter
list) from the DeclInfo that we compare the documentation with.

Addresses a concern voiced in D111264#3115104.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D113691

commit | commitdiff | tree

Aaron Puchert [Fri, 12 Nov 2021 20:09:16 +0000 (21:09 +0100)]

Comment AST: Find out if function is variadic in DeclInfo::fill

Then we don't have to look into the declaration again. Also it's only
natural to collect this information alongside parameters and return
type, as it's also just a parameter in some sense.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D113690

commit | commitdiff | tree

Peter Hawkins [Fri, 12 Nov 2021 20:02:18 +0000 (12:02 -0800)]

Don't define //mlir:MLIRBindingsPythonCore in terms of the NoCAPI and CAPIDeps targets.

We noticed that the library structure causes link ordering problems in Google's internal build. However, we don't think the problem is specific to Google's build, it probably can be reproduced anywhere with the right library structure.

In general splitting the Python bindings from their dependencies (the C API targets) creates the possibility that the two libraries might end up in the wrong order on the linker command line. We can avoid this problem happening by reverting the structure of the MLIRBindingsPythonCore to represent its dependencies in the usual way, rather than composing an incomplete `MLIRBindingsPythonCoreNoCAPI` target and their CAPI dependencies. It was probably a mistake to rewrite this particular `cc_library()` rule in terms of the two, since nothing guarantees that the two will be correctly ordered by the linker when both are being linked into the same binary, and it was only an incidental "cleanup" done in passing.

Otherwise the previous PR (D113565) is fine, since that was about the case where both are being built into two separate shared libraries. It just shouldn't have made this (unrelated) change.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D113773

commit | commitdiff | tree

Jez Ng [Fri, 12 Nov 2021 20:00:51 +0000 (15:00 -0500)]

[reland][lld-macho] Fix symbol relocs handling for compact unwind's functionAddress

Clang seems to emit all functionAddress relocs as section relocs, but
`ld -r` can turn those relocs into symbol ones. It turns out that we
weren't handling that case correctly when the symbol was a weak def
whose definition did not prevail.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D113702

commit | commitdiff | tree

Jacques Pienaar [Fri, 12 Nov 2021 19:46:14 +0000 (11:46 -0800)]

[mlir][shape] Add value_as_shape op

Part of the very first discussion here, but didn't upstream it before as we
didn't use it yet. Fix that for pending updates. Just adding the op here,
follow up will add the lowering to codegen.

commit | commitdiff | tree

Duncan P. N. Exon Smith [Fri, 12 Nov 2021 02:51:31 +0000 (18:51 -0800)]

Sema: const-qualify ParsedAttr::iterator::operator*()

`const`-qualify ParsedAttr::iterator::operator*(), clearing up confusion
about the two meanings of const for pointers/iterators. Helps unblock
removal of (non-const) iterator_facade_base::operator->().

commit | commitdiff | tree

Duncan P. N. Exon Smith [Fri, 12 Nov 2021 02:20:52 +0000 (18:20 -0800)]

IR: Avoid duplication of SwitchInst::findCaseValue(), NFC

Change the non-const version of findCaseValue() to forward to the const
version.

commit | commitdiff | tree

Philip Reames [Fri, 12 Nov 2021 19:35:28 +0000 (11:35 -0800)]

[unroll] Keep unrolled iterations with initial iteration

The unrolling code was previously inserting new cloned blocks at the end of the function.  The result of this with typical loop structures is that the new iterations are placed far from the initial iteration.

With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting.  As such, placing Count-1 copies out of line is a fairly poor code placement choice.  We'd much rather fall through into the hot (non-exiting) path.  For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code.

However, the real motivation for this change isn't performance.  Its readability and human understanding.  Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious.

commit | commitdiff | tree

Peter Klausler [Tue, 2 Nov 2021 20:01:38 +0000 (13:01 -0700)]

[flang] Runtime performance improvements to real formatted input

Profiling a basic internal real input read benchmark shows some
hot spots in the code used to prepare input for decimal-to-binary
conversion, which is of course where the time should be spent.
The library that implements decimal to/from binary conversions has
been optimized, but not the code in the Fortran runtime that calls it,
and there are some obvious light changes worth making here.

Move some member functions from *.cpp files into the class definitions
of Descriptor and IoStatementState to enable inlining and specialization.

Make GetNextInputBytes() the new basic input API within the
runtime, replacing GetCurrentChar() -- which is rewritten in terms of
GetNextInputBytes -- so that input routines can have the
ability to acquire more than one input character at a time
and amortize overhead.

These changes speed up the time to read 1M random reals
using internal I/O from a character array from 1.29s to 0.54s
on my machine, which on par with Intel Fortran and much faster than
GNU Fortran.

Differential Revision: https://reviews.llvm.org/D113697

commit | commitdiff | tree

Keith Smiley [Wed, 10 Nov 2021 05:28:56 +0000 (21:28 -0800)]

[lld-macho] Fix trailing slash in oso_prefix

Previously if you passed `-oso_prefix path/to/foo/` with a trailing
slash at the end, using `real_path` would remove that slash, but that
slash is necessary to make sure OSO prefix paths end up as valid
relative paths instead of starting with `/`.

Differential Revision: https://reviews.llvm.org/D113541

commit | commitdiff | tree

Duncan P. N. Exon Smith [Thu, 4 Nov 2021 00:52:34 +0000 (17:52 -0700)]

ADT: Fix const-correctness of iterator adaptors

This fixes const-correctness of iterator adaptors, dropping non-`const`
overloads for `operator*()`.

Iterators, like the pointers that they generalize, have two types of
`const`.

The `const` qualifier on members indicates whether the iterator itself
can be changed. This is analagous to `int *const`.

The `const` qualifier on return values of `operator*()`, `operator[]()`,
and `operator->()` controls whether the the pointed-to value can be
changed. This is analogous to `const int *`.

Since `operator*()` does not (in principle) change the iterator, then
there should only be one definition, which is `const`-qualified. E.g.,
iterators wrapping `int*` should look like:
```
int *operator*() const; // always const-qualified, no overloads
```

ba7a6b314fd14bb2c9ff5d3f4fe2b6525514cada changed `iterator_adaptor_base`
away from this to work around bugs in other iterator adaptors. That was
already reverted. This patch adds back its test, which combined
llvm::enumerate() and llvm::make_filter_range(), adds a test for
iterator_adaptor_base itself, and cleans up the `const`-ness of the
other iterator adaptors.

This also updates the documented requirements for
`iterator_facade_base`:
```
/// OLD:
///   - const T &operator*() const;
///   - T &operator*();

/// New:
///   - T &operator*() const;
```
In a future commit we might also clean up `iterator_facade`'s overloads
of `operator->()` and `operator[]()`. These already (correctly) return
non-`const` proxies regardless of the iterator's `const` qualifier.

Differential Revision: https://reviews.llvm.org/D113158

commit | commitdiff | tree

Philip Reames [Fri, 12 Nov 2021 19:15:57 +0000 (11:15 -0800)]

(re-)Autogen one last unroll-and-jam test

This case was complicated because someone had added new non-autogened test to an autogened file.  In particular, those new tests used two variables (%J and %j) which differeded only in capitalization.  The auto-updater doesn't distinguish case, so this meant auto-gened versions of the new tests failed with non-obvious errors.

There are two key lessons here:
1) Please don't use two values which differ only in case.  This is problematic for automatic tooling, but is also hard to understand for a human.
2) Please DO NOT add new tests to an autogened test without running autogen again.  If autogen doesn't pass on your new test, put them in a separate file.

commit | commitdiff | tree

Peter Klausler [Wed, 10 Nov 2021 22:02:30 +0000 (14:02 -0800)]

[flang] Fix rounding edge case in F output editing

When an Fw.d output edit descriptor has a "d" value exactly
equal to the number of zeroes after the decimal point for a value
(e.g., 0.07 with F5.1), the Fw.d output editing code needs to
do the rounding itself to either 0.0 or 0.1 after performing
a conversion without rounding (to avoid 0.04999 rounding up twice).

Differential Revision: https://reviews.llvm.org/D113698

commit | commitdiff | tree

Alfsonso Gregory [Fri, 12 Nov 2021 18:53:50 +0000 (13:53 -0500)]

[libc++][NFC] Resolve Python 2 FIXME

We don't use Python 2 anymore, so let us do the recommended fix instead
of using the workaround made for Python 2.

Differential Revision: https://reviews.llvm.org/D107715

commit | commitdiff | tree

Peter Klausler [Wed, 10 Nov 2021 19:55:46 +0000 (11:55 -0800)]

[flang] Respect NO_STOP_MESSAGE=1 in runtime

When an environment variable NO_STOP_MESSAGE=1 is set,
assume that STOP statements with a successful code
have QUIET=.TRUE.

Differential Revision: https://reviews.llvm.org/D113701

commit | commitdiff | tree

Lang Hames [Fri, 12 Nov 2021 18:28:38 +0000 (10:28 -0800)]

[ORC-RT][llvm-jitlink] Fix a buggy check in ORC-RT MachO TLV deregistration.

The check was failing because it was matching against the end of the range, not
the start.

This bug wasn't causing the ORC-RT MachO TLV regression test to fail because
we were only logging deallocation errors (including TLV deregistration errors)
and not actually returning a failure code. This commit updates llvm-jitlink to
report the errors properly.

commit | commitdiff | tree

Lang Hames [Fri, 12 Nov 2021 16:46:03 +0000 (08:46 -0800)]

[JITLink] Fix think-o in handwritten CWrapperFunctionResult -> Error converter.

We need to skip the length field when generating error strings.

No test case: This hand-hacked deserializer should be removed in the near future
once JITLink can use generic ORC APIs (including SPS and WrapperFunction).

commit | commitdiff | tree

Philip Reames [Fri, 12 Nov 2021 18:30:27 +0000 (10:30 -0800)]

Autogen a bunch of unrolling tests for ease of update

commit | commitdiff | tree

Peter Klausler [Wed, 10 Nov 2021 23:49:05 +0000 (15:49 -0800)]

[flang] Fix ORDER= argument to RESHAPE

The ORDER= argument to the transformational intrinsic function RESHAPE
was being misinterpreted in an inverted way that could be detected only
with 3-d or higher rank array. Fix in both folding and the runtime, and
extend tests.

Differential Revision: https://reviews.llvm.org/D113699

commit | commitdiff | tree

Florian Hahn [Fri, 12 Nov 2021 18:16:03 +0000 (18:16 +0000)]

[SCEV] Update SCEVLoopGuardRewriter to take SCEV -> SCEV map (NFC).

Split off refactoring from D113577 to reduce the diff. NFC as the new
interface will only be used in D113577.

commit | commitdiff | tree

Nawrin Sultana [Mon, 25 Oct 2021 19:01:56 +0000 (14:01 -0500)]

[OpenMP] Set default blocktime to 0 for hybrid cpu

Differential Revision:https://reviews.llvm.org/D113012

commit | commitdiff | tree

Quinn Pham [Thu, 11 Nov 2021 20:14:11 +0000 (14:14 -0600)]

[lldb][NFC] Inclusive language: rename m_master in ASTImporterDelegate

[NFC] As part of using inclusive language within the llvm project, this patch
replaces `m_master` in `ASTImporterDelegate` with `m_main`.

Reviewed By: teemperor, clayborg

Differential Revision: https://reviews.llvm.org/D113720

commit | commitdiff | tree

Simon Pilgrim [Fri, 12 Nov 2021 17:57:20 +0000 (17:57 +0000)]

[AMDGPU] Regenerate udiv.ll tests

commit | commitdiff | tree

Philip Reames [Fri, 12 Nov 2021 17:48:20 +0000 (09:48 -0800)]

Refresh an autogen test to reduce spurious diffs

commit | commitdiff | tree

Fangrui Song [Fri, 12 Nov 2021 17:47:31 +0000 (09:47 -0800)]

[ELF] Make --no-relax disable R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization

This brings back the original version of D81359.
I have found several use cases now.

* Unlike GNU ld, LLD's relocation processing is one pass. If we decide to
  optimize(relax) R_X86_64_{,REX_}GOTPCRELX, we will suppress GOT generation and
  cannot undo the decision later. Optimizing R_X86_64_REX_GOTPCRELX can usually
  make it easy to hit `relocation R_X86_64_REX_GOTPCRELX out of range` because
  the distance to GOT is usually shorter. Without --no-relax, the user has to
  recompile with `-Wa,-mrelax-relocations=no`.
* The option would help during my investigationg of the root cause of https://git.kernel.org/linus/09e43968db40c33a73e9ddbfd937f46d5c334924
* There is need for relaxation for AArch64 & RISC-V. Implementing this for
  x86-64 improves consistency with little target-specific cost (two-line
  X86_64.cpp change).

Reviewed By: alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D113615

commit | commitdiff | tree

Sam McCall [Fri, 12 Nov 2021 17:42:54 +0000 (18:42 +0100)]

[clangd] Mark completions as plain-text when there's no snippet part

This helps nvim support the "repeat" action

Fixes https://github.com/clangd/clangd/issues/922

commit | commitdiff | tree

Philip Reames [Fri, 12 Nov 2021 17:37:50 +0000 (09:37 -0800)]

[tests] Add coverage for cases we can prune exits when runtlme unrolling

commit | commitdiff | tree

Nikita Popov [Fri, 12 Nov 2021 17:17:27 +0000 (18:17 +0100)]

[ConstantRangeTest] Add helper to enumerate APInts (NFC)

While ForeachNumInConstantRange(ConstantRange::getFull(Bits))
works, it's somewhat roundabout, and I keep looking for this
function.

commit | commitdiff | tree

Quinn Pham [Wed, 10 Nov 2021 14:50:14 +0000 (08:50 -0600)]

[lldb][NFC] Inclusive language: replace master/slave names for ptys

[NFC] This patch replaces master and slave with primary and secondary
respectively when referring to pseudoterminals/file descriptors.

Reviewed By: clayborg, teemperor

Differential Revision: https://reviews.llvm.org/D113687

commit | commitdiff | tree

Dmitry Vyukov [Fri, 12 Nov 2021 16:43:26 +0000 (17:43 +0100)]

Revert "tsan: new runtime (v3)"

Summary:
This reverts commit ac95b8d9548cb3c07e60236d3e9e1fd05f79579b.
There is a number of bot failures:
http://45.33.8.238/mac/38755/step_4.txt
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/38135/consoleFull#-148886289949ba4694-19c4-4d7e-bec5-911270d8a58c

Reviewers: vitalybuka, melver

Subscribers:

commit | commitdiff | tree

Simon Pilgrim [Fri, 12 Nov 2021 16:47:43 +0000 (16:47 +0000)]

[X86] convertShiftLeftToScale - improve vXi8 constant handling

Add support for v32i8/v64i8 converting shift-by-constant to multiply-by-constant. This helps us avoid the generic vXi8 shift lowering, and a lot of VPBLENDVB ops which can be particularly slow.

We also needed to reorder a few shift lowering patterns to prevent regressions, particularly for XOP+AVX2 (Excavator) targets (which can split to fast v16i8 shifts) and AVX512-BWI targets (which prefers to extend to fast v32i16 shifts).

commit | commitdiff | tree

Zarko Todorovski [Fri, 12 Nov 2021 15:47:03 +0000 (15:47 +0000)]

[NFC][llvm] Remove uses of blacklist in llvm/test/Instrumentation

Small patch that changes blacklisted_global to blocked_global and a change in comments.

Reviewed By: pgousseau

Differential Revision: https://reviews.llvm.org/D113692

commit | commitdiff | tree

Brian Cain [Wed, 10 Nov 2021 17:26:10 +0000 (09:26 -0800)]

[libcxx] Change the type of __size to correspond

__size was declared as unsigned which was compatible with

commit | commitdiff | tree

Joel E. Denny [Fri, 12 Nov 2021 14:55:32 +0000 (09:55 -0500)]

[OpenMP] Fix main thread barrier for Pascal and amdgpu

Fixes what's left of https://bugs.llvm.org/show_bug.cgi?id=51781.

Reviewed By: jdoerfert, JonChesterfield, tianshilei1992

Differential Revision: https://reviews.llvm.org/D113602

commit | commitdiff | tree

Florian Hahn [Fri, 12 Nov 2021 16:09:19 +0000 (16:09 +0000)]

[LV] Precommit test case from PR52485.

commit | commitdiff | tree

Jay Foad [Thu, 11 Nov 2021 14:45:56 +0000 (14:45 +0000)]

[AMDGPU] Simplify 64-bit division/remainder expansion

The old expansion open-coded a 64-bit addition in a strange way, by
adding the high parts *without* carry-in from the low part, and then
adding the carry back in later on. Fixing this saves a couple of
instructions and makes the code much easier to understand.

Differential Revision: https://reviews.llvm.org/D113679

commit | commitdiff | tree

Zarko Todorovski [Fri, 12 Nov 2021 14:30:06 +0000 (14:30 +0000)]

[clang] Inclusive language: change instances of blacklist/whitelist to allowlist/ignorelist

Change the error message to use ignorelist, and changed some variable and function
names in related code and test.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D113189

commit | commitdiff | tree

Kazu Hirata [Fri, 12 Nov 2021 15:33:29 +0000 (07:33 -0800)]

[CodeGen] Use SDNode::uses (NFC)

commit | commitdiff | tree

Roman Lebedev [Fri, 12 Nov 2021 14:06:22 +0000 (17:06 +0300)]

[NFC][SROA] Add more tests for non-capturing pointer-escaping calls

commit | commitdiff | tree

Nicolas Vasilache [Fri, 12 Nov 2021 14:58:03 +0000 (14:58 +0000)]

[mlir] NFC - Address post-commit comments

Address comments from https://reviews.llvm.org/D113745
which landed as aa3731806723a2a12914aecda2af6e40e1903702

commit | commitdiff | tree

Justas Janickas [Mon, 20 Sep 2021 13:18:32 +0000 (14:18 +0100)]

[OpenCL] Constructor address space test adjusted for C++ for OpenCL 2021

Reuses C++ for OpenCL constructor address space test so that it
supports optional generic address spaces in version 2021.

Differential Revision: https://reviews.llvm.org/D110184

commit | commitdiff | tree

Alexey Bataev [Fri, 12 Nov 2021 14:28:03 +0000 (06:28 -0800)]

[Feature][NFC]Improve test checks to avoid possible false postitive test
failures, NFC.

commit | commitdiff | tree

Kerry McLaughlin [Fri, 12 Nov 2021 11:12:20 +0000 (11:12 +0000)]

[AArch64][SVE] Remove i1 type from isElementTypeLegalForScalableVector

`collectElementTypesForWidening` collects the types of load, store and
reduction Phis in a loop. These types are later checked using
`isElementTypeLegalForScalableVector` to prevent vectorisation of
loops with instruction types that are unsupported.

This patch removes i1 from the list of types supported for scalable
vectors. This fixes an assert ("Cannot yet scalarize uniform stores") in
`setCostBasedWideningDecision` when we have a loop containing a uniform
i1 store and a scalable VF, which we cannot create a scatter for.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D113680

commit | commitdiff | tree

Alexey Bataev [Thu, 29 Jul 2021 18:22:19 +0000 (11:22 -0700)]

[SLP]Improve vectorization of split loads.

Need to fix ther cost estimation for split loads, since we look at the
subregs already, no need to permute them, need just to estimate
subregister insert, if it is smaller than the real register. Also, using
split loads, it might be profitable already to vectorize smaller trees
with gathering of the loads.

Differential Revision: https://reviews.llvm.org/D107188

commit | commitdiff | tree

Simon Pilgrim [Fri, 12 Nov 2021 14:02:43 +0000 (14:02 +0000)]

[X86] combineX86ShufflesConstants - constant fold from target shuffles unless optsize = true

Currently we only constant fold target shuffles if any of the sources has one use, or it would remove a variable shuffle mask - the aim being to avoid constant pool bloat.

This patch proposes we should constant fold by default and only limit this if optsize is enabled - I've added a basic test for this in vector-mul.ll (the pmuludq case is by far the most common), I can add other specific test cases if people need them.

This should permit further constant folding, break some instruction dependencies and help reduce shuffle port pressure.

Differential Revision: https://reviews.llvm.org/D113748

commit | commitdiff | tree

Kadir Cetinkaya [Fri, 12 Nov 2021 13:50:13 +0000 (14:50 +0100)]

[clangd] Fix use-after-free in test

commit | commitdiff | tree

Dmitry Vyukov [Tue, 27 Apr 2021 11:55:41 +0000 (13:55 +0200)]

tsan: new runtime (v3)

This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes

Depends on D112602.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D112603

commit | commitdiff | tree

Raphael Isemann [Fri, 12 Nov 2021 12:54:19 +0000 (13:54 +0100)]

[lldb] Fix that the embedded Python REPL crashes if it receives SIGINT

When LLDB receives a SIGINT while running the embedded Python REPL it currently
just crashes in `ScriptInterpreterPythonImpl::Interrupt` with an error such as
the one below:

```

Fatal Python error: PyThreadState_Get: the function must be called with the GIL
held, but the GIL is released (the current Python thread state is NULL)

```

The faulty code that causes this error is this part of `ScriptInterpreterPythonImpl::Interrupt`:
```
    PyThreadState *state = PyThreadState_GET();
    if (!state)
      state = GetThreadState();
    if (state) {
      long tid = state->thread_id;
      PyThreadState_Swap(state);
      int num_threads = PyThreadState_SetAsyncExc(tid, PyExc_KeyboardInterrupt);
```

The obvious fix I tried is to just acquire the GIL before this code is running
which fixes the crash but the `KeyboardInterrupt` we want to raise immediately
is actually just queued and would only be raised once the next line of input has
been parsed (which e.g. won't interrupt Python code that is currently waiting on
a timer or IO from what I can see). Also none of the functions we call here is
marked as safe to be called from a signal handler from what I can see, so we
might still end up crashing here with some bad timing.

Python 3.2 introduced `PyErr_SetInterrupt` to solve this and the function takes
care of all the details and avoids doing anything that isn't safe to do inside a
signal handler. The only thing we need to do is to manually setup our own fake
SIGINT handler that behaves the same way as the standalone Python REPL signal
handler (which raises a KeyboardInterrupt).

From what I understand the old code used to work with Python 2 so I kept the old
code around until we officially drop support for Python 2.

There is a small gap here with Python 3.0->3.1 where we might still be crashing,
but those versions have reached their EOL more than a decade ago so I think we
don't need to bother about them.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D104886

commit | commitdiff | tree

Sanjay Patel [Fri, 12 Nov 2021 13:12:59 +0000 (08:12 -0500)]

[x86] fold vector (X > -1) & Y to shift+andn

and (pcmpgt X, -1), Y --> pandn (vsrai X, BitWidth-1), Y

This avoids the -1 constant vector in favor of an arithmetic shift
instruction if it exists (the ISA is still not complete after all
these years...).

We catch this pattern late in combining by matching PCMPGT, so it
should not interfere with more general folds.

Differential Revision: https://reviews.llvm.org/D113603

commit | commitdiff | tree

Jan Svoboda [Fri, 12 Nov 2021 13:15:24 +0000 (14:15 +0100)]

[clang] NFC: Format a loop in CompilerInstance

This code will be moved to a separate function in a future patch. Reformatting now to prevent a bunch of clang-format complains on Phabricator.

commit | commitdiff | tree

Nicolas Vasilache [Fri, 12 Nov 2021 13:13:47 +0000 (13:13 +0000)]

[mlir][Vector] Add support for 1D depthwise conv vectorization

At this time the 2 flavors of conv are a little too different to allow significant code sharing and other will likely come up.
so we go the easy route first by duplicating and adapting.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113758

commit | commitdiff | tree

Dmitry Vyukov [Fri, 12 Nov 2021 13:07:23 +0000 (14:07 +0100)]

tsan: ignore some errors in the clone_setns test

Some bots failed with:
unshare failed: 1
https://lab.llvm.org/buildbot/#/builders/70/builds/14101

Look only for the target EINVAL error.

Differential Revision: https://reviews.llvm.org/D113759

commit | commitdiff | tree

Phoebe Wang [Fri, 12 Nov 2021 13:05:51 +0000 (21:05 +0800)]

Add nounwind for tests. NFC

commit | commitdiff | tree

Kadir Cetinkaya [Wed, 10 Nov 2021 11:10:33 +0000 (12:10 +0100)]

[clangd] Mark macros from preamble for code completion

If the main file is a header, mark the marcos defined in its preamble
section as code-completion ready.

Fixes https://github.com/clangd/clangd/issues/921.

Differential Revision: https://reviews.llvm.org/D113555

commit | commitdiff | tree

Adrian Kuegel [Fri, 12 Nov 2021 12:15:51 +0000 (13:15 +0100)]

Revert "[clang] retain type sugar in auto / template argument deduction"

This reverts commit 9b6036deedf28e10d797fc4ca734d57680d18053.
Breaks two libc++ tests.

commit | commitdiff | tree

Adrian Kuegel [Fri, 12 Nov 2021 12:12:02 +0000 (13:12 +0100)]

Revert "[lldb] fix test expectation broken by clang fix at D110216"

This reverts commit 55085952175ed3b029097a0594acc4e34a5df218.
The patch it depends on is reverted.

commit | commitdiff | tree

Florian Hahn [Fri, 12 Nov 2021 12:20:00 +0000 (12:20 +0000)]

[SCEV] Use APIntOps::umin to select best max BC count (NFC).

Suggested in D102267, but I missed this in the committed version.

commit | commitdiff | tree

Florian Hahn [Fri, 12 Nov 2021 12:19:35 +0000 (12:19 +0000)]

[SCEV] Add test case where applying zext info pessimizes BTC.

Add an additional test case for D113578.

commit | commitdiff | tree

Dmitry Vyukov [Thu, 11 Nov 2021 19:37:05 +0000 (20:37 +0100)]

tsan: don't start background thread after clone

Start the background thread only after fork, but not after clone.
For fork we did this always and it's known to work (or user code has adopted).
But if we do this for the new clone interceptor some code (sandbox2) fails.
So model we used to do for years and don't start the background thread after clone.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113744

commit | commitdiff | tree

Nicolas Vasilache [Fri, 12 Nov 2021 09:44:14 +0000 (09:44 +0000)]

[mlir][Linalg] Rewrite DownscaleSizeOneWindowed2DConvolution to use rank-reducing insert/extract slices.

This rewriting enables better bufferization and canonicalizations.

Differential Revision: https://reviews.llvm.org/D113745

commit | commitdiff | tree

Dmitry Vyukov [Fri, 12 Nov 2021 09:06:20 +0000 (10:06 +0100)]

tsan: fix XMM register corruption in hacky call

The compiler does not recognize HACKY_CALL as a call
(we intentionally hide it from the compiler so that it can
compile non-leaf functions as leaf functions).
To compensate for that hacky call thunk saves and restores
all caller-saved registers. However, it saves only
general-purposes registers and does not save XMM registers.
This is a latent bug that was masked up until a recent "NFC" commit
d736002e90 ("tsan: move memory access functions to a separate file"),
which allowed more inlining and exposed the 10-year bug.
Save and restore caller-saved XMM registers (all) as well.

Currently the bug manifests as e.g. frexp interceptor messes the
return value and the added test fails with:
i=8177 y=0.000000 exp=4

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113742

commit | commitdiff | tree

Tomasz Miąsko [Sat, 16 Oct 2021 12:10:31 +0000 (14:10 +0200)]

[llvm-nm] Demangle Rust symbols

Add support for demangling Rust v0 symbols to llvm-nm by reusing
nonMicrosoftDemangle which supports both Itanium and Rust mangling.

Reviewed By: dblaikie, jhenderson

Differential Revision: https://reviews.llvm.org/D111937

commit | commitdiff | tree

Jan Svoboda [Fri, 12 Nov 2021 11:11:15 +0000 (12:11 +0100)]

[clang] NFC: Use range-based for loop

commit | commitdiff | tree

Jan Svoboda [Fri, 12 Nov 2021 11:05:02 +0000 (12:05 +0100)]

[clang] NFC: Remove benign condition

commit | commitdiff | tree

Salman Javed [Fri, 12 Nov 2021 10:24:19 +0000 (23:24 +1300)]

[clang-tidy] Re-apply 0076957 with fix for failing ASan tests

Re-apply "Fix lint warning in ClangTidyDiagnosticConsumer.cpp (NFC)"
with fixes for the failing ASan tests.

This reverts commit 74add1b6d6d377ab2cdce26699cf798110817e42.

commit | commitdiff | tree

Gabor Marton [Fri, 5 Nov 2021 10:53:29 +0000 (11:53 +0100)]

[analyzer][solver] Remove reference to RangedConstraintManager

We no longer need a reference to RangedConstraintManager, we call top
level `State->assume` functions.

Differential Revision: https://reviews.llvm.org/D113261

commit | commitdiff | tree

Gabor Marton [Mon, 26 Jul 2021 20:55:44 +0000 (22:55 +0200)]

[analyzer][solver] Iterate to a fixpoint during symbol simplification with constants

D103314 introduced symbol simplification when a new constant constraint is
added. Currently, we simplify existing equivalence classes by iterating over
all existing members of them and trying to simplify each member symbol with
simplifySVal.

At the end of such a simplification round we may end up introducing a
new constant constraint. Example:
```
  if (a + b + c != d)
    return;
  if (c + b != 0)
    return;
  // Simplification starts here.
  if (b != 0)
    return;
```
The `c == 0` constraint is the result of the first simplification iteration.
However, we could do another round of simplification to reach the conclusion
that `a == d`. Generally, we could do as many new iterations until we reach a
fixpoint.

We can reach to a fixpoint by recursively calling `State->assume` on the
newly simplified symbol. By calling `State->assume` we re-ignite the
whole assume machinery (along e.g with adjustment handling).

Why should we do this? By reaching a fixpoint in simplification we are capable
of discovering infeasible states at the moment of the introduction of the
**first** constant constraint.
Let's modify the previous example just a bit, and consider what happens without
the fixpoint iteration.
```
  if (a + b + c != d)
    return;
  if (c + b != 0)
    return;
  // Adding a new constraint.
  if (a == d)
    return;
  // This brings in a contradiction.
  if (b != 0)
    return;
  clang_analyzer_warnIfReached(); // This produces a warning.
              // The path is already infeasible...
  if (c == 0) // ...but we realize that only when we evaluate `c == 0`.
    return;
```
What happens currently, without the fixpoint iteration? As the inline comments
suggest, without the fixpoint iteration we are doomed to realize that we are on
an infeasible path only after we are already walking on that. With fixpoint
iteration we can detect that before stepping on that. With fixpoint iteration,
the `clang_analyzer_warnIfReached` does not warn in the above example b/c
during the evaluation of `b == 0` we realize the contradiction. The engine and
the checkers do rely on that either `assume(Cond)` or `assume(!Cond)` should be
feasible. This is in fact assured by the so called expensive checks
(LLVM_ENABLE_EXPENSIVE_CHECKS). The StdLibraryFuncionsChecker is notably one of
the checkers that has a very similar assertion.

Before this patch, we simply added the simplified symbol to the equivalence
class. In this patch, after we have added the simplified symbol, we remove the
old (more complex) symbol from the members of the equivalence class
(`ClassMembers`). Removing the old symbol is beneficial because during the next
iteration of the simplification we don't have to consider again the old symbol.

Contrary to how we handle `ClassMembers`, we don't remove the old Sym->Class
relation from the `ClassMap`. This is important for two reasons: The
constraints of the old symbol can still be found via it's equivalence class
that it used to be the member of (1). We can spare one removal and thus one
additional tree in the forest of `ClassMap` (2).

Performance and complexity: Let us assume that in a State we have N non-trivial
equivalence classes and that all constraints and disequality info is related to
non-trivial classes. In the worst case, we can simplify only one symbol of one
class in each iteration. The number of symbols in one class cannot grow b/c we
replace the old symbol with the simplified one. Also, the number of the
equivalence classes can decrease only, b/c the algorithm does a merge operation
optionally. We need N iterations in this case to reach the fixpoint. Thus, the
steps needed to be done in the worst case is proportional to `N*N`. Empirical
results (attached) show that there is some hardly noticeable run-time and peak
memory discrepancy compared to the baseline. In my opinion, these differences
could be the result of measurement error.
This worst case scenario can be extended to that cases when we have trivial
classes in the constraints and in the disequality map are transforming to such
a State where there are only non-trivial classes, b/c the algorithm does merge
operations. A merge operation on two trivial classes results in one non-trivial
class.

Differential Revision: https://reviews.llvm.org/D106823

commit | commitdiff | tree

Neubauer, Sebastian [Thu, 11 Nov 2021 14:58:42 +0000 (15:58 +0100)]

[AMDGPU][NFC] Fix typos

Differential Revision: https://reviews.llvm.org/D113672

commit | commitdiff | tree

Florian Hahn [Fri, 12 Nov 2021 10:30:03 +0000 (10:30 +0000)]

[SCEV] Add tests where guards limit both %n and (zext %n).

Suggested in D113577.

Domain: System / Toolchain;

RSS Atom