review.tizen.org Git - platform/upstream/llvm.git/log

[InstCombine] fold sub of low-bit masked value from offset of same value

There might be some demanded/known bits way to generalize this,
but I'm not seeing it right now.

This came up as a regression when I was looking at a different
demanded bits improvement.

https://rise4fun.com/Alive/5fl

  Name: general
  Pre: ((-1 << countTrailingZeros(C1)) & C2) == 0
  %a1 = add i8 %x, C1
  %a2 = and i8 %x, C2
  %r = sub i8 %a1, %a2
  =>
  %r = and i8 %a1, ~C2

  Name: test 1
  %a1 = add i8 %x, 192
  %a2 = and i8 %x, 10
  %r = sub i8 %a1, %a2
  =>
  %r = and i8 %a1, -11

  Name: test 2
  %a1 = add i8 %x, -108
  %a2 = and i8 %x, 3
  %r = sub i8 %a1, %a2
  =>
  %r = and i8 %a1, -4

[InstCombine] add tests for sub with masked bits; NFC

[MLIR] Fix standard -> LLVM conversion to fail for unsupported memref element type.

- Move isSupportedMemRefType() to ConvertToLLVMPatterns and check if the
memref element type is supported there.

Differential Revision: https://reviews.llvm.org/D91374

[flang] Document DO CONCURRENT's problems (NFC)

Differential Revision: https://reviews.llvm.org/D86556

[lldb/DataFormatters] Display null C++ pointers as nullptr

Display null pointer as `nullptr`, `nil` and `NULL` for C++,
Objective-C/Objective-C++ and C respectively. The original motivation
for this patch was to display a null std::string pointer as nullptr
instead of "", but the fix seemed generic enough to be done for all
summary providers.

Differential revision: https://reviews.llvm.org/D77153

[AMDGPU] Remove scratch rsrc from spill pseudos

Differential Revision: https://reviews.llvm.org/D91110

[gn build] (manually) port 410626c9b56

[mlir] Make tensor_to_memref op docs match reality

The previous code defined it as allocating a new memref for its result.
However, this is not how it is treated by the dialect conversion framework,
that does the equivalent of inserting and folding it away internally
(even independent of any canonicalization patterns that we have
defined).

The semantics as they were previously written were also very
constraining: Nontrivial analysis is needed to prove that the new
allocation isn't needed for correctness (e.g. to avoid aliasing).
By removing those semantics, we avoid losing that information.

Differential Revision: https://reviews.llvm.org/D91382

[mlir] Bufferize tensor constant ops

We lower them to a std.global_memref (uniqued by constant value) + a
std.get_global_memref to produce the corresponding memref value.
This allows removing Linalg's somewhat hacky lowering of tensor
constants, now that std properly supports this.

Differential Revision: https://reviews.llvm.org/D91306

[mlir] Fix subtensor_insert bufferization.

It was incorrect in the presence of a tensor argument with multiple
uses.

The bufferization of subtensor_insert was writing into a converted
memref operand, but there is no guarantee that the converted memref for
that operand is safe to write into. In this case, the same converted
memref is written to in-place by the subtensor_insert bufferization,
violating the tensor-level semantics.

I left some comments in a TODO about ways forward on this. I will be
working actively on this problem in the coming days.

Differential Revision: https://reviews.llvm.org/D91371

[AArch64][GlobalISel] Select CSINC and CSINV for G_SELECT with constants

Select the following:

- G_SELECT cc, 0, 1 -> CSINC zreg, zreg, cc
- G_SELECT cc 0, -1 -> CSINV zreg, zreg cc
- G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
- G_SELECT cc, -1, f -> CSINV f, zreg, inv_cc
- G_SELECT cc, t, 1 -> CSINC t, zreg, cc
- G_SELECT cc, t, -1 -> CSINC t, zreg, cc

(IR example: https://godbolt.org/z/YfPna9)

These correspond to a bunch of the AArch64csel patterns in AArch64InstrInfo.td.

Unfortunately, it doesn't seem like we can import patterns that use NZCV like
those ones do. E.g.

```
def : Pat<(AArch64csel GPR32:$tval, (i32 1), (i32 imm:$cc), NZCV),
(CSINCWr GPR32:$tval, WZR, (i32 imm:$cc))>;
```

So we have to manually select these for now.

This replaces `selectSelectOpc` with an `emitSelect` function, which performs
these optimizations.

Differential Revision: https://reviews.llvm.org/D90701

[VE] Support vld intrinsics

Add intrinsics for vector load instructions. Add a regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91332

[LoopVectorize] regenerate test checks; NFC

[LLDB] Fix handling of bit-fields in a union

When parsing DWARF and laying out bit-fields we don't properly take into account when they are in a union, they will all have a zero offset.

Differential Revision: https://reviews.llvm.org/D91118

[PhaseOrdering] regenerate test checks; NFC

[InstCombine] add tests for low-mask-of-add; NFC

Some updates/fixes to the creduce script.

This was motivated by changes to llvm's `not --crash` disabling symbolization
but I ended up removing `not` from the script entirely because it
returns differently depending on whether clang "crashes" or exits for some
other reason. The script had to choose between calling `not` and `not --crash`
and sometimes it was wrong.

The script also now disables symbolization when we don't read the stack
trace because symbolizing is kind of slow.

Differential Revision: https://reviews.llvm.org/D91372

[AMDGPU] Enable multi-dword flat scratch load/stores

Differential Revision: https://reviews.llvm.org/D91384

[mlir][Python] Fix 'unreferenced local variable' warning on MSVC.

Differential Revision: https://reviews.llvm.org/D91282

[PatternMatch] Add single index InsertValue matcher.

This patch adds a new matcher for single index InsertValue instructions,
similar to the existing matcher for ExtractValue.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D91352

[OPENMP]Fix PR47790: segfault in frontend while parsing Objective-C with OpenMP.

Need to check if the sema is actually finishing a function decl.

Differential Revision: https://reviews.llvm.org/D91376

[VE] Disable -fsigaddr option for VE

VE needs to support integrated assembler and "nas". This "nas"
doesn't recognize ".sigaddr" pseudo mnemonics, so need to disable
it. This patch disable it on VE by default. Also add a regression
test for that.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91350

[flang] Include source information in an invalid file-unit-number message

An io-unit that is an internal-file-variable is syntactically identical
to a file-unit-number expression that is a variable reference.  An
ambiguous unit is initially parsed as an internal-file-variable.  If
semantic analysis determines that the unit is not of character type,
it is rewritten as an internal-file-variable.  This modification must
retain source coordinate information.

Differential revision: https://reviews.llvm.org/D91375

[fuzzer] Add Windows Visual C++ exception intercept

Adds a new option, `handle_winexcept` to try to intercept uncaught
Visual C++ exceptions on Windows. On Linux, such exceptions are handled
implicitly by `std::terminate()` raising `SIBABRT`. This option brings the
Windows behavior in line with Linux.

Unfortunately this exception code is intentionally undocumented, however
has remained stable for the last decade. More information can be found
here: https://devblogs.microsoft.com/oldnewthing/20100730-00/?p=13273

Reviewed By: morehouse, metzman

Differential Revision: https://reviews.llvm.org/D89755

[flang] Recognize END FILE as ENDFILE in free form source

The ENDFILE statement may be spelled as two words.

Differential revision: https://reviews.llvm.org/D91377

[NFC][NewPM] Reuse PassBuilder callbacks with -O0

This removes lots of duplicated code which was necessary before
https://reviews.llvm.org/D89158.
Now we can use PassBuilder::runRegisteredEPCallbacks().
This is mostly sanitizers.

There is likely more that can be done to simplify, but let's start with this.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D90870

[OPENMP]Fix PR48076: Check map types array before accessing its front.

Need to check if there are map types for the components before trying to
access them when trying to modify type mappings for combined partial
mappings.

Differential Revision: https://reviews.llvm.org/D91370

[AMDGPU] Fix scheduling of exp pos4

Also fix a similar issue in SIInsertWaitcnts, but I don't think that fix
has any effect in practice.

Differential Revision: https://reviews.llvm.org/D91290

[AMDGPU] Define and use names for export targets. NFC.

Differential Revision: https://reviews.llvm.org/D91289

[msan] Break the getShadow loop after matching an argument

Reviewed-by: eugenis
Differential Revision: https://reviews.llvm.org/D91320

[SystemZ][ZOS] libcxx - no posix memalign

The unavailability of posix_memalign on z/OS forces us to define _LIBCPP_HAS_NO_LIBRARY_ALIGNED_ALLOCATION'. The use of posix_memalign is being used in libcxx/src/new.cpp.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D90178

[BasicAA] Remove checks for GEP decomposition limit reached

The GEP aliasing code currently checks for the GEP decomposition
limit being reached (i.e., we did not reach the "final" underlying
object). As far as I can see, these checks are not necessary. It is
perfectly fine to work with a GEP whose base can still be further
decomposed.

Looking back through the commit history, these checks were originally
introduced in 1a444489e9d90915cfdda0720489893896ef1503. However, I
believe that the problem this was intended to address was later
properly fixed with 1726fc698ccb85fe4bb23c200a50f28b57fc53cb, and
the checks are no longer necessary since then (and were not the
right fix in the first place).

Differential Revision: https://reviews.llvm.org/D91010

[Frontend] Treat .cuh files as CUDA source files

to synchronize with tools/clang-format/git-clang-format

tra: Keeping them in sync does have a minor benefit of not raising a question why the two maps are different.

Differential Revision: https://reviews.llvm.org/D91034

fix clang build

[libc++] NFC: Remove symbol from ABI list changelog that was never added

The `posix_memalign@GLIBC_2.2.5` symbol can't have been added by r284206,
because it doesn't show up in the corresponding ABI list. It's also not
defined in libc++, so that wouldn't make sense. It must have made it into
that comment by mistake.

[NFC] Switch printFunctionLikeOp and parseFunctionLikeOp to only support "inline" visibility.

- Remove the default valued arguments from these functions.
- Besides FuncOp, looks like no other in-tree op is using these functions.

Differential Revision: https://reviews.llvm.org/D91369

Revert "[gn build] (semi-manually) port 173b51169b8"

This reverts commit 37a1336de722c6920a24e8cd4278e396402f1b2a.
173b51169b8 was reverted in 777ca48.

[libc++] Instantiate additional <iostream> members in the dylib

This commit adds new explicit instantiations for some classes in <iostream>
in the library. This is done after noticing that many programs that use
streams end up containing weak definitions of these classes, which has a
negative impact on both code size and load times (due to the need to
resolve weak symbols at load time). Note that we are just adding the
additional explicit instantiations for the `char` specializations, since
the `wchar_t` specializations are not used as often, and as a result there
wouldn't be a clear benefit.

This change is not an ABI break, since we are just adding additional
symbols.

Differential Revision: https://reviews.llvm.org/D90677

Revert "[SystemZ][ZOS] Porting the time functions within libc++ to z/OS"

This reverts commit 173b51169b838. That commit was applied incorrectly,
and undid previous changes. That was clearly not intended.

[MSP430] Remove unused MVT::Glue output from MSP430ISD::SELECT_CC nodes.

Follow up from a similar patch on RISCV 637f19c36b323cc3ab597f6ef138db53be395949

Nothing reads this Glue value that I could see. The SDNode def in
the td file does not have the SDNPOutGlue flag so I don't think
this glue would get properly propagated to MachineSDNodes if it
was used.

[flang] Implement runtime support for basic ALLOCATE/DEALLOCATE

Add error reporting infrastructure and support for ALLOCATE
and DEALLOCATE statements of intrinsic types without SOURCE=
or MOLD=.

Differential revision: https://reviews.llvm.org/D91215

[clang-tidy] Merge options inplace instead of copying

Changed `ClangTidyOptions::mergeWith` to operate on the instance instead of returning a copy. The old mergeWith method has been renamed to merge and marked as nodiscard, to aid in disambiguating which one is which.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D91184

Non-implicit attribute creation requires a source range; NFC

There are two factory functions used to create a semantic attribute,
Create() and CreateImplicit(). CreateImplicit() does not need to
specify the source range of the attribute since it's an implicitly-
generated attribute. The same logic does not apply to Create(), so
this removes the default argument from those declarations to avoid
accidentally creating a semantic attribute without source location
information.

[ELF] Don't consider SHF_ALLOC ".debug*" sections debug sections

Fixes PR48071

* The Rust compiler produces SHF_ALLOC `.debug_gdb_scripts` (which normally does not have the flag)
* `.debug_gdb_scripts` sections are removed from `inputSections` due to --strip-debug/--strip-all
* When processing --gc-sections, pieces of a SHF_MERGE section can be marked live separately

`=>` segfault when marking liveness of a `.debug_gdb_scripts` which is not split into pieces (because it is not in `inputSections`)

This patch circumvents the problem by not treating SHF_ALLOC ".debug*" as debug sections (to prevent --strip-debug's stripping)
(which is still useful on its own).

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D91291

[FPEnv][Clang][Driver] Use MarshallingInfoFlag for -fexperimental-strict-floating-point

As of D80952 we are disabling strict floating point on all hosts except
those that are explicitly listed as supported. Use of strict floating point
on other hosts requires use of the -fexperimental-strict-floating-point
flag. This is to avoid bugs like "https://bugs.llvm.org/show_bug.cgi?id=45329"
(which has an incorrect link in the previous review).

In the review for D80952 I was asked to mark the -fexperimental option as
a MarshallingInfoFlag. This patch does exactly that.

Differential Revision: https://reviews.llvm.org/D88987

[InstCombine] add tests for mask of sext-in-reg; NFC

Reland: Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source

Summary:
Expand the print-memoryssa and print<memoryssa> passes with a new hidden
option -cfg-dot-mssa that names a file. When set, a dot-cfg style file
will be generated into the named file with the memoryssa comments retained
and those blocks containing them shown in light pink. The option does
nothing in isolation.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>

Reviewed By: asbirlea (Alina Sbirlea), dblaikie (David Blaikie)

Differential Revision: https://reviews.llvm.org/D90638

[gn build] (semi-manually) port 173b51169b8

Fix unused variable warning in release builds

[ValueTracking] Update computeKnownBitsFromShiftOperator callbacks to take KnownBits shift amount. NFCI.

We were creating this internally, but will need to support general KnownBits amounts as part of D90479.

[ELF] Make SORT_INIT_PRIORITY support .ctors.N

Input sections `.ctors/.ctors.N` may go to either the output section `.init_array` or the output section `.ctors`:

* output `.ctors`: currently we sort them by name. This patch changes to sort by priority from high to low. If N in `.ctors.N` is in the form of %05u, there is no semantic difference. Actually GCC and Clang do use %05u. (In the test `ctors_dtors_priority.s` and Gold's test `gold/testsuite/script_test_14.s`, we can see %03u, but they are not really produced by compilers.)
* output `.init_array`: users can provide an input section description `SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)` to mix `.init_array.*` and `.ctors.*`. This can make .init_array.N and .ctors.(65535-N) interchangeable.

With this change, users can mix `.ctors.N` and `.init_array.N` in `.init_array` (PR44698 and PR48096) with linker scripts. As an example:

```
SECTIONS {
  .init_array : {
    *(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*))
    *(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors)
  }
} INSERT AFTER .fini_array;
SECTIONS {
  .fini_array : {
    *(SORT_BY_INIT_PRIORITY(.fini_array.* .dtors.*))
    *(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors)
  }
} INSERT BEFORE .init_array;
```

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D91187

[ELF] Sort by input order within an input section description

According to
https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html#Input-Section-Basics
for `*(.a .b)`, the order should match the input order:

* for `ld 1.o 2.o`, sections from 1.o precede sections from 2.o
* within a file, `.a` and `.b` appear in the section header table order

This patch implements the behavior. The interaction with `SORT*` and --sort-section is:

Matched sections are ordered by radix sort with the keys being `(SORT*, --sort-section, input order)`,
where `SORT*` (if present) is most significant.

> Note, multiple `SORT*` within an input section description has undocumented and
> confusing behaviors in GNU ld:
> https://sourceware.org/pipermail/binutils/2020-November/114083.html
> Therefore multiple `SORT*` is not the focus for this patch but
> this patch still strives to have an explainable behavior.

As an example, we partition `SORT(a.*) b.* c.* SORT(d.*)`, into
`SORT(a.*) | b.* c.* | SORT(d.*)` and perform sorting within groups. Sections
matched by patterns between two `SORT*` are sorted by input order. If
--sort-alignment is given, they are sorted by --sort-alignment, breaking tie by
input order.

This patch also allows a section to be matched by multiple patterns, previously
duplicated sections could occupy more space in the output and had erroneous zero bytes.

The patch is in preparation for support for
`*(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)) *(.init_array .ctors)`,
which will allow LLD to mix .ctors*/.init_array* like GNU ld (gold's --ctors-in-init-array)
PR44698 and PR48096

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D91127

[ELF] Support multiple SORT in an input section description

The second `SORT` in `*(SORT(...) SORT(...))` is incorrectly parsed as a file pattern.
Fix the bug by stopping at `SORT*` in `readInputSectionsList`.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D91180

[PowerPC] Prevent the use of MMA with P9 and earlier

We want to allow using MMA on P10 CPU only. This patch prevents the use of MMA
with the -mmma option on P9 CPUs and earlier.

Differential Revision: https://reviews.llvm.org/D91200

[lldb] Replace TestAbortExitCode with a debugserver specific test

When I added TestAbortExitCode I actually planned this to be a generic test for the
exit code functionality on POSIX systems. However due to all the different test setups we
can have I don't think this worked out. Right now the test had to be made so permissive
that it pretty much can't fail.

Just to summarize, we would need to support the following situations:
1. ToT debugserver (on macOS)
2. lldb-server (on other platforms)
3. Any old debugserver version when using the system debugserver (on macOS)

This patch is removing TestAbortExitCode and adds a ToT debugserver specific test
that checks the patch that motivated the whole exit code testing. There is already
an exit-code test for lldb-server from what I can see and 3) is pretty much untestable
as we don't know anything about the system debugserver.

Reviewed By: kastiglione

Differential Revision: https://reviews.llvm.org/D89305

[SystemZ][ZOS] Porting the time functions within libc++ to z/OS

This patch is one part of many steps required to build libc++ and libc++abi libraries on z/OS. This particular deals with time related functions and consists of the following 3 parts.

1) Initialization of :timeval within libc++ library need to be adjusted to work on z/OS.
The following is z/OS definition from time.h which includes additional aggregate member.
typedef signed int suseconds_t;
struct timeval {
time_t tv_sec;
char tv_usec_pad[4];
suseconds_t tv_usec;
};

In contracts the following is definition from time.h on Linux.

typedef long int __suseconds_t;
struct timeval
{
__time_t tv_sec;
__suseconds_t tv_usec;
};

2) In addition, retrieving ::timespec within libc++ library needs to be adjusted to compensate the difference of some of the members of ::stat depending of the target host.
Here are the 2 members in conflict on z/OS extracted from stat.h.
struct stat {
...
time_t st_atime;
time_t st_mtime;
...
};
In contract here is Linux equivalent from stat.h.
struct stat
{
...
struct timespec st_atim;
struct timespec st_mtim;
...
};

3) On Linux both members are of type timespec whereas on z/OS an object of type timespec need to be constructed first before retrieving it within libc++ library.

The libc++ header file __threading_support calls nanosleep, which is not available on z/OS.
The equivalent functionality will be implemented by using both sleep() and usleep().

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D87940

[KnownBits] Add KnownBits::makeConstant helper. NFCI.

Helper for cases where we need to create a KnownBits from a (fully known) constant value.

Revert "Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source"

This reverts commit 45d459e7522ddc512ac70c4c822d58d335099672 due to
build issue in Poly.

[RISCV] Don't include CodeGen layer files in MC layer

-Use MCRegister instead of Register in MC layer.
-Move some enums from RISCVInstrInfo.h to RISCVBaseInfo.h to be with other TSFlags bits.

Differential Revision: https://reviews.llvm.org/D91114

Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source

Summary:
Expand the print-memoryssa and print<memoryssa> passes with a new hidden
option -cfg-dot-mssa that names a file. When set, a dot-cfg style file
will be generated into the named file with the memoryssa comments retained
and those blocks containing them shown in light pink. The option does
nothing in isolation.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>

Reviewed By: asbirlea (Alina Sbirlea), dblaikie (David Blaikie)

Differential Revision: https://reviews.llvm.org/D90638

[RISCV] Add an ANDI to shift amount of FSL/FSR instructions

The fshl and fshr intrinsics are defined to modulo their shift amount by the bitwidth of one of their inputs. The FSR/FSL instructions read one extra bit from the shift amount. If that bit is set the inputs are swapped. In order to preserve the semantics of the llvm intrinsics we need to make sure that the extra bit isn't set. DAG combine or instcombine may have removed any mask that was originally present.

We could be smarter here and try to use computeKnownBits to check if the bit is known zero, but wanted to start with correctness.

Differential Revision: https://reviews.llvm.org/D90905

[ValueTracking] Update computeKnownBitsFromShiftOperator callbacks to use KnownBits shift handling. NFCI.

Introduce -print-before-changed, making -print-changed also print before passes that modify IR

Summary:
Add an option -print-before-changed that modifies the print-changed
behaviour so that it prints the IR before a pass that changed it in
addition to printing the IR after the pass. Note that the option
does nothing in isolation. The filtering options work as expected.
Lit tests are included.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>

Reviewed By: aeubanks (Arthur Eubanks)

Differential Revision: https://reviews.llvm.org/D88757

[lldb] Add expect_var_path to test variable path results

This adds `expect_var_path` to test variable paths so we no longer have to
use `frame var` and find substrs in the command output. The behaviour
is identical with `expect_expr` (and it also uses the same checking backend),
but it instead calls `GetValueForVariablePath` to evaluate the string as a variable
path.

Also rewrites a few of the tests that previously used `frame variable` to use
`expect_var_path`.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D90450

[builtins] Add support for single-precision-only-FPU ARM targets.

This patch enables building compiler-rt builtins for ARM targets that
only support single-precision floating point instructions (e.g., those
with -mfpu=fpv4-sp-d16).

This fixes PR42838

Differential Revision: https://reviews.llvm.org/D90698

[NFC intended] Refactor SinkAndHoistLICMFlags to allow others to construct without exposing internals

Summary:
Refactor SinkAdHoistLICMFlags from a struct to a class with accessors and constructors to allow other
classes to construct flags with meaningful defaults while not exposing LICM internal details.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>

Reviewed By: asbirlea (Alina Sbirlea)

Differential Revision: https://reviews.llvm.org/D90482

[CODE_OWNERS.TXT] Update to include yours truly as the TableGen owner

[clang][docs] Remove wrongly spaced \brief in Doxygen comment (NFC)

[lldb][NFC] Move OptionDefinition from lldb-private-types.h to its own Utility header

Also moves the curious isprint8 function (which was used to check whether we have a
valid short option) into the struct and documents it.

[clang-scan-deps] Fix for input file given as relative path in compilation database "command" entry.

Differential Revision: https://reviews.llvm.org/D91204

[ARM] Ensure CountReg definition dominates InsertPt when creating t2DoLoopStartTP

Of course there was something missing, in this case a check that the def
of the count register we are adding to a t2DoLoopStartTP would dominate
the insertion point.

In the future, when we remove some of these COPY's in between, the
t2DoLoopStartTP will always become the last instruction in the block,
preventing this from happening. In the meantime we need to check they
are created in a sensible order.

Differential Revision: https://reviews.llvm.org/D91287

[LLD] Fix include following 45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3

[lld] Use temporary directory to create test outputs

[LLD][COFF] When using LLD-as-a-library, always prevent re-entrance on failures

This is a follow-up for D70378 (Cover usage of LLD as a library).

While debugging an intermittent failure on a bot, I recalled this scenario which
causes the issue:

1.When executing lld/test/ELF/invalid/symtab-sh-info.s L45, we reach
  lld::elf::Obj-File::ObjFile() which goes straight into its base ELFFileBase(),
  then ELFFileBase::init().
2.At that point fatal() is thrown in lld/ELF/InputFiles.cpp L381, leaving a
  half-initialized ObjFile instance.
3.We then end up in lld::exitLld() and since we are running with LLD_IN_TEST, we
  hapily restore the control flow to CrashRecoveryContext::RunSafely() then back
  in lld::safeLldMain().
4.Before this patch, we called errorHandler().reset() just after, and this
  attempted to reset the associated SpecificAlloc<ObjFile<ELF64LE>>. That tried
  to free the half-initialized ObjFile instance, and more precisely its
  ObjFile::dwarf member.

Sometimes that worked, sometimes it failed and was catched by the
CrashRecoveryContext. This scenario was the reason we called
errorHandler().reset() through a CrashRecoveryContext.

But in some rare cases, the above repro somehow corrupted the heap, creating a
stack overflow. When the CrashRecoveryContext's filter (that is,
__except (ExceptionFilter(GetExceptionInformation()))) tried to handle the
exception, it crashed again since the stack was exhausted -- and that took the
whole application down. That is the issue seen on the bot. Locally it happens
about 1 times out of 15.

Now this situation can happen anywhere in LLD. Since catching stack overflows is
not a reliable scenario ATM when using CrashRecoveryContext, we're now
preventing further re-entrance when such failures occur, by signaling
lld::SafeReturn::canRunAgain=false. When running with LLD_IN_TEST=2 (or above),
only one iteration will be executed, instead of two.

Differential Revision: https://reviews.llvm.org/D88348

[lldb] [test] Add a minimal test for x86 dbreg reading

Add a test verifying that after the 'watchpoint' command, new values
of x86 debug registers can be read back correctly. The primary purpose
of this test is to catch broken DRn reading and help debugging it.

Differential Revision: https://reviews.llvm.org/D91264

[lldb] [Process/Utility] Fix DR offsets for FreeBSD

Fix Debug Register offsets to be specified relatively to UserArea
on FreeBSD/amd64 and FreeBSD/i386, and add them to UserArea on i386.
This fixes overlapping GPRs and DRs in gdb-remote protocol, making it
impossible to correctly get and set debug registers from the LLDB
client.

Differential Revision: https://reviews.llvm.org/D91254

Revert "Generalize regex matching std::string variants to compensate for recent"

This reverts commit 856fd98a176240470dcc2b8ad54b5c17ef6a75b3. The type formatters
use inline namespaces to find the formatter that fits the type ABI, so they
can't just ignore the inline namespaces.

The failing tests should be fixed by da121fff1184267a405f81a87f7314df2d474e1c .

[lldb] Introduce a LLDB printing policy for Clang type names that suppressed inline namespaces

Commit 5f12f4ff9078455cad9d4806da01f570553a5bf9 made suppressing inline namespaces
when printing typenames default to true. As we're using the inline namespaces
in LLDB to construct internal type names (which need internal namespaces in them
to, for example, differentiate libc++'s std::__1::string from the std::string
from libstdc++), this broke most of the type formatting logic.

[dllexport] Instantiate default ctor default args for explicit specializations (PR45811)

For dllexported default constructors with default arguments, we export
default constructor closures which pass in the default args. (See D8331
for a good explanation.)

For templates, that means those default args must be instantiated even
if the function isn't called. That is done by the
InstantiateDefaultCtorDefaultArgs() function, but it wasn't done for
explicit specializations, causing asserts (see bug).

Differential revision: https://reviews.llvm.org/D91089

[mlir] Add plus, star and optional less/greater parsing

The tokens are already handled by the lexer. This revision exposes them
through the parser interface.

This revision also adds missing functions for question mark parsing and
completes the list of valid punctuation tokens in the documentation.

Differential Revision: https://reviews.llvm.org/D90907

[dllexport] Avoid assert for explicitly defaulted methods in explicit instantiation definitions (PR47683)

Clang was asserting due to attempting to codegen such methods twice.

Differential revision: https://reviews.llvm.org/D90849

[mlir] Generate Op builders for Python bindings

Add an ODS-backed generator of default builders. This currently does not
support operation with attribute arguments, for which the builder is
just ignored. Attribute support will be introduced separately for
builders and accessors.

Default builders are always generated with the same number of result and
operand groups as the ODS specification, i.e. one group per each operand
or result. Optional elements accept None but cannot be omitted. Variadic
groups accept iterable objects and cannot be replaced with a single
object.

For some operations, it is possible to infer the result type given the
traits, but most traits rely on inline pieces of C++ that we cannot
(yet) forward to Python bindings. Since the Ops where the inference is
possible (having the `SameOperandAndResultTypes` trait or
`TypeMatchesWith` without transform field) are a small minority, they
also require the result type to make the builder syntax more consistent.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D91190

[VE] Change the default type of v64 register class

Change the default type of v64 register class from v512i32 to v256f64.
Add a regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91301

[MLIR] Added documentation and manual to use bufferization features.

Added documentation about the bufferization features.
Furthermore, the usage of pre- and post-processing is described.
This also includes information about optimization functionalities.

Differential Revision: https://reviews.llvm.org/D90675

[clangd] Simplify relations deserialization loop, NFC.

[SVE] Deal with SVE tuple call arguments correctly when running out of registers

When passing SVE types as arguments to function calls we can run
out of hardware SVE registers. This is normally fine, since we
switch to an indirect mode where we pass a pointer to a SVE stack
object in a GPR. However, if we switch over part-way through
processing a SVE tuple then part of it will be in registers and
the other part will be on the stack.

I've fixed this by ensuring that:

1. When we don't have enough registers to allocate the whole block
   we mark any remaining SVE registers temporarily as allocated.
2. We temporarily remove the InConsecutiveRegs flags from the last
   tuple part argument and reinvoke the autogenerated calling
   convention handler. Doing this prevents the code from entering
   an infinite recursion and, in combination with 1), ensures we
   switch over to the Indirect mode.
3. After allocating a GPR register for the pointer to the tuple we
   then deallocate any SVE registers we marked as allocated in 1).
   We also set the InConsecutiveRegs flags back how they were before.
4. I've changed the AArch64ISelLowering LowerCALL and
   LowerFormalArguments functions to detect the start of a tuple,
   which involves allocating a single stack object and doing the
   correct numbers of legal loads and stores.

Differential Revision: https://reviews.llvm.org/D90219

[ARM] Remove unused check labels. NFC

[libc++] [P0340] [C++20] Update status page. NFC.

This was implemented in 410b650e674496e61506fa88f3026759b8759d0f:
"Implement P0340R3: Make 'underlying_type' SFINAE-friendly. Reviewed as https://reviews.llvm.org/D63574

llvm-svn: 364094"

[mlir][Linalg] Improve the logic to perform tile and fuse with better dependence tracking.

This change does two main things
1) An operation might have multiple dependences to the same
   producer. Not tracking them correctly can result in incorrect code
   generation with fusion. To rectify this the dependence tracking
   needs to also have the operand number in the consumer.
2) Improve the logic used to find the fused loops making it easier to
   follow. The only constraint for fusion is that linalg ops (on
   buffers) have update semantics for the result. Fusion should be
   such that only one iteration of the fused loop (which is also a
   tiled loop) must touch only one (disjoint) tile of the output. This
   could be relaxed by allowing for recomputation that is the default
   when oeprands are tensors, or can be made legal with promotion of
   the fused view (in future).

Differential Revision: https://reviews.llvm.org/D90579

[AArch64][GlobalISel] Optimize G_PTR_ADD with a negated offset to be a G_SUB.

[NFC][SCEV] Generalize monotonicity check for full and limited iteration space

A piece of logic of `isLoopInvariantExitCondDuringFirstIterations` is actually
a generalized predicate monotonicity check. This patch moves it into the
corresponding method and generalizes it a bit.

Differential Revision: https://reviews.llvm.org/D90395
Reviewed By: apilipenko

[NFC][coroutines] remove unused argument in SemaCoroutine

Test plan: check-llvm, check-clang

Reviewers: lxfind, junparser

Differential Revision: https://reviews.llvm.org/D91243

Revert "[Coroutine] Allocas used by StoreInst does not always escape"

This reverts commit 8bc7b9278e55c4c8c731e7600a2d146438697964, which landed by accident.

[mlir][sparse] export sparse tensor runtime support through header file

Exposing the C versions of the methods of the sparse runtime support lib
through header files will enable using the same methods in an MLIR program
as well as a C++ program, which will simplify future benchmarking comparisons
(e.g. comparing MLIR generated code with eigen for Matrix Market sparse matrices).

Reviewed By: penpornk

Differential Revision: https://reviews.llvm.org/D91316

[IndVars] IV user should not prevent use widening

Sometimes the an instruction we are trying to widen is used by the IV
(which means the instruction is the IV increment). Currently this may
prevent its widening. We should ignore such user because it will be
dead once the transform is done anyways.

Differential Revision: https://reviews.llvm.org/D90920
Reviewed By: fhahn

[Coroutine] Allocas used by StoreInst does not always escape

In the existing logic, for a given alloca, as long as its pointer value is stored into another location, it's considered as escaped.
This is a bit too conservative. Specifically, in non-optimized build mode, it's often to have patterns of code that first store an alloca somewhere and then load it right away.
These used should be handled without conservatively marking them escaped.

This patch tracks how the memory location where an alloca pointer is stored into is being used. As long as we only try to load from that location and nothing else, we can still
consider the original alloca not escaping and keep it on the stack instead of putting it on the frame.

Differential Revision: https://reviews.llvm.org/D91305

[IndVars] Recognize 'sub nuw' expressed as 'add' for widening

InstCombine canonicalizes 'sub nuw' instructions to 'add' without the
`nuw` flag. The typical case where we see it is decrementing induction
variables. For them, IndVars fails to prove that it's legal to widen them,
and inserts unprofitable `zext`'s.

This patch adds recognition of such pattern using SCEV.

Differential Revision: https://reviews.llvm.org/D89550
Reviewed By: fhahn, skatkov

[Test] Add Check statement

Fix structural comparison of template template arguments to compare the
right union member.

Should fix the armv8 buildbot.

[PowerPC] [Clang] Define macros to identify quad-fp semantics

We have option -mabi=ieeelongdouble to set current long double to
IEEEquad semantics. Like what GCC does, we need to define
__LONG_DOUBLE_IEEE128__ macro in this case, and __LONG_DOUBLE_IBM128__
if using PPCDoubleDouble.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D90208