review.tizen.org Git - platform/upstream/llvm.git/log

[ARM] Add patterns for select(p, BinOp(x, y), z) -> BinOpT(x, y,p z)

Most MVE instructions can be predicated to fold a select into the
instruction, using the predicate and the selects else as a passthough.
This adds tablegen patterns for most two operand instructions using the
newly added TwoOpPattern from 1030e82598da.

Differential Revision: https://reviews.llvm.org/D83222

[DebugInfo] Drop location ranges for variables which exist entirely outside the variable's scope

Summary:
This patch reduces file size in debug builds by dropping variable locations a
debugger user will not see.

After building the debug entity history map we loop through it. For each
variable we look at each entry. If the entry opens a location range which does
not intersect any of the variable's scope's ranges then we mark it for removal.
After visiting the entries for each variable we also mark any clobbering
entries which will no longer be referenced for removal, and then finally erase
the marked entries. This all requires the ability to query the order of
instructions, so before this runs we number them.

Tests:
Added llvm/test/DebugInfo/X86/trim-var-locs.mir

Modified llvm/test/DebugInfo/COFF/register-variables.ll
  Branch folding merges the tails of if.then and if.else into if.else. Each
  blocks' debug-locations point to different scopes so when they're merged we
  can't use either. Because of this the variable 'c' ends up with a location
  range which doesn't cover any instructions in its scope; with the patch
  applied the location range is dropped and its flag changes to IsOptimizedOut.

Modified llvm/test/DebugInfo/X86/live-debug-variables.ll
Modified llvm/test/DebugInfo/ARM/PR26163.ll
  In both tests an out of scope location is now removed. The remaining location
  covers the entire scope of the variable allowing us to emit it as a single
  location.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D82129

[llvm-readelf] - Introduce describe() helper functions.

These functions can be used to generate strings like
"SHT_?? section with index ?" to describe sections in error/warning messages,
what helps to simplify and generalize them.

Also this allows to isolate the following common code pattern:
`&Sec - &cantFail(Obj->sections()).front();`

Differential revision: https://reviews.llvm.org/D84240

[AMDGPU] Don't combine memory intrs to v3i16

v3i16 and v3f16 currently cannot be legalized and lowered so they should
not be emitted by inst combining.

Moved the check down to still allow extracting 1 or 2 elements via the dmask.

Fixes image intrinsics being combined to return v3x16.

Differential Revision: https://reviews.llvm.org/D84223

[lAA] Return SmallVectorImpl& instead of SmallVector& (NFC).

[llvm-readelf/readobj] - Fix the behavior when a sections is included in two groups at the same time.

The current behavior was introduced by me in D37567 and it is a bit strange. It prints the
"Error: ...." message to the errs() manually and stops dumping the group section which has this error.
This behavior is consistent with GNU though, but it is very inconsistent with what the regular llvm-readelf
code usually does/prints, so I suggest to change the implementation:

1) Instead of printing "Error: ...." to errs() - just report a warning.
2) Try to continue dumping the section.
3) Merge broken-group.test to group.text.

This is what this patch does.

Differential revision: https://reviews.llvm.org/D84170

[PowerPC] fixupIsDeadOrKill start and end in different block fixing

In fixupIsDeadOrKill, we assume StartMI and EndMI not exist in same
basic block, so we add an assertion in that function. This is wrong
before RA, as before RA the true definition may exist in another
block through copy like instructions.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D83365

[OpenMP][NFC] pass on env variables to libomptarget tests

[clangd] Fix conversion from Windows UNC paths to file URI format.

Summary:
The fix improves handling of Windows UNC paths to align with Appendix E. Nonstandard Syntax Variations of RFC 8089.

Before this fix it was difficult to use Windows UNC paths in compile_commands.json database as such paths were converted to file URIs using 'file:////auth/share/file.cpp' notation instead of recommended 'file://auth/share/file.cpp'.

As an example, VS.Code cannot understand file URIs with 4 starting slashes, thus such features as go-to-definition, jump-to-file, hover tooltip, etc. stop working. This also applicable to files which reside on Windows network-mapped drives because clangd internally resolves file paths to real paths in some cases and such paths get resolved to UNC paths.

Reviewers: sammccall, kadircet

Reviewed By: sammccall

Subscribers: ormris, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, kbobyrev, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84172

[llvm-readobj/readelf] - Don't fail dumping when unable to read the name of the SHT_DYNSYM section.

We have an issue currently: we are trying to read the name of the SHT_DYNSYM section
very early and using `unwrapOrError` call for that.

The name is needed only for the GNU output. Because of the current logic, the tool
fails to dump the whole object when something is wrong with the name of the .dynsym section.

This patch delays reading the name and also allows it to be broken.

Differential revision: https://reviews.llvm.org/D84173

[Test] Add more simple tests for PR46786

[sanitizer,NFC] InternalAlloc cleanup

[analyzer][solver] Track symbol disequalities

Summary:
This commmit adds another relation that we can track separately from
range constraints.  Symbol disequality can help us understand that
two equivalence classes are not equal to each other.  We can generalize
this knowledge to classes because for every a,b,c, and d that
a == b, c == d, and b != c it is true that a != d.

As a result, we can reason about other equalities/disequalities of symbols
that we know nothing else about, i.e. no constraint ranges associated
with them.  However, we also benefit from the knowledge of disequal
symbols by following the rule:
  if a != b and b == C where C is a constant, a != C
This information can refine associated ranges for different classes
and reduce the number of false positives and paths to explore.

Differential Revision: https://reviews.llvm.org/D83286

[analyzer][solver] Track symbol equivalence

Summary:
For the most cases, we try to reason about symbol either based on the
information we know about that symbol in particular or about its
composite parts.  This is faster and eliminates costly brute force
searches through existing constraints.

However, we do want to support some cases that are widespread enough
and involve reasoning about different existing constraints at once.
These include:
  * resoning about 'a - b' based on what we know about 'b - a'
  * reasoning about 'a <= b' based on what we know about 'a > b' or 'a < b'

This commit expands on that part by tracking symbols known to be equal
while still avoiding brute force searches.  It changes the way we track
constraints for individual symbols.  If we know for a fact that 'a == b'
then there is no need in tracking constraints for both 'a' and 'b' especially
if these constraints are different.  This additional relationship makes
dead/live logic for constraints harder as we want to maintain as much
information on the equivalence class as possible, but we still won't
carry the information that we don't need anymore.

Differential Revision: https://reviews.llvm.org/D82445

[analyzer] Introduce small improvements to the solver infra

Summary:
* Add a new function to delete points from range sets.
* Introduce an internal generic interface for range set intersections.
* Remove unnecessary bits from a couple of solver functions.
* Add in-code sections.

Differential Revision: https://reviews.llvm.org/D82381

[lldb/test] Delete result formatter machinery entirely

After more investigation, I realised this part of the code is totally
unused. It was used for communicating the test results from the
"inferior" dotest process to the main "dosep" process running
everything. Now that everything is being orchestrated through lit, this
is not used for anything.

[AArch64][SVE] Correctly allocate scavenging slot in presence of SVE.

This patch addresses two issues:

* Forces the availability of the base-pointer (x19) when the frame has
  both scalable vectors and variable-length arrays. Otherwise it will
  be expensive to access non-SVE locals.

* In presence of SVE stack objects, it will allocate the emergency
  scavenging slot close to the SP, so that they can be accessed from
  the SP or BP if available. If accessed from the frame-pointer, it will
  otherwise need an extra register to access the scavenging slot because
  of mixed scalable/non-scalable addressing modes.

Reviewers: efriedma, ostannard, cameron.mcinally, rengolin, david-arm

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70174

[lldb/interpreter] Fix formatting in CommandInterpreter.cpp (NFC)

This patch addresses some formatting issues introduced by commit
5bb742b10dafd595223172ae985687765934ebe9

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

Make lit TestRunner.py work in Python 3

Summary: In Python3 SubstituteCaptures are no longer converted to String implicitly behind the scenes. Converting explicitly makes the TestRunner to work in Python3.

Reviewers: gribozavr2, compnerd

Reviewed By: gribozavr2

Subscribers: tbkka, delcypher, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81361

[lldb/interpreter] Add ability to save lldb session to a file

This patch introduce a new feature that allows the users to save their
debugging session's transcript (commands + outputs) to a file.

It differs from the reproducers since it doesn't require to capture a
session preemptively and replay the reproducer file in lldb.
The user can choose the save its session manually using the session save
command or automatically by setting the interpreter.save-session-on-quit
on their init file.

To do so, the patch adds a Stream object to the CommandInterpreter that
will hold the input command from the IOHandler and the CommandReturnObject
output and error. This way, that stream object accumulates passively all
the interactions throughout the session and will save them to disk on demand.

The user can specify a file path where the session's transcript will be
saved. However, it is optional, and when it is not provided, lldb will
create a temporary file name according to the session date and time.

rdar://63347792

Differential Revision: https://reviews.llvm.org/D82155

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[ARM] Predicated binary operation tests. NFC

[lldb/test] Do a better job at setting (DY)LD_LIBRARY_PATH

Summary:
registerSharedLibrariesWithTarget was setting the library path
environment variable to the process build directory, but the function is
also accepting libraries in other directories (in which case they won't
be found automatically).

This patch makes the function set the path variable correctly for these
libraries too. This enables us to remove the code for setting the path
variable in TestWeakSymbols.py, which was working only accidentally --
it was relying on the fact that
launch_info.SetEnvironmentEntries(..., append=True)
would not overwrite the path variable it has set, but that is going to
change with D83306.

Reviewers: davide, jingham

Subscribers: lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D83552

[PowerPC] Extend .reloc directive on PowerPC

When the compiler generates a GOT indirect load it must generate two loads. One
that loads the address of the element from the GOT and a second to load the
actual element based on the address just loaded from the GOT. However, the
linker can optimize these two loads into one load if it knows that it is safe
to do so. The compiler can tell the linker that the optimization is safe
by using the R_PPC64_PCREL_OPT relocation.

This patch extends the .reloc directive to allow the following setup

  pld 3, vec@got@pcrel(0), 1
.Lpcrel1=.-8
      ... More instructions possible here ...
.reloc .Lpcrel1,R_PPC64_PCREL_OPT,.-.Lpcrel1
  lwa 3, 4(3)

Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach, MaskRay

Reviewed By: nemanjai, MaskRay

Differential Revision: https://reviews.llvm.org/D79625

[clangd] Fix Origin and MainFileOnly-ness for macros

Summary:
This was resulting in macros coming from preambles vanishing when user
have opened the source header. For example:

```
// test.h:
#define X
```

and

```
// test.cc
#include "test.h
^
```

If user only opens test.cc, we'll get `X` as a completion candidate,
since it is indexed as part of the preamble. But if the user opens
test.h afterwards we would index it as part of the main file and lose
the symbol (as new index shard for test.h will override the existing one
in dynamic index).

Also we were not setting origins for macros correctly, this patch also
fixes it.

Fixes https://github.com/clangd/clangd/issues/461

Reviewers: hokein

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84297

[Thumb] set code alignment for 16-bit load from constant pool

Summary:
[Thumb] set code alignment for 16-bit load from constant pool

LLVM miscompiles this code when compiling for a target with v8.2-A FP16 and the Thumb ISA at -O0:

extern void bar(__fp16 P5);

int main() {
__fp16 P5 = 1.96875;
bar(P5);
}

The code section containing main has 2 byte alignment.
It needs to have 4 byte alignment,
because the load literal instruction has an offset from the
load address with the low 2 bits zeroed.

I do not include a test case in this check-in.
llc and llvm-mc do not exhibit this bug. They do not set code section alignment
in the same manner as clang.

Reviewers: dnsampaio

Reviewed By: dnsampaio

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84169

[Matrix] Add LowerMatrixIntrinsics to the NPM

Pass LowerMatrixIntrinsics wasn't running yet running under the new pass
manager, and this adds LowerMatrixIntrinsics to the pipeline (to the
same place as where it is running in the old PM).

Differential Revision: https://reviews.llvm.org/D84180

[SCEV] Remove premature assert. PR46786

This assert was added to verify assumption that GEP's SCEV will be of pointer type,
basing on fact that it should be a SCEVAddExpr with (at least) last operand being
pointer. Two notes:
- GEP's SCEV does not have to be a SCEVAddExpr after all simplifications;
- In current state, GEP's SCEV does not have to have at least one pointer operands
(all of them can become int during the transforms).

However, we might want to be at a point where it is true. We are currently removing
this assert and will try to enumerate the cases where "is pointer" notion might be
lost during the transforms. When all of them are fixed, we can return it.

Differential Revision: https://reviews.llvm.org/D84294
Reviewed By: lebedev.ri

AMDGPU: Simplify f16 to i64 custom lowering

Range that f16 can represent fits into i32.
Lower as f16->i32->i64 instead of f16->f32->i64
since f32->i64 has long expansion.

Differential Revision: https://reviews.llvm.org/D84166

[ARM] Fix Asm/Disasm of TBB/TBH instructions

Summary:
This fixes Bugzilla #46616 in which it was reported
that "tbb  [pc, r0]" was marked as SoftFail
(aka unpredictable) incorrectly.

Expected behaviour is:
* ARMv8 is required to use sp as rn or rm
  (tbb/tbh only have a Thumb encoding so using Arm mode
  is not an option)
* If rm is the pc then the instruction is always
  unpredictable

Some of this was implemented already and this fixes the
rest. Added tests cover the new and pre-existing handling.

Reviewers: ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84227

add -fpch-codegen/debuginfo mapping to -fmodules-codegen/debuginfo

Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-*
variants. Having extra options also makes it simple to do a configure
check for the feature.
Also document the options in the release notes.

Differential Revision: https://reviews.llvm.org/D83623

accept 'clang++ -c a.pch -o a.o' to create PCH's object file

This way should be the same like with a.pcm for modules.
An alternative way is 'clang++ -c empty.cpp -include-pch a.pch -o a.o
-Xclang -building-pch-with-obj', which is what clang-cl's /Yc does
internally.

Differential Revision: https://reviews.llvm.org/D83716

[AST][RecoveryExpr] Error-dependent expression should not be treat as a nullptr pointer constant.

If an expression is contains-error and its type is unknown (dependent), we
don't treat it as a null pointer constant.

Fix a recovery-ast crash on C.

Differential Revision: https://reviews.llvm.org/D84222

Fix the clang-tidy build after get/isIntegerConstantExpression
refactoring.

Reland [lldb] Unify type name matching in FormattersContainer

This was originally reverted because the Linux bots were red after this landed,
but it seems that was actually caused by a different commit. I double checked
that this works on Linux, so let's reland this on Linux.

Summary:

FormattersContainer stores LLDB's formatters. It's implemented as a templated
map-like data structures that supports any kind of value type and only allows
ConstString and RegularExpression as the key types. The keys are used for
matching type names (e.g., the ConstString key `std::vector` matches the type
with the same name while RegularExpression keys match any type where the
RegularExpression instance matches).

The fact that a single FormattersContainer can only match either by string
comparison or regex matching (depending on the KeyType) causes us to always have
two FormatterContainer instances in all the formatting code. This also leads to
us having every type name matching logic in LLDB twice. For example,
TypeCategory has to implement every method twice (one string matching one, one
regex matching one).

This patch changes FormattersContainer to instead have a single `TypeMatcher`
key that wraps the logic for string-based and regex-based type matching and is
now the only possible KeyType for the FormattersContainer. This means that a
single FormattersContainer can now match types with both regex and string
comparison.

To summarize the changes in this patch:
* Remove all the `*_Impl` methods from `FormattersContainer`
* Instead call the FormatMap functions from `FormattersContainer` with a
  `TypeMatcher` type that does the respective matching.
* Replace `ConstString` with `TypeMatcher` in the few places that directly
  interact with `FormattersContainer`.

I'm working on some follow up patches that I split up because they deserve their
own review:

* Unify FormatMap and FormattersContainer (they are nearly identical now).
* Delete the duplicated half of all the type matching code that can now use one
  interface.
* Propagate TypeMatcher through all the formatter code interfaces instead of
  always offering two functions for everything.

There is one ugly design part that I couldn't get rid of yet and that is that we
have to support getting back the string used to construct a `TypeMatcher` later
on. The reason for this is that LLDB only supports referencing existing type
matchers by just typing their respective input string again (without even
supplying if it's a regex or not).

Reviewers: davide, mib

Reviewed By: mib

Subscribers: mgorny, JDevlieghere

Differential Revision: https://reviews.llvm.org/D84151

[MLIR] Set alignment in AllocOp of normalizeMemref()

AllocOp is updated in normalizeMemref(AllocOp allocOp), but, when the
AllocOp has `alignment` attribute, it was ignored and updated AllocOp
does not have `alignment` attribute. This patch fixes it.

Differential Revision: https://reviews.llvm.org/D83656

[NFC][Reduce] Group llvm-reduce options into a group, uncluttering --help

[SimplifyCFG] Do not create unneeded PR Phi in block with convergent calls

We do not thread blocks with convergent calls, but this check was missing
when we decide to insert PR Phis into it (which we only do for threading).

Differential Revision: https://reviews.llvm.org/D83936
Reviewed By: nikic

[PowerPC] Fix wrong codegen when stack pointer has to realign performing dynalloc

Current powerpc backend generates wrong code sequence if stack pointer
has to realign if `-fstack-clash-protection` enabled. When probing
dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes
`NegSizeReg` as input and returns
`FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated
correctly, however code following `PREPARE_PROBED_ALLOCA` still uses
value of `NegSizeReg`, which does not contain `ActualNegSize` if
`MaxAlign > TargetAlign`, to calculate loop trip count and residual
number of bytes.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84152

[PowerPC] Fix wrong codegen when stack pointer has to realign in prologue

Current powerpc backend generates wrong code sequence if stack pointer
has to realign if -fstack-clash-protection enabled. When probing in
prologue, backend should generate a subtraction instruction rather
than a `stux` instruction to realign the stack pointer.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84218

[lldb] Adjust for getIntegerConstantExpression refactor

[PowerPC] Fix the implicit operands in PredicateInstruction()

Summary:
In the function `PPCInstrInfo::PredicateInstruction()`, we will replace
non-Predicate Instructions to Predicate Instruction. But we forget add
the new implicit operands the new Predicate Instruction needed. This
patch is to fix this.

Reviewed By: jsji, efriedma

Differential Revision: https://reviews.llvm.org/D82390

[OpenMP] Add missing RUN lines for OpenMP 4.5

Summary: This was missed when default version was upgraded to 5.0 (part of D81098)

Reviewers: saiislam, ABataev, jdoerfert

Reviewed By: saiislam

Subscribers: yaxunl, guansong, sstefan1, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84221

Follow-on fixes for get/isIntegerConstantExpression

[DWARFYAML] Make the length field of compilation units optional. NFC.

This patch makes the length field of compilation units optional (0 by
default).

Reapply "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression"

Reapply 49e5f603d40083dce9c05796e3cde3a185c3beba
which had been reverted in c94332919bd922032e979b3ae3ced5ca5bdf9650.

Originally reverted because I hadn't updated it in quite a while when I
got around to committing it, so there were a bunch of missing changes to
new code since I'd written the patch.

Reviewers: aaron.ballman

Differential Revision: https://reviews.llvm.org/D76646

[DWARFYAML] Use yaml::Hex64 rather than uint64_t as length. NFC.

It's better to use yaml::Hex64 as length in the compilation unit.

[Coverage] fix failed test case.

[flang] Replay a FORMAT at the right position

When FORMAT control reaches the final parenthesis and data items
remain, we advance a record and revert to the beginning of the
FORMAT for further items. But when the FORMAT contains any
nested parenthesized group of editing descriptors, possibly
repeated, reversion must be to the beginning of the last such
top-level parenthesized group, including its repetition count.

Reviewed By: sscalpone, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D84281

[flang] Fix source line continuation in potential macro calls (bugzilla 46768)

The prescanner looks for implicit continuation lines when
there are unclosed parentheses at the end of a line, so that
source preprocessing macro references with arguments that span
lines are recognized. The condition that determines this
implicit continuation has been put into a predicate member
function and corrected to apply only when the following line
is source (not a preprocessing directive, comment, &c.).

Fixes bugzilla #46768.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D84280

[flang] Implement byte-swapped external unformatted I/O in runtime

Add SetConvert() to the OPEN statement's runtime API.
Add ByteswapOption() to the main program's runtime API.
Check a $FORT_CONVERT environment variable, too, for
a swapping specifier.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D84284

[flang] Handle leading zeroes after decimal in REAL formatted input

Leading zero digits after the decimal mark were being dropped.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D84282

[Coverage] Fix coverage test cases.

[flang] Check for misplaced labels

In fixed form source, complain when a label digit appears
outside the label field & when a non-digit appears in the label
field.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D84283

[PowerPC] add store (load float*) pattern to isProfitableToHoist

store (load float*) can be optimized to store(load i32*) in InstCombine pass.

Add store (load float*) to isProfitableToHoist to make sure we don't break
the opt in InstCombine pass.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D82341

[test-release] fallback to py3's venv module

If virtualenv is not available, we can look for py3's venv instead. We only
use this particular env for installing and running the test suite.

Disable -Wsuggest-override for all remaining unittests/ directories

[lld] Disable -Wsuggest-override for unittests

[Coverage] Add comment to skipped regions

Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757.
Add comment to skipped regions so we don't track execution count for lines containing only comments.

Differential Revision: https://reviews.llvm.org/D84208

[ValueTracking] Fix incorrect handling of canCreateUndefOrPoison

.. in isGuaranteedNotToBeUndefOrPoison.

This caused early exit of isGuaranteedNotToBeUndefOrPoison, making it return
imprecise result.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84251

[CFE] Add nomerge function attribute to inline assembly.

Sometimes we also want to avoid merging inline assembly. This patch add
the nomerge function attribute to inline assembly.

Reviewed By: zequanwu

Differential Revision: https://reviews.llvm.org/D84225

[PDB][NativeSession] Clean up some things in NativeSession.

-Use the actual sect/offset to keep track of symbols in the cache so they don't get created multiple times with different addresses.
-Remove getSymTag from PDBFunctionSymbol/PDBPublicSymbol because it's already implemented in the base class
-Merge the symbolizer test files for DIA and native, since the tests are the same.
-Implement getCompilandId for NativeLineNumber

Reviewed By: amccarth

Differential Revision: https://reviews.llvm.org/D84208

[clang] Disable -Wsuggest-override for unittests/

[NFC] Clean up doc comment and implementation for Module::isSubModuleOf.

Patch by Varun Gandhi!

Differential Revision: https://reviews.llvm.org/D84087

GlobalISel: Use Register and update comment physical register syntax

[PowerPC][Power10] Add Vector Multiply/Mod/Divide Instruction Definitions and MC Tests

This patch adds the td definitions and asm/disasm tests for the following instructions:
- Vector Multiply Low Doubleword: vmulld
- Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud
- Vector Divide Word/Doubleword: vdivsw, vdivuw, vdivsd, vdivud
- Vector Multiply High Word/Doubleword: vmulhsw, vmulhsd, vmulhuw, vmulhud
- Vector Divide Extended Word/Doubleword: vdivesw, vdiveuw, vdivesd, vdiveud

Differential Revision: https://reviews.llvm.org/D82929

Revert "[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy"

This reverts commit 64eb3a4915f00cca9af4c305a9ff36209003cd7b.

It caused miscompiles with optimizations enabled. Reverting while I investigate.

[AArch64][GlobalISel] Fix TLS accesses clobbering registers incorrectly.

This was happening because the BLR didn't have a use of the X0 arg register,
which would end up being re-used in high reg pressure situations.
The change also avoids hard coding the use of X0 for the sequence except to
copy the value for the call. ld64 should still be able to optimize it.

rdar://65438258

AMDGPU/GlobalISel: Add some baseline degenerate call argument tests

AMDGPU/GlobalISel: Fix not erasing inst when lowering G_FRINT

GlobalISel: Legalize G_FPOWI

GlobalISel: Translate llvm.powi intrinsic

There are a few questionable things about this intrinsic and existing
DAG implementation. For some reason the intrinsic hardcodes the second
operand to be scalar-only i32, and SelectionDAG builder makes a
legalization decision based on whether the operand is constant.

AMDGPU: Start interpreting byref on kernel arguments

These are treated identically to value aggregates placed in the kernel
argument list. A %struct.foo or %struct.foo addrspace(4)*
byref(sizeof(%struct.foo)) align(alignof(%struct.foo)) argument should
produce the same offsets and argument metadata.

This handles all 3 kernel ABI implementations, and the two HSA
metadata emission paths.

[mlir][docs] Fix Markdown format in Language Reference

Differential Revision: https://reviews.llvm.org/D84271

Fix pow and ldexp in HIP header

CodeGen: Add support for lowering byref attribute

Add implementations for fmin, fminf, and fminl. Testing infrastructure update is splitted to https://reviews.llvm.org/D83931.

[SCCP] Add switch+range tests (NFC)

[X86][AVX] getTargetShuffleMask - don't decode VBROADCAST(EXTRACT_SUBVECTOR(X,0)) patterns.

getTargetShuffleMask is used by the various "SimplifyDemanded" folds so we can't assume that the bypassed extract_subvector can be safely simplified - getFauxShuffleMask performs a more general decode that allows us to more safely catch many of these cases so the impact is minimal.

[llvm-libtool-darwin] Allow flattening archives

Add support for flattening archives while creating static libraries.
Hence, can now pass archives as input in addition to Mach-O binaries.
Furthermore, archives themselves must only conatain Mach-O binaries. As
per cctools' libtool's behavior, llvm-libtool-darwin does not flatten
archives recursively.

Reviewed by alexshap, smeenai, jhenderson

Differential Revision: https://reviews.llvm.org/D83520

Update Test (EXPECT_EQ and friends) to accept __uint128_t and floating point types (float, double, long double).

Summary: Update Test (EXPECT_EQ and friends) to accept __uint128_t and floating point types (float, double, long double).

Reviewers: sivachandra

Subscribers: mgorny, libc-commits

Tags: #libc-project

Differential Revision: https://reviews.llvm.org/D83931

DebugInfo: Add missing comment from llvm/test/DebugInfo/X86/debug-macro-dwo.ll

Meant to include this in 63a45091e5f3fce525d7bb8823df95a468ae69d0

Revert "[clangd] Fixes in lit tests"

This reverts commit ff63d6be93dc5958bf35d92919ce6fafcc611e89.

DAG: Handle expanding strict_fsub into fneg and strict_fadd

The AMDGPU handling of f16 vectors is terrible still since it gets
scalarized even when the vector operation is legal.

The code is is essentially duplicated between the non-strict and
strict case. Apparently no other expansions are currently trying to do
this. This is mostly because I found the behavior of
getStrictFPOperationAction to be confusing. In the ARM case, it would
expand strict_fsub even though it shouldn't due to the later check. At
that point, the logic required to check for legality was more complex
than just duplicating the 2 instruction expansion.

[llvm-libtool-darwin] Add support for -static option

Add support for creating static libraries when the input includes only
Mach-O binaries (and not libraries/archives themselves).

Reviewed by alexshap, Ktwu, smeenai, jhenderson, MaskRay, mtrent

Differential Revision: https://reviews.llvm.org/D83002

[AIX][XCOFF]emit extern linkage for the llvm intrinsic symbol

SUMMARY:

when we call memset, memcopy,memmove etc(this are llvm intrinsic function) in the c source code. the llvm will generate IR
like call call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
for c source code
bash> cat test_memset.call

struct S{
int a;
int b;
};
extern struct  S s;
void bar() {
  memset(&s, s.b, s.b);
}
like

%struct.S = type { i32, i32 }
@s = external global %struct.S, align 4
; Function Attrs: noinline nounwind optnone
define void @bar() #0 {
entry:
  %0 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  %1 = trunc i32 %0 to i8
  %2 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
  ret void
}
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #1
If we want to let the aix as assembly compile pass without -u
it need to has following assembly code.
.extern .memset
(we do not output extern linkage for llvm instrinsic function.
even if we output the extern linkage for llvm intrinsic function, we should not out .extern llvm.memset.p0i8.i32,
instead of we should emit .extern memset)

for other llvm buildin function floatdidf . even if we do not call these function floatdidf in the c source code(the generated IR also do not the call __floatdidf . the function call
was generated in the LLVM optimized.
the function is not in the functions list of Module, but we still need to emit extern .__floatdidf

The solution for it as :
We record all the lllvm intrinsic extern symbol when transformCallee(), and emit all these symbol in the AsmPrinter::doFinalization(Module &M)

Reviewers:  jasonliu, Sean Fertile, hubert.reinterpretcast,

Differential Revision: https://reviews.llvm.org/D78929

Fix the data layout mangling specification for 'i686-pc-macho'

Use 'o' for the mangling specification instead of 'e'. This fixes an
error in the backend caused by a mismatch between the data layouts
generated by the backend and the frontend.

rdar://problem/64168540

Avoid failing a CHECK in `DlAddrSymbolizer::SymbolizePC`.

Summary:
It turns out the `CHECK(addr >= reinterpret_cast<upt>(info.dli_saddr)`
can fail because on armv7s on iOS 9.3 `dladdr()` returns
`info.dli_saddr` with an address larger than the address we provided.

We should avoid crashing here because crashing in the middle of reporting
an issue is very unhelpful. Instead we now try to compute a function offset
if the value we get back from `dladdr()` looks sane, otherwise we don't
set the function offset.

A test case is included. It's basically a slightly modified version of
the existing `test/sanitizer_common/TestCases/Darwin/symbolizer-function-offset-dladdr.cpp`
test case that doesn't run on iOS devices right now.

More details:

In the concrete scenario on armv7s `addr` is `0x2195c870` and the returned
`info.dli_saddr` is `0x2195c871`.

This what LLDB says when disassembling the code.

```
(lldb) dis -a 0x2195c870
libdyld.dylib`<redacted>:
    0x2195c870 <+0>: nop
    0x2195c872 <+2>: blx    0x2195c91c                ; symbol stub for: exit
    0x2195c876 <+6>: trap
```

The value returned by `dladdr()` doesn't make sense because it points
into the middle of a instruction.

There might also be other bugs lurking here because I noticed that the PCs we
gather during stackunwinding (before changing them with
`StackTrace::GetPreviousInstructionPc()`) look a little suspicious (e.g.  the
PC stored for the frame with fail to symbolicate is 0x2195c873) as they don't
look properly aligned. This probably warrants further investigation in the future.

rdar://problem/65621511

Reviewers: kubamracek, yln

Subscribers: kristof.beyls, llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D84262

[OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region.

Summary:
Need to avoid an optimization for base pointer mapping for target data
directives.

Reviewers: jdoerfert, ye-luo

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84182

Revert D82927 "[Loop Fusion] Integrate Loop Peeling into Loop Fusion"

This reverts commit bb8850d34d601d4edd75fd30c07821c05a726c42.

It broke 3 check-llvm-transforms-loopfusion tests in an ASAN build.

LoopFuse.cpp `for (BasicBlock *Pred : predecessors(BB)) {` may operate on a deleted BB.

[ARM] Add MVE_TwoOpPattern. NFC

This commons out a chunk of the different two operand MVE patterns into
a single helper multidef. Or technically two multidef patterns so that
the Dup qr patterns can also get the same treatment. This is most of the
two address instructions that we have some codegen pattern for (not ones
that we select purely from intrinsics). It does not include shifts,
which are more spread out and will need some extra work to be given the
same treatment.

Differential Revision: https://reviews.llvm.org/D83219

Remove the "bool" return from OptionValue::Clear and its subclasses.

Every override returns true and its return value is never checked. I can't
see how clearing an OptionValue could fail, or what you would
do if it did. The return serves no purpose.

Differential Revision: https://reviews.llvm.org/D84253

DebugInfo: make test/DebugInfo/X86/debug-macro-dwo.ll more comprehensive

The test doesn't really demonstrate the use of the debug_loc.dwo section
distinct from the debug_loc section for strings in debug_macro.dwo -
because there are no strings that appear uin debug_loc.dwo that weren't
already in debug_loc, so the indexes would remain the same even if the
section that was used was fixed (to use debug_loc.dwo as per spec).

[lldb/test] Skip test in TestBitfieldIvars.py instead of xfailing it

The test triggers an ASan exception, causing job failures on the
sanitizer bot.

As suggested by Shafik.

[clangd] Fixes in lit tests

Summary:
Changes:
- `background-index.test` Add Windows support.
- `did-change-configuration-params.test` Replace `cat | FileCheck` with `FileCheck --input-file`
- `test-uri-windows.test` This test did not run on Windows displite `REQUIRES: windows-gnu || windows-msvc` (replacement: `UNSUPPORTED: !(windows-gnu || windows-msvc)`).

Reviewers: sammccall, kadircet

Reviewed By: kadircet

Subscribers: njames93, ormris, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83759

[MBP] Use profile count to compute tail dup cost if it is available

Current tail duplication in machine block placement pass uses block frequency
information in cost model. But frequency number has only relative meaning
compared to other basic blocks in the same function. A large frequency number
doesn't mean it is hot and a small frequency number doesn't mean it is cold.

To overcome this problem, this patch uses profile count in cost model if it's
available. So we can tail duplicate real hot basic blocks.

Differential Revision: https://reviews.llvm.org/D83265

[PGO][PGSO] Add profile guided size optimization to loop vectorization legality.

[compiler-rt][asan] decommit shadow memory for unmaps in fuchsia.

This CL allows asan allocator in fuchsia to decommit shadow memory
for memory allocated using mmap.

Big allocations in asan end up being allocated via `mmap` and freed with
`munmap`. However, when that memory is freed, asan returns the
corresponding shadow memory back to the OS via a call to
`ReleaseMemoryPagesToOs`.

In fuchsia, `ReleaseMemoryPagesToOs` is a no-op: to be able to free
memory back to the OS, you have to hold a handle to the vmo you want to
modify, which is tricky at the ReleaseMemoryPagesToOs level as that
function is not exclusively used for shadow memory.

The function `__sanitizer_fill_shadow` fills a given shadow memory range
with a specific value, and if that value is 0 (unpoison) and the memory
range is bigger than a threshold parameter, it will decommit that memory
if it is all zeroes.

This CL modifies the `FlushUnneededASanShadowMemory` function in
`asan_poisoning.cpp` to add a call to `__sanitizer_fill_shadow` with
value and threshold = 0. This way, all the unneeded shadow memory gets
returned back to the OS.

A test for this behavior can be found in fxrev.dev/391974

Differential Revision: https://reviews.llvm.org/D80355

Change-Id: Id6dd85693e78a222f0329d5b2201e0da753e01c0

[libTooling] In Clang Transformer, change `Metadata` field to deferred evaluation.

`Metadata` is being changed from an `llvm::Any` to a `MatchConsumer<llvm::Any>`
so that it's evaluation can be be dependent on on `MatchResult`s passed in.

Reviewed By: ymandel, gribozavr2

Differential Revision: https://reviews.llvm.org/D83820

Revert "[Windows] Fix limit on command line size"

This reverts commit d4020ef7c474b5e695d77aa100d7f68dc0c66b4e. It broke
LLDB buildbot: http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/17702.

[compiler-rt][test][profile] Fix missing include

... on systems where wait() isn't one of the declarations transitively included
via unistd.h (i.e. Darwin).

Differential Revision: https://reviews.llvm.org/D84207