review.tizen.org Git - platform/upstream/llvm.git/log

[NFC][CodeGen] Use while loop instead for loop in MachineBlockPlacement::optimizeBranches()
This will pass EXPENSIVE check.

llvm-svn: 368532

[ARM] MVE spill vector test. NFC

llvm-svn: 368531

[MVE] Don't try to unroll vectorised MVE loops

Due to the nature of the beat system in the MVE architecture, along with tail
predication and low-overhead loops, unrolling has less benefit compared to
normal loops. You can not, for example, hide the latency of a load with other
instructions as you can for scalar code. Preventing unrolling also makes the
code easier to read and reason about.

So if a loop contains vector code, don't enable the runtime unrolling. At least
for the time being.

Differential Revision: https://reviews.llvm.org/D65803

llvm-svn: 368530

[ARM] Permit auto-vectorization using MVE

With enough codegen complete, we can now correctly report the number and size
of vector registers for MVE, allowing auto vectorisation. This also allows FP
auto-vectorization for MVE without -Ofast/-ffast-math, due to support for IEEE
FP arithmetic and parity between scalar and vector FP behaviour.

Patch by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63728

llvm-svn: 368529

Properly handle reference initialization when detecting gsl::Pointer initialization chains

llvm-svn: 368528

Fix __clang_call_termiante's argument for foreign exceptions

Summary:
When exceptions are repeatedly thrown in the middle of handling another
exception, we call `__clang_call_terminate` with the exception pointer
(i32) as an argument. But in case of foreign exceptions, we don't have
the pointer, so we call the function with 0. (This requires
`__clang_call_terminate` can deal with 0 argument, which will be done
later)

But previously the 0 argument was not added as a `i32.const 0` but an
immediate by mistake, causing the `call` instruction to take not an i32
but rather an exnref, because an `exnref` is left on top of the value
stack if `br_on_exn` is not taken.

```
block i32
  br_on_exn 0, __cpp_exception
                               ;; exnref is on top of stack now
  i32.const 0                  ;; This was missing!
  call __clang_call_terminate
  unreachable
end
call __clang_call_terminate    ;; This takes i32 extracted by br_on_exn
```

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65475

llvm-svn: 368527

[LICM] Make Loop ICM profile aware

Summary:
Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block.

Test Plan:

ninja check

Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk

Reviewed By: asbirlea

Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65060

llvm-svn: 368526

Revert "test commit"

This reverts commit ad92a4a2769425ad0d39ac1dbb6282f6f51a1af7.

llvm-svn: 368525

test commit

llvm-svn: 368524

[X86] Remove some more code from combineShuffle that is no longer needed with widening legalization.

llvm-svn: 368523

[X86] Remove some code from combineShuffle that seems largely unnecessary with widening legalization.

The test case that changed is probably better served through
allowing combineTruncatedArithmetic to create narrow vectors. It
also appears InstCombine would have simplified this test case
to remove the zext and trunc anyway.

llvm-svn: 368522

[InstCombine][NFC] Use SimplifyAddInst() instead of SimplifyBinOp(Instruction::BinaryOps::Add, )

llvm-svn: 368521

[NFC][InstCombine] Tests for shift amount reassociation in bittest with truncated shl (PR42399)

trunc-of-shl:
https://rise4fun.com/Alive/zGx
https://rise4fun.com/Alive/sl0L
I.e. no extra legality check needed.

https://bugs.llvm.org/show_bug.cgi?id=42399

llvm-svn: 368520

[InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant

If one of the values being shifted is a constant, since the new shift
amount is known-constant, the new shift will end up being constant-folded
so, we don't need that one-use restriction then.

llvm-svn: 368519

[InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction

That one-use restriction is not needed for correctness - we have already
ensured that one of the shifts will go away, so we know we won't increase
the instruction count. So there is no need for that restriction.

llvm-svn: 368518

[NFC][InstCombine] Tests for shift amount reassociation in bittest with shift of const

llvm-svn: 368517

Add support for FreeBSD's LD_32_LIBRARY_PATH

Summary:
Because the dynamic linker for 32-bit executables on 64-bit FreeBSD uses
the environment variable `LD_32_LIBRARY_PATH` instead of
`LD_LIBRARY_PATH` to find needed dynamic libraries, running the 32-bit
parts of the dynamic ASan tests will fail with errors similar to:

```
ld-elf32.so.1: Shared object "libclang_rt.asan-i386.so" not found, required by "Asan-i386-inline-Dynamic-Test"
```

This adds support for setting up `LD_32_LIBRARY_PATH` for the unit and
regression tests. It will likely also require a minor change to the
`TestingConfig` class in `llvm/utils/lit/lit`.

Reviewers: emaste, kcc, rnk, arichardson

Reviewed By: arichardson

Subscribers: kubamracek, krytarowski, fedor.sergeev, delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D65772

llvm-svn: 368516

[X86][SSE] Lower shuffle as ANY_EXTEND_VECTOR_INREG

On SSE41+ targets we always lower vector shuffles to ZERO_EXTEND_VECTOR_INREG, even if we don't need the extended bits.

This patch relaxes this so that we lower to ANY_EXTEND_VECTOR_INREG if we can, meaning that shuffle combines have a better idea of what elements need to be kept zero. This helps the multiple reduction code as we can now combine away a lot more of the pack+extend codes.

Differential Revision: https://reviews.llvm.org/D65741

llvm-svn: 368515

[NFC][CodeGen] Modify the PI++ to ++PI in MachineBlockPlacement::optimizeBranches()

llvm-svn: 368514

[TableGen] Correct the shift to the proper bit width.

- Replace the previous 32-bit shift with 64-bit one matching `OpInit`.

llvm-svn: 368513

[Reassociate] try harder to convert negative FP constants to positive

This is an extension of a transform that tries to produce positive floating-point
constants to improve canonicalization (and hopefully lead to more reassociation
and CSE).

The original patches were:
D4904
D5363 (rL221721)

But as the test diffs show, these were limited to basic patterns by walking from
an instruction to its single user rather than recursively moving up the def-use
sequence. No fast-math is required here because we're only rearranging implicit
FP negations in intermediate ops.

A motivating bug is:
https://bugs.llvm.org/show_bug.cgi?id=32939

Differential Revision: https://reviews.llvm.org/D65954

llvm-svn: 368512

[lldb] Fix dynamic_cast by no longer failing on variable without metadata

Summary:
Our IR rewriting infrastructure currently fails when it encounters a variable which has no metadata associated.
This causes dynamic_cast to fail as in this case IRForTarget considers the type info pointers ('@_ZTI...') to be
variables without associated metadata. As there are no variables for these internal variables, this is actually
not an error and dynamic_cast would work fine if we didn't throw this error.

This patch fixes this by removing this diagnostics code. In case we would actually hit a variable that has no
metadata (but is supposed to have), we still have the error in the expression log so this shouldn't make it
harder to diagnose any missing metadata errors.

This patch should fix dynamic_cast and also adds a bunch of test coverage to that language feature.

Fixes rdar://10813639

Reviewers: davide, labath

Reviewed By: labath

Subscribers: friss, labath, abidh, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D65932

llvm-svn: 368511

[clang] Fixed x86 cpuid NSC signature

Summary:
The signature "Geode by NSC" for NSC vendor is wrong.
In lib/Headers/cpuid.h, signature_NSC_edx and signature_NSC_ecx constants are inverted (cpuid signature order is ebx # edx # ecx).

Reviewers: teemperor, rsmith, craig.topper

Reviewed By: teemperor, craig.topper

Subscribers: craig.topper, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D65978

llvm-svn: 368510

[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks

Summary:

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 368509

[modulemap] Add AArch64SVEACLETypes.def

Update modulemap with a new textual header.

llvm-svn: 368508

[clang-format] Add SpaceInEmptyBlock option for WebKit

See PR40840

Differential Revision: https://reviews.llvm.org/D65925

llvm-svn: 368507

[X86] Match the IR pattern form movmsk on SSE1 only targets where v4i32 isn't legal

Summary:
This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that.

This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type.

Reviewers: spatel, RKSimon, xbolva00

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65689

llvm-svn: 368506

[X86] Improve the diagnostic for larger than 4-bit immediate for vpermil2pd/ps. Only allow MCConstantExprs.

llvm-svn: 368505

[Sanitizer] Reenable getusershell interception

and disabling it forAndroid.

Reviewers: krytarowski, vitalybuka

Reviewed By: krytarowski

Differential Revision: https://reviews.llvm.org/D66027

llvm-svn: 368504

[X86] Fix stack probe issue on windows32.

Summary:
On windows if the frame size exceed 4096 bytes, compiler need to
generate a call to _alloca_probe. X86CallFrameOptimization pass
changes the reserved stack size and cause of stack probe function
not be inserted. This patch fix the issue by detecting the call
frame size, if the size exceed 4096 bytes, drop X86CallFrameOptimization.

Reviewers: craig.topper, wxiao3, annita.zhang, rnk, RKSimon

Reviewed By: rnk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65923

llvm-svn: 368503

[MemDep] allow to select block-scan-limit when constructing MemoryDependenceAnalysis

Introducing non-global control for default block-scan-limit in MemDep analysis.
Useful when there are many compilations per initialized LLVM instance (e.g. JIT).

Reviewed By: asbirlea
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65806

llvm-svn: 368502

Fix a false positive warning when initializing members with gsl::Owners.

llvm-svn: 368501

[clangd] Disallow extraction of expression-statements.

Summary:
I split out the "extract parent instead of this" logic from the "this isn't
worth extracting" logic (now in eligibleForExtraction()), because I found it
hard to reason about.

While here, handle overloaded as well as builtin assignment operators.

Also this uncovered a bug in getCallExpr() which I fixed.

Reviewers: SureYeaah

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D65337

llvm-svn: 368500

Attempt to reapply "Even more warnings utilizing gsl::Owner/gsl::Pointer annotations"

llvm-svn: 368499

clangd: use -j for background index pool

Summary:
clangd supports a -j option to limit the amount of threads to use for parsing
TUs. However, when using -background-index (the default in later versions of
clangd), the parallelism used by clangd defaults to the hardware_parallelisn,
i.e. number of physical cores.

On shared hardware environments, with large projects, this can significantly
affect performance with no way to tune it down.

This change makes the -j parameter apply equally to parsing and background
index. It's not perfect, because the total number of threads is 2x the -j value,
which may still be unexpected. But at least this change allows users to prevent
clangd using all CPU cores.

Reviewers: kadircet, sammccall

Reviewed By: sammccall

Subscribers: javed.absar, jfb, sammccall, ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D66031

llvm-svn: 368498

Small format fix

Differential Revision: https://reviews.llvm.org/D66034

llvm-svn: 368497

Detects whether RESOURCE_TYPE_IO is defined.

Summary: This fixes lldb build on macOS SDK prior to 10.12.

Reviewers: JDevlieghere

Subscribers: lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D66034

llvm-svn: 368496

cfi-icall: Allow the jump table to be optionally made non-canonical.

The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:

- There is a performance and code size overhead associated with each
  exported function, because each such function must have an associated
  jump table entry, which must be emitted even in the common case where the
  function is never address-taken anywhere in the program, and must be used
  even for direct calls between DSOs, in addition to the PLT overhead.

- There is no good way to take a CFI-valid address of a function written in
  assembly or a language not supported by Clang. The reason is that the code
  generator would need to insert a jump table in order to form a CFI-valid
  address for assembly functions, but there is no way in general for the
  code generator to determine the language of the function. This may be
  possible with LTO in the intra-DSO case, but in the cross-DSO case the only
  information available is the function declaration. One possible solution
  is to add a C wrapper for each assembly function, but these wrappers can
  present a significant maintenance burden for heavy users of assembly in
  addition to adding runtime overhead.

For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.

This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.

Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.

Fixes PR41972.

Differential Revision: https://reviews.llvm.org/D65629

llvm-svn: 368495

Add missing REQUIRES to r368487

llvm-svn: 368494

[Bugpoint redesign] Fix nonlocal URI link in doc

Summary: Fixes documentation bot build http://lab.llvm.org:8011/builders/llvm-sphinx-docs

Reviewers: JDevlieghere

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66022

llvm-svn: 368493

[Sanitizer][Darwin] Add interceptor for malloc_zone_from_ptr

Ensure that malloc_default_zone and malloc_zone_from_ptr return the
sanitizer-installed malloc zone even when MallocStackLogging (MSL) is
requested. This prevents crashes in certain situations. Note that the
sanitizers and MSL cannot be used together. If both are enabled, MSL
functionality is essentially deactivated since it only hooks the default
allocator which is replaced by a custom sanitizer allocator.

rdar://53686175

Reviewed By: kubamracek

Differential Revision: https://reviews.llvm.org/D65990

llvm-svn: 368492

[OpenMP] Add support for close map modifier in Clang

Summary:
This patch adds support for the close map modifier in Clang.

This ensures that the new map type is marked and passed to the OpenMP runtime appropriately.

Additional regression tests have been merged from patch D55892 (author @saghir).

Reviewers: ABataev, caomhin, jdoerfert, kkwli0

Reviewed By: ABataev

Subscribers: kkwli0, Hahnfeld, saghir, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D65341

llvm-svn: 368491

[DAGCombiner] exclude x*2.0 from normal negation profitability rules

This is the codegen part of fixing:
https://bugs.llvm.org/show_bug.cgi?id=32939

Even with the optimal/canonical IR that is ideally created by D65954,
we would reverse that transform in DAGCombiner and end up with the same
asm on AArch64 or x86.

I see 2 options for trying to correct this:

  1. Limit isNegatibleForFree() by special-casing the fmul pattern (this patch).
  2. Avoid creating (fmul X, 2.0) in the 1st place by adding a special-case
     transform to SelectionDAG::getNode() and/or SelectionDAGBuilder::visitFMul()
     that matches the transform done by DAGCombiner.

This seems like the less intrusive patch, but if there's some other reason to
prefer 1 option over the other, we can change to the other option.

Differential Revision: https://reviews.llvm.org/D66016

llvm-svn: 368490

Remove leftover MF->dump()'s from r368487 that break release builds

llvm-svn: 368489

[OpenMP][libomptarget] Add support for close map modifier

Summary:
This patch adds support for the close map modifier.

The close map modifier will overwrite the unified shared memory requirement and create a device copy of the data.

Reviewers: ABataev, Hahnfeld, caomhin, grokos, jdoerfert, AlexEichenberger

Reviewed By: Hahnfeld, AlexEichenberger

Subscribers: guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65340

llvm-svn: 368488

[globalisel] Add G_SEXT_INREG

Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
  %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
  %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 23

All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.

To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
  %1 = G_CONSTANT 16
  %2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
  %2 = G_SEXT_INREG %0, 16
or:
  %1 = G_CONSTANT 16
  %2:_(s32) = G_SHL %0, %1
  %3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.

Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61289

llvm-svn: 368487

Remove variable only used in an assert.

llvm-svn: 368486

Revert the test commit

llvm-svn: 368485

[clang-doc] Generate an HTML index file

clang-doc now generates a file that contains only an index to all the
infos that can be used as the landing page for the generated website.

Differential Revision: https://reviews.llvm.org/D65918

llvm-svn: 368484

Test commit.

llvm-svn: 368483

[clangd] Give absolute path to clang-tidy and include-fixer. HintPath should always be absolute, some URI schemes care.

llvm-svn: 368482

Revert "[sanitizers] MSVC warning disable for clean build" and follow-up that tried to fix the build as it's still broken.

This reverts commit 368476 and 368480.

llvm-svn: 368481

Fix compilation after SVN r368476

That revision broke compilation with this error:

lib/builtins/fixunsxfdi.c:13:2: error: unterminated conditional directive
#if !_ARCH_PPC

llvm-svn: 368480

[X86] Remove custom handling for extloads from LowerLoad.

We don't appear to need this with widening legalization.

llvm-svn: 368479

[CodeGen] Require a name for a block addr target

Summary:
A block address may be used in inline assembly. In which case it
requires a name so that the asm parser has something to parse. Creating
a name for every block address is a large hammer, but is necessary
because at the point when a temp symbol is created we don't necessarily
know if it's used in inline asm. This ensures that it exists regardless.

Reviewers: nickdesaulniers, craig.topper

Subscribers: nathanchance, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65352

llvm-svn: 368478

[MC] Don't recreate a label if it's already used

Summary:
This patch keeps track of MCSymbols created for blocks that were
referenced in inline asm. It prevents creating a new symbol which
doesn't refer to the block.

Inline asm may have a reference to a label. The asm parser however
doesn't recognize it as a label and tries to create a new symbol. The
result being that instead of the original symbol (e.g. ".Ltmp0") the
parser replaces it in the inline asm with the new one (e.g. ".Ltmp00")
without updating it in the symbol table. So the machine basic block
retains the "old" symbol (".Ltmp0"), but the inline asm uses the new one
(".Ltmp00").

Reviewers: nickdesaulniers, craig.topper

Subscribers: nathanchance, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65304

llvm-svn: 368477

[sanitizers] MSVC warning disable for clean build
- https://reviews.llvm.org/D66023

llvm-svn: 368476

Don't diagnose errors when a file matches an include component

This regressed in r368322, and was reported as PR42948 and on the
mailing list. The fix is to ignore the specific error code for this
case. The problem doesn't seem to reproduce on Windows, where a
different error code is used instead.

llvm-svn: 368475

Update test to explicity test with -fintegrated-as and -fno-integrated-as and to expect warnings when appropriate.

Reviewed by: thakis

Differential Revision: https://reviews.llvm.org/D65974

llvm-svn: 368474

[Docs][llvm-strip] Fix an indentation issue.

llvm-svn: 368473

Revert "[asan_symbolize] Fix bug where the frame counter was not incremented."

This reverts commit 52a36fae2a3f8560a5be690a67304db5edafc3fe.

This commit broke the sanitizer_android buildbot. See comments at
https://reviews.llvm.org/rL368373 for more details.

llvm-svn: 368472

CodeGen: ensure 8-byte aligned String Swift CF ABI

CFStrings should be 8-byte aligned when built for the Swift CF runtime
ABI as the atomic CF info field must be properly aligned. This is a
problem on 32-bit platforms which would give the structure 4-byte
alignment rather than 8-byte alignment.

llvm-svn: 368471

gn build: Merge r368432.

llvm-svn: 368470

gn build: Merge r368439.

llvm-svn: 368469

gn build: Merge r368402.

llvm-svn: 368468

gn build: Merge r368392.

llvm-svn: 368467

gn build: Merge r368358.

llvm-svn: 368466

[libomptarget] Remove duplicate RTLRequiresFlags per device

We have one global RTLs.RequiresFlags, I don't see a need to make a
copy per device that the runtime manages. This was problematic anyway
because the copy happened during the first __tgt_register_lib(). This
made it impossible to call __tgt_register_requires() from normal user
funtions for testing.
Hence, this change also fixes unified_shared_memory/shared_update.c for
older versions of Clang that don't call __tgt_register_requires() before
__tgt_register_lib().

Differential Revision: https://reviews.llvm.org/D66019

llvm-svn: 368465

[Docs][llvm-strip] Add help text to llvm-strip rst doc

Summary: Addresses https://bugs.llvm.org/show_bug.cgi?id=42383

Reviewers: jhenderson, alexshap, rupprecht

Reviewed By: jhenderson

Subscribers: wolfgangp, jakehehrlich, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65384

llvm-svn: 368464

Revert Even more warnings utilizing gsl::Owner/gsl::Pointer annotations

This reverts r368454 (git commit 7c3c8ba8daf40534e09f6fe8701b723e25e4e2dc)

llvm-svn: 368463

Revert Fix a build bot failure and multiple warnings instances for range base for loops

This reverts r368459 (git commit 2bf522aea62e4fb653cacb68072167d25149099e)

llvm-svn: 368462

[libFuzzer] Merge: print stats after reading the output corpus dir.

Summary:
The purpose is to be able to extract the number of new edges added to
the original (i.e. output) corpus directory after doing the merge. Use case
example: in ClusterFuzz, we do merge after every fuzzing session, to avoid
uploading too many corpus files, and we also record coverage stats at that
point. Having a separate line indicating stats after reading the initial output
corpus directory would make the stats extraction easier for both humans and
parsing scripts.

Context: https://github.com/google/clusterfuzz/issues/802.

Reviewers: morehouse, hctim

Reviewed By: hctim

Subscribers: delcypher, #sanitizers, llvm-commits, kcc

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D66020

llvm-svn: 368461

[clang-format] Add link to source code in file definitions

Two command line options have been added to clang-doc.
--repository=<string> - URL of repository that hosts code; used for links to definition locations.
--source-root=<string> - Directory where processed files are stored. Links to definition locations will only be generated if the file is in this dir.

If the file is in the source-root and a repository options is passed;
a link to the source code will be rendered by the HTML generator.

Differential Revision: https://reviews.llvm.org/D65483

llvm-svn: 368460

Fix a build bot failure and multiple warnings instances for range base for loops

llvm-svn: 368459

[TableGen] Add "InitValue": Handle operands with set bit values in decoder methods

Summary:
The problem:
When an operand had bits explicitly set to "1" (as in the InitValue.td test case attached), the decoder was ignoring those bits, and the DecoderMethod was receiving an input where the bits were still zero.

The solution:
We added an "InitValue" variable that stores the initial value of the operand based on what bits were explicitly initialized to 1 in TableGen code. The generated decoder code then uses that initial value to initialize the "tmp" variable, then calls fieldFromInstruction to read the values for the remaining bits that were left unknown in TableGen.

This is mainly useful when there are variations of an instruction that differ based on what bits are set in the operands, since this change makes it possible to access those bits in a DecoderMethod. The DecoderMethod can use those bits to know how to handle the input.

Patch by Nicolas Guillemot

Reviewers: craig.topper, dsanders, fhahn

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D63741

llvm-svn: 368458

[InstCombine] Refactor optimizeExp2() (NFC)

Refactor `LibCallSimplifier::optimizeExp2()` to use the new
`emitBinaryFloatFnCall()` version that fetches the function name from TLI.

llvm-svn: 368457

Rename PCH/leakfiles test so it runs on bots.

llvm-svn: 368455

Even more warnings utilizing gsl::Owner/gsl::Pointer annotations

Differential Revision: https://reviews.llvm.org/D65127

llvm-svn: 368454

[Transforms] Add a emitBinaryFloatFnCall() version that fetches the function name from TLI

Add the counterpart to a similar function for single operands.

Differential revision: https://reviews.llvm.org/D65976

llvm-svn: 368453

[Transforms] Fix comments for hasFloatFn() and getFloatFnName() (NFC)

llvm-svn: 368452

Print reasonable representations of type names in llvm-nm, readelf and readobj

For type values that do not have proper names, print reasonable representation
in llvm-nm, llvm-readobj and llvm-readelf, matching GNU tools.s

Fixes PR41713.

Differential Revision: https://reviews.llvm.org/D65537

llvm-svn: 368451

Title: Improve Loop Cache Analysis LIT tests.
Summary: Make LIT tests unsensitive to analysis output order.
Authored By: etiotto

llvm-svn: 368450

[Transforms] Rename hasUnaryFloatFn() and getUnaryFloatFn() (NFC)

Rename `hasUnaryFloatFn()` to `hasFloatFn()` and `getUnaryFloatFn()` to `getFloatFnName()`.

llvm-svn: 368449

[compiler-rt] FuzzedDataProvider: use C++ headers only instead of a C/C++ mix.

Reviewers: Dor1s

Reviewed By: Dor1s

Subscribers: dberris, delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D66017

llvm-svn: 368448

[NFC] Added tests for D65898

llvm-svn: 368447

More warnings regarding gsl::Pointer and gsl::Owner attributes

Differential Revision: https://reviews.llvm.org/D65120

llvm-svn: 368446

[AArch64][x86] add tests for pessimization of expression with X*2.0 (PR32939); NFC

llvm-svn: 368445

[lldb][NFC] Assert on invalid cursors positions when creating CompletionRequest

Before we just triggered undefined behavior on invalid positions.

llvm-svn: 368444

[DAGCombiner] remove redundant fold for X*1.0; NFC

This is handled at node creation time (similar to X/1.0)
after:
rL357029
(no fast-math-flags needed)

llvm-svn: 368443

[lldb][NFC] Remove unused IRForTarget::BuildRelocation

llvm-svn: 368442

[MachinePipeliner] Avoid indeterminate order in FuncUnitSorter

Summary:
This is exposed by adding a new testcase in PowerPC in
https://reviews.llvm.org/rL367732

The testcase got different output on different platform, hence breaking
buildbots.

The problem is that we get differnt FuncUnitOrder when calculateResMII.

The root cause is:
1. Two MachineInstr might get SAME priority(MFUsx) from minFuncUnits.
2. Current comparison operator() will return `MFUs1 > MFUs2`.
3. We use iterators for MachineInstr, so the input to FuncUnitSorter
might be different on differnt platform due to the iterator nature.

So for two MI with same MFU, their order is actually depends on the
iterator order, which is platform (implemtation) dependent.

This is risky, and may cause cross-compiling problems.

The fix is to check make sure we assign a determine order when they are
equal.

Reviewers: bcahoon, hfinkel, jmolloy

Subscribers: nemanjai, hiraditya, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65992

llvm-svn: 368441

[sanitizer] Update symbolizer/scripts/global_symbols.txt

llvm-svn: 368440

Title: Loop Cache Analysis
Summary: Implement a new analysis to estimate the number of cache lines
required by a loop nest.
The analysis is largely based on the following paper:

Compiler Optimizations for Improving Data Locality
By: Steve Carr, Katherine S. McKinley, Chau-Wen Tseng
http://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf
The analysis considers temporal reuse (accesses to the same memory
location) and spatial reuse (accesses to memory locations within a cache
line). For simplicity the analysis considers memory accesses in the
innermost loop in a loop nest, and thus determines the number of cache
lines used when the loop L in loop nest LN is placed in the innermost
position.

The result of the analysis can be used to drive several transformations.
As an example, loop interchange could use it determine which loops in a
perfect loop nest should be interchanged to maximize cache reuse.
Similarly, loop distribution could be enhanced to take into
consideration cache reuse between arrays when distributing a loop to
eliminate vectorization inhibiting dependencies.

The general approach taken to estimate the number of cache lines used by
the memory references in the inner loop of a loop nest is:

Partition memory references that exhibit temporal or spatial reuse into
reference groups.
For each loop L in the a loop nest LN: a. Compute the cost of the
reference group b. Compute the 'cache cost' of the loop nest by summing
up the reference groups costs
For further details of the algorithm please refer to the paper.
Authored By: etiotto
Reviewers: hfinkel, Meinersbur, jdoerfert, kbarton, bmahjour, anemet,
fhahn
Reviewed By: Meinersbur
Subscribers: reames, nemanjai, MaskRay, wuzish, Hahnfeld, xusx595,
venkataramanan.kumar.llvm, greened, dmgreen, steleman, fhahn, xblvaOO,
Whitney, mgorny, hiraditya, mgrang, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63459

llvm-svn: 368439

[X86][SSE] Swap X86ISD::BLENDV inputs with an inverted selection mask (PR42825)

As discussed on PR42825, if we are inverting the selection mask we can just swap the inputs and avoid the inversion.

Differential Revision: https://reviews.llvm.org/D65522

llvm-svn: 368438

[GlobalOpt] prevent crashing on large integer types (PR42932)

This is a minimal fix (copy the predicate for the assert) to
prevent the crashing seen in:
https://bugs.llvm.org/show_bug.cgi?id=42932
...when converting a constant integer of arbitrary width to uint64_t.

Differential Revision: https://reviews.llvm.org/D65970

llvm-svn: 368437

[MCA] Fix MSVC 19.16 build with libc++

MSVC (19.16) wants to see the definition of Instruction in
`std::pair<unsigned, const Instruction &> SourceRef` to decide
if it is assignable.

Patch by Orivej Desh.

Differential Revision: https://reviews.llvm.org/D65844

llvm-svn: 368436

[llvm-readelf]Print filename for multiple inputs and fix formatting regression

This patch addresses two closely related bugs:
https://bugs.llvm.org/show_bug.cgi?id=42930 and
https://bugs.llvm.org/show_bug.cgi?id=42931.

GNU readelf prints the file name for every input unless there is only
one input and that input is not an archive. This patch adds the printing
for multiple inputs. A previous change did it for archives, but
introduced a regression with GNU compatibility for single-output
formatting, resulting in a spurious initial blank line. This is fixed in
this patch too.

Reviewed by: grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D65953

llvm-svn: 368435

[clangd] Added highlighting for constructor initializers.

Summary: Constructor initializers were not being highlighted. This adds highlighting for them by using TraverseConstructorInitializer. Uses the Traverse* because there is no visit for CXXCtorInitializer.

Reviewers: hokein, ilya-biryukov

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D66001

llvm-svn: 368434

[Mips][Codegen] Fix fast-isel mixing of FGR64 and AFGR64 registers

Fast-isel was picking AFGR64 register class for processing call
arguments when +fp64 options was used. We simply check is option +fp64
is used and pick appropriate register.

Patch by Mirko Brkusanin.

Differential Revision: https://reviews.llvm.org/D65886

llvm-svn: 368433

[MCA] Add flag -show-encoding to llvm-mca.

Flag -show-encoding enables the printing of instruction encodings as part of the
the instruction info view.

Example (with flags -mtriple=x86_64--  -mcpu=btver2):

Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[7]: Encoding Size

[1]    [2]    [3]    [4]    [5]    [6]    [7]    Encodings:     Instructions:
1      2     1.00                         4     c5 f0 59 d0    vmulps   %xmm0, %xmm1, %xmm2
1      4     1.00                         4     c5 eb 7c da    vhaddps  %xmm2, %xmm2, %xmm3
1      4     1.00                         4     c5 e3 7c e3    vhaddps  %xmm3, %xmm3, %xmm4

In this example, column Encoding Size is the size in bytes of the instruction
encoding. Column Encodings reports the actual instruction encodings as byte
sequences in hex (objdump style).

The computation of encodings is done by a utility class named mca::CodeEmitter.

In future, I plan to expose the CodeEmitter to the instruction builder, so that
information about instruction encoding sizes can be used by the simulator. That
would be a first step towards simulating the throughput from the decoders in the
hardware frontend.

Differential Revision: https://reviews.llvm.org/D65948

llvm-svn: 368432