review.tizen.org Git - platform/upstream/llvm.git/log

bpf: Hook target feature "alu32" with LLVM

LLVM has supported a new target feature "alu32" which could be enabled or
disabled by "-mattr=[+|-]alu32" when using llc.

This patch link Clang with it, so it could be also done by passing related
options to Clang, for example:

-Xclang -target-feature -Xclang +alu32

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325996

[AMDGPU] Fixed madak.ll test on VI, added GFX10. NFC.

llvm-svn: 325995

[Sema][ObjC] Process category attributes before checking protocol uses

This ensures that any availability attributes are attached to the
category before the availability for the referenced protocols is checked.

rdar://37829755

llvm-svn: 325994

bpf: New disassembler testcases for 32-bit subregister support

This patch test disassembler output for load/store instructions when
-mattr=+alu32 specified for which we want to use "w" register format.

Also, this patch extended the existing insn-unit.s and insn-unit-32.s to
make sure disassemblers for all other instructions are not affected.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325993

bpf: New codegen testcases for 32-bit subregister support

This patch adds some unit tests for 32-bit subregister support.
We want to make sure ALU32, subregister load/store and new peephole
optimization are truely enabled once -mattr=+alu32 specified.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325992

bpf: New optimization pass for eliminating unnecessary i32 promotions

This pass performs peephole optimizations to cleanup ugly code sequences at
MachineInstruction layer.

Currently, the only optimization in this pass is to eliminate type
promotion
sequences for zero extending 32-bit subregisters to 64-bit registers.

If the compiler could prove the zero extended source come from 32-bit
subregistere then it is safe to erase those promotion sequece, because the
upper half of the underlying 64-bit registers were zeroed implicitly
already.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325991

bpf: New decoder namespace for 32-bit subregister load/store

When -mattr=+alu32 passed to the disassembler, use decoder namespace for
32-bit subregister.

This is to disassemble load and store instructions in preferred B format
as described in previous commit:

      w = *(u8 *) (r + off) // BPF_LDX | BPF_B
      w = *(u16 *)(r + off) // BPF_LDX | BPF_H
      w = *(u32 *)(r + off) // BPF_LDX | BPF_W

      *(u8 *) (r + off) = w // BPF_STX | BPF_B
      *(u16 *)(r + off) = w // BPF_STX | BPF_H
      *(u32 *)(r + off) = w // BPF_STX | BPF_W

NOTE: all other instructions should still use the default decoder
      namespace.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325990

bpf: Enable 32-bit subregister support for -mattr=+alu32

After all those preparation patches, now we could enable 32-bit subregister
support once -mattr=+alu32 specified.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325989

bpf: Support 32-bit subregister in various InstrInfo hooks

This patch support 32-bit subregister in three InstrInfo hooks, i.e.
copyPhysReg, loadRegFromStackSlot and storeRegToStackSlot,

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325988

bpf: New instruction patterns for 32-bit subregister load and store

The instruction mapping between eBPF/arm64/x86_64 are:

         eBPF              arm64        x86_64
LD1   BPF_LDX | BPF_B       ldrb        movzbl
LD2   BPF_LDX | BPF_H       ldrh        movzwl
LD4   BPF_LDX | BPF_W       ldr         movl

movzbl/movzwl/movl on x86_64 accept 32-bit sub-register, for example %eax,
the same for ldrb/ldrh on arm64 which accept 32-bit "w" register. And
actually these instructions only accept sub-registers. There is no point
to have LD1/2/4 (unsigned) for 64-bit register, because on these arches,
upper 32-bits are guaranteed to be zeroed by hardware or VM, so load into
the smallest available register class is the best choice for maintaining
type information.

For eBPF we should adopt the same philosophy, to change current
format (A):

  r = *(u8 *) (r + off) // BPF_LDX | BPF_B
  r = *(u16 *)(r + off) // BPF_LDX | BPF_H
  r = *(u32 *)(r + off) // BPF_LDX | BPF_W

  *(u8 *) (r + off) = r // BPF_STX | BPF_B
  *(u16 *)(r + off) = r // BPF_STX | BPF_H
  *(u32 *)(r + off) = r // BPF_STX | BPF_W

into B:

  w = *(u8 *) (r + off) // BPF_LDX | BPF_B
  w = *(u16 *)(r + off) // BPF_LDX | BPF_H
  w = *(u32 *)(r + off) // BPF_LDX | BPF_W

  *(u8 *) (r + off) = w // BPF_STX | BPF_B
  *(u16 *)(r + off) = w // BPF_STX | BPF_H
  *(u32 *)(r + off) = w // BPF_STX | BPF_W

There is no change on encoding nor how should they be interpreted,
everything is as it is, load the specified length, write into low bits of
the register then zeroing all remaining high bits.

The only change is their associated register class and how compiler view
them.

Format A still need to be kept, because eBPF LLVM backend doesn't support
sub-registers at default, but once 32-bit subregister is enabled, it should
use format B.

This patch implemented this together with all those necessary extended load
and truncated store patterns.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325987

bpf: Support i32 in getScalarShiftAmountTy method

getScalarShiftAmount method should be implemented for eBPF backend to make
sure shift amount could still get correct type once 32-bit subregisters
support are enabled.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325986

bpf: Support condition comparison on i32

We need to support condition comparison on i32. All these comparisons are
supposed to be combined into BPF_J* instructions which only support i64.

For ISD::BR_CC we need to promote it to i64 first, then do custom lowering.

For ISD::SET_CC, just expand to SELECT_CC like what's been done for i64.

For ISD::SELECT_CC, we also want to do custom lower for i32. However, after
32-bit subregister support enabled, it is possible the comparison operands
are i32 while the selected value are i64, or the comparison operands are
i64 while the selected value are i32. We need to define extra instruction
pattern and support them in custom instruction inserter.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325985

bpf: Handle i32 for ALU operations without ISA support

There is no eBPF ISA support for BSWAP, ROTR, ROTL, SREM, SDIVREM, MULHU,
ADDC, ADDE etc on i32.

They could be emulated by other basic BPF_ALU operations, we'd set their
lowering action the same as i64.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325984

bpf: New calling convention for 32-bit subregisters

This patch add new calling conventions to allow GPR32RegClass as valid
register class for arguments and return types.

New calling convention will only be choosen when -mattr=+alu32 specified.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325983

bpf: New target attribute "alu32" for 32-bit subregister support

This new attribute aims to control the enablement of 32-bit subregister
support on eBPF backend.

Name the interface as "alu32" is because we in particular want to enable
the generation of BPF_ALU32 instructions by enable subregister support.

This attribute could be used in the following format with llc:

llc -mtriple=bpf -mattr=[+|-]alu32

It is disabled at default.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325982

bpf: Define instruction patterns for extensions and truncations between i32 to i64

For transformations between i32 and i64, if it is explicit signed extension:
  - first cast the operand to i64
  - then use SLL + SRA to finish the extension.

if it is explicit zero extension:
  - first cast the operand to i64
  - then use SLL + SRL to finish the extension.

if it is explicit any extension:
  - just refer to 64-bit register.

if it is explicit truncation:
  - just refer to 32-bit subregister.

NOTE: Some of the zero extension sequences might be unnecessary, they will be
removed by an peephole pass on MachineInstruction layer.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325981

bpf: Tighten the immediate predication for 32-bit alu instructions

These 32-bit ALU insn patterns which takes immediate as one operand were
initially added to enable AsmParser support, and the AsmMatcher uses "ins"
and "outs" fields to deduct the operand constraint.

However, the instruction selector doesn't work the same as AsmMatcher. The
selector will use the "pattern" field for which we are not setting the
predication for immediate operands correctly.

Without this patch, i32 would eventually means all i32 operands are valid,
both imm and gpr, while these patterns should allow imm only.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325980

bpf: Use markSuperRegs to mark reserved registers

markSuperRegs is the canonical helper function used to mark reserved
registers. It could mark any overlapping sub-registers automatically.

Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 325979

[CFG] Try to narrow down MSVC compiler crash via binary search.

Split the presumably offending function in two to see which part of it causes
the crash to occur.

The crash was introduced in r325966.
r325969 did not help.

llvm-svn: 325978

[analyzer] Relax the assert used when traversing the node graph.

The assertion gets exposed when changing the exploration order.
This is a quick hacky fix, but the intention is that if the nodes do
merge, it should not matter which predecessor should be traverse.
A proper fix would be not to traverse predecessors at all, as all
information relevant for any decision should be avilable locally.

rdar://37540480

Differential Revision: https://reviews.llvm.org/D42773

llvm-svn: 325977

[analyzer] mark returns of functions where the region passed as parameter was not initialized

In the wild, many cases of null pointer dereference, or uninitialized
value read occur because the value was meant to be initialized by the
inlined function, but did not, most often due to error condition in the
inlined function.
This change highlights the return branch taken by the inlined function,
in order to help user understand the error report and see why the value
was uninitialized.

rdar://36287652

Differential Revision: https://reviews.llvm.org/D41848

llvm-svn: 325976

[analyzer] Consider switch- and goto- labels when constructing the set of executed lines

When viewing the report in the collapsed mode the label signifying where
did the execution go is often necessary for properly understanding the
context.

Differential Revision: https://reviews.llvm.org/D43145

llvm-svn: 325975

Fix a compiler warning in ModuleCacheTest.cpp, NFC

llvm-svn: 325974

[DebugInfo] Add remaining files to r325970

Add files which I missed in the original check-in

llvm-svn: 325973

[PowerPC] Disable shrink-wrapping when getting PC address through the LR

The instruction sequence used to get the address of the PC into a GPR requires
that we clobber the link register. Doing so without having first saved it in
the prologue leaves the function unable to return. Currently, this sequence is
emitted into the entry block. To ensure the prologue is inserted before this
sequence, disable shrink-wrapping.

This fixes PR33547.

Differential Revision: https://reviews.llvm.org/D43677

llvm-svn: 325972

[MemorySSA] Fix a cache invalidation bug with removed accesses

I suspect there's a deeper issue here, but we probably shouldn't be
using INVALID_MEMORYSSA_ID as liveOnEntry's ID anyway.

llvm-svn: 325971

[DebugInfo] Support DWARF v5 source code embedding extension

In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.

Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.

Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.

Differential Revision: https://reviews.llvm.org/D42765

llvm-svn: 325970

[CFG] NFC: Speculative attempt to fix MSVC internal compiler error on buildbot.

Don't use fancy initialization and member access in a DenseMap.

llvm-svn: 325969

[InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFC

llvm-svn: 325968

Sink the verification code around the assert where it's handled and wrap in NDEBUG.

This has the advantage of making release only builds more warning
free and there's no need to make this routine a class function if
it isn't using class members anyhow.

llvm-svn: 325967

[CFG] [analyzer] NFC: Allow more complicated construction contexts.

ConstructionContexts introduced in D42672 are an additional piece of information
included with CFGConstructor elements that help the client of the CFG (such as
the Static Analyzer) understand where the newly constructed object is stored.

The patch refactors the ConstructionContext class to prepare for including
multi-layered contexts that are being constructed gradually, layer-by-layer,
as the AST is traversed.

Differential Revision: https://reviews.llvm.org/D43428

llvm-svn: 325966

[InstSimplify] sqrt(X) * sqrt(X) --> X

This was misplaced in InstCombine. We can loosen the FMF as a follow-up step.

llvm-svn: 325965

[Utility] Simplify and generalize the CleanUp helper, NFC

Removing the template arguments and most of the mutating methods from
CleanUp makes it easier to understand and reuse.

In its present state, CleanUp would be too cumbersome to adapt to cases
where multiple objects need to be released. Take for example this change
in swift-lldb:

https://github.com/apple/swift-lldb/pull/334/files#diff-6f474df750f75c8ba675f2a8408a5629R219

This change is simple to express with the new CleanUp, but not so simple
with the old version.

Differential Revision: https://reviews.llvm.org/D43662

llvm-svn: 325964

[ELF] Fix IsPreemptible comment and typo. NFC

llvm-svn: 325963

Intrinsics calls should avoid the PLT when "RtLibUseGOT" metadata is present.

Differential Revision: https://reviews.llvm.org/D42216

llvm-svn: 325962

Set Module Metadata "RtLibUseGOT" when fno-plt is used.

Differential Revision: https://reviews.llvm.org/D42217

llvm-svn: 325961

[InstCombine] allow fmul-sqrt folds with less than full -ffast-math

Also, add a Builder method for intrinsics to reduce code duplication for clients.

llvm-svn: 325960

Simplify a DEBUG statement to remove a set but not used variable in release builds.

llvm-svn: 325959

Fix breakpoint thread name conditionals after breakpoint options refactor.

PR36435

llvm-svn: 325958

[X86] Add assembler/disassembler support for blendm with zero masking and broacast.

Fixes PR31617

llvm-svn: 325957

[Power9] Add missing instructions to the Power 9 scheduler

This is the first in a series of patches that will define more
instructions using InstRW so that we can move away from ItinRW
and ultimately have a complete Power 9 scheduler.

Differential Revision: https://reviews.llvm.org/D43635

llvm-svn: 325956

Remove dead code.

llvm-svn: 325955

[Hexagon] Recognize non-immediate constants in HexagonConstPropagation

llvm-svn: 325954

Inline printHelp.

Differential Revision: https://reviews.llvm.org/D43526

llvm-svn: 325953

Handle --version before handling --mllvm.

Because it's a waste of time to handle --mllvm before --version.

Differential Revision: https://reviews.llvm.org/D43527

llvm-svn: 325952

Fixed unused variable warning. NFCI.

llvm-svn: 325950

[X86] Add DAG combine to remove (and X, 1) from in front of a v1i1 scalar to vector.

These can be created by type legalization promoting the inputs to select to match scalar boolean contents.

We were trying to pattern match them away during isel, but its better to just remove them from the DAG.

I've cleaned up some patterns to not check for this 'and' anymore. But I suspect this has also opened up opportunities for pattern removal.

llvm-svn: 325949

Inline a trivial ctor.

Differential Revision: https://reviews.llvm.org/D43525

llvm-svn: 325948

[WebAssembly] Fix macro metaprogram to not duplicate code as much.

No functionality change intended.

llvm-svn: 325947

Because of CVE-2018-6574, some compiler options and linker options are restricted to prevent arbitrary code execution.

https://github.com/golang/go/issues/23672

By this change, building a Go code with LLVM Go bindings causes a compilation error as follows.

go build llvm.org/llvm/bindings/go/llvm: invalid flag in #cgo LDFLAGS: -Wl,-headerpad_max_install_names

llvm-go tool generates cgo LDFLAGS directive from `llvm-config --ldflags` and it contains -Wl,option options. But -Wl,option is banned by default. To avoid this problem, we need to set $CGO_LDFLAGS_ALLOW environment variable to notify a compiler that the flags should be allowed.

$ export CGO_LDFLAGS_ALLOW='-Wl,(-search_paths_first|-headerpad_max_install_names)'

By default for go 1.10 and go 1.9.5 these options should appear in the accepted set of options, however, if you're running into the error it's useful to have this documented.

Patch by Ryuichi Hayashida

llvm-svn: 325946

[Driver] Make -fno-common default for Fuchsia

We never want to generate common symbols on Fuchsia.

Differential Revision: https://reviews.llvm.org/D43545

llvm-svn: 325945

[X86][SSE] Generalize x > C-1 ? x+-C : 0 --> subus x, C combine for non-uniform constants

llvm-svn: 325944

Really fix test on windows.

Sorry for the noise.

llvm-svn: 325943

Fix one last test on a windows host.

llvm-svn: 325942

Shrink various scheduling tables by using narrower types.

16 bits ought to be enough for everyone. This shrinks clang by ~1MB.

llvm-svn: 325941

Bring r325915 back.

The tests that failed on a windows host have been fixed.

Original message:

Start setting dso_local for COFF.

With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.

llvm-svn: 325940

[PATCH] [AArch64] Add new target feature to fuse conditional select

This feature enables the fusion of the comparison and the conditional select
instructions together.

Differential revision: https://reviews.llvm.org/D42392

llvm-svn: 325939

Fix compiler warning introduced in r325931. NFC.

llvm-svn: 325938

[Test] Fix the test to output to /dev/null instead of redirecting.

The redirection was confusing the windows build machine.

llvm-svn: 325937

[X86][SSE] Add x > C-1 ? x+-C : 0 --> subus x, C test caaes for non-uniform constants

llvm-svn: 325936

[MemorySSA] Use fewer magic numbers. NFC

INVALID_MEMORYACCESS_ID == 0.

This patch also makes this initialization consistent with the rest of
the "invalid" ones in this file.

llvm-svn: 325935

[MemorySSA] Reduce padding in MemoryDefs. NFC

llvm-svn: 325934

[X86] Custom split v32i16/v64i8 bitcasts when AVX512F is available, but BWI is not.

The test changes you can see are related to the changes in ReplaceNodeResults. Though shuffle-vs-trunc-512.ll does have a test that exercises the code in LowerBITCAST. Looks like the test output didn't change because DAG combining is able to clean up the resulting type legalization. Adding the custom hook just makes type legalization work less hard.

Differential Revision: https://reviews.llvm.org/D43447

llvm-svn: 325933

[MachineOperand][Target] MachineOperand::isRenamable semantics changes

Summary:
Add a target option AllowRegisterRenaming that is used to opt in to
post-register-allocation renaming of registers. This is set to 0 by
default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq
fields of all opcodes to be set to 1, causing
MachineOperand::isRenamable to always return false.

Set the AllowRegisterRenaming flag to 1 for all in-tree targets that
have lit tests that were effected by enabling COPY forwarding in
MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC,
RISCV, Sparc, SystemZ and X86).

Add some more comments describing the semantics of the
MachineOperand::isRenamable function and how it is set and maintained.

Change isRenamable to check the operand's opcode
hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of
relying on it being consistently reflected in the IsRenamable bit
setting.

Clear the IsRenamable bit when changing an operand's register value.

Remove target code that was clearing the IsRenamable bit when changing
registers/opcodes now that this is done conservatively by default.

Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in
one place covering all opcodes that have constant pipe read limit
restrictions.

Reviewers: qcolombet, MatzeB

Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D43042

llvm-svn: 325931

Convert test to FileCheck. NFC.

llvm-svn: 325930

Revert "Start setting dso_local for COFF."

This reverts commit r325915.

It will take some time to fix the failures on a windows host.

llvm-svn: 325929

[Darwin] Add a test to make sure clang emits __apple accelerator tables.

llvm-svn: 325928

Replace HashStringUsingDJB with llvm::djbHash

Summary:
The llvm function is equivalent to this one. Where possible I tried to
replace const char* with llvm::StringRef to avoid extra strlen
computations. In most places, I was able to track the c string back to
the ConstString it was created from.

I also create a test that verifies we are able to lookup names with
unicode characters, as a bug in the llvm compiler (it accidentally used
a different hash function) meant this was not working until recently.

This also removes the unused ExportTable class.

Reviewers: aprantl, davide

Subscribers: JDevlieghere, lldb-commits

Differential Revision: https://reviews.llvm.org/D43596

llvm-svn: 325927

[Debug] Add dbg.value intrinsics for PHIs created during LCSSA.

Summary:
This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA.
I noticed a case where a value carried across a loop was reported as <optimized out>.

Specifically this case:
```
int bar(int x, int y) {
  return x + y;
}

int foo(int size) {
  int val = 0;
  for (int i = 0; i < size; ++i) {
    val = bar(val, i);  // Both val and i are correct
  }
  return val; // <optimized out>
}
```

In the above case, after all of the interesting computation completes our value
is reported as "optimized out." This change will add a dbg.value to correct this.

This patch also moves the dbg.value insertion routine from LoopRotation.cpp
into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA).

Reviewers: mzolotukhin, aprantl, vsk, davide

Reviewed By: aprantl, vsk

Subscribers: dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D42551

llvm-svn: 325926

[BPI] Detect branches in loops that make themselves not taken

If we have a loop like this:
int n = 0;
while (...) {
  if (++n >= MAX) {
    n = 0;
  }
}
then the body of the 'if' statement will only be executed once every MAX
iterations. Detect this by looking for branches in loops where taking the branch
makes the branch condition evaluate to 'not taken' in the next iteration of the
loop, and reduce the probability of such branches.

This slightly improves EEMBC benchmarks on cortex-m4/cortex-m33 due to making
better choices in if-conversion, but has no effect on any other cpu/benchmark
that I could detect.

Differential Revision: https://reviews.llvm.org/D35804

llvm-svn: 325925

[InstCombine] refactor fmul with negated op folds; NFCI

The existing code was inefficiently looking for 'nsz' variants.
That's unnecessary because we canonicalize those to the expected
form with -0.0.

We may also want to adjust or remove the fold that sinks negation.
We don't do that for fdiv (or integer ops?). That should be uniform?
It may also lead to missed optimization as in PR21914:
https://bugs.llvm.org/show_bug.cgi?id=21914
...or we just have to fix other passes to avoid that problem.

llvm-svn: 325924

[InstCombine] use FMF-copying functions to reduce code; NFCI

llvm-svn: 325923

[OMPT] Fix parallel_data in implicit barrier-end

This is required to be NULL for implicit barriers at the end of a
parallel region. Noticed in review of D43191.

Differential Revision: https://reviews.llvm.org/D43308

llvm-svn: 325922

[OMPT] Fix test tasks/serialized.c with optimization

The compiler inlines the user code in the task. Check for that case at
runtime by comparing the frame addresses and print the expected exit
address.

Also showcase how I think the OMPT tests could be reformatted to match
LLVM's code style. In my opinion it would be great to that kind of change
to all tests that need to be touched for whatever reason...

Differential Revision: https://reviews.llvm.org/D43191

llvm-svn: 325921

Revert "[Darwin] Add a test to check clang produces accelerator tables."

This reverts commit 7e24e5f8bff77b7e78da3bfcc68abf42457a66c9.
aka r325850. Clang should not have end-to-end tests.

llvm-svn: 325920

[X86] Regenerate i128 multiply tests

llvm-svn: 325919

[PowerPC] Code cleanup. Remove instructions that were withdrawn from Power 9.

The following set of instructions was originally planned to be added for Power 9
and so code was added to support them. However, a decision was made later on to
withdraw support for these instructions in the hardware.
xscmpnedp
xvcmpnesp
xvcmpnedp
This patch removes support for the instructions that were not added.

Differential Revision: https://reviews.llvm.org/D43641

llvm-svn: 325918

[mips] finish removal of unused fields in MipsInstructionSelector

r325916 missed to remove calls in constructor.

llvm-svn: 325917

[mips] remove unused fields in MipsInstructionSelector

Unused fields cause buildbreak if -Werror,-Wunused-private-field is passed.

llvm-svn: 325916

Start setting dso_local for COFF.

With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.

llvm-svn: 325915

Allow passing additional compiler/linker flags for the tests

Summary:
These flags can be specified using the CMake variables
LIBCXX_TEST_LINKER_FLAGS and LIBCXX_TEST_COMPILER_FLAGS.
When building the tests for CHERI I need to pass additional
flags (such as -mabi=n64 or -mabi=purecap) to the compiler
for our test configurations

Reviewers: EricWF

Reviewed By: EricWF

Subscribers: christof, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D42139

llvm-svn: 325914

Support for the mno-stack-arg-probe flag

Adds support for this flag. There is also another piece for llvm
(separate review). More info:
https://bugs.llvm.org/show_bug.cgi?id=36221

By Ruslan Nikolaev!

Differential Revision: https://reviews.llvm.org/D43108

llvm-svn: 325901

Support for the mno-stack-arg-probe flag

Adds support for this flag. There is also another piece for clang
(separate review). More info:
https://bugs.llvm.org/show_bug.cgi?id=36221

By Ruslan Nikolaev!

Differential Revision: https://reviews.llvm.org/D43107

llvm-svn: 325900

[mips] Revert r325872

There are still outstanding issues with byVal arguments
that prevent this from being committed. Revert for now.

llvm-svn: 325899

[SystemZ] Also update the CHECK line for VPDI

llvm-svn: 325898

[SystemZ] Fix VPDI argument in test.

To select element 1 from each half with VPDI, a constant of 5 should be used.

llvm-svn: 325897

[X86][F16C] Regenerate half conversion tests

llvm-svn: 325896

llvm-config: Add advapi32 to --system-libs on Windows (PR36372)

llvm-svn: 325894

[WebAssembly] NDEBUG is spelled without a leading underscore.

llvm-svn: 325893

[DAGCOmbine] Ensure that (brcond (setcc ...)) is handled in a canonical manner.

Summary:
There are transformation that change setcc into other constructs, and transform that try to reconstruct a setcc from the brcond condition. Depending on what order these transform are done, the end result differs.

Most of the time, it is preferable to get a setcc as a brcond argument (and this is why brcond try to recreate the setcc in the first place) so we ensure this is done every time by also doing it at the setcc level when the only user is a brcond.

Reviewers: spatel, hfinkel, niravd, craig.topper

Subscribers: nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D41235

llvm-svn: 325892

Revert "TableGen: Fix typeIsConvertibleTo for record types"

This reverts r325884.

Clang's TableGen has dependencies on the exact ordering of superclasses.
Revert this change fully for now to fix the build.

Change-Id: Ib297f5571cc7809f00838702ad7ab53d47335b26
llvm-svn: 325891

[ELF][MIPS] Set EI_ABIVERSION flag accordingly to MIPS ABIs requirement

MIPS ABIs require that if an executable file uses non-PIC model, the
EI_ABIVERSION entry in the ELF header should be incremented from 0 to 1.
That allows obsoleted / limited dynamic linkers refuse to link them.

llvm-svn: 325890

[MIPS GlobalISel] Adding GlobalISel

Add GlobalISel infrastructure up to the point where we can select a ret
void.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D43583

llvm-svn: 325888

[ELF] - Do not remove empty output sections that are explicitly assigned to phdr in script.

This continues direction started in D43069.

We can keep sections that are explicitly assigned to segment in script.
It helps to simplify code.

Differential revision: https://reviews.llvm.org/D43571

llvm-svn: 325887

TableGen: Avoid using resolveListElementReference in TGParser

A subsequent change intends to remove resolveListElementReference
entirely. This part of the removal can be split out for better
bisectability.

Change-Id: Ibd762d88fd2d1e2cc116a259e2a27a5e9f9a8b10

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43561

Change-Id: Ifb695041cef1964ad8a3102f448249501a9243f0
llvm-svn: 325886

TableGen: BitInit and VarBitInit are typed

Summary: Change-Id: I54e337a0b525e9649534bc5f90e5e07c0772e334

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43560

Change-Id: I07f78e793192974c2b90690ce644589fe4891e41
llvm-svn: 325885

TableGen: Fix typeIsConvertibleTo for record types

Summary:
Only check whether the left-hand side type is a subclass (or equal to)
the right-hand side type.

This requires a further fix in handling !if expressions and in type
resolution.

Furthermore, reverse the order of superclasses so that resolveTypes will
find a least common ancestor at least in simple cases.

Add a test that used to be accepted without flagging the obvious type
error.

Change-Id: Ib366db1a4e6a079f1a0851e469b402cddae76714

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43559

llvm-svn: 325884

TableGen: Add !size operation

Summary:
Returns the size of a list. I have found this to be rather useful in some
development for the AMDGPU backend where we could simplify our .td files
by concatenating list<LLVMType> for complex intrinsics. Doing so requires
us to compute the position argument for LLVMMatchType.

Basically, the usage is in a pattern that looks somewhat like this:

    list<LLVMType> argtypes =
        !listconcat(base,
                    [llvm_any_ty, LLVMMatchType<!size(base)>]);

Change-Id: I360a0b000fd488d18bea412228230fd93722bd2c

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits, tpr

Differential Revision: https://reviews.llvm.org/D43553

llvm-svn: 325883

AMDGPU: Track physreg uses in SILoadStoreOptimizer

Summary:
This handles def-after-use of physregs, and allows us to merge loads and
stores even across some physreg defs (typically M0 defs).

Change-Id: I076484b2bda27c2cf46013c845a0380c5b89b67b

Reviewers: arsenm, mareko, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D42647

llvm-svn: 325882

StructurizeCFG: Test for branch divergence correctly

Summary:
This fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform.

This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.

Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4

Reviewers: arsenm, rampitec, jlebar

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D40546

llvm-svn: 325881