James Park [Tue, 1 Dec 2020 22:28:46 +0000 (14:28 -0800)]
Avoid redundant inline with LLVM_ATTRIBUTE_ALWAYS_INLINE
Fix MSVC warning when __forceinline is paired with inline.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D85264
Jez Ng [Tue, 1 Dec 2020 05:07:16 +0000 (21:07 -0800)]
[lld-macho] Extend PIE option handling
* Enable PIE by default if targeting 10.6 or above on x86-64. (The
manpage says 10.7, but that actually applies only to i386, and in
general varies based on the target platform. I didn't update the
manpage because listing all the different behaviors would make for a
pretty long description.)
* Add support for `-no_pie`
* Remove `HelpHidden` from `-pie`
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D92362
David Blaikie [Tue, 1 Dec 2020 21:23:30 +0000 (13:23 -0800)]
Revert "[FastISel] Flush local value map on ever instruction" and dependent patches
This reverts commit
cf1c774d6ace59c5adc9ab71b31e762c1be695b1.
This change caused several regressions in the gdb test suite - at least
a sample of which was due to line zero instructions making breakpoints
un-lined. I think they're worth investigating/understanding more (&
possibly addressing) before moving forward with this change.
Revert "[FastISel] NFC: Clean up unnecessary bookkeeping"
This reverts commit
3fd39d3694d32efa44242c099e923a7f4d982095.
Revert "[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option"
This reverts commit
a474657e30edccd9e175d92bddeefcfa544751b2.
Revert "Remove static function unused after cf1c774."
This reverts commit
dc35368ccf17a7dca0874ace7490cc3836fb063f.
Revert "[lldb] Fix TestThreadStepOut.py after "Flush local value map on every instruction""
This reverts commit
53a14a47ee89dadb8798ca8ed19848f33f4551d5.
Michał Górny [Tue, 1 Dec 2020 22:00:54 +0000 (23:00 +0100)]
[lldb] [test] Reenable two passing tests on FreeBSD
[Reenable TestReproducerAttach and TestThreadSpecificBpPlusCondition
on FreeBSD -- both seem to pass correctly now.
Muhammad Omair Javaid [Tue, 1 Dec 2020 22:09:14 +0000 (03:09 +0500)]
Make offset field optional in RegisterInfo packet for Arm64
This patch carries forward our aim to remove offset field from qRegisterInfo
packets and XML register description. I have created a new function which
returns if offset fields are dynamic meaning client can calculate offset on
its own based on register number sequence and register size. For now this
function only returns true for NativeRegisterContextLinux_arm64 but we can
test this for other architectures and make it standard later.
As a consequence we do not send offset field from lldb-server (arm64 for now)
while other stubs dont have an offset field so it wont effect them for now.
On the client side we have replaced previous offset calculation algorithm
with a new scheme, where we sort all primary registers in increasing
order of remote regnum and then calculate offset incrementally.
This committ also includes a test to verify all of above functionality
on Arm64.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D91241
Muhammad Omair Javaid [Tue, 1 Dec 2020 22:09:14 +0000 (03:09 +0500)]
RegisterInfoPOSIX_arm64 remove unused bytes from g/G packet
This came up while putting together our new strategy to create g/G packets
in compliance with GDB RSP protocol where register offsets are calculated in
increasing order of register numbers without any unused spacing.
RegisterInfoPOSIX_arm64::GPR size was being calculated after alignment
correction to 8 bytes which meant there was a 4 bytes unused space between
last gpr (cpsr) and first vector register V. We have put LLVM_PACKED_START
decorator on RegisterInfoPOSIX_arm64::GPR to make sure single byte
alignment is enforced. Moreover we are now doing to use arm64 user_pt_regs
struct defined in ptrace.h for accessing ptrace user registers.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D92063
cchen [Tue, 1 Dec 2020 22:07:00 +0000 (16:07 -0600)]
[OpenMP51][DOCS] Claim "add present modifier in defaultmap clause", NFC.
Arthur Eubanks [Wed, 25 Nov 2020 04:40:47 +0000 (20:40 -0800)]
Reland [CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/
This allows us to use its value everywhere, rather than just clang. Some
other places, like opt and lld, will use its value soon.
Rename it internally to LLVM_ENABLE_NEW_PASS_MANAGER.
The #define for it is now in llvm-config.h.
The initial land accidentally set the value of
LLVM_ENABLE_NEW_PASS_MANAGER to the string
ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER instead of its value.
Reviewed By: rnk, hans
Differential Revision: https://reviews.llvm.org/D92072
Louis Dionne [Tue, 1 Dec 2020 21:49:48 +0000 (16:49 -0500)]
[libc++] NFC: Remove unused macros in <__config>
Arthur Eubanks [Tue, 1 Dec 2020 19:51:04 +0000 (11:51 -0800)]
[LLD][ELF][NewPM] Add option to force legacy PM
In preparation for the NPM switch.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92417
Rahul Joshi [Tue, 1 Dec 2020 19:19:59 +0000 (11:19 -0800)]
[MLIR] Fix genTypeInterfaceMethods() to work correctly with InferTypeOpInterface
- Change InferTypeOpInterface::inferResultTypes to use fully qualified types matching
the ones generated by genTypeInterfaceMethods, so the redundancy can be detected.
- Move genTypeInterfaceMethods() before genOpInterfaceMethods() so that the
inferResultTypes method generated by genTypeInterfaceMethods() takes precedence
over the declaration that might be generated by genOpInterfaceMethods()
- Modified an op in the test dialect to exercise this (the modified op would fail to
generate valid C++ code due to duplicate inferResultTypes methods).
Differential Revision: https://reviews.llvm.org/D92414
Arthur Eubanks [Tue, 1 Dec 2020 21:12:12 +0000 (13:12 -0800)]
Revert "[CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/"
The new pass manager was accidentally enabled by default with this change.
This reverts commit
a36bd4c90dcca82be9b64f65dbd22e921b6485ef.
Zahira Ammarguellat [Tue, 1 Dec 2020 20:34:18 +0000 (12:34 -0800)]
Fix erroneous edit in https://github.com/llvm/llvm-project/actions/runs/
394499364
Arthur Eubanks [Tue, 1 Dec 2020 20:22:27 +0000 (12:22 -0800)]
[LTO][wasm][NewPM] Allow using new pass manager for wasm LTO
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D92150
Terry Wilmarth [Tue, 1 Dec 2020 20:03:40 +0000 (14:03 -0600)]
[OpenMP] Add support for Intel's umonitor/umwait
These changes add support for Intel's umonitor/umwait usage in wait
code, for architectures that support those intrinsic functions. Usage of
umonitor/umwait is off by default, but can be turned on by setting the
KMP_USER_LEVEL_MWAIT environment variable.
Differential Revision: https://reviews.llvm.org/D91189
ergawy [Tue, 1 Dec 2020 20:06:33 +0000 (20:06 +0000)]
[MLIR][LLVM] Fix a tiny typo in the dialect docs.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D92333
Sylvain Audi [Mon, 30 Nov 2020 16:56:37 +0000 (11:56 -0500)]
[clang-scan-deps] Improve argument parsing to find target object file path.
Support the joined version of -o (-ofilepath), and ensure we use the last provided -o option.
Differential Revision: https://reviews.llvm.org/D92330
Arthur Eubanks [Wed, 25 Nov 2020 04:40:47 +0000 (20:40 -0800)]
[CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/
This allows us to use its value everywhere, rather than just clang. Some
other places, like opt and lld, will use its value soon.
The #define for it is now in llvm-config.h.
Reviewed By: rnk, hans
Differential Revision: https://reviews.llvm.org/D92072
Nico Weber [Tue, 1 Dec 2020 19:35:21 +0000 (14:35 -0500)]
[gn build] sync script: try to make sync script even clearer
Turns out startswith() takes an optional start parameter :)
No behavior change.
Layton Kifer [Tue, 1 Dec 2020 19:09:04 +0000 (22:09 +0300)]
[DAGCombiner][NFC] Replace duplicate implementation flipBoolean with DAG.getLogicalNOT
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D92246
Saleem Abdulrasool [Mon, 30 Nov 2020 23:54:08 +0000 (23:54 +0000)]
APINotes: constify `dump` methods (NFC)
This simply marks the functions as const as they do not mutate the
value. This is useful for debugging iterations during development.
NFCI.
Zahira Ammarguellat [Fri, 6 Nov 2020 14:38:22 +0000 (06:38 -0800)]
Argument dependent lookup with class argument is recursing into base
classes that haven't been instantiated. This is generating an assertion
in DeclTemplate.h. Fix for Bug25668.
Fangrui Song [Tue, 1 Dec 2020 18:33:18 +0000 (10:33 -0800)]
static const char *const foo => const char foo[]
By default, a non-template variable of non-volatile const-qualified type
having namespace-scope has internal linkage, so no need for `static`.
Fangrui Song [Tue, 1 Dec 2020 18:22:32 +0000 (10:22 -0800)]
[ELF][test] Fix lto/version-script2.ll
Arthur Eubanks [Tue, 1 Dec 2020 18:14:38 +0000 (10:14 -0800)]
[LTO][NewPM] Run verifier when doing LTO
This matches the legacy PM.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D92138
Bardia Mahjour [Tue, 1 Dec 2020 17:48:36 +0000 (12:48 -0500)]
Revert "[LV] Epilogue Vectorization with Optimal Control Flow"
This reverts commit
9c5504adceb544d9954ddb8ff3035a414f4b1423.
Reverting to investigate build failure in http://lab.llvm.org:8011/#/builders/98/builds/1461/steps/9
Louis Dionne [Tue, 24 Nov 2020 17:29:08 +0000 (12:29 -0500)]
[libc++] Optimize the number of assignments in std::exclusive_scan
Reported in https://twitter.com/blelbach/status/
1169807347142676480
Differential Revision: https://reviews.llvm.org/D67273
Rahman Lavaee [Tue, 1 Dec 2020 17:20:34 +0000 (09:20 -0800)]
Let .llvm_bb_addr_map section use the same unique id as its associated .text section.
Currently, `llvm_bb_addr_map` sections are generated per section names because we use
the `LinkedToSymbol` argument of getELFSection. This will cause the address map tables of functions
grouped into the same section when `-function-sections=true -unique-section-names=false` which is not
the intended behaviour. This patch lets the unique id of every `.text` section propagate to the associated
`.llvm_bb_addr_map` section.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92113
Nikita Popov [Tue, 1 Dec 2020 17:19:40 +0000 (18:19 +0100)]
[BasicAA] Add test for suboptimal result with unknown sizes (NFC)
Roman Lebedev [Tue, 1 Dec 2020 16:50:56 +0000 (19:50 +0300)]
[NFC][clang-tidy] Port rename_check.py to Python3
Nico Weber [Tue, 1 Dec 2020 16:46:15 +0000 (11:46 -0500)]
clang/darwin: Use response files with ld64.lld.darwinnew
The new MachO lld just grew support for response files in D92149, so let
the clang driver use it.
Differential Revision: https://reviews.llvm.org/D92399
Bardia Mahjour [Tue, 1 Dec 2020 16:57:16 +0000 (11:57 -0500)]
[LV] Epilogue Vectorization with Optimal Control Flow
This is yet another attempt at providing support for epilogue
vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none
and reviews D30247 and D88819.
Similar to D88819, this patch achieve epilogue vectorization by
executing a single vplan twice: once on the main loop and a second
time on the epilogue loop (using a different VF). However it's able
to handle more loops, and generates more optimal control flow for
cases where the trip count is too small to execute any code in vector
form.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D89566
Fangrui Song [Tue, 1 Dec 2020 16:59:54 +0000 (08:59 -0800)]
[ELF] Error for undefined foo@v1
If an object file has an undefined foo@v1, we emit a dynamic symbol foo.
This is incorrect if at runtime a shared object provides the non-default version foo@v1
(the undefined foo may bind to foo@@v2, for example).
GNU ld issues an error for this case, even if foo@v1 is undefined weak
(https://sourceware.org/bugzilla/show_bug.cgi?id=3351). This behavior makes
sense because to represent an undefined foo@v1, we have to construct a Verneed
entry. However, without knowing the defining filename, we cannot construct a
Verneed entry (Verneed::vn_file is unavailable).
This patch implements the error.
Depends on D92258
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D92260
Nikita Popov [Fri, 2 Oct 2020 19:41:19 +0000 (21:41 +0200)]
[MemCpyOpt] Port to MemorySSA
This is a straightforward port of MemCpyOpt to MemorySSA following
the approach of D26739. MemDep queries are replaced with MSSA queries
without changing the overall structure of the pass. Some care has
to be taken to account for differences between these APIs
(MemDep also returns reads, MSSA doesn't).
Differential Revision: https://reviews.llvm.org/D89207
Fangrui Song [Tue, 1 Dec 2020 16:54:01 +0000 (08:54 -0800)]
[ELF] Make foo@@v1 resolve undefined foo@v1
The symbol resolution rules for versioned symbols are:
* foo@@v1 (default version) resolves both undefined foo and foo@v1
* foo@v1 (non-default version) resolves undefined foo@v1
Note, foo@@v1 must be defined (the assembler errors if attempting to
create an undefined foo@@v1).
For defined foo@@v1 in a shared object, we call `SymbolTable::addSymbol` twice,
one for foo and the other for foo@v1. We don't do the same for object files, so
foo@@v1 defined in one object file incorrectly does not resolve a foo@v1
reference in another object file.
This patch fixes the issue by reusing the --wrap code to redirect symbols in
object files. This has to be done after processing input files because
foo and foo@v1 are two separate symbols if we haven't seen foo@@v1.
Add a helper `Symbol::getVersionSuffix` to retrieve the optional trailing
`@...` or `@@...` from the possibly truncated symbol name.
Depends on D92258
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D92259
Fangrui Song [Tue, 1 Dec 2020 16:49:14 +0000 (08:49 -0800)]
[ELF][test] Add some tests for versioned symbols in object files
Test the symbol resolution related to
* defined foo@@v1 and foo@v1 in object files/shared objects
* undefined foo@v1
* weak foo@@v1 and foo@v1
* visibility
* interaction with --wrap.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D92258
Fangrui Song [Tue, 1 Dec 2020 16:39:00 +0000 (08:39 -0800)]
[X86] Support modifier @PLTOFF for R_X86_64_PLTOFF64
`gcc -mcmodel=large` can emit @PLTOFF.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D92294
Juneyoung Lee [Tue, 1 Dec 2020 16:01:42 +0000 (01:01 +0900)]
[InstSimplify] Add tests that fold instructions with poison operands (NFC)
Clement Courbet [Tue, 1 Dec 2020 08:44:23 +0000 (09:44 +0100)]
[MergeICmps] Fix missing split.
We were not correctly splitting a blocks for chains of length 1.
Before that change, additional instructions for blocks in chains of
length 1 were not split off from the block before removing (this was
done correctly for chains of longer size).
If this first block contained an instruction referenced elsewhere,
deleting the block, would result in invalidation of the produced value.
This caused a miscompile which motivated D92297 (before D17993,
nonnull and dereferenceable attributed were not added so MergeICmps were
not triggered.) The new test gep-references-bb.ll demonstrate the issue.
The regression was introduced in
rG0efadbbcdeb82f5c14f38fbc2826107063ca48b2.
This supersedes D92364.
Test case by MaskRay (Fangrui Song).
Differential Revision: https://reviews.llvm.org/D92375
Aaron En Ye Shi [Tue, 1 Dec 2020 15:46:19 +0000 (15:46 +0000)]
[HIP] Fix static-lib test CHECK bug
Fix hip test failures that were introduced by
previous changes to hip-toolchain-rdc-static-lib.hip
test. The .*lld.* is matching a longer string than
expected.
Differential Revision: https://reviews.llvm.org/D92342
Sanjay Patel [Tue, 1 Dec 2020 15:35:24 +0000 (10:35 -0500)]
[x86] adjust cost model values for minnum/maxnum with fast-math-flags
Without FMF, we lower these intrinsics into something like this:
vmaxsd %xmm0, %xmm1, %xmm2
vcmpunordsd %xmm0, %xmm0, %xmm0
vblendvpd %xmm0, %xmm1, %xmm2, %xmm0
But if we can ignore NANs, the single min/max instruction is enough
because there is no need to fix up the x86 logic that corresponds to
X > Y ? X : Y.
We probably want to make other adjustments for FP intrinsics with FMF
to account for specialized codegen (for example, FSQRT).
Differential Revision: https://reviews.llvm.org/D92337
Benjamin Kramer [Tue, 1 Dec 2020 15:29:02 +0000 (16:29 +0100)]
[DAG] Remove unused variable. NFC.
David Green [Tue, 1 Dec 2020 15:05:55 +0000 (15:05 +0000)]
[ARM] Mark select and selectcc of MVE vector operations as expand.
We already expand select and select_cc in codegenprepare, but they can
still be generated under some situations. Explicitly mark them as expand
to ensure they are not produced, leading to a failure to select the
nodes.
Differential Revision: https://reviews.llvm.org/D92373
Sanjay Patel [Tue, 1 Dec 2020 13:51:19 +0000 (08:51 -0500)]
[InstCombine] canonicalize sign-bit-shift of difference to ext(icmp)
icmp is the preferred spelling in IR because icmp analysis is
expected to be better than any other analysis. This should
lead to more follow-on folding potential.
It's difficult to say exactly what we should do in codegen to
compensate. For example on AArch64, which of these is preferred:
sub w8, w0, w1
lsr w0, w8, #31
vs:
cmp w0, w1
cset w0, lt
If there are perf regressions, then we should deal with those in
codegen on a case-by-case basis.
A possible motivating example for better optimization is shown in:
https://llvm.org/PR43198 but that will require other transforms
before anything changes there.
Alive proof:
https://rise4fun.com/Alive/o4E
Name: sign-bit splat
Pre: C1 == (width(%x) - 1)
%s = sub nsw %x, %y
%r = ashr %s, C1
=>
%c = icmp slt %x, %y
%r = sext %c
Name: sign-bit LSB
Pre: C1 == (width(%x) - 1)
%s = sub nsw %x, %y
%r = lshr %s, C1
=>
%c = icmp slt %x, %y
%r = zext %c
Raphael Isemann [Tue, 1 Dec 2020 14:49:51 +0000 (15:49 +0100)]
[lldb][NFC] Modernize and cleanup TestClassTemplateParameterPack
* Un-inline the test.
* Use expect_expr everywhere and also check all involved types.
* Clang-format the test sources.
* Explain what we're actually testing with the 'C' and 'D' templates.
* Split out the non-template-parameter-pack part of the test into its own small test.
Simon Pilgrim [Tue, 1 Dec 2020 14:21:22 +0000 (14:21 +0000)]
[DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111)
Move the X86 VSELECT->USUBSAT fold to DAGCombiner - there's nothing target specific about these folds.
Florian Hahn [Mon, 30 Nov 2020 15:43:39 +0000 (15:43 +0000)]
[ConstraintElimination] Decompose GEP %ptr, ZEXT(SHL()).
Add support to decompose a GEP with a ZEXT(SHL()) operand.
Nico Weber [Tue, 1 Dec 2020 00:54:04 +0000 (19:54 -0500)]
lld/ELF: Make three rarely-used flags work with --reproduce
All three use readFile() for their argument so their argument file is
already copied to the tar, but we weren't rewriting the argument to
point to the path used in the tar file.
No test because the change is trivial (several other flags in
createResponseFile() also aren't tested, likely for the same reason.)
Differential Revision: https://reviews.llvm.org/D92356
Alexey Baturo [Tue, 1 Dec 2020 12:58:31 +0000 (15:58 +0300)]
[RISCV][crt] support building without init_array
Reviewed By: luismarques, phosek, kito-cheng
Differential Revision: https://reviews.llvm.org/D87997
Kazushi (Jam) Marukawa [Tue, 1 Dec 2020 11:08:22 +0000 (20:08 +0900)]
[VE] Add vmul and vdiv intrinsic instructions
Add vmul and vdiv intrinsic instructions and regression tests.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92377
Simon Pilgrim [Tue, 1 Dec 2020 13:51:27 +0000 (13:51 +0000)]
[X86] Add PR48223 usubsat test case
Bhramar Vatsa [Tue, 1 Dec 2020 13:35:04 +0000 (16:35 +0300)]
[InstCombine] Optimize away the unnecessary multi-use sign-extend
C.f. https://bugs.llvm.org/show_bug.cgi?id=47765
Added a case for handling the sign-extend (Shl+AShr) for multiple uses,
to optimize it away for an individual use,
when the demanded bits aren't affected by sign-extend.
https://rise4fun.com/Alive/lgf
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D91343
Roman Lebedev [Tue, 1 Dec 2020 12:48:32 +0000 (15:48 +0300)]
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold, 2
If the shift amount was undef for some lane, the shift amount in opposite
shift is irrelevant for that lane, and the new shift amount for that lane
can be undef.
AndreyChurbanov [Tue, 1 Dec 2020 13:53:21 +0000 (16:53 +0300)]
[OpenMP] libomp: add UNLIKELY hints to rarely executed branches
Added UNLIKELY hint to one-time or rarely executed branches.
This improves performance of the library on some tasking benchmarks.
Differential Revision: https://reviews.llvm.org/D92322
Sanjay Patel [Tue, 1 Dec 2020 12:37:06 +0000 (07:37 -0500)]
[InstCombine] add tests for sign-bit-shift-of-sub; NFC
Hans Wennborg [Tue, 1 Dec 2020 12:50:49 +0000 (13:50 +0100)]
Remove rm -f cortex-a57-misched-mla.s; hopefully the bots have all cycled past it now
Roman Lebedev [Tue, 1 Dec 2020 12:47:04 +0000 (15:47 +0300)]
Revert "[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold"
It seems i have missed checklines, temporairly reverting,
will reland momentairly..
This reverts commit
aa1aa135097ecfab6d9917a435142030eff0a226.
Roman Lebedev [Tue, 1 Dec 2020 12:33:12 +0000 (15:33 +0300)]
[NFC][InstCombine] sext.ll: @test9: avoid only differently-cased names for values and block names
Roman Lebedev [Tue, 1 Dec 2020 12:11:14 +0000 (15:11 +0300)]
[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold
If the shift amount was undef for some lane, the shift amount in opposite
shift is irrelevant for that lane, and the new shift amount for that lane
can be undef.
Roman Lebedev [Tue, 1 Dec 2020 12:04:40 +0000 (15:04 +0300)]
[NFC][InstCombine] Improve vector undef test coverage for sext(ashr(shl(trunc()))) fold
Roman Lebedev [Tue, 1 Dec 2020 12:00:15 +0000 (15:00 +0300)]
[InstCombine] Evaluate new shift amount for sext(ashr(shl(trunc()))) fold in wide type (PR48343)
It is not correct to compute that new shift amount in it's narrow type
and only then extend it into the wide type:
----------------------------------------
Optimization: PR48343 good
Precondition: (width(%X) == width(%r))
%o0 = trunc %X
%o1 = shl %o0, %Y
%o2 = ashr %o1, %Y
%r = sext %o2
=>
%n0 = sext %Y
%n1 = sub width(%o0), %n0
%n2 = sub width(%X), %n1
%n3 = shl %X, %n2
%r = ashr %n3, %n2
Done: 2016
Optimization is correct!
----------------------------------------
Optimization: PR48343 bad
Precondition: (width(%X) == width(%r))
%o0 = trunc %X
%o1 = shl %o0, %Y
%o2 = ashr %o1, %Y
%r = sext %o2
=>
%n0 = sub width(%o0), %Y
%n1 = sub width(%X), %n0
%n2 = sext %n1
%n3 = shl %X, %n2
%r = ashr %n3, %n2
Done: 1
ERROR: Domain of definedness of Target is smaller than Source's for i9 %r
Example:
%X i9 = 0x000 (0)
%Y i4 = 0x3 (3)
%o0 i4 = 0x0 (0)
%o1 i4 = 0x0 (0)
%o2 i4 = 0x0 (0)
%n0 i4 = 0x1 (1)
%n1 i4 = 0x8 (8, -8)
%n2 i9 = 0x1F8 (504, -8)
%n3 i9 = 0x000 (0)
Source value: 0x000 (0)
Target value: undef
I.e. we should be computing it in the wide type from the beginning.
Fixes https://bugs.llvm.org/show_bug.cgi?id=48343
Roman Lebedev [Tue, 1 Dec 2020 11:49:28 +0000 (14:49 +0300)]
[NFC][InstCombine] Add PR48343 miscompiled testcase
Roman Lebedev [Tue, 1 Dec 2020 11:48:46 +0000 (14:48 +0300)]
[NFC][InstCombine] Autogenerate sext.ll test checklines
Roman Lebedev [Tue, 1 Dec 2020 08:07:28 +0000 (11:07 +0300)]
[SimplifyCFG] FoldBranchToCommonDest: don't require that cmp of br is last instruction
There is no correctness need for that, and since we allow live-out
uses, this could theoretically happen, because currently nothing
will move the cond to right before the branch in those tests.
But regardless, lifting that restriction even makes the transform
easier to understand.
This makes the transform happen in 81 more cases (+0.55%)
)
Roman Lebedev [Tue, 1 Dec 2020 07:59:08 +0000 (10:59 +0300)]
[NFC][SimplifyCFG] fold-branch-to-common-dest: add tests with cond of br not being the last op
Simon Pilgrim [Tue, 1 Dec 2020 11:56:12 +0000 (11:56 +0000)]
[DAG] Move vselect(icmp_ult, -1, add(x,y)) -> uaddsat(x,y) to DAGCombine (PR40111)
Move the X86 VSELECT->UADDSAT fold to DAGCombiner - there's nothing target specific about these folds.
The SSE42 test diffs are relatively benign - its avoiding an extra constant load in exchange for an extra xor operation - there are extra register moves, which is annoying as all those operations should commute them away.
Differential Revision: https://reviews.llvm.org/D91876
Sven van Haastregt [Tue, 1 Dec 2020 11:33:10 +0000 (11:33 +0000)]
[OpenCL] Allow pointer-to-pointer kernel args beyond CL 1.2
The restriction on pointer-to-pointer kernel arguments has been
relaxed in OpenCL 2.0. Apply the same address space restrictions for
pointer argument types to the inner pointer types.
Differential Revision: https://reviews.llvm.org/D92091
Cullen Rhodes [Mon, 2 Nov 2020 13:02:32 +0000 (13:02 +0000)]
[LV] Clamp VF hint when unsafe
In the following loop the dependence distance is 2 and can only be
vectorized if the vector length is no larger than this.
void foo(int *a, int *b, int N) {
#pragma clang loop vectorize(enable) vectorize_width(4)
for (int i=0; i<N; ++i) {
a[i + 2] = a[i] + b[i];
}
}
However, when specifying a VF of 4 via a loop hint this loop is
vectorized. According to [1][2], loop hints are ignored if the
optimization is not safe to apply.
This patch introduces a check to bail of vectorization if the user
specified VF is greater than the maximum feasible VF, unless explicitly
forced with '-force-vector-width=X'.
[1] https://llvm.org/docs/LangRef.html#llvm-loop-vectorize-and-llvm-loop-interleave
[2] https://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations
Reviewed By: sdesmalen, fhahn, Meinersbur
Differential Revision: https://reviews.llvm.org/D90687
Simon Pilgrim [Tue, 1 Dec 2020 10:59:53 +0000 (10:59 +0000)]
[InstCombine][X86] Fold addsub intrinsic to fadd/fsub depending on demanded elts (PR46277)
Caroline Concatto [Mon, 16 Nov 2020 10:14:28 +0000 (10:14 +0000)]
[NFC][CostModel]Extend class IntrinsicCostAttributes to use ElementCount Type
This patch replaces the attribute `unsigned VF` in the class
IntrinsicCostAttributes by `ElementCount VF`.
This is a non-functional change to help upcoming patches to compute the cost
model for scalable vector inside this class.
Differential Revision: https://reviews.llvm.org/D91532
Kadir Cetinkaya [Tue, 1 Dec 2020 08:58:56 +0000 (09:58 +0100)]
[clang] Enable code completion of designated initializers in Compound Literal Expressions
PreferedType were not set when parsing compound literals, hence
designated initializers were not available as code completion suggestions.
This patch sets the preferedtype to parsed type for the following initializer
list.
Fixes https://github.com/clangd/clangd/issues/142.
Differential Revision: https://reviews.llvm.org/D92370
Florian Hahn [Mon, 30 Nov 2020 15:26:51 +0000 (15:26 +0000)]
[ConstraintElimination] Decompose GEP %ptr, SHL().
Add support the decompose a GEP with an SHL operand.
Kazushi (Jam) Marukawa [Fri, 27 Nov 2020 13:19:43 +0000 (22:19 +0900)]
[VE] Add vadd and vsub intrinsic instructions
Add vadd and vsub intrinsic instructions and regression tests.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92332
Simon Pilgrim [Tue, 1 Dec 2020 10:35:52 +0000 (10:35 +0000)]
[InstCombine][X86] Add test coverage showing failure to simplify addsub intrinsics to fadd/fsub
If we only use odd/even lanes then we just need fadd/fsub ops
Sjoerd Meijer [Tue, 1 Dec 2020 09:17:10 +0000 (09:17 +0000)]
ExtractValue instruction costs
Instruction ExtractValue wasn't handled in
LoopVectorizationCostModel::getInstructionCost(). As a result, it was modeled
as a mul which is not really accurate. Since it is free (most of the times),
this now gets a cost of 0 using getInstructionCost.
This is a follow-up of D92208, that required changing this regression test.
In a follow up I will look at InsertValue which also isn't handled yet.
Differential Revision: https://reviews.llvm.org/D92317
David Green [Tue, 1 Dec 2020 10:40:04 +0000 (10:40 +0000)]
[AArch64] Update pass pipeline test. NFC
David Green [Tue, 1 Dec 2020 09:04:36 +0000 (09:04 +0000)]
[ARM] PREDICATE_CAST demanded bits
The PREDICATE_CAST node is used to model moves between MVE predicate
registers and gpr's, and eventually become a VMSR p0, rn. When moving to
a predicate only the bottom 16 bits of the sources register are
demanded. This adds a simple fold for that, allowing it to potentially
remove instructions like uxth.
Differential Revision: https://reviews.llvm.org/D92213
Jay Foad [Tue, 1 Dec 2020 10:15:22 +0000 (10:15 +0000)]
[AMDGPU] Simplify some generation checks. NFC.
Hans Wennborg [Tue, 1 Dec 2020 10:14:48 +0000 (11:14 +0100)]
[gn build] Manually merge 40659cd
Georgii Rymar [Tue, 10 Nov 2020 13:33:57 +0000 (16:33 +0300)]
[obj2yaml] - Teach tool to emit the "SectionHeaderTable" key and sort sections by file offset.
Currently when we dump sections, we dump them in the order,
which is specified in the sections header table.
With that the order in the output might not match the order in the file.
This patch starts sorting them by by file offsets when dumping.
When the order in the section header table doesn't match the order
in the file, we should emit the "SectionHeaderTable" key. This patch does it.
Differential revision: https://reviews.llvm.org/D91249
Jan Svoboda [Tue, 1 Dec 2020 09:40:50 +0000 (10:40 +0100)]
[clang][cli] Port HeaderSearch option flags to new option parsing system
Depends on D83697.
Reviewed By: dexonsmith
Original patch by Daniel Grumberg.
Differential Revision: https://reviews.llvm.org/D83940
Georgii Rymar [Mon, 30 Nov 2020 13:16:34 +0000 (16:16 +0300)]
[llvm-readobj][test] - Merge 2 test cases together.
This merges `invalid-attr-section-size.test` and `invalid-attr-version.test`
into `invalid-attributes-sec.test`.
This allows to have a single place where other related test cases can be added.
Differential revision: https://reviews.llvm.org/D92316
David Chisnall [Tue, 1 Dec 2020 09:48:25 +0000 (09:48 +0000)]
[GNU ObjC] Fix a regression listing methods twice.
Methods synthesized from declared properties were being added to the
method lists twice. This came from the change to list them in the
class's method list, which missed removing the place in CGObjCGNU that
added them again.
Reviewed By: lanza
Differential Revision: https://reviews.llvm.org/D91874
Georgii Rymar [Tue, 1 Dec 2020 09:08:46 +0000 (12:08 +0300)]
[llvm-readobj] - Introduce `ObjDumper::reportUniqueWarning(const Twine &Msg)`.
This introduces the overload for `reportUniqueWarning` which allows
to avoid using `createError` in many places.
Differential revision: https://reviews.llvm.org/D92371
Jan Svoboda [Fri, 20 Nov 2020 11:49:51 +0000 (12:49 +0100)]
[clang][cli] Port DependencyOutput option flags to new option parsing system
Depends on D91861.
Reviewed By: dexonsmith
Original patch by Daniel Grumberg.
Differential Revision: https://reviews.llvm.org/D83694
Eugene Zhulenev [Tue, 1 Dec 2020 08:44:32 +0000 (00:44 -0800)]
[mlir] AsyncRuntime: disable threading until test flakiness is fixed
ExecutionEngine/LLJIT do not run globals destructors in loaded dynamic libraries when destroyed, and threads managed by ThreadPool can race with program termination, and it leads to segfaults.
TODO: Re-enable threading after fixing a problem with destructors, or removing static globals from dynamic library.
Differential Revision: https://reviews.llvm.org/D92368
Jan Svoboda [Fri, 20 Nov 2020 11:49:51 +0000 (12:49 +0100)]
[clang][cli] Port Frontend option flags to new option parsing system
Depends on D91861.
Reviewed By: dexonsmith
Original patch by Daniel Grumberg.
Differential Revision: https://reviews.llvm.org/D83697
Jan Svoboda [Fri, 20 Nov 2020 11:49:51 +0000 (12:49 +0100)]
[clang][cli] Split DefaultAnyOf into a default value and ImpliedByAnyOf
This makes the options API composable, allows boolean flags to imply non-boolean values and makes the code more logical (IMO).
Differential Revision: https://reviews.llvm.org/D91861
Jan Svoboda [Fri, 20 Nov 2020 09:26:07 +0000 (10:26 +0100)]
[clang][cli] Factor out call to EXTRACTOR in generateCC1CommandLine (NFC)
Reviewed By: Bigcheese, dexonsmith
Original patch by Daniel Grumberg.
Differential Revision: https://reviews.llvm.org/D83211
Kristof Beyls [Mon, 30 Nov 2020 12:43:44 +0000 (13:43 +0100)]
collect_and_build_with_pgo.py: adapt to monorepo
Differential Revision: https://reviews.llvm.org/D92328
Georgii Rymar [Fri, 27 Nov 2020 08:57:03 +0000 (11:57 +0300)]
[llvm-readelf] - Switch to using from `reportWarning` to `reportUniqueWarning` in `DynRegionInfo`.
This is a part of the plan we had previously to convert all calls to
`reportUniqueWarning` and then rename it to just `reportWarning`.
I was a bit unsure about this particular change at first, because it doesn't add a
new functionality: seems it is impossible to trigger a warning duplication currently.
At the same time I find the idea of the plan mentioned very reasonable.
And with that we will be sure that `DynRegionInfo` can't report duplicate
warnings, what looks like a nice feature for possible refactorings and further tool development.
Differential revision: https://reviews.llvm.org/D92224
Martin Storsjö [Fri, 20 Nov 2020 09:29:27 +0000 (11:29 +0200)]
[compiler-rt] [emutls] Handle unused parameters in a compiler agnostic way
The MSVC specific pragmas disable this warning, but the pragmas themselves
(when not guarded by any _MSC_VER ifdef) cause warnings for other targets,
e.g. when targeting mingw.
Instead silence the MSVC warnings about unused parameters by casting
the parameters to void.
Differential Revision: https://reviews.llvm.org/D91851
Georgii Rymar [Fri, 27 Nov 2020 10:34:30 +0000 (13:34 +0300)]
[llvm-readelf/obj] - Move unique warning handling logic to the `ObjDumper`.
This moves the `reportUniqueWarning` method to the base class.
My motivation is the following:
I've experimented with replacing `reportWarning` calls with `reportUniqueWarning`
in ELF dumper. I've found that for example for removing them from `DynRegionInfo` helper
class, it is worth to pass a dumper instance to it (to be able to call dumper()->reportUniqueWarning()).
The problem was that `ELFDumper<ELFT>` is a template class. I had to make `DynRegionInfo` to be templated
and do lots of minor changes everywhere what did not look reasonable/nice.
At the same time I guess one day other dumpers like COFF/MachO/Wasm etc might want to
start using `reportUniqueWarning` API too. Then it looks reasonable to move the logic to the
base class.
With that the problem of passing the dumper instance will be gone.
Differential revision: https://reviews.llvm.org/D92218
Kazu Hirata [Tue, 1 Dec 2020 06:28:26 +0000 (22:28 -0800)]
[CodeView] Remove unused declaration collectInlineSiteChildren (NFC)
The function definition was removed on Sep 7, 2016 in commit
a9f4cc9510546f5728258524d344a3e03e43500b. The declaration seems to be
unused since then.
Wei Wang [Tue, 17 Nov 2020 18:43:02 +0000 (10:43 -0800)]
[Remarks][2/2] Expand remarks hotness threshold option support in more tools
This is the #2 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.
This change expands remarks hotness threshold option
-fdiagnostics-hotness-threshold in clang and *-remarks-hotness-threshold in
other tools to utilize hotness threshold from profile summary.
Remarks hotness filtering relies on several driver options. Table below lists
how different options are correlated and affect final remarks outputs:
| profile | hotness | threshold | remarks printed |
|---------|---------|-----------|-----------------|
| No | No | No | All |
| No | No | Yes | None |
| No | Yes | No | All |
| No | Yes | Yes | None |
| Yes | No | No | All |
| Yes | No | Yes | None |
| Yes | Yes | No | All |
| Yes | Yes | Yes | >=threshold |
In the presence of profile summary, it is often more desirable to directly use
the hotness threshold from profile summary. The new argument value 'auto'
indicates threshold will be synced with hotness threshold from profile summary
during compilation. The "auto" threshold relies on the availability of profile
summary. In case of missing such information, no remarks will be generated.
Differential Revision: https://reviews.llvm.org/D85808
Wei Wang [Tue, 17 Nov 2020 18:37:59 +0000 (10:37 -0800)]
[Remarks][1/2] Expand remarks hotness threshold option support in more tools
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.
This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:
* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='
Differential Revision: https://reviews.llvm.org/D85809
Greg Parker [Sun, 29 Nov 2020 01:54:29 +0000 (17:54 -0800)]
[DSE] Remove a redundant call to getLocForWriteEx()
Differential Revision: https://reviews.llvm.org/D92263
Raman Tenneti [Tue, 1 Dec 2020 05:06:47 +0000 (21:06 -0800)]
Initial commit of mktime.
This introduces mktime to LLVM libc, based on C99/C2X/Single Unix Spec.
Co-authored-by: Jeff Bailey <jeffbailey@google.com>
This change doesn't handle TIMEZONE, tm_isdst and leap seconds. It returns -1 for invalid dates. I have verified the return results for all the possible dates with glibc's mktime.
TODO:
+ Handle leap seconds.
+ Handle out of range time and date values that don't overflow or underflow.
+ Implement the following suggestion Siva - As we start accumulating the seconds, we should be able to check if the next amount of seconds to be added can lead to an overflow. If it does, return the overflow value. If not keep accumulating. The benefit is that, we don't have to validate every input, and also do not need the special cases for sizeof(time_t) == 4.
+ Handle timezone and update of tm_isdst
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D91551
Craig Topper [Tue, 1 Dec 2020 04:15:04 +0000 (20:15 -0800)]
[RISCV] Rename RISCVGenSystemOperands.inc to RISCVGenSearchableTables.inc to prepare for more tables. NFC
D89449 adds more tables so renaming as a pre-commit for that.
Hendrik Greving [Thu, 12 Nov 2020 01:56:14 +0000 (17:56 -0800)]
Add MachineModuleInfo constructor with external MCContext
Adds a constructor to MachineModuleInfo and MachineModuleInfoWapperPass that
takes an external MCContext. If provided, the external context will be used
throughout codegen instead of MMI's default one.
This enables external drivers to take ownership of data put on the MMI's context
during codegen. The internal context is used otherwise and destroyed upon
finish.
Differential Revision: https://reviews.llvm.org/D91313