Rafael Auler [Wed, 9 Mar 2022 18:46:15 +0000 (10:46 -0800)]
[BOLT] Increase coverage of shrink wrapping [4/5]
Change shrink-wrapping to try a priority list of save
positions, instead of trying the best one and giving up if it doesn't
work. This also increases coverage.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126114
Rafael Auler [Sat, 21 May 2022 02:43:07 +0000 (19:43 -0700)]
[BOLT] Increase coverage of shrink wrapping [3/5]
Add the option to run -equalize-bb-counts before shrink
wrapping to avoid unnecessarily optimizing some CFGs where profile is
inaccurate but we can prove two blocks have the same frequency.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126113
Rafael Auler [Sat, 21 May 2022 02:31:07 +0000 (19:31 -0700)]
[BOLT] Increase coverage of shrink wrapping [2/5]
Refactor isStackAccess() to reflect updates by D126116. Now we only
handle simple stack accesses and delegate the rest of the cases to
getMemDataSize.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126112
Rafael Auler [Wed, 9 Mar 2022 18:46:15 +0000 (10:46 -0800)]
[BOLT] Increase coverage of shrink wrapping [1/5]
Change how function score is calculated and provide more
detailed statistics when reporting back frame optimizer and shrink
wrapping results. In this new statistics, we provide dynamic coverage
numbers. The main metric for shrink wrapping is the number of executed
stores that were saved because of shrink wrapping (push instructions
that were either entirely moved away from the hot block or converted
to a stack adjustment instruction). There is still a number of reduced
load instructions (pop) that we are not counting at the moment. Also
update alloc combiner to report dynamic numbers, as well as frame
optimizer.
For debugging purposes, we also include a list of top 10 functions
optimized by shrink wrapping. These changes are aimed at better
understanding the impact of shrink wrapping in a given binary.
We also remove an assertion in dataflow analysis to do not choke on
empty functions (which makes no sense).
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126111
Nico Weber [Tue, 12 Jul 2022 00:14:26 +0000 (20:14 -0400)]
[gn build] (manually) port
ce233e714665
Michael Jones [Tue, 28 Jun 2022 22:08:00 +0000 (15:08 -0700)]
[libc] clean up printf error codes
Move the constants for printf's return values into core_structs, and
update the converters to match.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D128767
Raphael Isemann [Mon, 11 Jul 2022 23:41:44 +0000 (16:41 -0700)]
[lldb] Add support for escaping fish arguments
LLDB supports having globbing regexes in the process launch arguments
that will be resolved using the user's shell. This requires that we pass
the launch args to the shell and then read back the expanded arguments
using LLDB's argdumper utility.
As the shell will not just expand the globbing regexes but all special
characters, we need to escape all non-globbing charcters such as $, &,
<, >, etc. as those otherwise are interpreted and removed in the step
where we expand the globbing characters. Also because the special
characters are shell-specific, LLDB needs to maintain a list of all the
characters that need to be escaped for each specific shell.
This patch adds the list of special characters that need to be escaped
for fish. Without this patch on systems where fish is the user's shell
having any of these special characters in your arguments or path to
the binary will cause the process launch to fail. E.g., `lldb -- ./calc
1<2` is failing without this patch. The same happens if the absolute
path to calc is in a directory that contains for example parentheses
or other special characters.
Differential revision: https://reviews.llvm.org/D104635
Jonas Devlieghere [Mon, 11 Jul 2022 23:29:55 +0000 (16:29 -0700)]
[lldb] Add a test to prefer exact triple matches in platform selection
Add a test that ensures we always prioritize exact triple matches when
creating platforms. This is a regression test for a (now resolved) bug
that that resulted in the remote tvOS platform being selected for a tvOS
simulator binary because the ArchSpecs are compatible.
Florian Hahn [Mon, 11 Jul 2022 23:34:07 +0000 (16:34 -0700)]
[GlobalOpt] Add test that requires splitting up global into many.
Add test that hits the limit introduced in
4796b4ae7bccc7.
Florian Hahn [Mon, 11 Jul 2022 23:01:04 +0000 (16:01 -0700)]
[LV] Move VPBlendRecipe::execute to VPlanRecipes.cpp (NFC).
Alex Brachet [Mon, 11 Jul 2022 22:46:06 +0000 (22:46 +0000)]
Fix build on Windows
It seems like the `sed` on Windows is not particularly
smart. It's not actually needed in this place, so I've
removed it's usage and just created an invalid yaml
another way.
Craig Topper [Mon, 11 Jul 2022 22:06:00 +0000 (15:06 -0700)]
[RISCV] Use MVT for the argument to getMaskTypeFor. NFC
Only one caller didn't already have an MVT and that was easy to
fix. Since the return type is MVT and it uses MVT::getVectorVT,
taking an MVT as input makes the most sense.
Christopher Bate [Thu, 7 Jul 2022 22:50:53 +0000 (16:50 -0600)]
[mlir] Register linalg external TilingInterface models in InitAllDialects
Differential Revision: https://reviews.llvm.org/D129333
Jonas Devlieghere [Mon, 11 Jul 2022 21:03:53 +0000 (14:03 -0700)]
[lldb] Use the just-built libc++ for testing the LLDB data formatters
Make sure we use the libc++ from the build dir. Currently, by passing
-stdlib=libc++, we might pick up the system libc++. This change ensures
that if LLVM_LIBS_DIR is set, we try to use the libc++ from there.
Differential revision: https://reviews.llvm.org/D129166
Aart Bik [Sat, 9 Jul 2022 04:12:25 +0000 (21:12 -0700)]
[mlir][sparse] implement sparse2sparse reshaping (expand/collapse)
A previous revision implemented expand/collapse reshaping between
dense and sparse tensors for sparse2dense and dense2sparse since those
could use the "cheap" view reshape on the already materialized
dense tensor (at either the input or output side), and do some
reshuffling from or to sparse. The dense2dense case, as always,
is handled with a "cheap" view change.
This revision implements the sparse2sparse cases. Lacking any "view"
support on sparse tensors this operation necessarily has to perform
data reshuffling on both ends.
Tracker for improving this:
https://github.com/llvm/llvm-project/issues/56477
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D129416
Petr Hosek [Mon, 11 Jul 2022 20:09:51 +0000 (20:09 +0000)]
[Driver] Don't use frame pointer on Fuchsia when optimizations are enabled
This matches the standard behavior on other platforms.
Differential Revision: https://reviews.llvm.org/D129512
Alex Brachet [Mon, 11 Jul 2022 21:44:28 +0000 (21:44 +0000)]
Fix build on Windows
Error message is not capitalized on Windows
Alex Brachet [Mon, 11 Jul 2022 21:31:01 +0000 (21:31 +0000)]
[COFF] Add vfsoverlay flag
This patch adds a new flag vfsoverlay similar to clang’s
ivfsoverlay flag. This is helpful when compiling on case
sensitive file systems when cross compiling to Windows.
Particularly when compiling third party code containing
\#pragma comment(“linker”, “/defaultlib:...”) which
can’t be easily changed.
Differential Revision: https://reviews.llvm.org/D125800
Alex Brachet [Mon, 11 Jul 2022 21:28:21 +0000 (21:28 +0000)]
[libc] Add imaxabs
Differential Revision: https://reviews.llvm.org/D129517
Jonas Devlieghere [Mon, 11 Jul 2022 20:58:52 +0000 (13:58 -0700)]
Revert "[C++20][Modules] Update handling of implicit inlines [P1779R3]"
This reverts commit
ef0fa9f0ef3e as a follow up to
b19d3ee7120b which
reverted commit
ac507102d258. See https://reviews.llvm.org/D126189 for
more details.
Craig Topper [Mon, 11 Jul 2022 20:31:07 +0000 (13:31 -0700)]
[SelectionDAG] Simplify how we drop poison flags in SimplifyDemandedBits.
As far as I can tell what was happening in the original code is
that the getNode call receives the same operands as the original
node with different SDNodeFlags. The logic inside getNode detects
that the node already exists and intersects the flags into the
existing node and returns it. This results in Op and NewOp for the
TLO.CombineTo call always being the same node.
We may have already called CombineTo as part of the recursive handling.
A second call to CombineTo as we unwind the recursion overwrites
the previous CombineTo. I think this means any time we updated the
poison flags that was the only change that ends up getting made
and we relied on DAGCombiner to revisit and call SimplifyDemandedBits
again. The second time the poison flags wouldn't need to be dropped
and we would keep the CombineTo call from further down the recursion.
We can instead call setFlags to drop the poison flags and remove the
call to TLO.CombineTo. This way we keep the CombineTo from deeper in
the recursion which should be more efficient.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D129511
Martin Storsjö [Fri, 8 Jul 2022 21:36:16 +0000 (00:36 +0300)]
[lldb] Reduce the stack alignment requirements for the Windows x86_64 ABI
This fixes https://github.com/llvm/llvm-project/issues/56095.
Differential Revision: https://reviews.llvm.org/D129455
Piotr Sobczak [Mon, 11 Jul 2022 19:24:56 +0000 (21:24 +0200)]
[AMDGPU] Fix bitcast v4i64/v16i16
Fix a regression introduced in D128865.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D129375
Aiden Grossman [Mon, 11 Jul 2022 20:12:52 +0000 (13:12 -0700)]
[mlgo] Simplify autogenerated regalloc model
Currently the autogenerated regalloc model will sometimes
output an incorrect LR index to evict instead of the first LR
with with the mask set to 1. This trips an assertion within
the MLRegallocAdvisor that the evicted LR has a mask of 1. This
patch, made possible by https://reviews.llvm.org/D124565, simplifies
the autogenerated model by taking away all unnecessary features and
getting rid of the functions that were previously to mix in all
the necessary inputs so they wouldn't get pruned by the Tensorflow
XLA AOT compiler. This is no longer necessary after the previously
mentioned patch. This also fixes the nondeterministic behavior
that is sometimes observed where the autogenerated model will
simply output 0 instead of the correct index.
Reviewed By: yundiqian
Differential Revision: https://reviews.llvm.org/D129254
George Petterson [Mon, 11 Jul 2022 19:37:03 +0000 (15:37 -0400)]
Fix an issue with grouped conv2d op
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D128880
Nirvedh [Mon, 11 Jul 2022 20:03:16 +0000 (20:03 +0000)]
Revert "Fix an issue with grouped conv2d op"
This reverts commit
45ef20ca71aaba9ad50c4641fe7fcbb786724af8.
George Petterson [Mon, 11 Jul 2022 19:37:03 +0000 (15:37 -0400)]
Fix an issue with grouped conv2d op
Hui Xie [Mon, 11 Jul 2022 19:56:14 +0000 (21:56 +0200)]
[libc++] Rename variables to use the snake case instead of camel case
For some reason the pre-commit CI of https://reviews.llvm.org/D129233 was all green so I didn't spot this
https://reviews.llvm.org/B174525
Reviewed By: #libc, philnik, Mordante
Differential Revision: https://reviews.llvm.org/D129503
Kai Sasaki [Mon, 11 Jul 2022 19:55:31 +0000 (21:55 +0200)]
[mlir][complex] Lower complex.log to libm log call
Lower complex.log to corresponding function call with libm.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D129417
Fangrui Song [Mon, 11 Jul 2022 19:53:34 +0000 (12:53 -0700)]
[sanitizer] Remove #include <linux/fs.h> to resolve fsconfig_command/mount_attr conflict with glibc 2.36
It is generally not a good idea to mix usage of glibc headers and Linux UAPI
headers (https://sourceware.org/glibc/wiki/Synchronizing_Headers). In glibc
since
7eae6a91e9b1670330c9f15730082c91c0b1d570 (milestone: 2.36), sys/mount.h
defines `fsconfig_command` which conflicts with linux/mount.h:
.../usr/include/linux/mount.h:95:6: error: redeclaration of ‘enum fsconfig_command’
Remove #include <linux/fs.h> which pulls in linux/mount.h. Expand its 4 macros manually.
Android sys/mount.h doesn't define BLKBSZGET and it still needs linux/fs.h.
In the long term we should move Linux specific definitions to sanitizer_platform_limits_linux.cpp
but this commit is easy to cherry pick into older compiler-rt releases.
Fix https://github.com/llvm/llvm-project/issues/56421
Reviewed By: #sanitizers, vitalybuka, zatrazz
Differential Revision: https://reviews.llvm.org/D129471
Fangrui Song [Mon, 11 Jul 2022 19:44:37 +0000 (12:44 -0700)]
Revert "[sanitizer] Remove #include <linux/fs.h> to resolve fsconfig_command/mount_attr conflict with glibc 2.36"
This reverts commit
b379129c4beb3f26223288627a1291739f33af02.
Breaks Android build. Android sys/mount.h doesn't define macros like BLKBSZGET.
Joseph Huber [Thu, 30 Jun 2022 11:59:24 +0000 (07:59 -0400)]
[HIP] Add support for handling HIP in the linker wrapper
This patch adds the necessary changes required to bundle and wrap HIP
files. The bundling is done using `clang-offload-bundler` currently to
mimic `fatbinary` and the wrapping is done using very similar runtime
calls to CUDA. This still does not support managed / surface / texture
variables, that would require some additional information in the entry.
One difference in the codegeneration with AMD is that I don't check if
the handle is null before destructing it, I'm not sure if that's
required.
With this we should be able to support HIP with the new driver.
Depends on D128850
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D128914
Joseph Huber [Wed, 29 Jun 2022 19:48:16 +0000 (15:48 -0400)]
[HIP] Generate offloading entries for HIP with the new driver.
This patch adds the small change required to output offloading entried
for HIP instead of CUDA. These should be placed in different sections so
because they need to be distinct to the offloading toolchain, otherwise
we'd have HIP trying to register CUDA kernels or vice-versa. This patch will
precede support for HIP in the linker wrapper.
Reviewed By: yaxunl, tra
Differential Revision: https://reviews.llvm.org/D128850
Joseph Huber [Mon, 11 Jul 2022 19:43:18 +0000 (15:43 -0400)]
[llvm-objdump][docs] Fix documentation for offloading flags
mphschmitt [Mon, 11 Jul 2022 19:39:33 +0000 (15:39 -0400)]
[llvm-objdump][docs] fix typo in llvm-objdump documentation.
Fix a typo in llvm-objdump documentation.
Differential Revision: https://reviews.llvm.org/D129445
Reviewed by: jhuber6
Joseph Huber [Sun, 10 Jul 2022 02:20:04 +0000 (22:20 -0400)]
[Clang] Parse toolchain-specific offloading arguments directly
OpenMP supports multiple offloading toolchains and architectures. In
order to support this we originally used `getArgsForToolchain` to get
the arguments only intended for each toolchain. This allowed users to
manually specify if an `--offload-arch=` argument was intended for which
toolchain using `-Xopenmp-target=` or other methods. For example,
```
clang input.c -fopenmp -fopenmp-targets=nvptx64,amdgcn -Xopenmp-target=nvptx64 --offload-arch=sm_70 -Xopenmp-target=amdgcn --offload-arch=gfx908
```
However, this was causing problems with the AMDGPU toolchain. This is
because the AMDGPU toolchain for OpenMP uses an `amdgpu` arch to determine the
architecture. If this tool is not availible the compiler will exit with an error
even when manually specifying the architecture. This patch pulls out the logic in
`getArgsForToolchain` and specializes it for extracting `--offload-arch`
arguments to avoid this.
Reviewed By: JonChesterfield, yaxunl
Differential Revision: https://reviews.llvm.org/D129435
David Green [Mon, 11 Jul 2022 19:36:46 +0000 (20:36 +0100)]
[AArch64] Move fp16 intrinsics tests to new file. NFC
The enabled features for the existing test do not always include FP16,
which is required for the intrinsics.
Nick Desaulniers [Mon, 11 Jul 2022 19:33:52 +0000 (12:33 -0700)]
[llvm][docs] commit phabricator patch
Users upgrading to PHP 8.1 might start observing failures with `arc`.
Commit @ychen's suggestions as a patch in tree that can be applied since
arcanist is no longer accepting patches.
Also, remove the suggestion to apply an external patch updating CA
certs. It seems that this was fixed in upstream arcanist before they
stopped accepting patches. Compare
https://github.com/rashkov/arcanist/commit/
e3659d43d8911e91739f3b0c5935598bceb859aa
vs
https://github.com/rashkov/arcanist/commit/
13d3a3c3b100979c34dda261fe21253e3571bc46
Link: https://secure.phabricator.com/book/phabcontrib/article/contributing_code/
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D129232
Kaining Zhong [Mon, 11 Jul 2022 19:21:57 +0000 (15:21 -0400)]
[lld-macho] Handle user-provided dtrace symbols to avoid linking failure
This fixes https://github.com/llvm/llvm-project/issues/56238. ld64.lld currently does not generate __dof section in Mach-O, and -no_dtrace_dof option is on by default. However when there are user-defined dtrace symbols, ld64.lld will treat them as undefined symbols, which causes the linking to fail because lld cannot find their definitions. This patch allows ld64.lld to rewrite the instructions calling dtrace symbols to instructions like nop as what ld64 does; therefore, when encountered with user-provided dtrace probes, the linking can still succeed.
I'm not sure whether support for dtrace is expected in lld, so for now I didn't add codes to make lld emit __dof section like ld64, and only made it possible to link with dtrace symbols provided. If this feature is needed, I can add that part in Dtrace.cpp & Dtrace.h.
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D129062
Mitch Phillips [Mon, 11 Jul 2022 18:44:55 +0000 (11:44 -0700)]
Update DynInit generation for ASan globals.
Address a follow-up TODO for Sanitizer Metadata.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D128672
LLVM GN Syncbot [Mon, 11 Jul 2022 19:17:30 +0000 (19:17 +0000)]
[gn build] Port
7d426a392f73
Craig Topper [Mon, 11 Jul 2022 18:00:50 +0000 (11:00 -0700)]
[RISCV] Pre-commit tests for D121833. NFC
Nikolas Klauser [Mon, 11 Jul 2022 15:07:35 +0000 (17:07 +0200)]
[libc++] Implement ranges::{reverse, rotate}_copy
Reviewed By: var-const, #libc
Spies: huixie90, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D127211
Iain Sandoe [Mon, 11 Jul 2022 18:50:31 +0000 (19:50 +0100)]
Revert "[C++20][Modules] Build module static initializers per P1874R1."
This reverts commit
ac507102d258b6fc0cb57eb60c9dfabd57ff562f.
reverting while we figuere out why one of the green dragon lldb test fails.
Iain Sandoe [Mon, 11 Jul 2022 18:49:48 +0000 (19:49 +0100)]
Revert "[C++20][Modules] Fix two tests for CTORs that return pointers [NFC]."
This reverts commit
4328b960176f4394416093e640ad4265bde65ad7.
reverting while we figure out why one of the Greendragon lldb tests fails.
Dylan Fleming [Mon, 11 Jul 2022 18:08:48 +0000 (18:08 +0000)]
[Flang] Fix formatting for FIRLangRef.html
Previously, FIRLangRef.md was incorrectly formatted.
This was due to how FIRLangRef.md had no page header,
and so the first entry would render incorrectly.
This patch introduces a header file, which is prepended to the FIRLangRef
before it becomes a HTML file. The header is currently brief
but can be expanded upon at a later date if required.
This formatting fix also means the index page
can correctly generate a link to FIRLangRef.html and as such,
this patch also removes FIRLangRef from the sidebar and adds it to the main list of links.
Depends on D128650
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D129186
Fangrui Song [Mon, 11 Jul 2022 18:38:28 +0000 (11:38 -0700)]
[sanitizer] Remove #include <linux/fs.h> to resolve fsconfig_command/mount_attr conflict with glibc 2.36
It is generally not a good idea to mix usage of glibc headers and Linux UAPI
headers (https://sourceware.org/glibc/wiki/Synchronizing_Headers). In glibc
since
7eae6a91e9b1670330c9f15730082c91c0b1d570 (milestone: 2.36), sys/mount.h
defines `fsconfig_command` which conflicts with linux/mount.h:
.../usr/include/linux/mount.h:95:6: error: redeclaration of ‘enum fsconfig_command’
Remove #include <linux/fs.h> which pulls in linux/mount.h. Expand its 4 macros manually.
Fix https://github.com/llvm/llvm-project/issues/56421
Reviewed By: #sanitizers, vitalybuka, zatrazz
Differential Revision: https://reviews.llvm.org/D129471
Justin Cady [Mon, 11 Jul 2022 18:29:20 +0000 (11:29 -0700)]
[InstrProf] Mark __llvm_profile_runtime hidden to match libclang_rt.profile definition
Mark the symbol hidden to match INSTR_PROF_PROFILE_RUNTIME_VAR in compiler-rt.
Fixes second issue discussed at https://discourse.llvm.org/t/63090
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D128842
Dylan Fleming [Mon, 11 Jul 2022 17:34:26 +0000 (17:34 +0000)]
[Flang] Add a link from the docs html page to the FIR html page
The Fortran Language Reference is currently generated via tablegen,
however isn't present on flang.llvm.org/docs/
This patch adds FIRLangRef.md to the flang/docs directoy,
and adds a link to the generated HTML file in sidebar
under the 'Documentation' heading.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128650
Katherine Rasmussen [Wed, 29 Jun 2022 21:44:00 +0000 (14:44 -0700)]
[flang] Add semantics test for image_status and add a check
Add a semantics test for the intrinsic function image_status. Add
a check and restriction on the image argument in image_status,
ensuring that it is a positive value. Add same check on the
size argument of the intrinsic ishftc. Add another check on
the shift argument of ishftc, ensuring that it is less than or
equal to the size argument. Add a short semantics test checking
these restrictions in ishftc function calls.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D128009
Craig Topper [Mon, 11 Jul 2022 17:30:56 +0000 (10:30 -0700)]
[RISCV] Remove doPeepholeLoadStoreADDI.
All of the cases should be handled by SelectAddrRegImm now.
Reviewed By: asb, luismarques
Differential Revision: https://reviews.llvm.org/D129451
Craig Topper [Mon, 11 Jul 2022 17:16:36 +0000 (10:16 -0700)]
[RISCV] Move the custom isel for (add X, imm) into SelectAddrRegImm.
This custom isel was used to split the lo12 bits of the imm so that
they could be folded into load/store addresses via a post-isel
peephole.
This patch instead splits the immediate during isel and folds the
lo12 removing the need for the post-isel peephole to do anything.
After this we'll be able to remove the post-isel peephole.
Reviewed By: asb, luismarques
Differential Revision: https://reviews.llvm.org/D129450
Alex Brachet [Mon, 11 Jul 2022 17:41:37 +0000 (17:41 +0000)]
[scudo][NFC] Clang-format
c823cbf699
Ran `git clang-format` but didn't add the changed file...
Alex Brachet [Mon, 11 Jul 2022 17:39:44 +0000 (17:39 +0000)]
[scudo][Fuchsia] Don't assume MapPlatformData::Vmar is valid
After https://reviews.llvm.org/D129237, the assumption
that any non-null data contains a valid vmar handle is no
longer true. Generally this code here needs cleanup, but
in the meantime this fixes errors on Fuchsia.
Differential Revision: https://reviews.llvm.org/D129331
Ivan Trofimov [Mon, 11 Jul 2022 17:12:31 +0000 (10:12 -0700)]
[libasan] Remove 4Mb stack limit for swapcontext unpoisoning
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D129219
Dominic Chen [Fri, 8 Jul 2022 19:44:26 +0000 (12:44 -0700)]
[scudo] Satisfy -Wstrict-prototypes
Differential Revision: https://reviews.llvm.org/D129391
Prabhdeep Singh Soni [Mon, 11 Jul 2022 17:27:26 +0000 (13:27 -0400)]
[OMPIRBuilder] Add support for simdlen clause
This patch adds OMPIRBuilder support for the simdlen clause for the
simd directive. It uses the simdlen support in OpenMPIRBuilder when
it is enabled in Clang. Simdlen is lowered by OpenMPIRBuilder by
generating the loop.vectorize.width metadata.
Reviewed By: jdoerfert, Meinersbur
Differential Revision: https://reviews.llvm.org/D129149
jungpark-mlir [Mon, 11 Jul 2022 17:18:09 +0000 (17:18 +0000)]
[MLIR][TOSA] Fix converting tosa.clamp and tosa.relu to linalg
Tosa to Linalg conversion crashes when input tensor is a float type other than fp32.
Because tosa.clamp and tosa.reluN have fp32 min/max attribute which is converted as arith.constant with the attribute type.
This commit fixes the crash by correctly setting the float constant type from the input tensor.
Reviewed By: eric-k256
Differential Revision: https://reviews.llvm.org/D128630
Ivan Trofimov [Mon, 11 Jul 2022 16:56:13 +0000 (09:56 -0700)]
[NFC][asan] Clang-format a test
Part of D129219.
LLVM GN Syncbot [Mon, 11 Jul 2022 16:57:49 +0000 (16:57 +0000)]
[gn build] Port
c8a28ae214c0
Venkata Ramanaiah Nalamothu [Mon, 11 Jul 2022 13:20:08 +0000 (18:50 +0530)]
[llvm][docs] Fix typos to say subclasses need to override virtual methods but not overload
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D129484
spupyrev [Mon, 11 Jul 2022 16:49:41 +0000 (09:49 -0700)]
Revert "Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations""
This reverts commit
76029cc53e838e6d86b13b0c39152f474fb09263.
spupyrev [Mon, 11 Jul 2022 16:48:22 +0000 (09:48 -0700)]
Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type"
This reverts commit
6d0528636ae54fba75938a79ae7a98dfcc949f72.
spupyrev [Mon, 11 Jul 2022 16:43:39 +0000 (09:43 -0700)]
Revert "Rebase: [Facebook] Add clang driver options to test debug info and BOLT"
This reverts commit
f921985a29fc9787b3ed98dbc897146cc3fd91f7.
Than McIntosh [Mon, 11 Jul 2022 12:37:06 +0000 (08:37 -0400)]
tsan: update Go x86 build rules to back off to sse3
This is a partial revert of https://reviews.llvm.org/D106948, changing
just the Go build rules to remove -msse4.2 and revert back to -msse3,
so as to preserve support for older x86 machines. More details at
https://github.com/golang/go/issues/53743.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D129482
Craig Topper [Mon, 11 Jul 2022 16:26:34 +0000 (09:26 -0700)]
[RISCV] Make shouldConvertConstantLoadToIntImm return true unless enableUnalignedScalarMem is true.
This restores the old behavior before D129402 when
enableUnalignedScalarMem is false. This fixes a regression spotted
by @asb.
To fix this correctly, we need to consider alignment of the load
we'd be replacing, but that's not possible in the current interface.
Sanjay Patel [Mon, 11 Jul 2022 16:02:42 +0000 (12:02 -0400)]
[SDAG] enhance sub->xor fold to ignore signbit
As suggested in the post-commit feedback for D128123,
we can ease the mask constraint to ignore the MSB
(and make the code easier to read by adjusting the check).
https://alive2.llvm.org/ce/z/bbvqWv
Sanjay Patel [Mon, 11 Jul 2022 15:24:42 +0000 (11:24 -0400)]
[InstCombine] add test for possible sub->xor fold; NFC
Sanjay Patel [Mon, 11 Jul 2022 15:08:22 +0000 (11:08 -0400)]
[AArch64] add test for possible sub->xor enhancement; NFC
spupyrev [Fri, 8 Jul 2022 17:14:26 +0000 (10:14 -0700)]
[BOLT] Do not merge cold and hot chains of basic blocks
There is a post-processing in ext-tsp block reordering that merges some blocks
into chains. This allows to maintain the original block order in the absense of
profile data and can be beneficial for code size (when fallthroughs are merged).
In the earlier version we could merge hot and cold (with zero execution count)
chains, that later were split by SplitFunction.cpp (when split-all-cold=1). The
diff eliminates the redundant merging.
It is unlikely the change will affect the performance of a binary in a
measurable way, as it is mostly operates with cold basic blocks. However, after
the diff the impact of split-all-cold is almost negligible and we can avoid the
extra function splitting.
Measuring on the clang binary (negative is good, positive is a regression):
**clang12**
benchmark1: `0.0253`
benchmark2: `-0.1843`
benchmark3: `0.3234`
benchmark4: `0.0333`
**clang10**
benchmark1 `-0.2517`
benchmark2 `-0.3703`
benchmark3 `-0.1186`
benchmark4 `-0.3822`
**clang7**
benchmark1 `0.2526`
benchmark2 `0.0500`
benchmark3 `0.3024`
benchmark4 `-0.0489`
**Overall**: `-0.0671 ± 0.1172` (insignificant)
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D129397
Maksim Panchenko [Tue, 29 Mar 2022 21:02:58 +0000 (14:02 -0700)]
Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations"
Summary:
This reverts commit
729d29e167a553ee1190c310b6a510db8d8731ac.
Needed as a workaround for T112872562.
Manual rebase conflict history:
https://phabricator.intern.facebook.com/
D35230076
https://phabricator.intern.facebook.com/
D35681740
Test Plan: sandcastle
Reviewers: #llvm-bolt
Subscribers: spupyrev
Differential Revision: https://phabricator.intern.facebook.com/
D37098481
Rafael Auler [Thu, 5 Aug 2021 21:17:07 +0000 (14:17 -0700)]
Rebase: [Facebook] [MC] Introduce NeverAlign fragment type
Summary:
Introduce NeverAlign fragment type.
The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.
In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).
This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.
The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.
LLVM: https://reviews.llvm.org/D97982
Manual rebase conflict history:
https://phabricator.intern.facebook.com/
D30142613
Test Plan: sandcastle
Reviewers: #llvm-bolt
Subscribers: phabricatorlinter
Differential Revision: https://phabricator.intern.facebook.com/
D31361547
Amir Ayupov [Tue, 1 Jun 2021 18:37:41 +0000 (11:37 -0700)]
Rebase: [Facebook] Add clang driver options to test debug info and BOLT
Summary:
This is an essential piece of infrastructure for us to be
continuously testing debug info with BOLT. We can't only make changes
to a test repo because we need to change debuginfo tests to call BOLT,
hence, this diff needs to sit in our opensource repo. But when upstreaming
to LLVM, this should be kept BOLT-only outside of LLVM. When upstreaming,
we need to git diff and check all folders that are being modified by our
commits and discard this one (and leave as an internal diff).
To test BOLT in debuginfo tests, configure it with -DLLVM_TEST_BOLT=ON.
Then run check-lldb and check-debuginfo.
Manual rebase conflict history:
https://phabricator.intern.facebook.com/
D29205224
https://phabricator.intern.facebook.com/
D29564078
https://phabricator.intern.facebook.com/
D33289118
https://phabricator.intern.facebook.com/
D34957174
Test Plan:
tested locally
Configured with:
-DLLVM_ENABLE_PROJECTS="clang;lld;lldb;compiler-rt;bolt;debuginfo-tests"
-DLLVM_TEST_BOLT=ON
Ran test suite with:
ninja check-debuginfo
ninja check-lldb
Reviewers: #llvm-bolt
Subscribers: ayermolo, phabricatorlinter
Differential Revision: https://phabricator.intern.facebook.com/
D35317341
Tasks: T92898286
Aaron Ballman [Mon, 11 Jul 2022 16:28:01 +0000 (12:28 -0400)]
Revert "Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface"
This reverts commit
69fcf4fd5a014b763061f13b5c4434d49c42c35a.
It broke at least one bot:
https://lab.llvm.org/buildbot/#/builders/91/builds/11328
Jonas Devlieghere [Mon, 11 Jul 2022 16:24:40 +0000 (09:24 -0700)]
Revert "jGetLoadedDynamicLibrariesInfos can inspect machos not yet loaded"
This reverts commit
77a38f6839980bfac61babb40d83772c51427011 because (I
suspect) it breaks TestAppleSimulatorOSType.py on GreenDragon [1].
[1] https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45191/
LLVM GN Syncbot [Mon, 11 Jul 2022 16:19:49 +0000 (16:19 +0000)]
[gn build] Port
69fcf4fd5a01
Vaibhav Yenamandra [Mon, 11 Jul 2022 16:18:13 +0000 (12:18 -0400)]
Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface
Create an interface for writing SARIF documents from within clang:
The primary intent of this change is to introduce the interface
clang::SarifDocumentWriter, which allows incrementally adding
diagnostic data to a JSON backed document. The proposed interface is
not yet connected to the compiler internals, which will be covered in
future work. As such this change will not change the input/output
interface of clang.
This change also introduces the clang::FullSourceRange type that is
modeled after clang::SourceRange + clang::FullSourceLoc, this is useful
for packaging a pair of clang::SourceLocation objects with their
corresponding SourceManagers.
Previous discussions:
RFC for this change: https://lists.llvm.org/pipermail/cfe-dev/2021-March/067907.html
https://lists.llvm.org/pipermail/cfe-dev/2021-July/068480.html
SARIF Standard (2.1.0):
https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html
Differential Revision: https://reviews.llvm.org/D109701
Michał Górny [Fri, 1 Jul 2022 14:46:56 +0000 (16:46 +0200)]
Reland "[lldb] [test] Improve stability of llgs vCont-threads tests"
Perform a major refactoring of vCont-threads tests in order to attempt
to improve their stability and performance.
Split test_vCont_run_subset_of_threads() into smaller test cases,
and split the whole suite into two files: one for signal-related tests,
the running-subset-of tests.
Eliminate output_match checks entirely, as they are fragile to
fragmentation of output. Instead, for the initial thread list capture
raise an explicit SIGINT from inside the test program, and for
the remaining output let the test program run until exit, and check all
the captured output afterwards.
For resume tests, capture the LLDB's thread view before and after
starting new threads in order to determine the IDs corresponding
to subthreads rather than relying on program output for that.
Add a mutex for output to guarantee serialization. A barrier is used
to guarantee that all threads start before SIGINT, and an atomic bool
is used to delay prints from happening until after SIGINT.
Call std::this_thread::yield() to reduce the risk of one of the threads
not being run.
This fixes the test hangs on FreeBSD. Hopefully, it will also fix all
the flakiness on buildbots.
Thanks to Pavel Labath for figuring out why the original version did not
work on Debian.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D129012
Fangrui Song [Mon, 11 Jul 2022 16:04:45 +0000 (09:04 -0700)]
[llvm-objcopy][ELF] Allow --set-section-flags src=... and --rename-section src=tst
* GNU objcopy supports --set-section-flags src=... --rename-section src=tst and --set-section-flags runs first.
* GNU objcopy processes --update-section before --rename-section.
To match the two behaviors, postpone --rename-section and allow its use together
with --set-section-flags.
As a side effect, --rename-section=.foo1=.foo2 --add-section=.foo1=/dev/null
leads to .foo2 while GNU objcopy surprisingly produces .foo1 (so
--set-section-flags --add-section --rename-section do not form a total order).
I think the deviation is fine as a total order makes more sense.
Rename set-section-flags-and-rename.test to
set-section-attr-and-rename.test and additionally test --set-section-alignment
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129336
Thomas Raoux [Mon, 11 Jul 2022 07:01:13 +0000 (07:01 +0000)]
[mlir][vector] Add pattern to distribute splat constant
Distribute splat constant out of WarpExecuteOnLane0Op region.
Differential Revision: https://reviews.llvm.org/D129467
Thomas Raoux [Mon, 11 Jul 2022 06:45:05 +0000 (06:45 +0000)]
[mlir][vector] Avoid creating duplicate output in warpOp
Prevent creating multiple output for the same Value when distributing
operations out of WarpExecuteOnLane0Op. This avoid creating combinatory
explosion of outputs.
Differential Revision: https://reviews.llvm.org/D129465
Jay Foad [Mon, 11 Jul 2022 14:55:18 +0000 (15:55 +0100)]
[AMDGPU] Add testing for removal of null export target in GFX11
Code changes were submitted in D128185.
Arjun P [Mon, 11 Jul 2022 13:28:15 +0000 (14:28 +0100)]
[MLIR][Presburger] introduce MPInt to support fast arbitrary precision in Presburger
This uses an int64_t-based fastpath for the common case and falls back to
SlowMPInt to handle the rare cases where larger numbers occur.
It uses `__builtin_*` for performance through the support in LLVM MathExtras.
Using this in the Presburger library results in a minor performance
*improvement* over any commit hash before sequence of patches
starting at
d5e31cf38adfc2c240fb9717989792537cc9e819.
This was previously reverted in
1e10d35ea9c02e9b5694836fd3dcc0b9baf28b48 due
to a build failure; relanding now with an attempted fix.
Reviewed By: Groverkss, ftynse
Differential Revision: https://reviews.llvm.org/D128811
Nikita Popov [Mon, 11 Jul 2022 14:45:29 +0000 (16:45 +0200)]
[Bitcode] Add additional callbr tests (NFC)
Additional coverage for the auto-upgrade code in D129288.
Dawid Jurczak [Thu, 7 Jul 2022 11:38:36 +0000 (13:38 +0200)]
[NFC][Coroutines] Add regression test for heap allocation elision optimization
Recently C++ snippet included in this patch popped up at least twice in different regression contexts:
https://github.com/llvm/llvm-project/issues/56262 and https://reviews.llvm.org/D123300
It appears that Clang users rely on HALO so adding C++ example coming originally from Gor Nishanov to tests
should help in avoiding similar regressions in future.
Differential Revision: https://reviews.llvm.org/D129279
Mircea Trofin [Fri, 8 Jul 2022 02:46:05 +0000 (19:46 -0700)]
[mlgo] Don't provide default model URLs
Pointed out in Issue #56432: the current reference models may not be
quite friendly to open source projects. Their purpose is only
illustrative - the expectation is that projects would train their own.
To avoid unintentionally pulling such a model, made the URL cmake
setting require explicit user setting.
Differential Revision: https://reviews.llvm.org/D129342
Simon Pilgrim [Mon, 11 Jul 2022 14:29:44 +0000 (15:29 +0100)]
[X86] isTargetShuffleEquivalent - attempt to match SM_SentinelZero shuffle mask elements using known bits
If the combined shuffle mask requires zero elements, we don't currently have much chance of matching them against the expected source vector. This patch uses the SelectionDAG::MaskedVectorIsZero wrapper to attempt to determine if the expected lement we want to use is already known to be zero.
I've also tightened up the ExpectedMask assertion to always be in range - we're never giving it a target shuffle mask that has sentinels at all - allowing to remove some of the confusing bounds checks.
This attempts to address some of the regressions uncovered by D129150 where we more aggressively fold shuffles as AND / 'clear' masks which results in more combined shuffles using SM_SentinelZero.
Differential Revision: https://reviews.llvm.org/D129207
Nimish Mishra [Mon, 11 Jul 2022 15:53:41 +0000 (21:23 +0530)]
[flang][OpenMP] Allow default(none) to access variables with PARAMETER attribute
This patch fixes https://github.com/flang-compiler/f18-llvm-project/issues/1351.
Concretely, data-sharing attributes on PARAMETER data used in a block
with DEFAULT(NONE) should be ignored.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D129444
Arjun P [Mon, 11 Jul 2022 13:16:52 +0000 (14:16 +0100)]
Revert "[MLIR][Presburger] introduce MPInt to support fast arbitrary precision in Presburger"
This reverts commit
c9035df2fad4da9ea75b9211b3a9b0a230925000.
Reverting due to build failure on Windows: https://lab.llvm.org/buildbot/#/builders/172/builds/14767
Venkata Ramanaiah Nalamothu [Fri, 1 Jul 2022 15:23:30 +0000 (20:53 +0530)]
[lldb] Fix thread step until to not set breakpoint(s) on incorrect line numbers
The requirements for "thread until <line number>" are:
a) If any code contributed by <line number> or the nearest subsequent of <line number> is executed before leaving the function, stop
b) If you end up leaving the function w/o triggering (a), then stop
In case of (a), since the <line number> may have multiple entries in the line table and the compiler might have scheduled/moved the relevant code across, and the lldb does not know the control flow, set breakpoints on all the line table entries of best match of <line number> i.e. exact or the nearest subsequent line.
Along with the above, currently, CommandObjectThreadUntil is also setting the breakpoints on all the subsequent line numbers after the best match and this latter part is wrong.
This issue is discussed at http://lists.llvm.org/pipermail/lldb-dev/2018-August/013979.html.
In fact, currently `TestStepUntil.py` is not actually testing step until scenarios and `test_missing_one` test fails without this patch if tests are made to run. Fixed the test as well.
Reviewed By: jingham
Differential Revision: https://reviews.llvm.org/D50304
John Brawn [Fri, 8 Jul 2022 10:12:38 +0000 (11:12 +0100)]
[MVE] Don't distribute add of vecreduce if it has more than one use
If the add has more than one use then applying the transformation
won't cause it to be removed, so we can end up applying it again
causing an infinite loop.
Differential Revision: https://reviews.llvm.org/D129361
Arnamoy Bhattacharyya [Mon, 11 Jul 2022 13:01:15 +0000 (09:01 -0400)]
[flang][OpenMP] Fix firstprivate bug
In case where the bound(s) of a workshare loop use(s) firstprivate var(s), currently, that use is not updated with the created clone. It still uses the shared variable. This patch fixes that.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D127137
David Sherwood [Tue, 10 May 2022 09:49:43 +0000 (10:49 +0100)]
[LoopVectorize] Add option to use active lane mask for loop control flow
Currently, for vectorised loops that use the get.active.lane.mask
intrinsic we only use the mask for predicated vector operations,
such as masked loads and stores, etc. The loop itself is still
controlled by comparing the canonical induction variable with the
trip count. However, for some targets this is inefficient when it's
cheap to use the mask itself to control the loop.
This patch adds support for using the active lane mask for control
flow by:
1. Generating the active lane mask for the next iteration of the
vector loop, rather than the current one. If there are still any
remaining iterations then at least the first bit of the mask will
be set.
2. Extract the first bit of this mask and use this bit for the
conditional branch.
I did this by creating a new VPActiveLaneMaskPHIRecipe that sets
up the initial PHI values in the vector loop pre-header. I've also
made use of the new BranchOnCond VPInstruction for the final
instruction in the loop region.
Differential Revision: https://reviews.llvm.org/D125301
Stephen Tozer [Mon, 20 Jun 2022 09:41:15 +0000 (10:41 +0100)]
[DebugInfo][InstrRef] Fix error in copy handling in InstrRefLDV
Currently, an error exists when InstrRefBasedLDV observes transfers of
variables across copies, which causes it to lose track of variables
under certain circumstances, resulting in shorter lifetimes for those
variables as LDV gives up searching for live locations for them. This
patch fixes this issue by storing the currently tracked values in
the destination first, then updating them manually later without
clobbering or assigning them the wrong value.
Differential Revision: https://reviews.llvm.org/D128101
Abhina Sreeskantharajan [Mon, 11 Jul 2022 12:28:47 +0000 (08:28 -0400)]
[SystemZ][z/OS] Force alignment to fix build failure on z/OS
The following commit https://reviews.llvm.org/D125998 added a static_assert which was triggered on z/OS because bitfields are always aligned to 1 regardless of type.
```
error: static_assert failed due to requirement 'alignof(llvm::SmallVector<llvm::MDOperand, 0>) <= alignof(llvm::MDNode::Header)' "LargeStorageVector too strongly aligned"
```
The solution was to force the alignment to be size_t.
Reviewed By: wolfgangp
Differential Revision: https://reviews.llvm.org/D129369
David Green [Mon, 11 Jul 2022 12:03:30 +0000 (13:03 +0100)]
[ARM] Expand MVE i1 fptoint and inttofp if mve.fp is not present.
If MVE.fp is not present then we cannot select the vector i1 fp
operations to VCMP instructions, so need to expand.
Arjun P [Mon, 11 Jul 2022 10:34:10 +0000 (11:34 +0100)]
[MLIR][Presburger] introduce MPInt to support fast arbitrary precision in Presburger
This uses an int64_t-based fastpath for the common case and falls back to
SlowMPInt to handle the rare cases where larger numbers occur.
It uses `__builtin_*` for performance through the support in LLVM MathExtras.
Using this in the Presburger library results in a minor performance
*improvement* over any commit hash before sequence of patches
starting at
d5e31cf38adfc2c240fb9717989792537cc9e819.
Reviewed By: Groverkss, ftynse
Differential Revision: https://reviews.llvm.org/D128811
Tom Praschan [Mon, 11 Jul 2022 10:20:15 +0000 (12:20 +0200)]
[clangd] Include "final" when printing class declaration
Fixes https://github.com/clangd/clangd/issues/1184
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D128202
Tom Praschan [Mon, 11 Jul 2022 10:13:35 +0000 (12:13 +0200)]
Go-to-type on smart_ptr<Foo> now also shows Foo
Fixes clangd/clangd#1026
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D128826
David Sherwood [Fri, 1 Jul 2022 09:19:45 +0000 (10:19 +0100)]
[LoopVectorize][NFC] Add optional Name parameter to VPInstruction
This patch is a simple piece of refactoring that now permits users
to create VPInstructions and specify the name of the value being
generated. This is useful for creating more readable/meaningful
names in IR.
Differential Revision: https://reviews.llvm.org/D128982