Johannes Doerfert [Sun, 23 Jan 2022 20:06:22 +0000 (14:06 -0600)]
[Attributor][NFCI] Expose some nosync reasoning to outside users.
No-sync is a property that we need in more places as complex
transformations emerge. To simplify the query we provide an
`AA::isNoSyncInst` helper now and expose two existing helpers through
the `AANoSync` class.
Johannes Doerfert [Sun, 23 Jan 2022 20:08:06 +0000 (14:08 -0600)]
[Attributor][NFCI] Remove anonymous namespaces
The namespaces made it more complicate to implement static helpers,
among other things. We should not need them at all.
Johannes Doerfert [Sat, 22 Jan 2022 22:24:52 +0000 (16:24 -0600)]
[OpenMP] Eliminate redundant barriers in the same block
Patch originally by Giorgis Georgakoudis (@ggeorgakoudis), typos and
bugs introduced later by me.
This patch allows us to remove redundant barriers if they are part
of a "consecutive" pair of barriers in a basic block with no impacted
memory effect (read or write) in-between them. Memory accesses to
local (=thread private) or constant memory are allowed to appear.
Technically we could also allow any other memory that is not used to
share information between threads, e.g., the result of a malloc that
is also not captured. However, it will be easier to do more reasoning
once the code is put into an AA. That will also allow us to look through
phis/selects reasonably. At that point we should also deal with calls,
barriers in different blocks, and other complexities.
Differential Revision: https://reviews.llvm.org/D118002
Johannes Doerfert [Fri, 21 Jan 2022 21:47:56 +0000 (15:47 -0600)]
[OpenMP] Ensure to remove noinline from all runtime functions eventually
We used to remove noinline from known OpenMP runtime functions (which
are declared in OMPKinds.td). Now we remove noinline from all functions
with the proper prefixes: __kmpc, _ZN4_OMP (= namespace omp), omp_
Amir Ayupov [Mon, 31 Jan 2022 06:02:51 +0000 (22:02 -0800)]
[BOLT][CMAKE] Add extra BOLT_INCLUDE_TESTS condition for merge-fdata emit-relocs option
Only enable --emit-relocs linker option for merge-fdata target if tests are enabled.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D118580
Siva Chandra Reddy [Mon, 31 Jan 2022 17:32:07 +0000 (17:32 +0000)]
[libc] Add implementations of POSIX mkdir, mkdirat, rmdir, unlink and unlinkat.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D118641
Jez Ng [Tue, 1 Feb 2022 04:45:19 +0000 (23:45 -0500)]
[lld-macho][test] Add test for UUID format
Reviewed By: keith
Differential Revision: https://reviews.llvm.org/D118646
Serguei Katkov [Thu, 27 Jan 2022 05:21:09 +0000 (12:21 +0700)]
[RS4GC] Make PointerToBase mapping be independent on call site. NFC.
PointerToBase is a mapping between potentially derived pointer to its base.
As soon as we are in SSA form if there is a base of derived pointer and it
is available at def of derived pointer, the same base will be available at any
point where derived pointer is alive.
So the mapping of derived pointer to base pointer is not a property
of a call site but the same on function level.
Reviewers: reames, yrouban
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D118604
Joseph Huber [Tue, 1 Feb 2022 04:32:33 +0000 (23:32 -0500)]
[OpenMP] Remove new driver tests for AMDGPU
Some of the new driver tests are flaky on AMDGPU, remove for now.
Joseph Huber [Mon, 31 Jan 2022 19:31:54 +0000 (14:31 -0500)]
[Libomptarget] Run GPU offloading tests using the new drvier
This patch adds a new target to the tests to run using the new driver as
the method for generating offloading code.
Depends on D116541
Differential Revision: https://reviews.llvm.org/D118637
Joseph Huber [Fri, 21 Jan 2022 20:43:20 +0000 (15:43 -0500)]
[PassBuilder] Add OpenMPOpt to default LTO pipeline
The LTO support for OpenMP offloading allows us to run the OpenMPOpt
pass during the LTO pipeline. This patch introduces an early run of the
Module pass and a late run of the CGSCC pass. These are quick no-ops if
there is no OpenMP in the module.
Depends on D118198
Differential Revision: https://reviews.llvm.org/D118611
Joseph Huber [Tue, 25 Jan 2022 22:46:01 +0000 (17:46 -0500)]
[OpenMP] Remove call to 'clang-offload-wrapper' binary
Summary:
This patch removes the system call to the `clang-offload-wrapper` tool
by replicating its functionality in a new file. This improves
performance and makes the future wrapping functionality easier to
change.
Differential Revision: https://reviews.llvm.org/D118198
Joseph Huber [Tue, 25 Jan 2022 19:25:39 +0000 (14:25 -0500)]
[OpenMP] Replace sysmtem call to `llc` with target machine
Summary:
This patch replaces the system call to the `llc` binary with a library
call to the target machine interface. This should be faster than
relying on an external system call to compile the final wrapper binary.
Differential Revision: https://reviews.llvm.org/D118197
Joseph Huber [Tue, 25 Jan 2022 16:23:27 +0000 (11:23 -0500)]
[OpenMP] Cleanup the Linker Wrapper
Summary:
Various changes and cleanup for the Linker Wrapper tool.
Joseph Huber [Tue, 18 Jan 2022 15:56:12 +0000 (10:56 -0500)]
[OpenMP] Include the executable name in the temporary files
Summary:
This parses the executable name out of the linker arguments so we can
use it to give more informative temporary file names and so we don't
accidentally use it for device linking.
Joseph Huber [Sun, 16 Jan 2022 21:06:59 +0000 (16:06 -0500)]
[OpenMP] Implement save temps functionality in linker wrapper
Summary:
This patch implements the `-save-temps` flag for the linker wrapper.
This allows the user to inspect the intermeditary outpout that the
linker wrapper creates.
Joseph Huber [Sun, 16 Jan 2022 04:10:52 +0000 (23:10 -0500)]
[OpenMP] Embed bitcode after optimizations instead of linking
Summary:
Various changes to the linker wrapper, and the bitcode embedding is not
done after the optimizations have run rather than after linking is done.
This saves time when doing JIT.
Joseph Huber [Fri, 14 Jan 2022 03:59:05 +0000 (22:59 -0500)]
[OpenMP] Improve symbol resolution for OpenMP Offloading LTO
This patch improves the symbol resolution done for LTO with offloading
applications. The symbol resolution done here allows the LTO backend to
internalize more functions. The symbol resoltion done is a simplified
view that does not take into account various options like `--wrap` or
`--dyanimic-list` and always assumes we are creating a shared object.
The actual target may be an executable, but semantically it is used as a
shared object because certain objects need to be visible outside of the
executable when they are read by the OpenMP plugin.
Depends on D117246
Differential Revision: https://reviews.llvm.org/D118155
Joseph Huber [Thu, 13 Jan 2022 17:42:02 +0000 (12:42 -0500)]
[OpenMP] Add support for linking AMDGPU images
This patch adds support for linking AMDGPU images using the LLD binary.
AMDGPU files are always bitcode images and will always use the LTO
backend. Additionally we now pass the default architecture found with
the `amdgpu-arch` tool to the argument list.
Depends on D117156
Differential Revision: https://reviews.llvm.org/D117246
Joseph Huber [Wed, 12 Jan 2022 21:14:52 +0000 (16:14 -0500)]
[OpenMP] Add extra flag handling to linker wrapper
This patch adds support for a few extra flags in the linker wrapper,
such as debugging flags, verbose output, and passing arguments to ptxas. We also
now forward pass remarks to the LLVM backend so they will show up in the LTO
passes.
Depends on D117049
Differential Revision: https://reviews.llvm.org/D117156
Joseph Huber [Tue, 11 Jan 2022 20:50:39 +0000 (15:50 -0500)]
[OpenMP] Add support for embedding bitcode images in wrapper tool
Summary;
This patch adds support for embedding device images in the linker
wrapper tool. This will be used for performing JIT functionality in the
future.
Depends on D117048
Differential Revision: https://reviews.llvm.org/D117049
Joseph Huber [Tue, 11 Jan 2022 15:53:59 +0000 (10:53 -0500)]
[OpenMP] Link the bitcode library late for device LTO
Summary:
This patch adds support for linking the OpenMP device bitcode library
late when doing LTO. This simply passes it in as an additional device
file when doing the final device linking phase with LTO. This has the
advantage that we don't link it multiple times, and the device
references do not get inlined and prevent us from doing needed OpenMP
optimizations when we have visiblity of the whole module.
Fix some failings where the implicit conversion of an Error to an
Expected triggered the deleted copy constructor.
Depends on D116675
Differential revision: https://reviews.llvm.org/D117048
Joseph Huber [Fri, 7 Jan 2022 22:12:51 +0000 (17:12 -0500)]
[OpenMP] Initial Implementation of LTO and bitcode linking in linker wrapper
This patch implements the fist support for handling LTO in the
offloading pipeline. The flag `-foffload-lto` is used to control if
bitcode is embedded into the device. If bitcode is found in the device,
the extracted files will be sent to the LTO pipeline to be linked and
sent to the backend. This implementation does not separately link the
device bitcode libraries yet.
Depends on D116675
Differential Revision: https://reviews.llvm.org/D116975
Joseph Huber [Wed, 5 Jan 2022 18:21:03 +0000 (13:21 -0500)]
[OpenMP] Search for static libraries in offload linker tool
This patch adds support for searching through the linker library paths
to identify static libraries that may contain device code. If device
code is present it will be extracted. This should ideally fully support
static linking with OpenMP offloading.
Depends on D116627
Differential Revision: https://reviews.llvm.org/D116675
Joseph Huber [Tue, 4 Jan 2022 22:20:04 +0000 (17:20 -0500)]
[Clang] Initial support for linking offloading code in tool
This patch adds the initial support for linking NVPTX offloading code
using the clang-linker-wrapper tool. This uses the extracted device
files and runs `nvlink` on them. Currently this is then passed to the
existing toolchain for creating linkable OpenMP offloading programs
using `clang-offload-wrapper` and compiling it manually using `llc`.
More work is required to support LTO, Bitcode linking, AMDGPU, and x86
offloading.
Depends on D116545
Differential Revision: https://reviews.llvm.org/D116627
Joseph Huber [Mon, 3 Jan 2022 17:31:52 +0000 (12:31 -0500)]
[OpenMP] Add support for extracting device code in linker wrapper
This patchs add support for extracting device offloading code from the
linker's input files. If the file contains a section with the name
`.llvm.offloading.<triple>.<arch>` it will be extracted to a new
temporary file to be linked. Addtionally, the host file containing it
will have the section stripped so it does not remain in the executable
once linked.
Depends on D116544
Differential Revision: https://reviews.llvm.org/D116545
Sam Clegg [Mon, 12 Oct 2020 13:59:51 +0000 (06:59 -0700)]
llvm-readobj: support globals in initializer expressions
Differential Revision: https://reviews.llvm.org/D117747
River Riddle [Fri, 21 Jan 2022 08:38:30 +0000 (00:38 -0800)]
[mlir] Add isa/dyn_cast support for dialect interfaces
This matches the same API usage as attributes/ops/types. For example:
```c++
Dialect *dialect = ...;
// Instead of this:
if (auto *interface = dialect->getRegisteredInterface<DialectInlinerInterface>())
// You can do this:
if (auto *interface = dyn_cast<DialectInlinerInterface>(dialect))
```
Differential Revision: https://reviews.llvm.org/D117859
Fangrui Song [Tue, 1 Feb 2022 03:16:11 +0000 (19:16 -0800)]
[AArch64] Temporarily use getPointerElementType to fix -Wdeprecated-declarations. NFC
Tanya Lattner [Tue, 1 Feb 2022 03:03:29 +0000 (19:03 -0800)]
Add status of migration.
Mircea Trofin [Tue, 1 Feb 2022 02:59:47 +0000 (18:59 -0800)]
[nfc][mlgo][regalloc] 'hasPreferredPhys' out of feature components
It isn't cacheable, it can be updated by other events than live interval
resizing.
Geoffrey Martin-Noble [Tue, 1 Feb 2022 01:50:59 +0000 (17:50 -0800)]
[Bazel] Don't fail the build on usage of deprecated APIs
Build failures are not a particularly helpful way to enforce not using
deprecated APIs and that isn't the point of the Bazel build.
At the same time, this removes `-Wno-unused` this is a check that we do
enforce in the Google internal build and so are ok maintaining in our
maintenance of the upstream Bazel build (the comment about not wanting
to do so was from a time when this was in a separate repository and I was
the only one maintaining it).
Differential Revision: https://reviews.llvm.org/D118671
Changpeng Fang [Tue, 1 Feb 2022 02:07:47 +0000 (18:07 -0800)]
AMDGPU {NFC}: Add code object v5 support and generate metadata for implicit kernel args
Summary:
Add code object v5 support (deafult is still v4)
Generate metadata for implicit kernel args for the new ABI
Set the metadata version to be 1.2
Reviewers:
t-tye, b-sumner, arsenm, and bcahoon
Fixes:
SWDEV-307188, SWDEV-307189
Differential Revision:
https://reviews.llvm.org/D118272
Chris Bieneman [Tue, 1 Feb 2022 01:44:37 +0000 (19:44 -0600)]
Fix memory leak I introduced in
2d66ed370a40
This should fix the asan issue identified on the Linux asan bot.
David Blaikie [Tue, 1 Feb 2022 01:32:31 +0000 (17:32 -0800)]
Disable -Wmissing-prototypes for internal linkage functions that aren't explicitly marked "static"
Some functions can end up non-externally visible despite not being
declared "static" or in an unnamed namespace in C++ - such as by having
parameters that are of non-external types.
Such functions aren't mistakenly intended to be defining some function
that needs a declaration. They could be maybe more legible (except for
the `operator new` example) with an explicit static, but that's a
stylistic thing outside what should be addressed by a warning.
Jonas Devlieghere [Tue, 1 Feb 2022 00:51:07 +0000 (16:51 -0800)]
[lldb] Use the build's python interpreter in the shell tests
Make sure that the shell tests use the same python interpreter as the
rest of the build instead of picking up `python` from the PATH.
It would be nice if we could use the _disallow helper, but that triggers
on invocations that specify python as the scripting language.
Fangrui Song [Tue, 1 Feb 2022 00:46:11 +0000 (16:46 -0800)]
[BitcodeWriter] Fix cases of some functions
`WriteIndexToFile` is used by external projects so I do not touch it.
Fangrui Song [Tue, 1 Feb 2022 00:33:56 +0000 (16:33 -0800)]
[ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils
D116542 adds EmbedBufferInModule which introduces a layer violation
(https://llvm.org/docs/CodingStandards.html#library-layering).
See
2d5f857a1eaf5f7a806d12953c79b96ed8952da8 for detail.
EmbedBufferInModule does not use BitcodeWriter functionality and should be moved
LLVMTransformsUtils. While here, change the function case to the prevailing
convention.
It seems that EmbedBufferInModule just follows the steps of
EmbedBitcodeInModule. EmbedBitcodeInModule calls WriteBitcodeToFile but has IR
update operations which ideally should be refactored to another library.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D118666
Joseph Huber [Mon, 31 Jan 2022 23:58:35 +0000 (18:58 -0500)]
[LLVM] Resolve layer violation in BitcodeWriter
Summary:
The changes introduced in D116542 added a dependency on TransformUtils
to use the `appendToCompilerUsed` method. This created a circular
dependency. This patch simply copies the needed function locally to
remove the dependency.
Keith Smiley [Sat, 29 Jan 2022 04:06:51 +0000 (20:06 -0800)]
[llvm-objcopy][MachO] Ignore LC_LINKER_OPTION when redefining symbols
Previously you would get this error:
```
error: unsupported load command (cmd=0x2d)
```
If the binary you were redefining the symbols of contained a
LC_LINKER_OPTION load command. This command does not need to be changed
when redefining symbols so we can ignore it like many others.
Differential Revision: https://reviews.llvm.org/D118526
Fangrui Song [Mon, 31 Jan 2022 23:41:45 +0000 (15:41 -0800)]
[Bazel] Add include/llvm/Transforms/Utils/ModuleUtils.h to work around layer violation after D116542
There is a layer violation and can break clang -fmodule-name=X -fmodules-strict-decluse builds:
* LLVMTransformUtils has `#include "llvm/Bitcode/BitcodeWriterPass.h"`
* LLVMBitWriter depends on LLVMTransformUtils after D116542
Temporarily work around the issue.
Michael Kruse [Mon, 31 Jan 2022 15:49:44 +0000 (09:49 -0600)]
[Clang][OpenMPIRBuilder] Fix off-by-one error when dividing by stepsize.
When the stepsize does not evenly divide the range's end, round-up to ensure that that last multiple of the stepsize before the reaching the upper boud is reached. For instance, the trip count of
for (int i = 0; i < 7; i+=5)
is two (i=0 and i=5), not (7-0)/5 == 1.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D118542
Peter Klausler [Wed, 26 Jan 2022 17:53:12 +0000 (09:53 -0800)]
[flang] Make NEWUNIT= use a range suitable for INTEGER(KIND=1) and recycle unit numbers
Use a bit-set to manage runtime-generated I/O unit numbers, recycle
them after they're closed, and use a range of values that fits in
a minimal-sized integer.
Differential Revision: https://reviews.llvm.org/D118651
Mircea Trofin [Mon, 31 Jan 2022 22:43:03 +0000 (14:43 -0800)]
[mlgo][regalloc] Factor live interval feature calculation
Factoring it out so we can subsequently cache it. This should be a NFC,
however, for the float quantities, we see small errors in the least
significant digits. This is because, before, we were summing up one by
one. Now, we sum up results of sums.
This shouldn't matter for ML, and will require rework when we do
quantization (avoiding floats altogether), but meanwhile, it did require
an update to the reference file used for testing.
The patch also bumps the precision of the variables involved in this, to
reduce the error (note they are casted back to float at the end by the
SET macro, since we only work with float and not double in TF)
Differential Revision: https://reviews.llvm.org/D118659
Snehasish Kumar [Mon, 31 Jan 2022 22:15:36 +0000 (14:15 -0800)]
[instrprof][NFC] Refactor out the common logic for getProfileKind.
The logic for getProfileKind for RawInstrProfReader and
InstrProfReaderIndex is similar. To avoid duplication, move the logic
from the header to InstrProfReader.cpp and introduce a static method
which implements the common code.
Differential Revision: https://reviews.llvm.org/D118656
Snehasish Kumar [Wed, 29 Dec 2021 23:31:11 +0000 (15:31 -0800)]
[memprof] Move the meminfo block struct to MemProfData.inc.
The definition of the MemInfoBlock is shared between the memprof
compiler-rt runtime and llvm/lib/ProfileData/. This change removes the
memprof_meminfoblock header and moves the struct to the shared include
file. To enable this sharing, the Print method is moved to the
memprof_allocator (the only place it is used) and the remaining uses are
updated to refer to the MemInfoBlock defined in the MemProfData.inc
file.
Also a couple of other minor changes which improve usability of the
types in MemProfData.inc.
* Update the PACKED macro to handle commas.
* Add constructors and equality operators.
* Don't initialize the buildid field.
Differential Revision: https://reviews.llvm.org/D116780
Peter Klausler [Thu, 20 Jan 2022 22:09:05 +0000 (14:09 -0800)]
[flang] runtime perf: larger I/O buffer growth increments
When reallocating an I/O buffer to accommodate a large record,
ensure that the amount of growth is at least as large as the
minimum initial record size (64KiB). The previous policy was
causing input buffer reallocation for each byte after the minimum
buffer size when scanning input data for record termination
newlines.
Differential Revision: https://reviews.llvm.org/D118649
Dávid Bolvanský [Mon, 31 Jan 2022 22:45:56 +0000 (23:45 +0100)]
[Clang][NFC] Added testcase from #49549
The issue is fixed in trunk, so add testcase to avoid regression in the future.
Konstantin Varlamov [Mon, 31 Jan 2022 22:44:53 +0000 (14:44 -0800)]
[libc++][ranges][NFC] Fix formatting on newly-added links on the Ranges status page.
Sam Clegg [Sun, 30 Jan 2022 03:09:06 +0000 (19:09 -0800)]
[clang][WebAssembly] Imply -fno-threadsafe-static when threading is disabled
When we don't enable atomics we completely disabled threading in
which case there is no point in generating thread safe code for
static initialization.
This should always be safe because, in WebAssembly, it is not
possible to link object compiled without the atomics feature into a
mutli-threaded program.
See https://github.com/emscripten-core/emscripten/pull/16152
Differential Revision: https://reviews.llvm.org/D118571
Chris Bieneman [Mon, 31 Jan 2022 21:44:55 +0000 (15:44 -0600)]
[NFC] Skip PassBuilderCTests if no default triple
This fixes the unit tests so that it is skipped if there is no default
target triple set. Unset default target triple is a supported build
configuration for LLVM.
Mircea Trofin [Mon, 31 Jan 2022 22:01:43 +0000 (14:01 -0800)]
[NFC][regalloc] Move evict advisor initialization before VRAI
This is because a subsequent patch will propose obtaining the VRAI from
the advisor, which will enable feature caching for the ML advisor, for
better compile time. Making this change first as it's both innocuous and
keeps the future patch to be reviewed small.
Joachim Protze [Mon, 31 Jan 2022 21:53:01 +0000 (22:53 +0100)]
[OpenMP][tests][NFC] Pin debug info to DWARF v4 for libarcher tests
Temporary solution for #53467, since debian test machines do not support
DWARF v5.
Kirill Stoimenov [Mon, 31 Jan 2022 20:51:03 +0000 (20:51 +0000)]
[ASan] Fixed null pointer bug introduced in D112098.
Also added some more test to cover the "else if" part.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D118645
Joseph Huber [Mon, 31 Jan 2022 21:46:00 +0000 (16:46 -0500)]
[OpenMP] Remove hard-coded triple in new driver test
Summary:
Previously this test used a hard-coded triple value in the check lines
wihch failed on other architectures. This patch changes that to accept
any host triple.
Itay Bookstein [Sat, 29 Jan 2022 11:05:17 +0000 (13:05 +0200)]
[clang][CodeGen][NFC] Remove unused CodeGenModule fields
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D118619
Mircea Trofin [Mon, 31 Jan 2022 21:42:47 +0000 (13:42 -0800)]
[nfc][mlgo] De-const a parameter
We plan to pass the MachineFunction& to APIs that expect it non-const
(for legitimate reasons). The advisor still holds the ref as a const
ref, though, so we keep most of the maintainability value of that.
Peter Klausler [Wed, 26 Jan 2022 17:54:58 +0000 (09:54 -0800)]
[flang] Distinguish intrinsic from non-intrinsic modules
For "USE, INTRINSIC", search only for intrinsic modules;
for "USE, NON_INTRINSIC", do not recognize intrinsic modules.
Allow modules of both kinds with the same name to be used in
the same source file (but not in the same scoping unit, a
constraint of the standard that is now enforced).
The symbol table's scope tree now has a single instance of
a scope with a new kind, IntrinsicModules, whose children are
the USE'd intrinsic modules (explicit or not). This separate
"top-level" scope is a child of the single global scope and
it allows both intrinsic and non-intrinsic modules of the same
name to exist in the symbol table. Intrinsic modules' scopes'
symbols now have the INTRINSIC attribute set.
The search path directories need to make a distinction between
regular directories and the one(s) that point(s) to intrinsic
modules. I allow for multiple intrinsic module directories in
the second search path, although only one is needed today.
Differential Revision: https://reviews.llvm.org/D118631
William S. Moses [Sun, 17 Oct 2021 22:31:00 +0000 (18:31 -0400)]
[LoopIdiom] Keep TBAA when creating memcpy/memmove
When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing information to be kept.
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D108221
Nico Weber [Mon, 31 Jan 2022 21:15:48 +0000 (16:15 -0500)]
[gn build] (manually) port
551b1774524
Martin Storsjö [Wed, 12 Jan 2022 09:26:49 +0000 (09:26 +0000)]
[libcxx] [Windows] Pick a unique bit for __regex_word
The old `__regex_word` aliased the mask for `xdigit`, causing stray
test failures.
The diff may look surprising, as if the previous faulty value had
been set specifically for Windows - but this is due to a restructuring
in
411c630bae0e0d50697651797709987e2cfea92d. Prior to that, there
were OS specific settings for some OSes, and one fallback used for
the rest (which turns out to not work for Windows).
Differential Revision: https://reviews.llvm.org/D118188
David Greene [Mon, 31 Jan 2022 15:06:08 +0000 (07:06 -0800)]
[UpdateTestChecks] Re-add --filter and --filter-out options
Re-add filtering options with fixes for failed tests. We were not passing the
is_filtered argument in all check generator calls in update_cc_test_checks.py
Enhance the various update_*_test_checks.py tools to allow filtering the tool
output with regular expressions. The --filter option will emit only tool output
lines matching the given regular expression while the --filter-out option will
emit only tools output lines not matching the given regular expression. Filters
are applied in order of appearance on the command line (or in UTC_ARGS) and the
first matching filter terminates the search.
This allows test authors to create more focused tests by removing irrelevant
tool output and checking only the pieces of output necessary to test the desired
functionality.
Differential Revision: https://reviews.llvm.org/D117694
tyb0807 [Thu, 20 Jan 2022 11:28:19 +0000 (11:28 +0000)]
[AArch64] Removing redundant PAuth flag
This removes `HasPAUTH` from `AArch64SubTarget`, as it seems to be a
redundant, unused copy of `HasPAuth`.
Differential Revision: https://reviews.llvm.org/D117782
tyb0807 [Wed, 19 Jan 2022 10:19:58 +0000 (10:19 +0000)]
[AArch64][SelectionDAG] CodeGen for Armv8.8/9.3 MOPS
New target SDNodes are added: AArch64ISD::MOPS_MEMSET, etc.
Each intrinsic is translated to one of these in SelectionDAGBuilder
via EmitTargetCodeForMOPS.
A custom lowering routine for INTRINSIC_W_CHAIN is added to handle
llvm.aarch64.mops.memset.tag. This takes a separate path from the common
intrinsics but ultimately ends up in the same EmitMOPS().
This is part 4/4 of a series of patches split from
https://reviews.llvm.org/D117405 to facilitate reviewing.
Patch by Tomas Matheson, Lucas Prates and Son Tuan Vu.
Differential Revision: https://reviews.llvm.org/D117764
Joseph Huber [Thu, 30 Dec 2021 21:41:36 +0000 (16:41 -0500)]
[Clang] Introduce Clang Linker Wrapper Tool
This patch introduces a linker wrapper tool that allows us to preprocess
files before they are sent to the linker. This adds a dummy action and
job to the driver stage that builds the linker command as usual and then
replaces the command line with the wrapper tool.
Depends on D116543
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D116544
Joseph Huber [Wed, 29 Dec 2021 21:29:13 +0000 (16:29 -0500)]
[OpenMP] Embed device files into the host IR
This patch adds support for embedding the device object files into the
host IR to create a fat binary. Each offloading file will be inserted
into a section with the following naming format
`.llvm.offloading.<triple>.<arch>.<filename>`.
Depends on D116542
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D116543
Joseph Huber [Fri, 3 Dec 2021 20:48:36 +0000 (15:48 -0500)]
[OpenMP] Add a flag for embedding a file into the module
This patch adds support for a flag `-fembed-offload-binary` to embed a
file as an ELF section in the output by placing it in a global variable.
This can be used to bundle offloading files with the host binary so it
can be accessed by the linker. The section is named using the
`-fembed-offload-section` option.
Depends on D116541
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D116542
Joseph Huber [Thu, 16 Dec 2021 17:40:13 +0000 (12:40 -0500)]
[OpenMP] Introduce new flag to change offloading driver pipeline
This patch introduces the `-fopenmp-new-driver` option which instructs
the compiler to use a new driver scheme for producing offloading code.
In this scheme we create a complete offloading object file and then pass
it as input to the host compilation phase. This will allow us to embed
the object code in the backend phase.
This is the start of a series of commits to rework the OpenMP offloading driver
pipeline. The goal of this is to simplify the steps required for creating an
offloading program. This patch changes the driver's configuration to simply pass
the device file back to the host as an input so it can be embedded as an LLVM IR
global during the backend, then simply passes that object file to the linker.
This driver implementation will currently create the following phases,
```
$ clang input.c -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver -ccc-print-phases
+- 0: input, "input.c", c, (host-openmp)
+- 1: preprocessor, {0}, cpp-output, (host-openmp)
+- 2: compiler, {1}, ir, (host-openmp)
| | +- 3: input, "input.c", c, (device-openmp)
| | +- 4: preprocessor, {3}, cpp-output, (device-openmp)
| |- 5: compiler, {4}, ir, (device-openmp)
| +- 6: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64)" {5}, ir
| +- 7: backend, {6}, assembler, (device-openmp)
|- 8: assembler, {7}, object, (device-openmp)
+- 9: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64)" {8}, ir
+- 10: backend, {9}, assembler, (host-openmp)
+- 11: assembler, {10}, object, (host-openmp)
12: clang-linker-wrapper, {11}, image, (host-openmp)
```
Which will map to the following bindings
```
# "x86_64-unknown-linux-gnu" - "clang", inputs: ["input.c"], output: "/tmp/input-bae62e.bc"
# "nvptx64" - "clang", inputs: ["input.c", "/tmp/input-bae62e.bc"], output: "/tmp/input-76784e.s"
# "nvptx64" - "NVPTX::Assembler", inputs: ["/tmp/input-76784e.s"], output: "/tmp/input-8f29db.o"
# "x86_64-unknown-linux-gnu" - "clang", inputs: ["/tmp/input-bae62e.bc", "/tmp/input-8f29db.o"], output: "/tmp/input-545450.o"
# "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["/tmp/input-545450.o"], output: "a.out"
```
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D116541
tyb0807 [Tue, 18 Jan 2022 22:34:48 +0000 (22:34 +0000)]
[AArch64][GlobalISel] CodeGen for Armv8.8/9.3 MOPS
This implements codegen for Armv8.8/9.3 Memory Operations extension (MOPS).
Any memcpy/memset/memmov intrinsics will always be emitted as a series
of three consecutive instructions P, M and E which perform the
operation. The SelectionDAG implementation is split into a separate
patch.
AArch64LegalizerInfo will now consider the following generic opcodes
if +mops is available, instead of legalising by expanding them to
libcalls: G_BZERO, G_MEMCPY_INLINE, G_MEMCPY, G_MEMMOVE, G_MEMSET
The s8 value of memset is legalised to s64 to match the pseudos.
AArch64O0PreLegalizerCombinerInfo will still be able to combine
G_MEMCPY_INLINE even if +mops is present, as it is unclear whether it is
better to generate fixed length copies or MOPS instructions for the
inline code of small or zero-sized memory operations, so we choose to be
conservative for now.
AArch64InstructionSelector will select the above as new pseudo
instructions: AArch64::MOPSMemory{Copy/Move/Set/SetTagging} These are
each expanded to a series of three instructions (e.g. SETP/SETM/SETE)
which must be emitted together during code emission to avoid scheduler
reordering.
This is part 3/4 of a series of patches split from
https://reviews.llvm.org/D117405 to facilitate reviewing.
Patch by Tomas Matheson and Son Tuan Vu
Differential Revision: https://reviews.llvm.org/D117763
River Riddle [Mon, 31 Jan 2022 19:32:17 +0000 (11:32 -0800)]
[mlir:Standard][NFC] Remove the dead Arithmetic op classes from Ops.td
These were dead after the arithmetic operations moved from Standard to the Arithmetic dialect.
tyb0807 [Tue, 18 Jan 2022 19:24:11 +0000 (19:24 +0000)]
[AArch64] Modeling NZCV read/write for MOPS instructions
According to the specification, MOPS instructions define/use NZCV flags as
part of their semantics (see discussion in
https://reviews.llvm.org/D116157).
More specifically, the specification of the MOPS extension states that
each memcpy/memset/memmov operation will be performed by a series
of three MOPS instructions P, M and E. The P instruction writes to the
NZCV flags, while the others (M and E) reads from the NZCV flags.
This is part 2/4 of a series of patches split from
https://reviews.llvm.org/D117405 to facilitate reviewing.
Differential Revision: https://reviews.llvm.org/D117757
tyb0807 [Tue, 18 Jan 2022 14:12:03 +0000 (14:12 +0000)]
[AArch64] Support for memset tagged intrinsic
This introduces a new ACLE intrinsic for memset tagged
(https://github.com/ARM-software/acle/blob/next-release/main/acle.md#memcpy-family-of-operations-intrinsics---mops).
void *__builtin_arm_mops_memset_tag(void *, int, size_t)
A corresponding LLVM intrinsic is introduced:
i8* llvm.aarch64.mops.memset.tag(i8*, i8, i64)
The types match llvm.memset but the return type is not void.
This is part 1/4 of a series of patches split from
https://reviews.llvm.org/D117405 to facilitate reviewing.
Patch by Tomas Matheson
Differential Revision: https://reviews.llvm.org/D117753
Peter Klausler [Thu, 20 Jan 2022 21:37:58 +0000 (13:37 -0800)]
[flang] Correct interpretation of RECL=
When RECL= is set on OPEN(), ensure that it:
1) enforces a max output record payload size
(not including header+footer or newline), and
2) causes padding of short output records only
for ACCESS='DIRECT'
The previous code was causing some false overrun errors
and applying padding to sequential/stream output files.
Differential Revision: https://reviews.llvm.org/D118630
Mircea Trofin [Mon, 31 Jan 2022 20:44:09 +0000 (12:44 -0800)]
[mlgo][regalloc][test] Add comprehensive log output testing
Sanjoy Das [Sun, 30 Jan 2022 02:34:48 +0000 (18:34 -0800)]
Remove `mutable` and stray comment
The `mutable` was added back when `scope` was a `DataLayoutOpInterface`.
Differential Revision: https://reviews.llvm.org/D118643
Martin Storsjö [Thu, 20 Jan 2022 11:46:49 +0000 (11:46 +0000)]
[libcxx] [Windows] Use the standard vsnprintf instead of _vsnprintf
In ancient Microsoft C runtimes, there might only have been
a nonstandard `_vsnprintf` instead of the standard `vsnprintf`, but
in modern versions (the only ones relevant for libc++), both
are available.
In MinGW configurations built with `__USE_MINGW_ANSI_STDIO=1` (as it
is built in CI), `vsnprintf` provides a more standards compliant
behaviour than what Microsoft's CRT provides, while `_vsnprintf` retains
the Microsoft C runtime specific quirks.
Differential Revision: https://reviews.llvm.org/D118187
Daniel McIntosh [Fri, 28 Jan 2022 19:18:56 +0000 (14:18 -0500)]
[docs] Update Prolog/Epilog Code Insertion docs to show it's still incomplete
Compact Unwind is a subsection, but that was lost in rGff9feeb520a32d076c3095468208ae116c428285
Reviewed By: void
Differential Revision: https://reviews.llvm.org/D118499
Sam Clegg [Mon, 31 Jan 2022 20:19:54 +0000 (12:19 -0800)]
Revert "[WebAssembly] Refactor and fix emission of external IR global decls"
This reverts commit
00bf4755e90c89963a135739218ef49c2417109f.
This change broke the emscripten builder (among other things):
https://ci.chromium.org/ui/p/emscripten-releases/builders/try/linux/
b8823500584349280721/overview
Sample failure:
```
test_unistd_unlink (test_core.core0) ...
wasm-ld: error: symbol type mismatch: __stdio_write
>>> defined as WASM_SYMBOL_TYPE_FUNCTION in /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/libc-debug.a(__stdio_write.o)
>>> defined as WASM_SYMBOL_TYPE_DATA in /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/libc-debug.a(stderr.o)
```
Joseph Huber [Mon, 31 Jan 2022 16:39:20 +0000 (11:39 -0500)]
[OpenMP][NFC] Change error message on offloading failure to mention documentation
This patch changes the error message to instead mention the
documentation page for the debugging options provided by libomptarget
and the bitcode runtimes. Add some extra information to the documentation to
help users more quickly identify debugging resources.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118626
Joseph Huber [Mon, 31 Jan 2022 16:47:19 +0000 (11:47 -0500)]
[Libomptarget] Reduce shared memory stack size to 512 and a message when it is exceeded
Reduces the shared memory size used for globalization to 512 bytes from
2048 to reduce the pressure on shared memory. This patch ado adds a
debug mesage to indicate when the shared memory was insufficient.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118625
Sanjay Patel [Mon, 31 Jan 2022 19:19:05 +0000 (14:19 -0500)]
[x86] add tests for binop of select with identity constant; NFC
bakhtiyar [Mon, 31 Jan 2022 20:00:03 +0000 (12:00 -0800)]
[async] Get the number of worker threads from the runtime.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D117751
Adrian Prantl [Mon, 31 Jan 2022 19:57:18 +0000 (11:57 -0800)]
Work around a Clang modules build issue.
See:
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/40636/consoleFull#-
39956214149ba4694-19c4-4d7e-bec5-
911270d8a58c
```
llvm/lib/Support/Valgrind.cpp:37:63: error: missing '#include <stddef.h>'; 'size_t' must be declared before it is used
void llvm::sys::ValgrindDiscardTranslations(const void *Addr, size_t Len) {
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/13.0.0/include/stddef.h:46:23: note: declaration here is not visible
typedef __SIZE_TYPE__ size_t;
^
1 error generated.
```
rdar://
88049280
Florian Hahn [Mon, 31 Jan 2022 19:54:14 +0000 (19:54 +0000)]
[LV] Add additional complex first order recurrence test.
Add a new test case with 2 first-order recurrences, which share a user.
Louis Dionne [Mon, 31 Jan 2022 19:44:02 +0000 (14:44 -0500)]
[libc++][NFC] Mark a few issues and papers as implemented
Differential Revision: https://reviews.llvm.org/D118638
Eli Friedman [Mon, 31 Jan 2022 18:37:07 +0000 (10:37 -0800)]
[ScalarEvolution] Add bailout to avoid zext of pointer.
The RHS of an isImpliedCond call can be a pointer even if the LHS is
not. This is similar to
bfa2a81e.
Not going to include a testcase; an IR testcase would be extremely
complicated and fragile.
Fixes https://github.com/llvm/llvm-project/issues/51936 .
Differential Revision: https://reviews.llvm.org/D114555
Paul Walker [Fri, 28 Jan 2022 13:25:08 +0000 (13:25 +0000)]
[SVE] By using SEL when orring predicates we forgo the need for a PTRUE.
Differential Revision: https://reviews.llvm.org/D118463
Chris Bieneman [Mon, 31 Jan 2022 19:31:46 +0000 (13:31 -0600)]
[NFC] Fix build when LLVM_DEFAULT_TARGET_TRIPLE=""
We do support building with a default target unspecified. This fixes
two small build issues that prevented LLVM's unit tests from building
and libSupport from building on Windows.
Ruslan Arutyunyan [Mon, 31 Jan 2022 18:40:16 +0000 (21:40 +0300)]
[libc++][pstl][NFC] Remove usage of std::result_of from Parallel STL
std::result_of creates problems when building with C++20 because it's
deprecated there.
The solution is to remove it and get return value type for a function
with decltype.
Substitute std::invoke_result for std::result_of is unnecessary because
we don't have std::invoke semantics within the function - we don't work
with pointer-to-member's.
Reviewed by: ldionne, MikeDvorskiy, #libc
Differential Revision: https://reviews.llvm.org/D118457
Konstantin Varlamov [Mon, 31 Jan 2022 19:23:40 +0000 (11:23 -0800)]
[libc++][ranges][NFC] Add some missing links to the Ranges status page.
Arthur O'Dwyer [Wed, 26 Jan 2022 04:36:55 +0000 (23:36 -0500)]
[libc++] [ranges] ADL-proof ranges::iter_{swap,move}.
As discovered in D117817, `std::ranges::input_range<Holder<Incomplete>*[10]>`
hard-errored before this patch. That's because `input_range` requires
`iter_rvalue_reference_t`, which requires `iter_move`, which was
not ADL-proofed.
Add ADL-proofing tests to all the range refinements.
`output_range` and `common_range` shouldn't be affected,
and all the others subsume `input_range` anyway, but we might as
well be thorough.
Differential Revision: https://reviews.llvm.org/D118213
Alexey Bataev [Thu, 16 Dec 2021 16:55:52 +0000 (08:55 -0800)]
[SLP]Alternate vectorization for cmp instructions.
Added support for alternate ops vectorization of the cmp instructions.
It allows to vectorize either cmp instructions with same/swapped
predicate but different (swapped) operands kinds or cmp instructions
with different predicates and compatible operands kinds.
Differential Revision: https://reviews.llvm.org/D115955
Alexander Yermolovich [Mon, 31 Jan 2022 19:06:06 +0000 (11:06 -0800)]
[BOLT][DWARF] Handle shared abbrev section
We can have a scenario where multiple CUs share an abbrev table.
We modify or don't modify one CU, which leads to other CUs having invalid abbrev section.
Example that caused it.
All of CUs shared the same abbrev table. First CU just had compile_unit and sub_program.
It was not modified. Next CU had DW_TAG_lexical_block with
DW_AT_low_pc/DW_AT_high_pc converted to DW_AT_low_pc/DW_AT_ranges.
We used unmodified abbrev section for first and subsequent CUs.
So when parsing subsequent CUs debug info was corrupted.
In this patch we will now duplicate all sections that are modified and are different.
This also means that if .debug_types is present and it shares Abbrev table, and
they usually are, we now can have two Abbrev tables. One for CU that was modified,
and unmodified one for TU.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D118517
Paul Walker [Sun, 30 Jan 2022 15:24:23 +0000 (15:24 +0000)]
[SVE] Extend isel pattern coverage for INCP & DECP.
Adds patterns for:
add(x, cntp(p, p)) -> incp(x, p)
sub(x, cntp(p, p)) -> decp(x, p)
Differential Revision: https://reviews.llvm.org/D118567
Sanjoy Das [Sat, 29 Jan 2022 22:30:57 +0000 (14:30 -0800)]
Remove OpTrait, AttrTrait and TypeTrait
- Remove the `{Op,Attr,Type}Trait` TableGen classes and replace with `Trait`
- Rename `OpTraitList` to `TraitList` and use it in a few places
The bulk of this change is a mechanical s/OpTrait/Trait/ throughout the codebase.
Reviewed By: rriddle, jpienaar, herhut
Differential Revision: https://reviews.llvm.org/D118543
Ties Stuij [Mon, 31 Jan 2022 19:00:46 +0000 (19:00 +0000)]
Add info on PACBTI-M to the Clang release notes
Differential Revision: https://reviews.llvm.org/D118380
Jonas Devlieghere [Mon, 31 Jan 2022 18:28:51 +0000 (10:28 -0800)]
[lldb] Support Rosetta registers in crashlog.py
Rosetta crashlogs can have their own thread register state. Unlike the
other registers which ware directly listed under "threadState", the
Rosetta registers are nested under their own key in the JSON, as
illustrated below:
{
"threadState":
{
"rosetta":
{
"tmp2":
{
"value":
4935057216
},
"tmp1":
{
"value":
4365863188
},
"tmp0":
{
"value":
18446744073709551615
}
}
}
}
Jon Chesterfield [Mon, 31 Jan 2022 18:43:03 +0000 (18:43 +0000)]
[openmp] Delete rpath test, too expensive to get it working across platforms
Christian Sigg [Mon, 31 Jan 2022 13:07:25 +0000 (14:07 +0100)]
[MLIR][arith] More float op folders
Fold `arith.fadd %x, -0.0 -> %x` and similarly for `fsub`, `fmul`, `fdiv`.
Fold `arith.fmin %x, %x -> %x`, `arith.fmin %x, +inf -> %x` and similarly for `fmax`.
Reviewed By: pifon2a, mehdi_amini, bondhugula
Differential Revision: https://reviews.llvm.org/D118244
Florian Hahn [Mon, 31 Jan 2022 18:20:46 +0000 (18:20 +0000)]
[AArch64] Bail out for float operands in SetCC optimization.
The optimization added in D118139 causes a crash on the added test case
while trying to zero extend an vector of floats.
Fix the crash by bailing out for floating point operands.
Reviewed By: DavidTruby
Differential Revision: https://reviews.llvm.org/D118615