Guozhi Wei [Thu, 14 Jul 2022 17:04:44 +0000 (17:04 +0000)]
[MachineCombiner] Don't compute the latency of transient instructions
If an MI will not generate a target instruction, we should not compute its
latency. Then we can compute more precise instruction sequence cost, and get
better result.
Differential Revision: https://reviews.llvm.org/D129615
Thomas Raoux [Thu, 14 Jul 2022 15:34:22 +0000 (15:34 +0000)]
[mlir][vector] Pattern to clean up vector.extract during distribution
This prevents blocking propagation when converting between scalar and
vector<1>
Differential Revision: https://reviews.llvm.org/D129782
Craig Topper [Thu, 14 Jul 2022 17:03:58 +0000 (10:03 -0700)]
[SimplifyIndVar] Use enum class for ExtendKind. NFC
I happened to notice a two places where the enum was being pass
directly to the bool IsSigned argument of createExtendInst. This
was functionally ok since SignExtended in the enum has value
of 1, but the code shouldn't rely on that.
Using an enum class prevents the enum from being convertible to bool,
but does make writing the enum values more verbose. Since we now
have to write ExtendKind:: in front of them, I've shortened the
names of ZeroExtended and SignExtended.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D129733
Nick Desaulniers [Thu, 14 Jul 2022 16:49:06 +0000 (09:49 -0700)]
[clang][test] fix typo in fn attr
While testing backports of
https://reviews.llvm.org/D129572#inline-1245936
commit
2240d72f15f3 ("[X86] initial -mfunction-return=thunk-extern support")
I noticed that one of my unit tests mistyped a function attribute. The
unit test was intended to test fn attr merging behavior, but with the
typo it was not. Small fixup.
Reviewed By: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D129691
Erich Keane [Thu, 14 Jul 2022 15:50:21 +0000 (08:50 -0700)]
[NFC] Move check for isEqualityOp to CheckFloatComparisons
So callers don't have to. Also, fix a clang-format/use of auto fix in
CheckFloatComparisons.
Florian Hahn [Thu, 14 Jul 2022 16:23:47 +0000 (09:23 -0700)]
[SCEV] Avoid creating unnecessary SCEVs for SelectInsts.
After
675080a4533b, we always create SCEVs for all operands of a
SelectInst. This can cause notable compile-time regressions compared to
the recursive algorithm, which only evaluates the operands if the select
is in a form we can create a usable expression.
This approach adds additional logic to getOperandsToCreate to only
queue operands for selects if we will later be able to construct a
usable SCEV.
Unfortunately this introduces a bit of coupling between actual SCEV
construction for selects and getOperandsToCreate, but I am not sure if
there are better alternatives to address the regression mentioned for
675080a4533b.
This doesn't have any notable compile-time impact on CTMark.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D129731
Fraser Cormack [Wed, 13 Jul 2022 14:48:40 +0000 (15:48 +0100)]
[RISCV] Disable subregister liveness by default
We previously enabled subregister liveness by default when compiling
with RVV. This has been shown to cause miscompilations where RVV
register operand constraints are not met. A test was added for this in
D129639 which explains the issue in more detail.
Until this issue is fixed in some way, we should not be enabling
subregister liveness unless the user asks for it.
Reviewed By: craig.topper, rogfer01, kito-cheng
Differential Revision: https://reviews.llvm.org/D129646
Shilei Tian [Thu, 14 Jul 2022 16:06:43 +0000 (12:06 -0400)]
[OpenMP] Ignore .eggs file in OpenMP
The OMPD patches introduces GDB plugin. When it is built, it will create a
coulple of temp files in `.eggs`. This patch add it into `.gitignore` in case it
messed up the git tracking.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D129711
Philip Reames [Thu, 14 Jul 2022 15:50:44 +0000 (08:50 -0700)]
[SCEVExpander] Allow udiv with isKnownNonZero(RHS) + add vscale case
Motivation here is to unblock LSRs ability to use ICmpZero uses - the major effect of which is to enable count down IVs. The test changes reflect this goal, but the potential impact is much broader since this isn't a change in LSR at all.
SCEVExpander needs(*) to prove that expanding the expression is safe anywhere the SCEV expression is valid. In general, we can't expand any node which might fault (or exhibit UB) unless we can either a) prove it won't fault, or b) guard the faulting case. We'd been allowing non-zero constants here; this change extends it to non-zero values.
vscale is never zero. This is already implemented in ValueTracking, and this change just adds the same logic in SCEV's range computation (which in turn drives isKnownNonZero). We should common up some logic here, but let's do that in separate changes.
(*) As an aside, "needs" is such an interesting word here. First, we don't actually need to guard this at all; we could choose to emit a select for the RHS of ever udiv and remove this code entirely. Secondly, the property being checked here is way too strong. What the client actually needs is to expand the SCEV at some particular point in some particular loop. In the examples, the original urem dominates that loop and yet we completely ignore that information when analyzing legality. I don't plan to actively pursue either direction, just noting it for future reference.
Differential Revision: https://reviews.llvm.org/D129710
Dmitry Vyukov [Thu, 14 Jul 2022 14:58:07 +0000 (16:58 +0200)]
tsan: fix a bug in trace part switching
Callers of TraceSwitchPart expect that TraceAcquire will always succeed
after the call. It's possible that TryTraceFunc/TraceMutexLock in TraceSwitchPart
that restore the current stack/mutexset filled the trace part exactly up
to the TracePart::kAlignment gap and the next TraceAcquire won't succeed.
Skip the alignment gap after writing initial stack/mutexset to avoid that.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D129777
Brendon Cahoon [Thu, 14 Jul 2022 14:47:16 +0000 (09:47 -0500)]
Revert "[UnifyLoopExits] Reduce number of guard blocks"
This reverts commit
e13248ab0e79b59d5e5ac73e2fe57d82ce485ce1.
Need to revert because the transformation cannot occur for basic
blocks that contain convergent instructions.
Dawid Jurczak [Thu, 14 Jul 2022 15:20:21 +0000 (17:20 +0200)]
[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand>
This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment
coming from that review: https://reviews.llvm.org/D129468#3643295
Differential Revision: https://reviews.llvm.org/D129565
Warren Ristow [Thu, 14 Jul 2022 15:21:04 +0000 (08:21 -0700)]
[Reassociate] Cleanup minor missed optimizations
In analyzing issue #56483, it was noticed that running `opt` with
`-reassociate` was missing some minor optimizations. For example,
there were cases where the running `opt` on IR with floating-point
instructions that have the `fast` flags applied, sometimes resulted in
less efficient code than the input IR (things like dead instructions
left behind, and missed reassociations). These were sometimes noted
in the test-files with TODOs, to investigate further. This commit
fixes some of these problems, removing some TODOs in the process.
FTR, I refer to these as "minor" missed optimizations, because when
running a full clang/llvm compilation, these inefficiencies are not
happening, as other passes clean that residue up. Regardless, having
cleaner IR produced by `opt`, makes assessing the quality of fixes done
in `opt` easier.
Andy Yankovsky [Mon, 4 Jul 2022 18:17:51 +0000 (18:17 +0000)]
[lldb] Add support for using integral const static data members in the expression evaluator
This adds support for using const static integral data members as described by C++11 [class.static.data]p3
to LLDB's expression evaluator.
So far LLDB treated these data members are normal static variables. They already work as intended when they are declared in the class definition and then defined in a namespace scope. However, if they are declared and initialised in the class definition but never defined in a namespace scope, all LLDB expressions that use them will fail to link when LLDB can't find the respective symbol for the variable.
The reason for this is that the data members which are only declared in the class are not emitted into any object file so LLDB can never resolve them. Expressions that use these variables are expected to directly use their constant value if possible. Clang can do this for us during codegen, but it requires that we add the constant value to the VarDecl we generate for these data members.
This patch implements this by:
* parsing the constant values from the debug info and adding it to variable declarations we encounter.
* ensuring that LLDB doesn't implicitly try to take the address of expressions that might be an lvalue that points to such a special data member.
The second change is caused by LLDB's way of storing lvalues in the expression parser. When LLDB parses an expression, it tries to keep the result around via two mechanisms:
1. For lvalues, LLDB generates a static pointer variable and stores the address of the last expression in it: `T *$__lldb_expr_result_ptr = &LastExpression`
2. For everything else, LLDB generates a static variable of the same type as the last expression and then direct initialises that variable: `T $__lldb_expr_result(LastExpression)`
If we try to print a special const static data member via something like `expr Class::Member`, then LLDB will try to take the address of this expression as it's an lvalue. This means LLDB will try to take the address of the variable which causes that Clang can't replace the use with the constant value. There isn't any good way to detect this case (as there a lot of different expressions that could yield an lvalue that points to such a data member), so this patch also changes that we only use the first way of capturing the result if the last expression does not have a type that could potentially indicate it's coming from such a special data member.
This change shouldn't break most workflows for users. The only observable side effect I could find is that the implicit persistent result variables for const int's now have their own memory address:
Before this change:
```
(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x00007ffeefbff8e8
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8
```
After this change we capture `i` by value so it has its own value.
```
(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x0000000100155320
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8
```
Reviewed By: Michael137
Differential Revision: https://reviews.llvm.org/D81471
Nikolas Klauser [Fri, 24 Jun 2022 23:40:56 +0000 (01:40 +0200)]
[libc++] Test the size of basic_string
Reviewed By: ldionne, #libc
Spies: hubert.reinterpretcast, arichardson, mstorsjo, libcxx-commits
Differential Revision: https://reviews.llvm.org/D127672
Nikita Popov [Thu, 14 Jul 2022 14:24:37 +0000 (16:24 +0200)]
[Bitcode] Report metadata decoding error more gracefully
Brendon Cahoon [Thu, 14 Jul 2022 14:05:50 +0000 (09:05 -0500)]
Revert "[StructurizeCFG] Improve basic block ordering"
This reverts commit
f1b05a0a2bbbea160002be709f8a1c59de366761.
Need to revert to due to issues identified with testing. The
transformation is incorrect for blocks that contain convergent
instructions.
Thomas Raoux [Thu, 14 Jul 2022 02:15:22 +0000 (02:15 +0000)]
[mlir][vector] Support distribution of vector.reduce with accumulator
Right now the pattern was ignoring the optional accumulator.
Differential Revision: https://reviews.llvm.org/D129719
Jeff Bailey [Wed, 13 Jul 2022 06:00:25 +0000 (06:00 +0000)]
Add support for three more string_view functions
Add support for three more string_view functions
1) starts_with(char)
2) ends_with(char)
3) find_first_of(char, size_t)
Reimplemented trim in terms of the new starts_with and ends_with.
Tested:
New unit tests.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D129618
Ella Ma [Thu, 14 Jul 2022 07:54:40 +0000 (15:54 +0800)]
[analyzer] Fixing SVal::getType returns Null Type for NonLoc::ConcreteInt in boolean type
In method `TypeRetrievingVisitor::VisitConcreteInt`, `ASTContext::getIntTypeForBitwidth` is used to get the type for `ConcreteInt`s.
However, the getter in ASTContext cannot handle the boolean type with the bit width of 1, which will make method `SVal::getType` return a Null `Type`.
In this patch, a check for this case is added to fix this problem by returning the bool type directly when the bit width is 1.
Differential Revision: https://reviews.llvm.org/D129737
Matthias Springer [Thu, 14 Jul 2022 08:15:09 +0000 (10:15 +0200)]
[mlir][linalg][NFC] Cleanup: Drop linalg.inplaceable attribute
bufferization.writable is used in most cases instead. All remaining test cases are updated. Some code that is no longer needed is deleted.
Differential Revision: https://reviews.llvm.org/D129739
Adam Czachorowski [Mon, 11 Jul 2022 15:29:12 +0000 (17:29 +0200)]
[clang] Do not crash on "requires" after a fatal error occurred.
The code would assume that SubstExpr() cannot fail on concept
specialization. This is incorret - we give up on some things after fatal
error occurred, since there's no value in doing futher work that the
user will not see anyway. In this case, this lead to crash.
The fatal error is simulated in tests with -ferror-limit=1, but this
could happen in other cases too.
Fixes https://github.com/llvm/llvm-project/issues/55401
Differential Revision: https://reviews.llvm.org/D129499
Michał Górny [Wed, 13 Jul 2022 15:33:28 +0000 (17:33 +0200)]
[lldb] [llgs] Convert m_debugged_processes into a map of structs
Convert the m_debugged_processes map from NativeProcessProtocol pointers
to structs, and combine the additional set(s) holding the additional
process properties into a flag field inside this struct. This is
desirable since there are more properties to come and having a single
structure with all information should be cleaner and more efficient than
using multiple sets for that.
Suggested by Pavel Labath in D128893.
Differential Revision: https://reviews.llvm.org/D129652
Alina Sbirlea [Mon, 13 Jun 2022 22:09:18 +0000 (15:09 -0700)]
Turn on flag to not re-run simplification pipeline.
This patch turns on the flag `-enable-no-rerun-simplification-pipeline`, which means the simplification pipeline will not be rerun on unchanged functions in the CGSCCPass Manager.
Compile time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=
17457be1c393ff691cca032b04ea1698fedf0301&to=
882301ebb893c8ef9f09fe1ea871f7995426fa07&stat=instructions
No meaningful run time regressions observed in the llvm test suite and
in additional internal workloads at this time.
The example test in `test/Other/no-rerun-function-simplification-pipeline.ll` is a good means to understand the effect of this change:
```
define void @f1(void()* %p) alwaysinline {
call void %p()
ret void
}
define void @f2() #0 {
call void @f1(void()* @f2)
call void @f3()
ret void
}
define void @f3() #0 {
call void @f2()
ret void
}
```
There are two SCCs formed by the ModuleToPostOrderCGSCCAdaptor: (f1) and (f2, f3).
The pass manager runs on the first SCC, leading to running the simplification pipeline (function and loop passes) on f1. With the flag on, after this, the output will have `Running analysis: ShouldNotRunFunctionPassesAnalysis on f1`.
Next, the pass manager runs on the second SCC: (f2, f3). Since f1() was inlined, f2() now calls itself, and also calls f3(), while f3() only calls f2().
So the pass manager for the SCC first runs the Inliner on (f2, f3), then the simplification pipeline on f2.
With the flag on, the output will have `Running analysis: ShouldNotRunFunctionPassesAnalysis on f2`; unless the inliner makes a change, this analysis remains preserved which means there's no reason to rerun the simplification pipeline. With the flag off, there is a second run of the simplification pipeline run on f2.
Next, the same flow occurs for f3. The simplification pipeline is run on f3 a single time with the flag on, along with `ShouldNotRunFunctionPassesAnalysis on f3`, and twice with the flag off.
The reruns occur only on f2 and f3 due to the additional ref edges.
Nikolas Klauser [Thu, 14 Jul 2022 13:04:36 +0000 (15:04 +0200)]
[libc++] Allow setting _LIBCPP_OVERRIDABLE_FUNC_VIS
Chromium changes this flag to be able to use a custom new/delete from a
dylib.
Nimish Mishra [Thu, 14 Jul 2022 12:54:57 +0000 (18:24 +0530)]
[flang][OpenMP] Added semantic checks for hint clause
This patch improves semantic checks for hint clause.
It checks "hint-expression is a constant expression
that evaluates to a scalar value with kind
`omp_sync_hint_kind` and a value that is a valid
synchronization hint."
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D127615
Nimish Mishra [Thu, 14 Jul 2022 12:50:28 +0000 (18:20 +0530)]
[flang][OpenMP] Lowering support for atomic update construct
This patch adds lowering support for atomic update construct. A region
is associated with every `omp.atomic.update` operation wherein resides:
(1) the evaluation of the expression on the RHS of the atomic assignment
statement, and (2) a `omp.yield` operation that yields the extended value
of expression evaluated in (1).
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D125668
Nikita Popov [Thu, 14 Jul 2022 12:49:07 +0000 (14:49 +0200)]
[LoopPredication] Use isSafeToExpandAt() member function (NFC)
As a followup to D129630, this switches a usage of the freestanding
function in LoopPredication to use the member variant instead. This
was the last use of the freestanding function, so drop it entirely.
Nikita Popov [Wed, 13 Jul 2022 10:18:40 +0000 (12:18 +0200)]
[SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506)
isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one caller currently gets
this wrong, resulting in PR50506.
Fix this by a) making the CanonicalMode argument on the freestanding
functions required and b) adding member functions on SCEVExpander
that automatically take the SCEVExpander mode into account. We can
use the latter variant nearly everywhere, and thus make sure that
there is no chance of CanonicalMode mismatch.
Fixes https://github.com/llvm/llvm-project/issues/50506.
Differential Revision: https://reviews.llvm.org/D129630
Namhyung Kim [Thu, 14 Jul 2022 06:58:38 +0000 (07:58 +0100)]
[llvm-objdump] Create fake sections for a ELF core file
The linux perf tools use /proc/kcore for disassembly kernel functions.
Actually it copies the relevant parts to a temp file and then pass it to
objdump. But it doesn't have section headers so llvm-objdump cannot
handle it.
Let's create fake section headers for the program headers. It'd have a
single section for each segment to cover the entire range. And for this
purpose we can consider only executable code segments.
With this change, I can see the following command shows proper outputs.
perf annotate --stdio --objdump=/path/to/llvm-objdump
Differential Revision: https://reviews.llvm.org/D128705
Nicolas Vasilache [Wed, 13 Jul 2022 15:09:38 +0000 (08:09 -0700)]
[mlir][Linalg] Retire LinalgPromotion pattern
This revision removes the LinalgPromotion pattern and adds a `transform.structured.promotion` op.
Since the LinalgPromotion transform allows the injection of arbitrary C++ via lambdas, the current
transform op does not handle it.
It is left for future work to decide what the right transform op control is for those cases.
Note the underlying implementation remains unchanged and the mechanism is still controllable by
lambdas from the API.
During this refactoring it was also determined that the `dynamicBuffers` option does not actually
connect to a change of behavior in the algorithm.
This also exhibits that the related test is wrong (and dangerous).
Both the option and the test are therefore removed.
Lastly, a test that connects patterns using the filter-based mechanism is removed: all the independent
pieces are already tested separately.
Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785
Differential Revision: https://reviews.llvm.org/D129649
Muhammad Usman Shahid [Thu, 14 Jul 2022 11:44:51 +0000 (07:44 -0400)]
Rewording "static_assert" diagnostics
This patch rewords the static assert diagnostic output. Failing a
_Static_assert in C should not report that static_assert failed. This
changes the wording to be more like GCC and uses "static assertion"
when possible instead of hard coding the name. This also changes some
instances of 'static_assert' to instead be based on the token in the
source code.
Differential Revision: https://reviews.llvm.org/D129048
zhongyunde [Thu, 14 Jul 2022 11:40:49 +0000 (19:40 +0800)]
[IndVars] Eliminate redundant type cast between unsigned integer and float
Extend for unsigned integer according the comment of D129191.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D129358
Aaron Puchert [Thu, 14 Jul 2022 11:36:35 +0000 (13:36 +0200)]
Thread safety analysis: Don't erase TIL_Opcode type (NFC)
This is mainly for debugging, but it also eliminates some casts.
Aaron Puchert [Thu, 14 Jul 2022 11:36:11 +0000 (13:36 +0200)]
Thread safety analysis: Support builtin pointer-to-member operators
We consider an access to x.*pm as access of the same kind into x, and
an access to px->*pm as access of the same kind into *px. Previously we
missed reads and writes in the .* case, and operations to the pointed-to
data for ->* (we didn't miss accesses to the pointer itself, because
that requires an LValueToRValue cast that we treat independently).
We added support for overloaded operator->* in D124966.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129514
Sunho Kim [Thu, 14 Jul 2022 11:14:22 +0000 (20:14 +0900)]
[JITLink] Silence unused variable warning. (NFC)
LLVM GN Syncbot [Thu, 14 Jul 2022 11:06:28 +0000 (11:06 +0000)]
[gn build] Port
3e9cc543f223
gbreynoo [Thu, 14 Jul 2022 11:04:38 +0000 (12:04 +0100)]
Revert "[llvm-ar][test] Add testing for bitcode file handling"
This reverts commit
264b9a4885e6f1beac3de72ee55c15dc78981927.
Due to build bot test failure.
Simon Moll [Thu, 14 Jul 2022 10:36:22 +0000 (12:36 +0200)]
[VP] Add test to show optimization opportunities
Add vp.add test cases that can are optimized with D92086 to show the
potential of generalized pattern rewriting.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D129746
Michał Górny [Thu, 14 Jul 2022 09:34:36 +0000 (11:34 +0200)]
[lldb] [gdb-remote] Remove stray GetSupportsThreadSuffix() method (NFC)
Remove stray GDBRemoteCommunicationClient::GetSupportsThreadSuffix()
method that is not implemented nor used anywhere.
gbreynoo [Thu, 14 Jul 2022 09:48:52 +0000 (10:48 +0100)]
[llvm-ar][test] Add testing for bitcode file handling
This change adds testing for handling of bitcode files in archives,
particularly the creation of symbol tables and through MRI scripts.
Although there is some testing of bitcode handling in the archive
library testing, this was not covered.
Differential Revision: https://reviews.llvm.org/D129088
Cullen Rhodes [Thu, 14 Jul 2022 09:46:23 +0000 (09:46 +0000)]
Revert "[ORC] Add a shared-memory based orc::MemoryMapper."
This reverts commit
5acd471698849d9e322a29e6ca08791e8d447b7b.
Breaks shared library build with:
ld.lld-12: error: undefined symbol: shm_open
>>> referenced by ExecutorSharedMemoryMapperService.cpp:68
(/home/culrho01/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/ExecutorSharedMemoryMapperService.cpp:68)
>>>
lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/ExecutorSharedMemoryMapperService.cpp.o:(llvm::orc::rt_bootstrap::ExecutorSharedMemoryMapperService::reserve[abi:cxx11](unsigned
long))
>>> did you mean: sem_open
>>> defined in:
/usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../aarch64-linux-gnu/libpthread.so
Cullen Rhodes [Thu, 14 Jul 2022 09:46:11 +0000 (09:46 +0000)]
Revert "[ORC] Fix compilation on mingw"
This reverts commit
46b1a7c5f9e6841016078d32728bb0d205336df5.
Parent commit breaks shared library build, reverting both commits.
Jay Foad [Wed, 13 Jul 2022 16:17:35 +0000 (17:17 +0100)]
[AMDGPU] Update LiveVariables after killing an immediate def
D114999 added code to kill an immediate def if it was folded into its
only use by convertToThreeAddress. This patch updates LiveVariables when
that happens in order to fix verification failures exposed by D129213.
Differential Revision: https://reviews.llvm.org/D129661
Nikita Popov [Thu, 14 Jul 2022 09:45:35 +0000 (11:45 +0200)]
[IndVars] Make sure header phi simplification preserves LCSSA form
When simplifying instructions, make sure that the replacement
preserves LCSSA form. This fixes the issue reported at:
https://reviews.llvm.org/D129293#3650851
Fraser Cormack [Wed, 13 Jul 2022 13:21:48 +0000 (14:21 +0100)]
[RISCV] Add a test showing a miscompilation with subreg liveness
This patch adds a test which shows that we may incorrectly register
allocate for RVV instructions which have no-overlap constraints on
source/dest registers of different LMUL groups.
The particular case shows that a vrgatherei16 instruction writes to a
LMUL=1 register group v11 and reads from an EMUL=2 register group
v10/v11. This breaks the overlap constraints of the vrgatherei16
instruction.
The test also shows that disabling subregister liveness fixes the test.
We use `early-clobber` on the `VR` dest and the `VRM2` source to enforce
the constraint but with subregister liveness this constraint is not met.
It's unclear to me at this point whether this is per-design of
early-clobber in conjunction with subregisters (meaning we should find
another way of expressing this constraint) or whether it's a bug in the
register allocator somewhere.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D129639
Cullen Rhodes [Wed, 13 Jul 2022 10:29:02 +0000 (10:29 +0000)]
[AArch64][NFC] Drop 'V' from ASIMD FP convert, other, D/Q-form regex
In the Cortex A57 Optimization Guide [1] VCVTAU (AArch32) is incorrectly
listed in the AArch64 instructions for instruction groups:
- ASIMD FP convert, other, D-form
- ASIMD FP convert, other, Q-form
It's meant to be FCVTAU, this will be fixed in future releases of the guide.
[1] https://developer.arm.com/documentation/uan0015/b
Cullen Rhodes [Thu, 14 Jul 2022 09:01:08 +0000 (09:01 +0000)]
[NFC][SVE] Add tests for zext(cmpeq(x, splat(0)))
In preparation for follow up patch folding above to CNOT.
Reviewed By: paulwalker-arm, peterwaller-arm
Differential Revision: https://reviews.llvm.org/D129625
Weining Lu [Wed, 13 Jul 2022 08:03:35 +0000 (16:03 +0800)]
[LoongArch] Implement OR combination to generate bstrins.w/d
Differential Revision: https://reviews.llvm.org/D129357
Ingo Müller [Mon, 9 May 2022 15:05:47 +0000 (15:05 +0000)]
[mlir][doc] Fix usage of PatternApplicator.
The constructor of PatternApplicator doesn't have a constructor that
accepts only a `RewritePatternSet` as currently used in the example
code in PatternRewriter.md. Instead, one has to turn it into a
`FrozenRewritePatternSet`.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D125236
Martin Storsjö [Thu, 14 Jul 2022 09:01:47 +0000 (12:01 +0300)]
[ORC] Fix compilation on mingw
Explicitly call the -W suffixed API functions when passing wchar based
strings.
Nikita Popov [Thu, 14 Jul 2022 08:52:19 +0000 (10:52 +0200)]
[SCCP] Make check for unknown/undef in unary op handling more explicit (NFCI)
Make the implementation more similar to other functions, by
explicitly skipping an unknown/undef first, and always falling
back to overdefined at the end. I don't think it makes a difference
now, but could make one once the constant evaluation can fail. In
that case we would directly mark the result as overdefined now,
rather than keeping it unknown (and later making it overdefined
because we think it's undef-based).
David Green [Thu, 14 Jul 2022 08:33:28 +0000 (09:33 +0100)]
[CodeGen] Move instruction predicate verification to emitInstruction
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo. The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.
Recommitted with some fixes for the leftover MCII variables in release
builds.
Differential Revision: https://reviews.llvm.org/D129506
Fangrui Song [Thu, 14 Jul 2022 08:28:28 +0000 (01:28 -0700)]
[CommandLine] --help: print "-o <xxx>" instead of "-o=<xxx>"
Accepting -o= is a quirk of CommandLine. For --help, we should print the
conventional "-o <xxx>".
Amara Emerson [Thu, 14 Jul 2022 08:11:15 +0000 (01:11 -0700)]
Revert "[llvm] add zstd to llvm::compression namespace"
This reverts commit
d449c600767284486615f3b79601ced15a00af61.
Breaks macOS builds with this:
llvm/lib/Support/Compression.cpp:24:10: fatal error: 'zstd.h' file not found
Nikita Popov [Wed, 22 Jun 2022 09:27:58 +0000 (11:27 +0200)]
[SCCP] Don't check for UndefValue before calling markConstant()
The value lattice explicitly represents undef, and markConstant()
internally checks for UndefValue and will create an undef rather
than constant lattice element in that case.
This is mostly a code simplification, it has little practical impact
because we usually get undef results from undef operands, and those
don't get processed.
Only leave the check behind for the CmpInst case, because it
currently goes through this incorrect code in the getCompare()
implementation: https://github.com/llvm/llvm-project/blob/
f98697642cea761448dc0f84f750d3f5def8af6b/llvm/include/llvm/Analysis/ValueLattice.h#L456-L457
Differential Revision: https://reviews.llvm.org/D128330
Amara Emerson [Thu, 14 Jul 2022 07:57:29 +0000 (00:57 -0700)]
[GlobalISel] Re-generate some checks.
Jason Molenda [Thu, 14 Jul 2022 07:53:08 +0000 (00:53 -0700)]
jGetLoadedDynamicLibrariesInfos can inspect machos not yet loaded
jGetLoadedDynamicLibrariesInfos normally checks with dyld to find
the list of binaries loaded in the inferior, and getting the filepath,
before trying to parse the Mach-O binary in inferior memory.
This allows for debugserver to parse a Mach-O binary present in memory,
but not yet registered with dyld. This patch also adds some simple
sanity checks that we're reading a Mach-O header before we begin
stepping through load commands, because we won't have the sanity check
of consulting dyld for the list of loaded binaries before parsing.
Also adds a testcase.
[This patch was reverted after causing a testsuite failure on a CI bot;
I haven't been able to repro the failure outside the CI, but I have a
theory that my sanity check on cputype which only matched arm64 and
x86_64 - and the CI machine may have a watch simulator that is still
using i386.]
Differential Revision: https://reviews.llvm.org/D128956
rdar://
95737734
Matthias Springer [Thu, 14 Jul 2022 07:47:37 +0000 (09:47 +0200)]
[mlir][sparse] Switch to One-Shot Bufferize
This change removes the partial bufferization passes from the sparse compilation pipeline and replaces them with One-Shot Bufferize. One-Shot Analysis (and TensorCopyInsertion) is used to resolve all out-of-place bufferizations, dense and sparse. Dense ops are then bufferized with BufferizableOpInterface. Sparse ops are still bufferized in the Sparsification pass.
Details:
* Dense allocations are automatically deallocated, unless they are yielded from a block. (In that case the alloc would leak.) All test cases are modified accordingly. E.g., some funcs now have an "out" tensor argument that is returned from the function. (That way, the allocation happens at the call site.)
* Sparse allocations are *not* automatically deallocated. They must be "released" manually. (No change, this will be addressed in a future change.)
* Sparse tensor copies are not supported yet. (Future change)
* Sparsification no longer has to consider inplacability. If necessary, allocations and/or copies are inserted during TensorCopyInsertion. All tensors are inplaceable by the time Sparsification is running. Instead of marking a tensor as "not inplaceable", it can be marked as "not writable", which will trigger an allocation and/or copy during TensorCopyInsertion.
Differential Revision: https://reviews.llvm.org/D129356
Jannik Silvanus [Fri, 20 May 2022 15:21:15 +0000 (17:21 +0200)]
[AMDGPU] SIMachineScheduler: Add support for several MachineScheduler features
The SI machine scheduler inherits from ScheduleDAGMI.
This patch adds support for a few features that are implemented
in ScheduleDAGMI (or its base classes) that were missing so far
because their support is implemented in overridden functions.
* Support cl::opt -view-misched-dags
This option allows to open a graphical window of the scheduling DAG.
* Support cl::opt -misched-print-dags
This option allows to print the scheduling DAG in text form.
* After constructing the scheduling DAG, call postprocessDAG()
to apply any registered DAG mutations.
Note that currently there are no mutations defined in AMDGPUTargetMachine.cpp
in case SIScheduler is used.
Still add this to avoid surprises in the future in case mutations are added.
Differential Revision: https://reviews.llvm.org/D128808
Fangrui Song [Thu, 14 Jul 2022 07:32:48 +0000 (00:32 -0700)]
[obj2yaml] Add -o to specify output filename
-o is very common among tools. yaml2obj supports -o and it surprised me that
obj2yaml doesn't support -o. Just add it which doesn't take much code.
Differential Revision: https://reviews.llvm.org/D129713
Shoaib Meenai [Thu, 14 Jul 2022 07:21:09 +0000 (00:21 -0700)]
[clang] Add missing header include
With my version of the MSVC tools (14.11.25503), this was failing to
build because of missing declarations of `std::isalnum` and
`std::isdigit`. Include `<cctype>` to get these.
Kazu Hirata [Thu, 14 Jul 2022 07:19:59 +0000 (00:19 -0700)]
[mlir] Use value instead of getValue (NFC)
Balázs Kéri [Thu, 14 Jul 2022 06:37:21 +0000 (08:37 +0200)]
[clang-tidy] Improve check cert-dcl58-cpp.
Detect template specializations that should be handled specially.
In some cases it is allowed to extend the `std` namespace with
template specializations.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129353
Kazu Hirata [Thu, 14 Jul 2022 06:39:33 +0000 (23:39 -0700)]
[clang] Use value instead of getValue (NFC)
Huan Nguyen [Thu, 14 Jul 2022 06:35:51 +0000 (23:35 -0700)]
[BOLT] Support multiple parents for split jump table
There are two assumptions regarding jump table:
(a) It is accessed by only one fragment, say, Parent
(b) All entries target instructions in Parent
For (a), BOLT stores jump table entries as relative offset to Parent.
For (b), BOLT treats jump table entries target somewhere out of Parent
as INVALID_OFFSET, including fragment of same split function.
In this update, we extend (a) and (b) to include fragment of same split
functinon. For (a), we store jump table entries in absolute offset
instead. In addition, jump table will store all fragments that access
it. A fragment uses this information to only create label for jump table
entries that target to that fragment.
For (b), using absolute offset allows jump table entries to target
fragments of same split function, i.e., extend support for split jump
table. This can be done using relocation (fragment start/size) and
fragment detection heuristics (e.g., using symbol name pattern for
non-stripped binaries).
For jump table targets that can only be reached by one fragment, we
mark them as local label; otherwise, they would be the secondary
function entry to the target fragment.
Test Plan
```
ninja check-bolt
```
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D128474
Kazu Hirata [Thu, 14 Jul 2022 06:11:56 +0000 (23:11 -0700)]
[llvm] Use value instead of getValue (NFC)
Corentin Jabot [Wed, 13 Jul 2022 17:03:18 +0000 (19:03 +0200)]
[Clang] Adjust extension warnings for delimited sequences
WG21 approved delimited escape sequences and named escape
sequences.
Adjust the extension warnings accordingly, and update
the release notes.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129664
owenca [Thu, 14 Jul 2022 03:51:40 +0000 (20:51 -0700)]
[llvm] Make lib/Target/BPF/BTF.h self-contained
Zi Xuan Wu (Zeson) [Thu, 14 Jul 2022 03:25:37 +0000 (11:25 +0800)]
[CSKY] Fix the br target operand type in td
br target operand should be Operand<OtherVT> type instead of Operand<iPTR>
Cole Kissane [Thu, 14 Jul 2022 02:58:42 +0000 (19:58 -0700)]
[llvm] add zstd to llvm::compression namespace
- add `FindZSTD.cmake`
- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`
Reviewed By: leonardchan, MaskRay
Differential Revision: https://reviews.llvm.org/D128465
Cole Kissane [Thu, 14 Jul 2022 02:48:29 +0000 (19:48 -0700)]
Revert "[llvm] add zstd to `llvm::compression` namespace"
This reverts commit
cef07169ec9f46fd25291a3218cf12bef324ea0c.
Alexander Potapenko [Thu, 14 Jul 2022 02:04:38 +0000 (19:04 -0700)]
[compiler-rt][hwasan] Support for new Intel LAM API
New version of Intel LAM patches
(https://lore.kernel.org/linux-mm/
20220712231328.5294-1-kirill.shutemov@linux.intel.com/)
uses a different interface based on arch_prctl():
- arch_prctl(ARCH_GET_UNTAG_MASK, &mask) returns the current mask for
untagging the pointers. We use it to detect kernel LAM support.
- arch_prctl(ARCH_ENABLE_TAGGED_ADDR, nr_bits) enables pointer tagging
for the current process.
Because __NR_arch_prctl is defined in different headers, and no other
platforms need it at the moment, we only declare internal_arch_prctl()
on x86_64.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D129645
Cole Kissane [Thu, 14 Jul 2022 02:06:26 +0000 (19:06 -0700)]
[llvm] add zstd to `llvm::compression` namespace
- add `FindZSTD.cmake`
- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`
Reviewed By: leonardchan, MaskRay
Differential Revision: https://reviews.llvm.org/D128465
Florian Hahn [Thu, 14 Jul 2022 01:53:39 +0000 (18:53 -0700)]
[VPlan] Move VPBB verification to separate function (NFC).
Joseph Huber [Wed, 13 Jul 2022 15:25:31 +0000 (11:25 -0400)]
[CUDA] Allow the new driver to compile CUDA in non-RDC mode
The new driver primarily allows us to support RDC-mode compilations with
proper linking. This is not needed for non-RDC mode compilation, but we
still would like the new driver to be able to handle this mode so we can
transition away from the old driver in the future. This patch adds the
necessary code to support creating a fatbinary for CUDA code generation
as well as removing old assumptions and errors about RDC-mode with the
new driver.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D129655
Jez Ng [Thu, 14 Jul 2022 01:13:45 +0000 (21:13 -0400)]
[lld-macho] Enable EH frame relocation / pruning
This just removes the code that gates the logic. The main issue here is
perf impact: without {D122258}, LLD takes a significant perf hit because
it now has to do a lot more work in the input parsing phase. But with
that change to eliminate unnecessary EH frames from input object files,
the perf overhead here is minimal. Concretely, here are the numbers for
some builds as measured on my 16-core Mac Pro:
**chromium_framework**
This is without the use of `-femit-dwarf-unwind=no-compact-unwind`:
base diff difference (95% CI)
sys_time 1.826 ± 0.019 1.962 ± 0.034 [ +6.5% .. +8.4%]
user_time 9.306 ± 0.054 9.926 ± 0.082 [ +6.2% .. +7.1%]
wall_time 8.225 ± 0.068 8.947 ± 0.128 [ +8.0% .. +9.6%]
samples 15 22
With that flag enabled, the regression mostly disappears, as hoped:
base diff difference (95% CI)
sys_time 1.839 ± 0.062 1.866 ± 0.068 [ -0.9% .. +3.8%]
user_time 9.452 ± 0.068 9.490 ± 0.067 [ -0.1% .. +0.9%]
wall_time 8.383 ± 0.127 8.452 ± 0.114 [ -0.1% .. +1.8%]
samples 17 21
**Unnamed internal app**
Without `-femit-dwarf-unwind`, this is the perf hit:
base diff difference (95% CI)
sys_time 1.372 ± 0.029 1.317 ± 0.024 [ -4.6% .. -3.5%]
user_time 2.835 ± 0.028 2.980 ± 0.027 [ +4.8% .. +5.4%]
wall_time 3.205 ± 0.079 3.383 ± 0.066 [ +4.9% .. +6.2%]
samples 102 83
With `-femit-dwarf-unwind`, the perf hit almost disappears:
base diff difference (95% CI)
sys_time 1.274 ± 0.026 1.270 ± 0.025 [ -0.9% .. +0.3%]
user_time 2.812 ± 0.023 2.822 ± 0.035 [ +0.1% .. +0.7%]
wall_time 3.166 ± 0.047 3.174 ± 0.059 [ -0.2% .. +0.7%]
samples 95 97
Just for fun, I measured the impact of `-femit-dwarf-unwind` on ld64
(`base` has the extra DWARF unwind info in the input object files,
`diff` doesn't):
base diff difference (95% CI)
sys_time 1.128 ± 0.010 1.124 ± 0.023 [ -1.3% .. +0.6%]
user_time 7.176 ± 0.030 7.106 ± 0.094 [ -1.5% .. -0.4%]
wall_time 7.874 ± 0.041 7.795 ± 0.121 [ -1.7% .. -0.3%]
samples 16 25
And for LLD:
base diff difference (95% CI)
sys_time 1.315 ± 0.019 1.280 ± 0.019 [ -3.2% .. -2.0%]
user_time 2.980 ± 0.022 2.822 ± 0.016 [ -5.5% .. -5.0%]
wall_time 3.369 ± 0.038 3.175 ± 0.033 [ -6.2% .. -5.3%]
samples 47 47
So parsing the extra EH frames is a lot more expensive for us than for
ld64. But given that we are quite a lot faster than ld64 to begin with,
I guess this isn't entirely unexpected...
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D129540
Nico Weber [Tue, 12 Jul 2022 14:54:54 +0000 (10:54 -0400)]
[docs] Document git-clang-format
clang-format's documentation documented the more general clang-format-diff.py
script. Add documentation for the less general but arguably easier-to-use
git integration as well.
Differential Revision: https://reviews.llvm.org/D129563
Nico Weber [Thu, 14 Jul 2022 01:05:36 +0000 (21:05 -0400)]
[gn build] fix building lldb after
b5ccfeb6bfbb
Stefan Pintilie [Wed, 13 Jul 2022 19:08:55 +0000 (14:08 -0500)]
[PowerPC][LLD] Change PPC64R2SaveStub to only use non-PC-relative code
Currently the PPC64R2SaveStub thunk will produce Power 10 code by default.
This produced an issue when linking older code that made use of the st_other=1
bit but was never meant to be linked or run on Power 10.
This patch makes it so that only the R_PPC64_REL24_NOTOC relocation can produce
Power 10 code.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D129580
einvbri [Thu, 7 Jul 2022 09:38:56 +0000 (04:38 -0500)]
[analyzer] Fix use of length in CStringChecker
CStringChecker is using getByteLength to get the length of a string
literal. For targets where a "char" is 8-bits, getByteLength() and
getLength() will be equal for a C string, but for targets where a "char"
is 16-bits getByteLength() returns the size in octets.
This is verified in our downstream target, but we have no way to add a
test case for this case since there is no target supporting 16-bit
"char" upstream. Since this cannot have a test case, I'm asserted this
change is "correct by construction", and visually inspected to be
correct by way of the following example where this was found.
The case that shows this fails using a target with 16-bit chars is here.
getByteLength() for the string literal returns 4, which fails when
checked against "char x[4]". With the change, the string literal is
evaluated to a size of 2 which is a correct number of "char"'s for a
16-bit target.
```
void strcpy_no_overflow_2(char *y) {
char x[4];
strcpy(x, "12"); // with getByteLength(), returns 4 using 16-bit chars
}
```
This change exposed that embedded nulls within the string are not
handled. This is documented as a FIXME for a future fix.
```
void strcpy_no_overflow_3(char *y) {
char x[3];
strcpy(x, "12\0");
}
```
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D129269
Philip Reames [Thu, 14 Jul 2022 00:12:48 +0000 (17:12 -0700)]
[LSR] Add test coverage for ICmpZero cases involving urem RHS
For the moment, we're pretty conservative here. My motivating case is the vscale one (as that is idiomatic for scalable vectorized loops on RISCV). There are two obvious approaches to fixing this, and I tried to add reasonable coverage for both even though I'll likely only fix one.
Florian Hahn [Thu, 14 Jul 2022 00:01:42 +0000 (17:01 -0700)]
[LV] Use PredRecipe directly instead of getOrAddVPValue (NFC).
There is no need to look up the VPValue for Instr, PredRecipe can be
used directly.
Peter Klausler [Mon, 11 Jul 2022 17:08:01 +0000 (10:08 -0700)]
[flang] Avoid crash from forward referenced derived type
Fortran permits forward references to derived types in contexts that don't
require knowledge of the derived type definition for semantic analysis,
such as in the declaration of a pointer or allocatable variable or component.
But when the forward-referenced derived type is used later for a component
reference, it is possible for the DerivedTypeSpec in he base variable or component
declaration to still have a null scope pointer even if the type has been defined,
since DerivedTypeSpec and TypeSpec objects are created in scopes of use
rather than in scopes of definition. The fix is to call
DerivedTypeSpec::Instantiate() in the name resolution of each component
name so that the scope gets filled in if it is still null.
Differential Revision: https://reviews.llvm.org/D129681
Dave Lee [Sat, 9 Jul 2022 00:34:10 +0000 (17:34 -0700)]
[lldb] Add image dump pcm-info command
Add `pcm-info` to the `target module dump` subcommands.
This dump command shows information about clang .pcm files. This command
effectively runs `clang -module-file-info` and produces identical output.
The .pcm file format is tightly coupled to the clang version. The clang
embedded in lldb is not guaranteed to match the version of the clang executable
available on the local system.
There have been times when I've needed to view the details about a .pcm file
produced by lldb's embedded clang, but because the clang executable was a
slightly different version, the `-module-file-info` invocation failed. With
this command, users can inspect .pcm files generated by lldb too.
Differential Revision: https://reviews.llvm.org/D129456
Peter Klausler [Fri, 8 Jul 2022 23:16:42 +0000 (16:16 -0700)]
[flang] Error detection/avoidance for TRANSFER with empty MOLD= type
When MOLD= is an array and there is no SIZE= in a call to TRANSFER(),
the size of an element of the MOLD= is used as the denominator in a
division to establish the extent of the vector result. When the
total storage size of the SOURCE= is known to be zero, the result is
empty and no division is needed.
To avoid a division by zero at runtime, we need to check for a zero-sized
MOLD= element type when the storage size of SOURCE= is nonzero and there
is no SIZE=. Further, in the compilation-time rewriting of calls to
SHAPE(TRANSFER(...)) and SIZE(TRANSFER(...)) for constant folding and
simplification purposes, we can't replace the call with an arithmetic
element count expression when the storage size of SOURCE= is not known
to be zero and the element size of MOLD= is not known to be nonzero at
compilation time.
These changes mostly affect tests using a MOLD= argument that is an
assumed-length character.
Differential Revision: https://reviews.llvm.org/D129680
Fangrui Song [Wed, 13 Jul 2022 23:47:35 +0000 (16:47 -0700)]
[Support] Fix LLVM_ENABLE_ZLIB==0 builds
owenca [Mon, 11 Jul 2022 06:49:16 +0000 (23:49 -0700)]
[clang-format][NFC] Replace most of std::vector with SmallVector
Differential Revision: https://reviews.llvm.org/D129466
Peter Klausler [Fri, 8 Jul 2022 22:25:01 +0000 (15:25 -0700)]
[flang][runtime] Complete list-directed character input with DECIMAL='COMMA'
Most of the infrastructure for DECIMAL='COMMA' mode was in place
in the I/O runtime support library, but I dropped the ball for
list-directed character input, which has its own detection of
input separators. Finish the job.
Differential Revision: https://reviews.llvm.org/D129679
Peter Klausler [Fri, 8 Jul 2022 21:35:42 +0000 (14:35 -0700)]
[flang] Ensure name resolution visits "=>NULL()" in entity-decl
Most modern Fortran programs declare procedure pointers with a
procedure-declaration-stmt, but it's also possible to declare one
with a type-declaration-stmt with a POINTER attribute. In this
case, e.g. "real, external, pointer :: p => null()" the initializer
is required to be a null-init. The parse tree traversal in name
resolution would visit the null-init if the symbol were an object
pointer only, leading to a crash in the case of a procedure pointer.
That explanation of the bug is longer than the fix. In short,
ensure that a null-init in an entity-decl is visited for both
species of pointers.
Differential Revision: https://reviews.llvm.org/D129676
Fangrui Song [Wed, 13 Jul 2022 23:26:54 +0000 (16:26 -0700)]
[Support] Change compression::zlib::{compress,uncompress} to use uint8_t *
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has
too much uint8_t *) and most callers use uint8_t * instead of char *.
The functions are recently moved into `llvm::compression::zlib::`, so
downstream projects need to make adaption anyway.
Alexander Shaposhnikov [Wed, 13 Jul 2022 23:21:45 +0000 (23:21 +0000)]
[SimplifyCFG] Improve SwitchToLookupTable optimization
Try to use the original value as an index (in the lookup table)
in more cases (to avoid one subtraction and shorten the dependency chain)
(https://github.com/llvm/llvm-project/issues/56189).
Test plan:
1/ ninja check-all
2/ bootstrapped LLVM + Clang pass tests
Differential revision: https://reviews.llvm.org/D128897
Peter Klausler [Thu, 7 Jul 2022 21:51:40 +0000 (14:51 -0700)]
[flang][runtime] Keep frame buffer in sync with file when truncating
When the I/O runtime is truncating an external file due to an
implied ENDFILE or explicit ENDFILE, ensure that the unit's frame
buffer for the file discards any data that have become obsolete.
This bug caused trouble with ACCESS='STREAM' I/O using POS= on
a WRITE, but it may have not been limited to that scenario.
Differential Revision: https://reviews.llvm.org/D129673
Peter Klausler [Thu, 7 Jul 2022 16:32:21 +0000 (09:32 -0700)]
[flang][runtime] Refine list-directed REAL(2) output
The rule used by list-directed REAL output editing to select
between Ew.d and Fw.d output editing breaks down for 16-bit
floating-point data, since the number of significant decimal
digits is so low that Ew,d output editing is nearly always selected.
Cap the test so that five-digit values will be output with Fw.d
editing.
Differential Revision: https://reviews.llvm.org/D129672
Nico Weber [Wed, 13 Jul 2022 22:35:25 +0000 (18:35 -0400)]
[gn build] (semi-manually) Port
5acd47169884
Peter Klausler [Tue, 5 Jul 2022 23:32:59 +0000 (16:32 -0700)]
[flang] Fold TRANSFER()
Fold usage of the raw data reinterpretation intrinsic function TRANSFER().
Differential Revision: https://reviews.llvm.org/D129671
Anubhab Ghosh [Tue, 12 Jul 2022 23:15:19 +0000 (16:15 -0700)]
[ORC] Add a shared-memory based orc::MemoryMapper.
This is an implementation of orc::MemoryMapper that maps shared memory
pages in both executor and controller process and writes directly to
them avoiding transferring content over EPC. All allocations are properly
deinitialized automatically on the executor side at shutdown by the
ExecutorSharedMemoryMapperService.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D128544
Yonghong Song [Tue, 12 Jul 2022 19:21:11 +0000 (12:21 -0700)]
[BPF] Handle anon record for CO-RE relocations
When doing experiment in kernel, for kernel data structure sockptr_t
in CO-RE operation, I hit an assertion error. The sockptr_t definition
and usage look like below:
#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
typedef struct {
union {
void *kernel;
void *user;
};
unsigned is_kernel : 1;
} sockptr_t;
#pragma clang attribute pop
int test(sockptr_t *arg) {
return arg->is_kernel;
}
The assertion error looks like
clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:878: llvm::Value*
{anonymous}::BPFAbstractMemberAccess::computeBaseAndAccessKey(llvm::CallInst*,
{anonymous}::BPFAbstractMemberAccess::CallInfo&, std::__cxx11::string&,
llvm::MDNode*&): Assertion `TypeName.size()' failed.
In this particular, the clang frontend attach the debuginfo metadata associated
with anon structure with the preserve_access_info IR intrinsic. But the first
debuginfo type has to be a named type so libbpf can have a sound start to
do CO-RE relocation.
Besides the above approach using pragma to push attribute, the below typedef/struct
definition can have preserve_access_index directly applying to the anon struct.
typedef struct {
union {
void *kernel;
void *user;
};
unsigned is_kernel : 1;
} __attribute__((preserve_access_index) sockptr_t;
This patch fixed the issue by preprocessing function argument/return types
and local variable types used by other CO-RE intrinsics. For any
typedef struct/union { ... } typedef_name
an association of <anon struct/union, typedef> is recorded to replace
the IR intrinsic metadata 'anon struct/union' to 'typedef'.
It is possible that two different 'typedef' types may have identical
anon struct/union type. For such a case, the association will be
<anon struct/union, nullptr> to indicate the invalid case.
Differential Revision: https://reviews.llvm.org/D129621
Leonard Chan [Wed, 13 Jul 2022 22:07:59 +0000 (15:07 -0700)]
[hwasan] Add __hwasan_add_frame_record to the hwasan interface
Hwasan includes instructions in the prologue that mix the PC and SP and store
it into the stack ring buffer stored at __hwasan_tls. This is a thread_local
global exposed from the hwasan runtime. However, if TLS-mechanisms or the
hwasan runtime haven't been setup yet, it will be invalid to access __hwasan_tls.
This is the case for Fuchsia where we instrument libc, so some functions that
are instrumented but can run before hwasan initialization will incorrectly
access this global. Additionally, libc cannot have any TLS variables, so we
cannot weakly define __hwasan_tls until the runtime is loaded.
A way we can work around this is by moving the instructions into a hwasan
function that does the store into the ring buffer and creating a weak definition
of that function locally in libc. This way __hwasan_tls will not actually be
referenced. This is not our long-term solution, but this will allow us to roll
out hwasan in the meantime.
This patch includes:
- A new llvm flag for choosing to emit a libcall rather than instructions in the
prologue (off by default)
- The libcall for storing into the ringbuffer (__hwasan_add_frame_record)
Differential Revision: https://reviews.llvm.org/D128387
Cole Kissane [Wed, 13 Jul 2022 22:08:40 +0000 (15:08 -0700)]
[llvm] fix zlib buffer truncate edge cases and fix nits in tests
- add check before truncating (un)compressed data buffer if the buffer is already a perfect length, to avoid triggering truncate assertion in edge case.
- explictly coerce LLVM_ENABLE_ZLIB to a 0 or 1 value in OFF case, to match current ON, FORCE_ON behavior.
- fix code style nits in zlib tests
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D129698