review.tizen.org Git - platform/upstream/llvm.git/log

[SanitizerCoverage] add weak definitions for the load/store callbacks.

Add weak definitions for the load/store callbacks.

This matches the weak definitions for all other SanitizerCoverage
callbacks.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D129801

[clang] Implement ElaboratedType sugaring for types written bare

Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

---

Troubleshooting list to deal with any breakage seen with this patch:

1) The most likely effect one would see by this patch is a change in how
   a type is printed. The type printer will, by design and default,
   print types as written. There are customization options there, but
   not that many, and they mainly apply to how to print a type that we
   somehow failed to track how it was written. This patch fixes a
   problem where we failed to distinguish between a type
   that was written without any elaborated-type qualifiers,
   such as a 'struct'/'class' tags and name spacifiers such as 'std::',
   and one that has been stripped of any 'metadata' that identifies such,
   the so called canonical types.
   Example:
   ```
   namespace foo {
     struct A {};
     A a;
   };
   ```
   If one were to print the type of `foo::a`, prior to this patch, this
   would result in `foo::A`. This is how the type printer would have,
   by default, printed the canonical type of A as well.
   As soon as you add any name qualifiers to A, the type printer would
   suddenly start accurately printing the type as written. This patch
   will make it print it accurately even when written without
   qualifiers, so we will just print `A` for the initial example, as
   the user did not really write that `foo::` namespace qualifier.

2) This patch could expose a bug in some AST matcher. Matching types
   is harder to get right when there is sugar involved. For example,
   if you want to match a type against being a pointer to some type A,
   then you have to account for getting a type that is sugar for a
   pointer to A, or being a pointer to sugar to A, or both! Usually
   you would get the second part wrong, and this would work for a
   very simple test where you don't use any name qualifiers, but
   you would discover is broken when you do. The usual fix is to
   either use the matcher which strips sugar, which is annoying
   to use as for example if you match an N level pointer, you have
   to put N+1 such matchers in there, beginning to end and between
   all those levels. But in a lot of cases, if the property you want
   to match is present in the canonical type, it's easier and faster
   to just match on that... This goes with what is said in 1), if
   you want to match against the name of a type, and you want
   the name string to be something stable, perhaps matching on
   the name of the canonical type is the better choice.

3) This patch could exposed a bug in how you get the source range of some
   TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
   which only looks at the given TypeLoc node. This patch introduces a new,
   and more common TypeLoc node which contains no source locations on itself.
   This is not an inovation here, and some other, more rare TypeLoc nodes could
   also have this property, but if you use getLocalSourceRange on them, it's not
   going to return any valid locations, because it doesn't have any. The right fix
   here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
   into the inner TypeLoc to get the source range if it doesn't find it on the
   top level one. You can use getLocalSourceRange if you are really into
   micro-optimizations and you have some outside knowledge that the TypeLocs you are
   dealing with will always include some source location.

4) Exposed a bug somewhere in the use of the normal clang type class API, where you
   have some type, you want to see if that type is some particular kind, you try a
   `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
   ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
   Again, like 2), this would usually have been tested poorly with some simple tests with
   no qualifications, and would have been broken had there been any other kind of type sugar,
   be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
   The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
   into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
   For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.

5) It could be a bug in this patch perhaps.

Let me know if you need any help!

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D112374

[X86] Use generic tuning for "x86-64" if "tune-cpu" is not specified

This is an alternative to D129154. See discussions on https://discourse.llvm.org/t/fast-scalar-fsqrt-tuning-in-x86/63605

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D129647

[llvm-dwp][test] Add nocompress.test testing LLVM_ENABLE_ZLIB==0

Rewrite a prebuilt file removed by D129728.

[RISCV] Refine the heuristics for our custom (mul (and X, C2), C1) isel.

Prefer to use SLLI instead of zext.w/zext.h in more cases. SLLI
might be better for compression.

[BOLT] Support split landing pad

We previously support split jump table, where some jump table entries
target different fragments of same function. In this fix, we provide
support for another type of intra-indirect transfer: landing pad.

When C++ exception handling is used, compiler emits .gcc_except_table
that describes the location of catch block (landing pad) for specific
range that potentially invokes a throw(). Normally landing pads reside
in the function, but with -fsplit-machine-functions, landing pads can
be moved to another fragment. The intuition is, landing pads are rarely
executed, so compiler can move them to .cold section.

This update will mark all fragments that have landing pad to another
fragment as non-simple, and later propagate non-simple to all related
fragments.

This update also includes one manual test case: split-landing-pad.s

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128561

[test] Remove llvm-dwp/X86/nocompress.test

It requires !zlib and isn't so useful.

[RISCV] Fix mistake in RISCVTTIImpl::getIntImmCostInst.

zext.w requires Zba not Zbb. The test was also wrong, but had the
correct comment.

[AMDGPU] Fix for the test failure caused by the 2e29b0138ca243c7d288622524a004c84acbbb9e

Fixing the idiv-licm.ll test failure

Differential Revision: https://reviews.llvm.org/D129819

[MLIR][Presburger] MPInt: use /// for top-level comment, not // (NFC)

[test] Remove zlib-gnu tests

[AMDGPU] Update the mechanism used to check for cycles and add eges in power-sched mutation

[llvm-dwp] Add SHF_COMPRESSED support and remove .zdebug support

clang 14 removed -gz=zlib-gnu and ld.lld/llvm-objcopy removed .zdebug support
recently. llvm-dwp currently doesn't support SHF_COMPRESSED. Add support and
remove .zdebug support.

Simplify llvm::object::Decompressor which has no .zdebug user now.

While here, add tests for ELF32LE, ELF32BE, and ELF64BE.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D129728

[SelectionDAG][RISCV][AMDGPU][ARM] Improve SimplifyDemandedBits for SHL with variable shift amount.

If we have a variable shift amount and the demanded mask has leading
zeros, we can propagate those leading zeros to not demand those bits
from operand 0. This can allow zero_extend/sign_extend to become
any_extend. This pattern can occur due to C integer promotion rules.

This transform is already done by InstCombineSimplifyDemanded.cpp where
sign_extend can be turned into zero_extend for example.

Reviewed By: spatel, foad

Differential Revision: https://reviews.llvm.org/D121833

[RISCV] Add additional tests for D121833. NFC

[Clang] Modify CXXMethodDecl::isMoveAssignmentOperator() to look through type sugar
AcceptedPublic

Currently CXXMethodDecl::isMoveAssignmentOperator() does not look though type
sugar and so if the parameter is a type alias it will not be able to detect
that the method is a move assignment operator. This PR fixes that and adds a set
of tests that covers that we correctly detect special member functions when
defaulting or deleting them.

This fixes: https://github.com/llvm/llvm-project/issues/56456

Differential Revision: https://reviews.llvm.org/D129591

[RISCV] Make TuneSiFive7 depend on TuneNoDefaultUnroll instead of listing it for every SiFive7 CPU

Remove testing for zlib-gnu llvm-mc support in the absence of zlib

[mlir][NVGPU] Verifier for nvgpu.ldmatrix

* Adds verifiers for `nvgpu.ldmatrix` op
* Adds tests to `mlir/test/Dialect/NVGPU/invalid.mlir`

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D129669

Remove zlibgnu support in llvm-mc

The feature's been removed from most other tools in LLVM at this point.

[clang-format] Fix invalid-code-generation by RemoveBracesLLVM

When removing an r_brace that is the first token of an annotated line, if the
line above ends with a line comment, clang-format generates invalid code by
merging the tokens after the r_brace into the line comment.

Fixes #56488.

Differential Revision: https://reviews.llvm.org/D129742

[libc++] Update RangesAlgorithms.csv

[mlir][sparse][bufferization] fix a few memory leaks

Fixed some new memory leaks after migration to new
bufferization. One is expected, the other may need
some more careful analysis.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D129805

[AMDGPU] Lowering VGPR to SGPR copies to v_readfirstlane_b32 if profitable.

Since the divergence-driven instruction selection has been enabled for AMDGPU,
all the uniform instructions are expected to be selected to SALU form, except those not having one.
VGPR to SGPR copies appear in MIR to connect values producers and consumers. This change implements an algorithm
that evolves a reasonable tradeoff between the profit achieved from keeping the uniform instructions in SALU form
and overhead introduced by the data transfer between the VGPRs and SGPRs.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D128252

Remove left over merge marker from 4b1e3d19370694dd2b2c04a5945f3f9e43917456

[gold] Ignore bitcode from sections inside object files

-fembed-bitcode will put bitcode into special sections within object
files, but this is not meant to be used by LTO, so the gold plugin
should ignore it.

https://github.com/llvm/llvm-project/issues/47216

Reviewed By: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D116995

Revert "[flang] Add co_sum to the list of intrinsics and update test"

This reverts commit d2460d90080f2ff8564ceed745998f821544ec98.

Reverting this commit because after pushing to main it caused
unexpected test failures.

[libc] Enable a few stdlib and time functions on aarch64.

[analyzer] Evaluate construction of non-POD type arrays

Introducing the support for evaluating the constructor
of every element in an array. The idea is to record the
index of the current array member being constructed and
create a loop during the analysis. We looping over the
same CXXConstructExpr as many times as many elements
the array has.

Differential Revision: https://reviews.llvm.org/D127973

[compiler-rt][CMake] Use linker semantics for unwinder and C++ library

Try the shared library first, and if it doesn't exist fallback onto
the static one. When the static library is requested, skip the shared
library.

Differential Revision: https://reviews.llvm.org/D129470

[LLD][COFF] On Windows, fix the date formatting in the 'incremental' test.

On my system the date formatting is a bit different from what the test used to
support. I'm using:

  Windows 11 version 21H2, build 22000.795 using the English(Canada) region.
  ls from BusyBox 1.36
  VS 2022 17.2.5
  WinSDK 10.0.22000

[mlir][sparse][bufferization] initialize reduction variable

After recent bufferization improvement, this test
started failing due to missed zero initialization.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D129800

[clang] Document -femit-compact-unwind option in the User’s Manual

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D129772

[flang] Add co_sum to the list of intrinsics and update test

Add the collective subroutine, co_sum, to the list of intrinsics.
In accordance with 16.9.50 and 16.9.137, add a check for and an
error if coindexed objects are being passed to certain arguments
in co_sum and in move_alloc. Add a semantics test to check that
this error is successfully caught in calls to move_alloc. Remove
the XFAIL directive, update the ERROR directives and add both
standard-conforming and non-standard conforming calls to the
semantics test for co_sum.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D114134

[ELF][test] Fix a typo in aarch64-ifunc-bti.s to actually test what was intended

Thanks to Alex Brachet for spotting it in D110217.

[mlir][AMDGPU] Add lds_barrier op

The lds_barrier op allows workgroups to wait at a barrier for
operations to/from their local data store (LDS) to complete without
incurring the performance penalties of a full memory fence.

Reviewed By: nirvedhmeshram

Differential Revision: https://reviews.llvm.org/D129522

[libc] Enable few stdio functions on aarch64.

[mlir] (NFC) run clang-format on all files

[libc] Enable few pthread and threads functions on aarch64.

[libc] Add implementations of pthread_equal and pthread_self.

Reviewed By: michaelrj, lntue

Differential Revision: https://reviews.llvm.org/D129729

[BOLT] Replace uses of layout with basic block list

As we are moving towards support for multiple fragments, loops that
iterate over all basic blocks of a function, but do not depend on the
order of basic blocks in the final layout, should iterate over binary
functions directly, rather than the layout.

Eventually, all loops using the layout list should either iterate over
the function, or be aware of multiple layouts. This patch replaces
references to binary function's block layout with the binary function
itself where only little code changes are necessary.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129585

[libc++][ranges] implement `std::ranges::set_union`

[libc++][ranges] implement `std::ranges::set_union`

Differential Revision: https://reviews.llvm.org/D129657

[CVP] Add coverage for missing mul/shl nowrap variants

[gn build] Port a83004f4ff9e

[test] Fix D129789 for 32bit platforms

[libcxx][AIX][z/OS] Remove headers included via `_IBMCPP__`

D127650 removed support for non-clang-based XL compilers, but left some
of the headers used only by this compiler and included under the
__IBMCPP__ macro. This change cleans this up by deleting these headers.

Reviewed By: hubert.reinterpretcast, fanbo-meng

Differential Revision: https://reviews.llvm.org/D129491

[libc++] Add missing UNSUPPORTED annotations to experimental tests that use RTTI

[libcxxabi][CMake] Set --unwindlib=none when using LLVM libunwind

We already link libunwind explicitly so avoid trying to link toolchain's
default libunwind which may be missing. This matches what we already do
for libcxx.

Differential Revision: https://reviews.llvm.org/D129469

[InstrProf] Add options to profile function groups

Add two options, `-fprofile-function-groups=N` and `-fprofile-selected-function-group=i` used to partition functions into `N` groups and only instrument the functions in group `i`. Similar options were added to xray in https://reviews.llvm.org/D87953 and the goal is the same; to reduce instrumented size overhead by spreading the overhead across multiple builds. Raw profiles from different groups can be added like normal using the `llvm-profdata merge` command.

Reviewed By: ianlevesque

Differential Revision: https://reviews.llvm.org/D129594

[clang][CodeGen] add fn_ret_thunk_extern to synthetic fns

Follow up fix to
commit 2240d72f15f3 ("[X86] initial -mfunction-return=thunk-extern
support")
https://reviews.llvm.org/D129572

@nathanchance reported that -mfunction-return=thunk-extern was failing
to annotate the asan and tsan contructors.
https://lore.kernel.org/llvm/Ys7pLq+tQk5xEa%2FB@dev-arch.thelio-3990X/

I then noticed the same occurring for gcov synthetic functions.

Similar to
commit 2786e67 ("[IR][sanitizer] Add module flag "frame-pointer" and set
it for cc1 -mframe-pointer={non-leaf,all}")
define a new module level MetaData, "fn_ret_thunk_extern", then when set
adds the fn_ret_thunk_extern IR Fn Attr to synthetically created
Functions.

Fixes https://github.com/llvm/llvm-project/issues/56514

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D129709

[PhaseOrdering][SystemZ] add test for combining/unrolling; NFC

As discussed in D128123, this test is based on an example
that ends up with codegen regressions if sub is converted
to xor.

[InstCombine] add/edit tests for masked sub from constant; NFC

[test][CodeGen] Don't miss lifetime markers in lifetime tests

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D129789

[GlobalISel] Change widenScalar of G_FCONSTANT to mutate into G_CONSTANT.

Widening a G_FCONSTANT by extending and then generating G_FPTRUNC doesn't produce
the same result all the time. Instead, we can just transform it to a G_CONSTANT
of the same bit pattern and truncate using a plain G_TRUNC instead.

Fixes https://github.com/llvm/llvm-project/issues/56454

Differential Revision: https://reviews.llvm.org/D129743

[libc++] Use __unwrap_iter_impl for both unwrapping and rewrapping

Reviewed By: ldionne, #libc

Spies: arichardson, sstefan1, libcxx-commits

Differential Revision: https://reviews.llvm.org/D129039

Revert "Rewording "static_assert" diagnostics"

This reverts commit b7e77ff25fb2412f6ab6d6cc756666b0e2f97bd3.

Reason: Broke sanitizer builds bots + libcxx. 'static assertion
expression is not an integral constant expression'. More details
available in the Phabricator review: https://reviews.llvm.org/D129048

[RISCV][LSR] Add coverage for ICmpZero with scaled vscale values

Follow up to 3bc09c7da5 - remove a fixme I forgot to remove, and add test cases showing remaining work.

Note that scaled vscales show up in vectorized code from a couple of sources:
* Element types smaller than vector block size (i.e. everything under i64)
* Unrolling
* LMUL > 1

The largest scaling we can currently have is 256 (e8 in every possible vector register). More practically useful scales are in the 2-16 range.

Revert "[lldb] Add support for using integral const static data members in the expression evaluator"

This reverts commit 486787210df5ce5eabadc90a7de353ae81101feb.

This broke the windows lldb bot: https://lab.llvm.org/buildbot/#/builders/83/builds/21186

[libc++] Error if someone tries to use MSVC and tell them to contact the libc++ developers

Nobody knows if there are users of libc++ with MSVC. Let's try to find that out and encourage them to upstream their changes to make that configuration work.

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D129055

[gn build] Port 1a8468ba6114

[NFC] Clang-format D129645

[lldb] [llgs] Remove not-really-used m_inferior_prev_state

Remove m_inferior_prev_state that's not suitable for multiprocess
debugging and that does not seem to be really used at all.

The only use of the variable right now is to "prevent" sending the stop
reason after attach/launch. However, this code is never actually run
since none of the process plugins actually use eStateLaunching or
eStateAttaching. Through adding an assert, I've confirmed that it's
never hit in any of the LLDB tests or while attaching/launching debugged
process via lldb-server and via lldb CLI.

Differential Revision: https://reviews.llvm.org/D128878
Sponsored by: The FreeBSD Foundation

Pass -DLIBXML2_INCLUDE_DIRS in the Windows release package script

As pointed out on https://reviews.llvm.org/D129571 this seems to
be the preferred variable to set.

[RISCV] Add a RISCV specific CodeGenPrepare pass.

Initial optimization is to convert (i64 (zext (i32 X))) to
(i64 (sext (i32 X))) if the dominating condition for the basic block
guaranteed the sign bit of X is zero.

This frequently occurs in loop preheaders where a signed induction
variable that can never be negative has been widened. There will be
a dominating check that the 32-bit trip count isn't negative or zero.
The check here is not restricted to that specific case though.

A i32->i64 sext is cheaper than zext on RV64 without the Zba
extension. Later optimizations can often remove the sext from the
preheader basic block because the dominating block also needs a sext to
evaluate the greater than 0 check.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D129732

[libc] Add nearest integer instructions to fputil.

Add round to nearest integer instructions to fputil. This will be
used in sinf implementation https://reviews.llvm.org/D123154

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D129776

[lldb] Remove ELF .zdebug support

clang 14 removed -gz=zlib-gnu support and ld.lld/llvm-objcopy removed zlib-gnu
support recently. Remove lldb support by migrating away from
llvm::object::Decompressor::isCompressedELFSection.
The API has another user llvm-dwp, so it is not removed in this patch.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D129724

[MachineCombiner] Don't compute the latency of transient instructions

If an MI will not generate a target instruction, we should not compute its
latency. Then we can compute more precise instruction sequence cost, and get
better result.

Differential Revision: https://reviews.llvm.org/D129615

[mlir][vector] Pattern to clean up vector.extract during distribution

This prevents blocking propagation when converting between scalar and
vector<1>

Differential Revision: https://reviews.llvm.org/D129782

[SimplifyIndVar] Use enum class for ExtendKind. NFC

I happened to notice a two places where the enum was being pass
directly to the bool IsSigned argument of createExtendInst. This
was functionally ok since SignExtended in the enum has value
of 1, but the code shouldn't rely on that.

Using an enum class prevents the enum from being convertible to bool,
but does make writing the enum values more verbose. Since we now
have to write ExtendKind:: in front of them, I've shortened the
names of ZeroExtended and SignExtended.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129733

[clang][test] fix typo in fn attr

While testing backports of
https://reviews.llvm.org/D129572#inline-1245936
commit 2240d72f15f3 ("[X86] initial -mfunction-return=thunk-extern support")
I noticed that one of my unit tests mistyped a function attribute. The
unit test was intended to test fn attr merging behavior, but with the
typo it was not. Small fixup.

Reviewed By: aaron.ballman, erichkeane

Differential Revision: https://reviews.llvm.org/D129691

[NFC] Move check for isEqualityOp to CheckFloatComparisons

So callers don't have to. Also, fix a clang-format/use of auto fix in
CheckFloatComparisons.

[SCEV] Avoid creating unnecessary SCEVs for SelectInsts.

After 675080a4533b, we always create SCEVs for all operands of a
SelectInst. This can cause notable compile-time regressions compared to
the recursive algorithm, which only evaluates the operands if the select
is in a form we can create a usable expression.

This approach adds additional logic to getOperandsToCreate to only
queue operands for selects if we will later be able to construct a
usable SCEV.

Unfortunately this introduces a bit of coupling between actual SCEV
construction for selects and getOperandsToCreate, but I am not sure if
there are better alternatives to address the regression mentioned for
675080a4533b.

This doesn't have any notable compile-time impact on CTMark.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129731

[RISCV] Disable subregister liveness by default

We previously enabled subregister liveness by default when compiling
with RVV. This has been shown to cause miscompilations where RVV
register operand constraints are not met. A test was added for this in
D129639 which explains the issue in more detail.

Until this issue is fixed in some way, we should not be enabling
subregister liveness unless the user asks for it.

Reviewed By: craig.topper, rogfer01, kito-cheng

Differential Revision: https://reviews.llvm.org/D129646

[OpenMP] Ignore .eggs file in OpenMP

The OMPD patches introduces GDB plugin. When it is built, it will create a
coulple of temp files in `.eggs`. This patch add it into `.gitignore` in case it
messed up the git tracking.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D129711

[SCEVExpander] Allow udiv with isKnownNonZero(RHS) + add vscale case

Motivation here is to unblock LSRs ability to use ICmpZero uses - the major effect of which is to enable count down IVs. The test changes reflect this goal, but the potential impact is much broader since this isn't a change in LSR at all.

SCEVExpander needs(*) to prove that expanding the expression is safe anywhere the SCEV expression is valid. In general, we can't expand any node which might fault (or exhibit UB) unless we can either a) prove it won't fault, or b) guard the faulting case. We'd been allowing non-zero constants here; this change extends it to non-zero values.

vscale is never zero. This is already implemented in ValueTracking, and this change just adds the same logic in SCEV's range computation (which in turn drives isKnownNonZero). We should common up some logic here, but let's do that in separate changes.

(*) As an aside, "needs" is such an interesting word here. First, we don't actually need to guard this at all; we could choose to emit a select for the RHS of ever udiv and remove this code entirely. Secondly, the property being checked here is way too strong. What the client actually needs is to expand the SCEV at some particular point in some particular loop. In the examples, the original urem dominates that loop and yet we completely ignore that information when analyzing legality. I don't plan to actively pursue either direction, just noting it for future reference.

Differential Revision: https://reviews.llvm.org/D129710

tsan: fix a bug in trace part switching

Callers of TraceSwitchPart expect that TraceAcquire will always succeed
after the call. It's possible that TryTraceFunc/TraceMutexLock in TraceSwitchPart
that restore the current stack/mutexset filled the trace part exactly up
to the TracePart::kAlignment gap and the next TraceAcquire won't succeed.
Skip the alignment gap after writing initial stack/mutexset to avoid that.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D129777

Revert "[UnifyLoopExits] Reduce number of guard blocks"

This reverts commit e13248ab0e79b59d5e5ac73e2fe57d82ce485ce1.

Need to revert because the transformation cannot occur for basic
blocks that contain convergent instructions.

[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand>

This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment
coming from that review: https://reviews.llvm.org/D129468#3643295

Differential Revision: https://reviews.llvm.org/D129565

[Reassociate] Cleanup minor missed optimizations

In analyzing issue #56483, it was noticed that running `opt` with
`-reassociate` was missing some minor optimizations. For example,
there were cases where the running `opt` on IR with floating-point
instructions that have the `fast` flags applied, sometimes resulted in
less efficient code than the input IR (things like dead instructions
left behind, and missed reassociations). These were sometimes noted
in the test-files with TODOs, to investigate further. This commit
fixes some of these problems, removing some TODOs in the process.

FTR, I refer to these as "minor" missed optimizations, because when
running a full clang/llvm compilation, these inefficiencies are not
happening, as other passes clean that residue up. Regardless, having
cleaner IR produced by `opt`, makes assessing the quality of fixes done
in `opt` easier.

[lldb] Add support for using integral const static data members in the expression evaluator

This adds support for using const static integral data members as described by C++11 [class.static.data]p3
to LLDB's expression evaluator.

So far LLDB treated these data members are normal static variables. They already work as intended when they are declared in the class definition and then defined in a namespace scope. However, if they are declared and initialised in the class definition but never defined in a namespace scope, all LLDB expressions that use them will fail to link when LLDB can't find the respective symbol for the variable.

The reason for this is that the data members which are only declared in the class are not emitted into any object file so LLDB can never resolve them. Expressions that use these variables are expected to directly use their constant value if possible. Clang can do this for us during codegen, but it requires that we add the constant value to the VarDecl we generate for these data members.

This patch implements this by:
* parsing the constant values from the debug info and adding it to variable declarations we encounter.
* ensuring that LLDB doesn't implicitly try to take the address of expressions that might be an lvalue that points to such a special data member.

The second change is caused by LLDB's way of storing lvalues in the expression parser. When LLDB parses an expression, it tries to keep the result around via two mechanisms:

1. For lvalues, LLDB generates a static pointer variable and stores the address of the last expression in it: `T *$__lldb_expr_result_ptr = &LastExpression`
2. For everything else, LLDB generates a static variable of the same type as the last expression and then direct initialises that variable: `T $__lldb_expr_result(LastExpression)`

If we try to print a special const static data member via something like `expr Class::Member`, then LLDB will try to take the address of this expression as it's an lvalue. This means LLDB will try to take the address of the variable which causes that Clang can't replace the use with the constant value. There isn't any good way to detect this case (as there a lot of different expressions that could yield an lvalue that points to such a data member), so this patch also changes that we only use the first way of capturing the result if the last expression does not have a type that could potentially indicate it's coming from such a special data member.

This change shouldn't break most workflows for users. The only observable side effect I could find is that the implicit persistent result variables for const int's now have their own memory address:

Before this change:
```
(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x00007ffeefbff8e8
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8
```

After this change we capture `i` by value so it has its own value.
```
(lldb) p i
(const int) $0 = 123
(lldb) p &$0
(const int *) $1 = 0x0000000100155320
(lldb) p &i
(const int *) $2 = 0x00007ffeefbff8e8
```

Reviewed By: Michael137

Differential Revision: https://reviews.llvm.org/D81471

[libc++] Test the size of basic_string

Reviewed By: ldionne, #libc

Spies: hubert.reinterpretcast, arichardson, mstorsjo, libcxx-commits

Differential Revision: https://reviews.llvm.org/D127672

[Bitcode] Report metadata decoding error more gracefully

Revert "[StructurizeCFG] Improve basic block ordering"

This reverts commit f1b05a0a2bbbea160002be709f8a1c59de366761.

Need to revert to due to issues identified with testing. The
transformation is incorrect for blocks that contain convergent
instructions.

[mlir][vector] Support distribution of vector.reduce with accumulator

Right now the pattern was ignoring the optional accumulator.

Differential Revision: https://reviews.llvm.org/D129719

Add support for three more string_view functions

Add support for three more string_view functions

1) starts_with(char)
2) ends_with(char)
3) find_first_of(char, size_t)

Reimplemented trim in terms of the new starts_with and ends_with.

Tested:
New unit tests.

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D129618

[analyzer] Fixing SVal::getType returns Null Type for NonLoc::ConcreteInt in boolean type

In method `TypeRetrievingVisitor::VisitConcreteInt`, `ASTContext::getIntTypeForBitwidth` is used to get the type for `ConcreteInt`s.
However, the getter in ASTContext cannot handle the boolean type with the bit width of 1, which will make method `SVal::getType` return a Null `Type`.
In this patch, a check for this case is added to fix this problem by returning the bool type directly when the bit width is 1.

Differential Revision: https://reviews.llvm.org/D129737

[mlir][linalg][NFC] Cleanup: Drop linalg.inplaceable attribute

bufferization.writable is used in most cases instead. All remaining test cases are updated. Some code that is no longer needed is deleted.

Differential Revision: https://reviews.llvm.org/D129739

[clang] Do not crash on "requires" after a fatal error occurred.

The code would assume that SubstExpr() cannot fail on concept
specialization. This is incorret - we give up on some things after fatal
error occurred, since there's no value in doing futher work that the
user will not see anyway. In this case, this lead to crash.

The fatal error is simulated in tests with -ferror-limit=1, but this
could happen in other cases too.

Fixes https://github.com/llvm/llvm-project/issues/55401

Differential Revision: https://reviews.llvm.org/D129499

[lldb] [llgs] Convert m_debugged_processes into a map of structs

Convert the m_debugged_processes map from NativeProcessProtocol pointers
to structs, and combine the additional set(s) holding the additional
process properties into a flag field inside this struct. This is
desirable since there are more properties to come and having a single
structure with all information should be cleaner and more efficient than
using multiple sets for that.

Suggested by Pavel Labath in D128893.

Differential Revision: https://reviews.llvm.org/D129652

Turn on flag to not re-run simplification pipeline.

This patch turns on the flag `-enable-no-rerun-simplification-pipeline`, which means the simplification pipeline will not be rerun on unchanged functions in the CGSCCPass Manager.

Compile time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=17457be1c393ff691cca032b04ea1698fedf0301&to=882301ebb893c8ef9f09fe1ea871f7995426fa07&stat=instructions

No meaningful run time regressions observed in the llvm test suite and
in additional internal workloads at this time.

The example test in `test/Other/no-rerun-function-simplification-pipeline.ll` is a good means to understand the effect of this change:
```
define void @f1(void()* %p) alwaysinline {
  call void %p()
  ret void
}

define void @f2() #0 {
  call void @f1(void()* @f2)
  call void @f3()
  ret void
}

define void @f3() #0 {
  call void @f2()
  ret void
}
```

There are two SCCs formed by the ModuleToPostOrderCGSCCAdaptor: (f1) and (f2, f3).

The pass manager runs on the first SCC, leading to running the simplification pipeline (function and loop passes) on f1. With the flag on, after this, the output will have `Running analysis: ShouldNotRunFunctionPassesAnalysis on f1`.

Next, the pass manager runs on the second SCC: (f2, f3). Since f1() was inlined, f2() now calls itself, and also calls f3(), while f3() only calls f2().
So the pass manager for the SCC first runs the Inliner on (f2, f3), then the simplification pipeline on f2.
With the flag on, the output will have `Running analysis: ShouldNotRunFunctionPassesAnalysis on f2`; unless the inliner makes a change, this analysis remains preserved which means there's no reason to rerun the simplification pipeline. With the flag off, there is a second run of the simplification pipeline run on f2.

Next, the same flow occurs for f3. The simplification pipeline is run on f3 a single time with the flag on, along with `ShouldNotRunFunctionPassesAnalysis on f3`, and twice with the flag off.
The reruns occur only on f2 and f3 due to the additional ref edges.

[libc++] Allow setting _LIBCPP_OVERRIDABLE_FUNC_VIS

Chromium changes this flag to be able to use a custom new/delete from a
dylib.

[flang][OpenMP] Added semantic checks for hint clause

This patch improves semantic checks for hint clause.
It checks "hint-expression is a constant expression
that evaluates to a scalar value with kind
`omp_sync_hint_kind` and a value that is a valid
synchronization hint."

Reviewed By: peixin

Differential Revision: https://reviews.llvm.org/D127615

[flang][OpenMP] Lowering support for atomic update construct

This patch adds lowering support for atomic update construct. A region
is associated with every `omp.atomic.update` operation wherein resides:
(1) the evaluation of the expression on the RHS of the atomic assignment
statement, and (2) a `omp.yield` operation that yields the extended value
of expression evaluated in (1).

Reviewed By: peixin

Differential Revision: https://reviews.llvm.org/D125668

[LoopPredication] Use isSafeToExpandAt() member function (NFC)

As a followup to D129630, this switches a usage of the freestanding
function in LoopPredication to use the member variant instead. This
was the last use of the freestanding function, so drop it entirely.

[SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506)

isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one caller currently gets
this wrong, resulting in PR50506.

Fix this by a) making the CanonicalMode argument on the freestanding
functions required and b) adding member functions on SCEVExpander
that automatically take the SCEVExpander mode into account. We can
use the latter variant nearly everywhere, and thus make sure that
there is no chance of CanonicalMode mismatch.

Fixes https://github.com/llvm/llvm-project/issues/50506.

Differential Revision: https://reviews.llvm.org/D129630

[llvm-objdump] Create fake sections for a ELF core file

The linux perf tools use /proc/kcore for disassembly kernel functions.
Actually it copies the relevant parts to a temp file and then pass it to
objdump. But it doesn't have section headers so llvm-objdump cannot
handle it.

Let's create fake section headers for the program headers. It'd have a
single section for each segment to cover the entire range. And for this
purpose we can consider only executable code segments.

With this change, I can see the following command shows proper outputs.

perf annotate --stdio --objdump=/path/to/llvm-objdump

Differential Revision: https://reviews.llvm.org/D128705

[mlir][Linalg] Retire LinalgPromotion pattern

This revision removes the LinalgPromotion pattern and adds a `transform.structured.promotion` op.
Since the LinalgPromotion transform allows the injection of arbitrary C++ via lambdas, the current
transform op does not handle it.
It is left for future work to decide what the right transform op control is for those cases.

Note the underlying implementation remains unchanged and the mechanism is still controllable by
lambdas from the API.

During this refactoring it was also determined that the `dynamicBuffers` option does not actually
connect to a change of behavior in the algorithm.
This also exhibits that the related test is wrong (and dangerous).
Both the option and the test are therefore removed.

Lastly, a test that connects patterns using the filter-based mechanism is removed: all the independent
pieces are already tested separately.

Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785

Differential Revision: https://reviews.llvm.org/D129649

Rewording "static_assert" diagnostics

This patch rewords the static assert diagnostic output. Failing a
_Static_assert in C should not report that static_assert failed. This
changes the wording to be more like GCC and uses "static assertion"
when possible instead of hard coding the name. This also changes some
instances of 'static_assert' to instead be based on the token in the
source code.

Differential Revision: https://reviews.llvm.org/D129048

[IndVars] Eliminate redundant type cast between unsigned integer and float

Extend for unsigned integer according the comment of D129191.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129358

Thread safety analysis: Don't erase TIL_Opcode type (NFC)

This is mainly for debugging, but it also eliminates some casts.