review.tizen.org Git - platform/upstream/llvm.git/log

[NewPM] Add option to prevent rerunning function pipeline on functions in CGSCC adaptor

In a CGSCC pass manager, we may visit the same function multiple times
due to SCC mutations. In the inliner pipeline, this results in running
the function simplification pipeline on a function multiple times even
if it hasn't been changed since the last function simplification
pipeline run.

We use a newly introduced analysis to keep track of whether or not a
function has changed since the last time the function simplification
pipeline has run on it. If we see this analysis available for a function
in a CGSCCToFunctionPassAdaptor, we skip running the function passes on
the function. The analysis is queried at the end of the function passes
so that it's available after the first time the function simplification
pipeline runs on a function. This is a per-adaptor option so it doesn't
apply to every adaptor.

The goal of this is to improve compile times. However, currently we
can't turn this on by default at least for the higher optimization
levels since the function simplification pipeline is not robust enough
to be idempotent in many cases, resulting in performance regressions if
we stop running the function simplification pipeline on a function
multiple times. We may be able to turn this on for -O1 in the near
future, but turning this on for higher optimization levels would require
more investment in the function simplification pipeline.

Heavily inspired by D98103.

Example compile time improvements with flag turned on:
https://llvm-compile-time-tracker.com/compare.php?from=998dc4a5d3491d2ae8cbe742d2e13bc1b0cacc5f&to=5c27c913687d3d5559ef3ab42b5a3d513531d61c&stat=instructions

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D113947

[SLP][NFC]Add a test for multiple alternate nodes with cost estimation,
NFC.

[Format, Sema] Use range-based for loops with llvm::reverse (NFC)

[OpenMP] Silence build warnings when built with MinGW

There's an attempt to upstream this change in
https://github.com/intel/ittapi/pull/25 too.

Differential Revision: https://reviews.llvm.org/D114069

[libc++] Remove _LIBCPP_HAS_NO_SPACESHIP_OPERATOR

All supported compilers support spaceship in C++20 nowadays.

Differential Revision: https://reviews.llvm.org/D113938

[libc++abi] Don't re-define _LIBCPP_HAS_NO_THREADS in single-threaded mode

Libc++ already defines the macro inside its __config_site header, so
libc++abi doesn't need to do it. Doing it just leads to -Wmacro-redefined
warnings when building libc++abi.

[NFC][gn build] Inclusive language: replace master with main in sync_source_lists_from_cmake.py

[NFC] As part of using inclusive language within the llvm project and to
match the renamed master branch, this patch replaces master with main in
sync_source_lists_from_cmake.py.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D113926

[libc] Use more consistent if defined syntax

[libc] Fix missing restricts

[libc] Fix documentation typo

[mlir][Vector] First step for 0D vector type

There seems to be a consensus that we should allow 0D vectors:
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097

This commit is only the first step: it changes the verifier and the parser to
allow vectors like `vector<f32>` (but does not allow explicit 0 dimensions,
i.e., `vector<0xf32>` is not allowed).

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114086

[analyzer][NFC] Make the API of CallDescription safer slightly

The new //deleted// constructor overload makes sure that no implicit
conversion from `0` would happen to `ArrayRef<const char*>`.

Also adds nodiscard to the `CallDescriptionMap::lookup()`

[NFC][clang] Inclusive terms: replace uses of blacklist in clang/test/

Replace filenames, variable names, check prefixes uses of blacklist with ignore list.

Reviewed By: jkorous

Differential Revision: https://reviews.llvm.org/D113211

[NFC][clangd] cleanup llvm-else-after-return findings

Cleanup of clang-tidy findings: removing "else" after a return statement
to improve readability of the code.

This patch was created by applying the clang-tidy fixes automatically.

Differential Revision: https://reviews.llvm.org/D113892

[clangd] Fix assertion crashes on unmatched NOLINTBEGIN comments.

The overload shouldSuppressDiagnostic seems unnecessary, and it is only
used in clangd.

This patch removes it and use the real one (suppression diagnostics are
discarded in clangd at the moment).

Fixes https://github.com/clangd/clangd/issues/929

Differential Revision: https://reviews.llvm.org/D113999

asan: don't use thread user_id

asan does not use user_id for anything,
so don't pass it to ThreadCreate.
Passing a random uninitialized field of AsanThread
as user_id does not make much sense anyway.

Depends on D113921.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D113922

memprof: don't use thread user_id

memprof does not use user_id for anything,
so don't pass it to ThreadCreate.
Passing a random field of MemprofThread as user_id
does not make much sense anyway.

Depends on D113920.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D113921

lsan: remove pthread_detach/join interceptors

They don't seem to do anything useful in lsan.
They are needed only if a tools needs to execute
some custom logic during detach/join, or if it uses
thread registry quarantine. Lsan does none of this.
And if a tool cares then it would also need to intercept
pthread_tryjoin_np and pthread_timedjoin_np, otherwise
it will mess thread states.
Fwiw, asan does not intercept these functions either.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D113920

tsan: don't consider debug calls as calls

Tsan pass does 2 optimizations based on presence of calls:
1. Don't emit function entry/exit callbacks if there are no calls
and no memory accesses.
2. Combine read/write of the same variable if there are no
intervening calls.
However, all debug info is represented as CallInst as well
and thus effectively disables these optimizations.
Don't consider debug info calls as calls.

Reviewed By: glider, melver

Differential Revision: https://reviews.llvm.org/D114079

Add a clang-transformer tutorial

Differential Revision: https://reviews.llvm.org/D114011

[NFC][AMDGPU][GlobalISel] Fix some legalizer tests

Instructions being tested were accidentally left dead.

[AMDGPU][GlobalISel] Fold G_FNEG above when users cannot fold mods

If possible fold fneg into instruction above if users cannot fold mods and we
know it will decrease instruction count.
Follows same logic as SDAG combiner in choosing opportunities to combine.

Differential Revision: https://reviews.llvm.org/D112827

Improve docs & test for #pragma clang attribute's any clause; NFC

There was some confusion during the discussion of a patch as to whether
`any` can be used to blast an attribute with no subject list onto
basically everything in a program by not specifying a subrule. This
patch adds documentation and tests to make it clear that this situation
is not supported and will be diagnosed.

[Analysis] Ensure getTypeLegalizationCost returns a simple VT for TypeScalarizeScalableVector

When getTypeConversion returns TypeScalarizeScalableVector we were
sometimes returning a non-simple type from getTypeLegalizationCost.
However, many callers depend upon this being a simple type and will
crash if not. This patch changes getTypeLegalizationCost to ensure
that we always a return sensible simple VT. If the vector type
contains unusual integer types, e.g. <vscale x 2 x i3>, then we just
set the type to MVT::i64 as a reasonable default.

A test has been added here that demonstrates the vectoriser can
correctly calculate the cost of vectorising a "zext i3 to i64"
instruction with a VF=vscale x 1:

Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll

Differential Revision: https://reviews.llvm.org/D113777

[AMDGPU] Generate test checks for mad_64_32.ll

Differential Revision: https://reviews.llvm.org/D113985

[DAG] SimplifyDemandedVectorElts - zero_extend_vector_inreg(and(x,c)) -> and(x,c')

If we've only demanded the 0'th element, and it comes from a (one-use) AND, try to convert the zero_extend_vector_inreg into a mask and constant fold it with the AND.

[fir] !fir.tdesc type conversion

Add !fir.tdesc type conversion.
!fir.tdesc is converted to a llvm.ptr<i8>.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D113769

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[Analysis] Fix getNumberOfParts to return 0 when the answer is unknown

When asking how many parts are required for a scalable vector type
there are occasions when it cannot be computed. For example, <vscale x 1 x i3>
is one such vector for AArch64+SVE because at the moment no matter how we
promote the i3 type we never end up with a legal vector. This means
that getTypeConversion returns TypeScalarizeScalableVector as the
LegalizeKind, and then getTypeLegalizationCost returns an invalid cost.
This then causes BasicTTImpl::getNumberOfParts to dereference an invalid
cost, which triggers an assert. This patch changes getNumberOfParts to
return 0 for such cases, since the definition of getNumberOfParts in
TargetTransformInfo.h states that we can use a return value of 0 to represent
an unknown answer.

Currently, LoopVectorize.cpp is the only place where we need to check for
0 as a return value, because all other instances will not currently
ask for the number of parts for <vscale x 1 x iX> types.

In addition, I have changed the target-independent interface for
getNumberOfParts to return 1 and assume there is a single register
that can fit the type. The loop vectoriser has lots of tests that are
target-independent and they relied upon the 0 value to mean the
answer is known and that we are not scalarising the vector.

I have added tests here that show we correctly return an invalid cost
for VF=vscale x 1 when the loop contains unusual types such as i7:

Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll

Differential Revision: https://reviews.llvm.org/D113772

[DebugInfo][NFC] Force some tests to not use instruction-referencing

There are various tests that need to be adjusted to test the right
thing with instruction referencing -- usually because the internal
representation of variables is different, sometimes that location lists
change. This patch makes a bunch of tests explicitly not use
instruction referencing, so that a check-llvm test with instruction
referencing on for x86_64 doesn't fail. I'll then convert the tests
to have instr-ref CHECK lines, and similar.

Differential Revision: https://reviews.llvm.org/D113194

[Thumb2] Regenerate test impacted by e8b55cf7b70a695d158d.

[fir] Add fir.box_tdesc conversion

This patch adds the conversion pattern for
`fir.box_tdes`.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D113931

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

[SCEV] Apply loop guards when computing max BTC for arbitrary steps.

Similar other cases in the current function (e.g. when the step is 1 or
-1), applying loop guards can lead to tighter upper bounds for the
backedge-taken counts.

Fixes PR52464.

Reviewed By: reames, nikic

Differential Revision: https://reviews.llvm.org/D113578

[lldb/test] TestRegisterVariables test fix

Revert "[runtimes] Fix building initial libunwind+libcxxabi+libcxx with compiler implied -lunwind"

This reverts commit 7c3d19ab7bcb79636bd65ee55a0fefef224fcb25.

This commit was reported as causing build problems for the amdgpu
buildbot in https://reviews.llvm.org/D113253#3137097.

[IR] Change CreateStepVector to work with element types smaller than i8

Currently the stepvector intrinsic only supports element types that
are integers of size 8 bits or more. This patch adds support for the
creation of stepvectors with smaller element types by creating
the intrinsic with i8 elements that we then truncate to the requested
size.

It's not currently possible to write a vectoriser test to exercise
this code path so I have added a unit test here:

llvm/unittests/IR/IRBuilderTest.cpp

Differential Revision: https://reviews.llvm.org/D113767

[RISCV] Add extra -early-live-intervals test coverage

Add test coverage for a problem that was fixed by D113493: when updating
live intervals, fix handling of live ranges that were previously tied to
an early-clobber def but no longer are.

[CodeGen] Update LiveIntervals in TargetInstrInfo::convertToThreeAddress

Delegate updating of LiveIntervals to each target's
convertToThreeAddress implementation, instead of repairing LiveIntervals
after the fact in TwoAddressInstruction::convertInstTo3Addr.

Differential Revision: https://reviews.llvm.org/D113493

[fir] Add conversion patterns for slice, shape, shapeshift and shift ops

The information in these perations is used by other operation.
At this point they should not have anymore uses.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D113971

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

[libc][benchmark] add memmove to size distribution, also update other distributions

Differential Revision: https://reviews.llvm.org/D113260

[InstCombine] Use SpecificBinaryOp_match in two more places

Differential Revision: https://reviews.llvm.org/D114038

[NFC][X86][Costmodel] Improve test coverage for i32->i64 vector *ext

[X86][Costmodel] `*ext v64i1 to v32i16` can appear after legalization, cost is same as for `*ext v32i1 to v32i16`

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113914

[X86][Costmodel] `trunc v32i16 to v64i1` can appear after legalization, cost is same as for `trunc v32i16 to v32i1`

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113913

[SelectionDAG] Make WidenVecRes_SELECT work for scalable vectors

This change make WidenVecRes_SELECT work for scalable vectors.

This patch is split from [D110319](https://reviews.llvm.org/D110319)

Signed-off-by: Eric Tang <tangxingxin1008@gmail.com>
Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D110388

[lldb/test] Added lldbutil function to test a breakpoint

Testing the breakpoint itself rather than the lldb string.

Differential Revision: https://reviews.llvm.org/D111899

[libcxx] [ci] Add CI configurations for MinGW

Mention support for MinGW in the docs. Rename the existing windows
CI jobs to Clang-cl, as both Clang-cl and MinGW are equally much
"Windows", just different toolchain environments.

Add an XFAIL for a recently added test that fails in the MinGW DLL
configuration (with an explanation of what's causing the failure).

Differential Revision: https://reviews.llvm.org/D112215

[libc] Fix incorrect revert of 1ee3205

The revert, b2fbd45d2395f1f6ef39db72b7156724fc101e40, incorrectly
re-introduced a few lines removed in 7c3d19ab7bcb79636bd65ee55a0fefef224fcb25

[flang] Remove default argument from function template specialization. NFC.

Patch D113697 added default function arguments to template specializations of `ConvertToBinary`.

According to https://en.cppreference.com/w/cpp/language/template_specialization this not allowed:
> Default function arguments cannot be specified in explicit specializations of function templates, member function templates, and member functions of class templates when the class is implicitly instantiated.

It happens to compile with gcc, clang and msvc 14.30 (Visual Studio 2022), but not msvc 14.29 (Visual Studio 2020). Even for the compilers that syntactically accept it, the default argument will never be used (only the default argument of the template declaration). From https://en.cppreference.com/w/cpp/language/function_template
> Note that only non-template and primary template overloads participate in overload resolution.

That is, the explicit function template specialization is not added to the overload candidate set. Only after all the parameter types are known, are the explicit specializations chosen, at which point the default function argument is ignored.

Also see D85657.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D114032

[X86][FP16] add alias for f*mul_*ch intrinsics

*_mul_*ch is to align with *_mul_*s, *_mul_*d and *_mul_*h.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D112777

[scudo] Handle mallinfo2

mallinfo is deprecated by GLIBC

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D113951

ADT: Adding a key_type definition to MapVector

The key_type type definition for map containers is useful in some
generic, template-based programming scenarios. The addition of key_type
to MapVector is consistent with other map types like DenseMap.

Differential Revision: https://reviews.llvm.org/D113242

[MachO] Move type size asserts to source files. NFC

As discussed in https://reviews.llvm.org/D113809#3128636. It's a bit
unfortunate to move the asserts away from the structs whose sizes
they're checking, but it's a far better developer experience when one of
the asserts is violated, because you get a single error instead of every
single source file including the header erroring out.

Revert "[gn build] (manually) port 1ee32055ea1d (benchmark move)"

1ee32055ea1d was reverted in 67de95b8c955.

This reverts commit a8e8e2d5a22636dba2f38d5eda7440a31933e8be
and follow-up a0dc6001df20fcd92e8eb9e196d403859f1b5192

[lld-macho][nfc] Sanity check on template type

Differential Revision: https://reviews.llvm.org/D114044

[mlir] Fix formatting in Ops.td files (NFC)

MemRefOps.td has some inconsistencies in its formatting of argument
lists.

Revert "[libc][NFC][Obvious] Fix the benchmarks after the switch to llvm/third-party"

This reverts commit 39e9f5d3685f3cfca0df072928ad96d973704dff.

Reverting, as we needed to re-revert the benchmarks move because it was
causing a build failure in the Fuchsia bots due to the way they consume
libcxx's CMakeLists. I want to make sure I understand where the fix
should be for that. After that, I'll incorporate the change here in the
re-reland.

[Bazel] Ignore both old and new benchmark directories

This is getting reverted and relanded a lot, breaking the build each
time.

Differential Revision: https://reviews.llvm.org/D114043

Revert "Make it possible for lldb to launch a remote binary with no local file."

The reworking of the gdb client tests into the PlatformClientTestBase broke
the test for this. I did the mutatis mutandis for the move, but the test
still fails. Reverting till I have time to figure out why.

This reverts commit b715b79d54d5ca2d4e8c91089b8f6a9389d9dc48.

[mlir][lsp] Use ResultGroupDefinition struct

This struct was added and was intended to be used, but it was missed in the original patch.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D114041

Revert "Reland "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'"""

This reverts commit 1ee32055ea1dd4db70d1939cbd4f5105c2dce160.

We hit additional bot failures; in particular, Fuchsia's seems to be
related to how CMakeLists are ingested, see https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8830380874445931681/overview

[MachO] Shrink reloc from 32 bytes to 24 bytes

The `r_address` field of `relocation_info` is only 4 bytes, so our
offset field (which is the `r_address` field adjusted for subsection
splitting) also only needs to be 4 bytes. This reduces the structure
size from 32 bytes to 24 bytes.

Combined with https://reviews.llvm.org/D113813, this is a minor perf
improvement for linking an internal app, tested on two machines:

```
           smol-relocs     baseline        difference (95% CI)
sys_time   7.367 ± 0.138   7.543 ± 0.157   [  +0.9% ..   +3.8%]
user_time  21.843 ± 0.351  21.861 ± 0.450  [  -1.3% ..   +1.4%]
wall_time  20.301 ± 0.307  20.556 ± 0.324  [  +0.1% ..   +2.4%]
samples    16              16

           smol-relocs     baseline        difference (95% CI)
sys_time   2.923 ± 0.050   2.992 ± 0.018   [  +1.4% ..   +3.4%]
user_time  10.345 ± 0.039  10.448 ± 0.023  [  +0.8% ..   +1.2%]
wall_time  12.068 ± 0.071  12.229 ± 0.021  [  +1.0% ..   +1.7%]
samples    15              12
```

More importantly though, this change by itself reduces our maximum
resident set size by 220 MB (2.75%, from 7.85 GB to 7.64 GB) on the
first machine. On the second machine, it reduces it by 125 MB (1.94%,
from 6.31 GB to 6.19 GB).

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113818

[MachO] Reduce size of Symbol and Defined

We can lay out Symbol more optimally to reduce its size from 56 bytes to
48 bytes by eliminating unnecessary padding, and we can lay out Defined
such that its bitfield members are placed in the tail padding of Symbol
(on ABIs which support this), to reduce it from 96 bytes to 80 bytes (8
bytes from the Symbol reduction, and 8 bytes from the tail padding
reuse).

This is perf-neutral for an internal app (results from two different
machines):

```
           smol-syms       baseline        difference (95% CI)
sys_time   7.430 ± 0.202   7.440 ± 0.193   [  -2.6% ..   +2.9%]
user_time  21.443 ± 0.513  21.206 ± 0.396  [  -3.3% ..   +1.1%]
wall_time  20.453 ± 0.534  20.222 ± 0.488  [  -3.7% ..   +1.5%]
samples    9               8

           smol-syms       baseline        difference (95% CI)
sys_time   3.011 ± 0.050   3.040 ± 0.052   [  -0.4% ..   +2.3%]
user_time  10.416 ± 0.075  10.496 ± 0.091  [  +0.1% ..   +1.4%]
wall_time  12.229 ± 0.144  12.354 ± 0.192  [  -0.1% ..   +2.1%]
samples    14              13
```

However, on the first machine, it reduces maximum resident set size by
65.9 MB (0.8%, from 7.92 GB to 7.85 GB). On the second machine, it
reduces it by 92 MB (1.4%, from 6.40 GB to 6.31 GB).

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113813

[MachO] Fix struct size assertion

It was checking for 64-bit builds incorrectly. Unfortunately,
ConcatInputSection has grown a bit in the meantime, and I don't see any
obvious way to shrink it. Perhaps icfEqClass could use 32-bit hashes
instead of 64-bit ones, but xxHash64 is supposed to be much faster than
xxHash32 (https://github.com/Cyan4973/xxHash#benchmarks), so that sounds
like a loss. (Unrelatedly, we should really look at using XXH3 instead
of xxHash64 now.)

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113809

Make it possible for lldb to launch a remote binary with no local file.

We don't actually need a local copy of the main executable to debug
a remote process. So instead of treating "no local module" as an error,
see if the LaunchInfo has an executable it wants lldb to use, and if so
use it. Then report whatever error the remote server returns.

Differential Revision: https://reviews.llvm.org/D113521

[gn build] (manually) port 1ee32055ea1d more (benchmark move)

The new benchmark dep apparently has more source files than the old one.

[gn build] (manually) port 1ee32055ea1d (benchmark move)

[ADT] Add unit test for EquivalanceClasses comparator

This unit tests tests new functionality added by D112052.

Differential Revision: https://reviews.llvm.org/D113461

Don't add irrelevant items to queue in DwarfCompileUnit::createScopeChildrenDIE (NFC)

Instead of popping them and then immediately throwing them away, we can
just filter out globals and items in different scopes before adding them
to WorkList. Shouldn't change anything but keep the queue smaller.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D113864

[DebugInfo] Use DbgEntityKind in DbgEntity interface (NFC)

It was being used occasionally already, and using it on the constructor
and getDbgEntityID has obvious type safety benefits.

Also use llvm_unreachable in the switch as usual, but since only these
two values are used in constructor calls I think it's still NFC.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D113862

[PowerPC] Fix a nullptr dereference

LiMI1/LiMI2 can be null, so don't call a method on them before checking.
Found by ubsan.

[ARM] Update test comments after D114018. NFC

The TODOs are now fixed as of D114018, so can be removed. Which is nice.

Limit test to x86 for now.

Coverage: Fix iterated type for LineCoverageIterator

LineCoverageIterator is not providing access to a mutable object. Fix it
to iterate over `const LineCoverageStats` so that `operator->()`
compiles again after 6b9b86db9dd974585a5c71cf2e5231d1e3385f7c.

[lldb] use EXT_SUFFIX for python extension

LLDB doesn't use only the python stable ABI, which means loading
it into an incompatible python can cause the process to crash.
_lldb.so should be named with the full EXT_SUFFIX from sysconfig
-- such as _lldb.cpython-39-darwin.so -- so this doesn't happen.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D112972

[libc][NFC][Obvious] Fix the benchmarks after the switch to llvm/third-party

[llvm-objcopy] Add --update-section

This is another attempt at D59351 which attempted to add --update-section, but
with some heuristics for adjusting segment/section offsets/sizes in the event
the data copied into the section is larger than the original size of the section.
We are opting to not support this case. GNU's objcopy was able to do this because
the linker and objcopy are tightly coupled enough that segment reformatting was
simpler. This is not the case with llvm-objcopy and lld where they like to be separated.

This will attempt to copy data into the section without changing any other
properties of the parent segment (if the section is part of one).

Differential Revision: https://reviews.llvm.org/D112116

[lldb] fix -print-script-interpreter-info on windows

Apparently "{sys.prefix}/bin/python3" isn't where you find the
python interpreter on windows, so the test I wrote for
-print-script-interpreter-info is failing.

We can't rely on sys.executable at runtime, because that will point
to lldb.exe not python.exe.

We can't just record sys.executable from build time, because python
could have been moved to a different location.

But it should be OK to apply relative path from sys.prefix to sys.executable
from build-time to the sys.prefix at runtime.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D113650

[scudo] Regression test for the MTE crash in storeEndMarker.

The original problem was fixed in D105261.

Differential Revision: https://reviews.llvm.org/D114022

[runtimes] Fix building initial libunwind+libcxxabi+libcxx with compiler implied -lunwind

This does mostly the same as D112126, but for the runtimes cmake files.
Most of that is straightforward, but the interdependency between
libcxx and libunwind is tricky:

Libunwind is built at the same time as libcxx, but libunwind is not
installed yet. LIBCXXABI_USE_LLVM_UNWINDER makes libcxx link directly
against the just-built libunwind, but the compiler implicit -lunwind
isn't found. This patch avoids that by adding --unwindlib=none if
supported, if we are going to link explicitly against a newly built
unwinder anyway.

Differential Revision: https://reviews.llvm.org/D113253

[runtimes] Fix incorrect comment about the purpose of LLVM_DEFAULT_TARGET_TRIPLE

5beec6fb04e7 added LLVM_DEFAULT_TARGET_TRIPLE to the runtimes build with
a comment, however I believe that comment had been copied from the LLVM
build tree. In the context of the runtimes, LLVM_DEFAULT_TARGET_TRIPLE
is used to set what targets we are building for, not the target for which
we "generate code".

Differential Revision: https://reviews.llvm.org/D114007

[libc++] Unspecified behavior randomization in libc++

This effort is dedicated to deflake the tests of the users which depend
on the unspecified behavior of algorithms and containers. This also
might help updating the sorting algorithm in libcxx which has the
quadratic worst case in the future or at least create a new one under
flag.

For detailed design, please see the design doc I provide in the patch.

Differential Revision: https://reviews.llvm.org/D96946

[X86] Add shift by splat modulo amount vector tests

Shows failure to fold zero_extend_vector_inreg(and(x, c)) -> bitcast(and(x,c')) when we're only demanding the 0'th extended element, such as with the SSE variable shift ops.

[libc++] Adjust comment about ABI change and std::bad_function_call

[fir] Add fir.string_lit conversion

This patch adds the conversion pattern for
the fir.string_lit operation.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D113992

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[SCEV] Canonicalize X - urem X, Y patterns

There are multiple possible ways to represent the X - urem X, Y pattern. SCEV was not canonicalizing, and thus, depending on which you analyzed, you could get different results. The sub representation appears to produce strictly inferior results in practice, so I decided to canonicalize to the Y * X/Y version.

The motivation here is that runtime unroll produces the sub X - (and X, Y-1) pattern when Y is a power of two. SCEV is thus unable to recognize that an unrolled loop exits because we don't figure out that the new unrolled step evenly divides the trip count of the unrolled loop. After instcombine runs, we convert the the andn form which SCEV recognizes, so essentially, this is just fixing a nasty pass ordering dependency.

The ARM loop hardware interaction in the test diff is opague to me, but the comments in the review from others knowledge of the infrastructure appear to indicate these are improvements in loop recognition, not regressions.

Differential Revision: https://reviews.llvm.org/D114018

[mlir] Fix clang5 build after D113641

[compiler-rt/profile] Reland mark __llvm_profile_raw_version as hidden

Since libclang_rt.profile is added later in the command line, a
definition of __llvm_profile_raw_version is not included if it is
provided from an earlier object, e.g.  from a shared dependency.

This causes an extra dependence edge where if libA.so depends on libB.so
and both are coverage-instrumented, libA.so uses libB.so's definition of
__llvm_profile_raw_version.  This leads to a runtime link failure if the
libB.so available at runtime does not provide this symbol (but provides
the other dependent symbols).  Such a scenario can occur in Android's
mainline modules.
E.g.:
  ld -o libB.so libclang_rt.profile-x86_64.a
  ld -o libA.so -l B libclang_rt.profile-x86_64.a

libB.so has a global definition of __llvm_profile_raw_version.  libA.so
uses libB.so's definition of __llvm_profile_raw_version.  At runtime,
libB.so may not be coverage-instrumented (i.e. not export
__llvm_profile_raw_version) so runtime linking of libA.so will fail.

Marking this symbol as hidden forces each binary to use the definition
of __llvm_profile_raw_version from libclang_rt.profile.  The visiblity
is unchanged for Apple platforms where its presence is checked by the
TAPI tool.

Reviewed By: MaskRay, phosek, davidxl

Differential Revision: https://reviews.llvm.org/D111759

[flang] Fix a bug in INQUIRE(IOLENGTH=) output

The inquire by output list form of the INQUIRE statement calculates the
number of file storage units that would be required to store the data
of an output list in an unformatted file. Currently, the result is
incorrectly multiplied by the number of bytes for a data type. A query
for "INTEGER(KIND=4) A(10)" should be 40, not 160.

Update formatting.

[libc] Correct rounding for hexadecimalStringToFloat with long inputs.

Update binaryExpTofloat so that it will round correctly for long inputs when converting hexadecimal strings to floating points.

Differential Revision: https://reviews.llvm.org/D113815

[libc++] Always define a key function for std::bad_function_call in the dylib

However, whether applications rely on the std::bad_function_call vtable
being in the dylib is still controlled by the ABI macro, since changing
that would be an ABI break.

Also separate preprocessor definitions for whether to use a key function
and whether to use a `bad_function_call`-specific `what` message
(`what` message is mandated by [LWG2233](http://wg21.link/LWG2233)).

Differential Revision: https://reviews.llvm.org/D92397

[Loads] Handle addrspacecast constant expressions when determining dereferenceability

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D114015

[libc++] [NFC] Disable clang-tidy's readability-identifier-naming check

In libc++ most of the names are not conforming to the llvm style. Removing the readability-identifier-naming check removes almost all clang-tidy warnings. For example in `<string>` the warning count goes from 1001 warnings down to 7.

Reviewed By: #libc, Mordante, ldionne

Spies: Mordante, Quuxplusone, aheejin, libcxx-commits, carlosgalvezp

Differential Revision: https://reviews.llvm.org/D113849

[PowerPC] PPC backend optimization on conditional trap intrustions

This patch adds PPC back end optimization to analyze the arguments of a
conditional trap instruction to execute one of the following:
1. Delete it if never trap
2. Replace it if always trap
3. Otherwise keep it

Reviewed By: nemanjai, amyk, PowerPC

Differential revision: https://reviews.llvm.org/D111434

[test] Precommit test for D114015

[CSSPGO] Fix a hash code truncating issue in ContextTrieNode.

std::hash returns a 64bit hash code while previously we were using only lower 32 bits which caused hash collision for large workloads.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D113688

[llvm] Add a SFINAE template parameter to DenseMapInfo

This allows for using SFINAE partial specialization for DenseMapInfo.
In MLIR, this is particularly useful as it will allow for defining partial
specializations that support all Attribute, Op, and Type classes without
needing to specialize DenseMapInfo for each individual class.

Differential Revision: https://reviews.llvm.org/D113641

[NFC][Regalloc] Factor out eviction decision from eviction attempt

This splits tryEvict into a const tryFindEvictionCandidate, which
attempts to find a candidate, and the actual eviction (should the former
be successful)

The newly introduced tryFindEvictionCandidate will move subsequently
into the RegAllocEvictionAdvisor.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D113941

[Bazel] Update .bazelignore for moved google/benchmark

We need to avoid directly processing the Bazel config in LLVM's copy of
google/benchmark, which was moved in
https://github.com/llvm/llvm-project/commit/1ee32055ea.

Differential Revision: https://reviews.llvm.org/D114014

Reland "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'""

This reverts commit e7568b68da8a216dc22cdc1c6d8903c94096c846 and relands
c6f7b720ecfa6db40c648eb05e319f8a817110e9.

The culprit was: missed that libc also had a dependency on one of the
copies of `google-benchmark`

Also opportunistically fixed indentation from prev. change.

Differential Revision: https://reviews.llvm.org/D112012

DebugInfo: Make DWARFExpression::iterator a const iterator

3d1d8c767be5537eb5510ee0522e2f3475fe7c04 made
DWARFExpression::iterator's Operation member `mutable`. After a few prep
commits, the iterator can instead be made a `const` iterator since no
caller can change the Operation.

Differential Revision: https://reviews.llvm.org/D113958