review.tizen.org Git - platform/upstream/llvm.git/log

[mlir:GreedyPatternRewriter] Add debug logging for pattern rewriter actions

This effectively mirrors the logging in dialect conversion, which has proven
very useful for understanding the pattern application process.

Differential Revision: https://reviews.llvm.org/D112120

[NFC] Clean up a few methods within GreedyPatternRewriter

Move a few methods out of line and clean up comments.

Avoid infinity arithmetics when computing exp approximations

Otherwise this can result a poison value on some platforms see https://bugs.llvm.org/show_bug.cgi?id=51204

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D112115

[test][ORC-RT] Disable x86_64 tests when target arch does not match

When cross-compiling, these tests will fail. For now leave the host arch
check that was already there since I don't know why it was added.

[fir] Add Character helper

This patch is extracted from D111337. It introduce the
CharacterExprHelper that helps dealing with character in FIR.

Reviewed By: schweitz, awarzynski

Differential Revision: https://reviews.llvm.org/D112140

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>

[VectorCombine] fold shuffle-of-binops with common operand

shuf (bo X, Y), (bo X, W) --> bo (shuf X), (shuf Y, W)

This is motivated by an example in D111800
(although that patch avoids the problem for that particular example).

The pattern is shown in reduced form with:
https://llvm.org/PR52178
https://alive2.llvm.org/ce/z/d8zB4D

There is no difference on the PhaseOrdering test from D111800
because the aarch64 cost model says that the shuffle cost is 3 while
the fadd cost is 2.

Differential Revision: https://reviews.llvm.org/D111901

Reland [clang] Pass -clear-ast-before-backend in Clang::ConstructJob()

This clears the memory used for the Clang AST before we run LLVM passes.

https://llvm-compile-time-tracker.com/compare.php?from=d0a5f61c4f6fccec87fd5207e3fcd9502dd59854&to=b7437fee79e04464dd968e1a29185495f3590481&stat=max-rss
shows significant memory savings with no slowdown (in fact -O0 slightly speeds up).

For more background, see
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.

Turn this off for the interpreter since it does codegen multiple times.

Relanding with fix for -print-stats: D111973

Relanding with fix for plugins: D112190

If you'd like to use this even with plugins, consider using the features
introduced in D112096.

This can be turned off with -Xclang -no-clear-ast-before-backend.

Differential Revision: https://reviews.llvm.org/D111270

[RISCV] Add a test showing incorrect VSETVLI insertion

This test case, reduced from an internal test failure, shows how we may
incorrectly skip the insertion of VSETVLI instructions when doing
cross-basic-block analysis.

The entry block ends in a `e32,mf2`. Its single successor, %bb.1, ends with a
`e8,mf8`, but for a mask-type instruction, so is considered compatible.
This means that the info %bb.1 is merged into its predecessor so
produces a `e32,mf2`. When it comes to the last block, which requires a
`e32,mf2`, we skip the insertion of a vsetvli because all predecessors
were determined to preserve the right vtype.

However, when %bb.1 is actually laid out it does actually need a
`e8,mf8` vsetvli, since the previous instruction has a different tail
policy. This means that when execution flows from %bb.1 to %bb.3, the
`vadd.vx` is misconfigured.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D112223

[IPT] Restructure cache to allow lazy update following invalidation [NFC]

This change restructures the cache used in IPT to point not to the first special instruction, but to the first instruction which *could* be special. That is, the cached reference is always equal to the first special, or comes before it in the block.

This avoids expensive block scans when we are removing special instructions from the beginning of the block. At the moment, this case is not heavily used, though it does trigger in GVN when doing CSE of calls. The main motivation was a change I'm no longer planning to move forward with, but the cache optimization seemed worthwhile as a minor perf win at low cost.

Differential Revision: https://reviews.llvm.org/D111768

Update the title and encoding for the C++ status page

Update the C++ and C status pages now that Clang 13 has been released

[clang] Don't clear AST if we have consumers running after the main action

Downstream users may have Clang plugins. By default these plugins run
after the main action if they are specified on the command line.

Since these plugins are ASTConsumers, presumably they inspect the AST.
So we shouldn't clear it if any plugins run after the main action.

Reviewed By: dblaikie, hans

Differential Revision: https://reviews.llvm.org/D112190

Reapply [ORC-RT] Configure the ORC runtime for more architectures and platforms

Reapply 5692ed0cce8c95, but with the ORC runtime disabled explicitly on
CrossWinToARMLinux to match the other compiler-rt runtime libraries.

Differential Revision: https://reviews.llvm.org/D112229

---

Enable building the ORC runtime for 64-bit and 32-bit ARM architectures,
and for all Darwin embedded platforms (iOS, tvOS, and watchOS). This
covers building the cross-platform code, but does not add TLV runtime
support for the new architectures, which can be added independently.

Incidentally, stop building the Mach-O TLS support file unnecessarily on
other platforms.

Differential Revision: https://reviews.llvm.org/D112111

[clang] Use StringRef::contains (NFC)

[DebugInfo] Support typedef with btf_decl_tag attributes

Clang patch ([1]) added support for btf_decl_tag attributes with typedef
types. This patch added llvm support including dwarf generation.
For example, for typedef
   typedef unsigned * __u __attribute__((btf_decl_tag("tag1")));
   __u u;
the following shows llvm-dwarfdump result:
   0x00000033:   DW_TAG_typedef
                   DW_AT_type      (0x00000048 "unsigned int *")
                   DW_AT_name      ("__u")
                   DW_AT_decl_file ("/home/yhs/work/tests/llvm/btf_tag/t.c")
                   DW_AT_decl_line (1)

   0x0000003e:     DW_TAG_LLVM_annotation
                     DW_AT_name    ("btf_decl_tag")
                     DW_AT_const_value     ("tag1")

   0x00000047:     NULL

  [1] https://reviews.llvm.org/D110127

Differential Revision: https://reviews.llvm.org/D110129

[Clang] Support typedef with btf_decl_tag attributes

Previously, btf_del_tag attribute supports record, field, global variable,
function and function parameter ([1], [2]). This patch added support for typedef.
The main reason is for typedef of an anonymous struct/union, we can only apply
btf_decl_tag attribute to the anonymous struct/union like below:
typedef struct { ... } __btf_decl_tag target_type
In this case, the __btf_decl_tag attribute applies to anonymous struct,
which increases downstream implementation complexity. But if
typedef with btf_decl_tag attribute is supported, we can have
typedef struct { ... } target_type __btf_decl_tag
which applies __btf_decl_tag to typedef "target_type" which make it
easier to directly associate btf_decl_tag with a named type.
This patch permitted btf_decl_tag with typedef types with this reason.

[1] https://reviews.llvm.org/D106614
[2] https://reviews.llvm.org/D111588

Differential Revision: https://reviews.llvm.org/D110127

[libc++] Use addressof in vector.

This addresses the usage of `operator&` in `<vector>`.

I now added tests for the current offending cases. I wonder whether it
would be better to add one addressof test per directory and test all
possible violations. Also to guard against possible future errors?

(Note there are still more headers with the same issue.)

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D111961

[lld-macho] Simplify lc-linker-option.ll and re-enable it on Windows

While attempting to simplify it, I discovered a concerning discrepancy
between our handling of LC_LINKER_OPTION vs ld64's. In particular, ld64
does not appear to check for `-all_load` nor `-ObjC` when processing
those options. Thus, if/when we fix this behavior, no duplicate symbol
error will be expected regardless of the use-after-free. As such, I've
removed the test logic that tries to induce the duplicate symbol error.
We can just rely on ASAN to do the verification.

In order to make the test run on Windows, I've removed the symlink
logic. Both ld64 and LLD handle this un-symlinked framework just fine.

I also capitalized the framework name, since that's the typical
convention.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D112195

[ORC-RT] Remove stray printf debugging output.

These were accidentally picked up in an earlier commit.

[mlir][Linalg] Improve conv vectorization for the stride==1 case.

In the stride == 1 case, conv1d reads contiguous data along the input dimension. This can be advantageaously used to bulk memory transfers and compute while avoiding unrolling. Experimentally, this can yield speedups of up to 50%.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D112139

[libomptarget][DeviceRTL] Generalise and simplify cmakelists

Step towards building the DeviceRTL for amdgpu.

Mostly replaces cuda-specific toolchain finding logic with the
generic logic currently found in the amdgpu deviceRTL cmake. Also
deletes dead code and changes the default to build on systems
without cuda installed, as the library doesn't use cuda and the
amdgpu-only systems generally won't have cuda installed.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D111983

[InstCombine] generalize reassociated Demorgan folds

This updates the recent D112108 / b92412fb286be26d
to handle the flipped logic ('or') sibling:
https://alive2.llvm.org/ce/z/Y2L6Ch

[InstCombine] add tests for DeMorgan with reassociation; NFC

These are direct mutations of the tests added for D112108 -
we should handle the sibling folds for 'or'.

Do not downcast uint64_t to unsigned in UniqueID hash computation

Context: https://reviews.llvm.org/D110925#inline-1070046

[runtimes] Properly handle the sysroot/triple/gcc-toolchain

In 395271a, I simplified how we handled the target triple for the
runtimes. However, in doing so, we stopped considering the default
in CMAKE_CXX_COMPILER_TARGET, so we'd use the LLVM_DEFAULT_TARGET_TRIPLE
(which is the host triple) even if CMAKE_CXX_COMPILER_TARGET was specified.
This commit fixes that problem and also refactors the code so that it's
easy to see what the default value is.

The fact that nobody seems to have been broken by this makes me think
that perhaps nobody is using CMAKE_CXX_COMPILER_TARGET to specify the
triple -- but it should still work.

Differential Revision: https://reviews.llvm.org/D111672

[SystemZ][z/OS] Initial implementation for lowerCall on z/OS

- This patch provides the initial implementation for lowering a call on z/OS according to the XPLINK64 calling convention
- A series of changes have been made to SystemZCallingConv.td to account for these additional XPLINK64 changes including adding a new helper function to shadow the stack along with allocation of a register wherever appropriate
- For the cases of copying a f64 to a gr64 and a f128 / 128-bit vector type to a gr64, a `CCBitConvertToType` has been added and has been bitcasted appropriately in the lowering phase
- Support for the ADA register (R5) will be provided in a later patch.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D111662

[DAGCombiner] fold bit-hack form of usubsat

(i8 X ^ 128) & (i8 X s>> 7) --> usubsat X, 128

I haven't found a generalization of this identity:
https://alive2.llvm.org/ce/z/_sriEQ

Note: I was actually looking at the first form of the pattern in that link,
but that's part of a long chain of potential missed transforms in codegen
and IR....that I hope ends here!

The predicates for when this is profitable are a bit tricky. This version of
the patch excludes multi-use but includes custom lowering (as opposed to
legal only).

On x86 for example, we have custom lowering for some vector types, and that
uses umax and sub. So to enable that fold, we need add use checks to avoid
regressions. Even with legal-only lowering, we could see code with extra
reg move instructions for extra uses, so that constraint would have to be
eased very carefully to avoid penalties.

Differential Revision: https://reviews.llvm.org/D112085

[SystemZ][z/OS] Additional test coverage for validating dialect instructions for SystemZ

- There are certain instructions most notably those with extended mnemonics that restricted to only the gnu/att variant
- There are also certain instruction aliases/mnemonic aliases that are restricted only to the HLASM variant (see https://reviews.llvm.org/D97581, https://reviews.llvm.org/D94250 and https://reviews.llvm.org/D92185 for reference)
- This patch adds a few tests to check for the behaviour introduced in the above patches. The testing coverage could not be added in at the same time, due to parallel work being done introducing the HLASM syntax

Reviewed By: uweigand, abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D112172

[SLP]Unify vectorization of PHI and store nodes with improved tiny tree vectorization.

Vectorization of PHIs and stores very similar, it might be beneficial to
try to revectorize stores (like PHIs) if the total number of stores with
the same/alternate opcode is less than the vector size but number of
stores with the same type is larger than the vector size.

Differential Revision: https://reviews.llvm.org/D109831

[mlir][linalg][bufferize] Fix bufferizesToMemoryWrite for TiledLoopOp

This is the same fix as for scf.for.

Differential Revision: https://reviews.llvm.org/D112218

[mlir][linalg][bufferize] Fix bug in getInplaceableOpResult

Differential Revision: https://reviews.llvm.org/D112123

[mlir][linalg][bufferize] Avoid creating copies that are never read

Differential Revision: https://reviews.llvm.org/D111956

[mlir][linalg][bufferize] Eliminate InitTensorOps of InsertSliceOp sources

An InitTensorOp is replaced with an ExtractSliceOp on the InsertSliceOp's destination. This optimization is applied after analysis and only to InsertSliceOps that were decided to bufferize inplace. Another analysis on the new ExtractSliceOp is needed after the rewrite.

Differential Revision: https://reviews.llvm.org/D111955

Relax assert in ExprConstant to a return None.

Fixes a compiler assert on passing a compile time integer to atomic builtins.

Assert introduced in D61522
Function changed from ->bool to ->Optional in D76646
Simplifies call sites to getIntegerConstantExpr to elide the now-redundant
isValueDependent checks.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D112159

[clang][deps] Make resource directory deduction configurable

The `clang-scan-deps` CLI tool invokes the compiler with `-print-resource-dir` in case the `-resource-dir` argument is missing from the compilation command line. This is to enable running the tool on compilation databases that use compiler from a different toolchain than `clang-scan-deps` itself. While this doesn't make sense when scanning modular builds (due to the `-cc1` arguments the tool generates), the tool can can be used to efficiently scan for file dependencies of non-modular builds too.

This patch stops deducing the resource directory by invoking the compiler by default. This mode can still be enabled by invoking `clang-scan-deps` with `--resource-dir-recipe invoke-compiler`. The new default is `--resource-dir-recipe modify-compiler-path` which relies on the resource directory deduction taking place in `Driver::Driver` which is based on the compiler path. This makes the default more aligned with the intended usage of the tool while still allowing it to serve other use-cases.

Note that this functionality was also influenced by D108979, where the dependency scanner stopped going through `ClangTool::run`. The function tried to deduce the resource directory based on the current executable path, which might not be what the users expect when invoked from within a shared library.

Depends on D108979.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108366

[lldb] Fix a thinko in 2ace1e57

An empty plugin name means we should try everything.

Picked up by the windows bot.

[SVE] Fix selection failure when splitting extended masked loads

When splitting a masked load, `GetDependentSplitDestVTs` is used to get the
MemVTs of the high and low parts. If the masked load is extended, this
may return VTs with different element types which are used to create the
high & low masked load instructions.
This patch changes `GetDependentSplitDestVTs` to ensure we return VTs with
the same element type.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D111996

[MIPS] Fix switching between 32/64-bit variants of r6 target triples

If clang driver gets 64-bit r6 target triple like `mipsisa64r6` and
additional option forces switching to generation of 32-bit code, it
loses r6 abi and generates 32-bit r2-r5 abi code.

```
$ clang -target mipsisa64r6-linux-gnu -mabi=32
```

This patch fixes the problem.

- Add optional `SubArchType` argument to the `Triple::setArch()` method.
- Implement generation of mips r6 target triples in the
`Triple::getArchName()` method.

Differential Revision: https://reviews.llvm.org/D110514.diff

[ARM] Add new abs test. NFC

[clang][deps] NFC: Rename building CompilerInvocation

The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.

This patch gives descriptive name to the generated `CompilerInvocation` that can be used to derive the command-line to build a modular dependency.

Depends on D111725.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111728

[clang][deps] NFC: Rename scanning CompilerInstance

The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.

This patch gives a distinct name to the `CompilerInstance` that's used to run the implicit build during dependency scan.

Depends on D111724.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111725

[clang][deps] NFC: Remove redundant CompilerInstance reference

The `ModuleDepCollectorPP` class holds a reference to `ModuleDepCollector` as well as `ModuleDepCollector`'s `CompilerInstance`. The fact that these refer to the same object is non-obvious.

This patch removes the `CompilerInvocation` reference from `ModuleDepCollectorPP` and accesses it through `ModuleDepCollector` instead.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111724

[clang][deps] Ensure reported context hash is strict

One of main goals of the dependency scanner is to be strict about module compatibility. This is achieved through strict context hash. This patch ensures that strict context hash is enabled not only during the scan itself (and its minimized implicit build), but also when actually reporting the dependency.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111720

Revert "AddGlobalAnnotations for function with or without function body."

This reverts commit 121b2252de0eed68f2ddf5f09e924a6c35423d47.

The following code causes a crash in some circumstances:

  struct k {
    ~k() __attribute__((annotate(""))) {}
  };
  void m() { k(); }

[lldb] Silence -Wpessimizing-move warning

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp:3635:10: error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]
return std::move(merged);
^

[lldb] Remove ConstString from GetPluginNameStatic of some plugins

This patch deals with ObjectFile, ObjectContainer and OperatingSystem
plugins. I'll convert the other types in separate patches.

In order to enable piecemeal conversion, I am leaving some ConstStrings
in the lowest PluginManager layers. I'll convert those as the last step.

Differential Revision: https://reviews.llvm.org/D112061

[mlir] Fix a crash when creating a 1d zero element LLVM constant

Fixes a regression introduced in f9be7a7afda3c90b99c9f50e5eff1624da5a6511

Differential Revision: https://reviews.llvm.org/D112208

[mlir] Use empty() calls where possible.

These are based on findings from the ClangTidy
readability-container-size-empty check.

[lldb] Add omitted abstract formal parameters in DWARF symbol files

This patch fixes a problem introduced by clang change
https://reviews.llvm.org/D95617 and described by
https://bugs.llvm.org/show_bug.cgi?id=50076#c6, where inlined functions
omit unused parameters both in the stack trace and in `frame var`
command. With this patch, the parameters are listed correctly in the
stack trace and in `frame var` command.

Specifically, we parse formal parameters from the abstract version of
inlined functions and use those formal parameters if they are missing
from the concrete version.

Differential Revision: https://reviews.llvm.org/D110571

[NFC][LoopIdiom] Make for loops more readable

Patch simplifies for loops in LIR following LLVM guidelines: https://llvm.org/docs/CodingStandards.html#use-range-based-for-loops-wherever-possible.

Differential Revision: https://reviews.llvm.org/D112077

[libcxx] Throw correct exception from std::vector::reserve

According to the standard [vector.capacity]/5, std::vector<T>::reserve
shall throw an exception of type std::length_error when the requested
capacity exceeds max_size().

This behavior is not implemented correctly: the function 'reserve'
simply propagates the exception from allocator<T>::allocate. Before
D110846 that exception used to be of type std::length_error (which is
correct for vector<T>::reserve, but incorrect for
allocator<T>::allocate).

This patch fixes the issue and adds regression tests.

Reviewed By: Quuxplusone, ldionne, #libc

Differential Revision: https://reviews.llvm.org/D112068

[libcxx] Support allocators with explicit c-tors in vector<bool>

std::vector<bool> rebinds the supplied allocator to construct objects
of type '__storage_type' rather than 'bool'. Allocators are allowed to
use explicit conversion constructors, so care must be taken when
performing conversions.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D112150

Revert "[fir] Add Character helper"

This reverts commit e4ce92245c96cea9492767d7149eb9e30dee0d16.

Buildbots not happy with the tests.

[clang] Support __float128 on DragonFlyBSD.

Differential Revision: https://reviews.llvm.org/D111760

[docs] Fix broken link rendering in the LLVM Coding Standards.

[lldb] [Host/SerialPort] Add std::moves for better compatibility

[lldb] [Host/Terminal] Add missing #ifdef for baudRateToConst()

[lldb] [unittest] Disable SetParity() tests on Linux entirely

Attempting to enable PARENB causes tcsetattr() to fail on the Debian
and Ubuntu buildbots, so let's skip these tests on Linux entirely.

[lldb] Add serial:// protocol for connecting to serial port

Add a new serial:// protocol along with SerialPort that provides a new
API to open serial ports.  The URL consists of serial device path
followed by URL-style options, e.g.:

    serial:///dev/ttyS0?baud=115200&parity=even

If no options are provided, the serial port is only set to raw mode
and the other attributes remain unchanged.  Attributes provided via
options are modified to the specified values.  Upon closing the serial
port, its original attributes are restored.

Differential Revision: https://reviews.llvm.org/D111355

[NARY-REASSOCIATE][NFC] Simplify min/max handling

In order to explore different variants of reassociation current implementation uses "swap in a loop" approach. Unfortunately, the implementation is more complicated than it could be. This is an attempt to streamline the code. New approach is to extract core functionality into a helper function and call it explicitly as many times as required.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D112128

[mlir][linalg][bufferize][NFC] Change findValueInReverseUseDefChain signature

This commit is in preparation for scf.if support.

* `condition` in findValueInReverseUseDefChain takes a Value instead of OpOperand*.
* Return a SetVector<Value> instead of a single Value. This SetVector always contains exactly one Value at the moment.

Differential Revision: https://reviews.llvm.org/D111928

[SVE][Analysis] Tune the cost model according to the tune-cpu attribute

This patch introduces a new function:

  AArch64Subtarget::getVScaleForTuning

that returns a value for vscale that can be used for tuning the cost
model when using scalable vectors. The VScaleForTuning option in
AArch64Subtarget is initialised according to the following rules:

1. If the user has specified the CPU to tune for we use that, else
2. If the target CPU was specified we use that, else
3. The tuning is set to "generic".

For CPUs of type "generic" I have assumed that vscale=2.

New tests added here:

  Analysis/CostModel/AArch64/sve-gather.ll
  Analysis/CostModel/AArch64/sve-scatter.ll
  Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll

Differential Revision: https://reviews.llvm.org/D110259

[lldb] [Host] Add setters for common teletype properties to Terminal

Add setters for common teletype properties to the Terminal class:

- SetRaw() to enable common raw mode options

- SetBaudRate() to set the baud rate

- SetStopBits() to select the number of stop bits

- SetParity() to control parity bit in the output

- SetHardwareControlFlow() to enable or disable hardware control flow
(if supported)

Differential Revision: https://reviews.llvm.org/D111030

[MLIR][OpenMP] Add support for ordered construct

This patch supports the ordered construct in OpenMP dialect following
Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR
using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive
is not supported yet since LLVM optimization passes do not support it
for now.

Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh

Differential Revision: https://reviews.llvm.org/D110015

[mlir][linalg][bufferize][NFC] Check return value of getResultBuffer

In a subsequent commit, getResultBuffer can return a "null" Value. This is the case when the returned buffer from an scf.if is not unique.

This commit is in preparation for scf.if support to keep the next commit smaller.

Differential Revision: https://reviews.llvm.org/D111927

[mlir][linalg][bufferize] Bufferize using PostOrder traversal

This is required for bufferization of scf::IfOp, which is added in a subsequent commit.

Some ops (scf::ForOp, TiledLoopOp) require PreOrder traversal to make sure that bbArgs are mapped before bufferizing the loop body.

Differential Revision: https://reviews.llvm.org/D111924

[lldb][NFC] clang-format CPlusPlusLanguage.cpp

[fir] Add Character helper

This patch is extracted from D111337. It introduce the
CharacterExprHelper that helps dealing with character in FIR.

Reviewed By: schweitz, awarzynski

Differential Revision: https://reviews.llvm.org/D112140

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>

[NFC][LoopIdiom] Add more test case to runtime-determined memset size

This patch supplements missing test case for D107353.
- Fix wrong descriptions in 64-bit mode test case
- Added testcase under 32-bit mode

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D108507

[LLDB] [NFC] Typo fix in usage text for "type filter" command

When you invoke "help type filter" the resulting help shows:

Syntax: type synthetic [<sub-command-options>]

This patch fixes the help so it says "type filter" instead of "type synthetic".

patch by: "Daniel Jalkut <jalkut@red-sweater.com>"

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D112199

[opt-viewer] Use safe yaml load_all

Differential Revision: https://reviews.llvm.org/D112075

Revert "[MLIR][OpenMP] Add support for ordered construct"

This reverts commit dc2be87ecf10f2f1cf05f638a72256387c78f1c1.

Seems like this broke all the CI bots.

[ELF] Avoid adding an orphan section to a less suitable segment

If segments are defined in a linker script, placing an orphan section
before the found closest-rank section can result in adding it in a
previous segment and changing flags of that segment. This happens if
the orphan section has a lower sort rank than the found section. To
avoid that, the patch forces orphan sections to be moved after the
found section if segments are explicitly defined.

Differential Revision: https://reviews.llvm.org/D111717

[NFC][msan] Add NormalArgAfterNoUndef testcase

[NFC][msan] Rerun update_test_checks.py for a test

[NFC][msan] Break the loop when done

We have nothing to do after the Argument
is found.

[lld-macho][nfc] Added some notes on deliberate differences btw LD64 vs LLD-MACHO

For future references and to help with debugging crashes, this could be useful.

Differential Revision: https://reviews.llvm.org/D110464

[Codegen] Set ARITH_FENCE as meta-instruction

ARITH_FENCE, which was added by https://reviews.llvm.org/D99675,
should be a meta-instruction b/c it only emits comments "ARITH_FENCE".

Reviewed By: pengfei, LuoYuanke

Differential Revision: https://reviews.llvm.org/D112127

[modules] While merging ObjCInterfaceDecl definitions, merge them as decl contexts too.

While working on https://reviews.llvm.org/D110280 I've tried to merge
decl contexts as it seems to be correct and matching our handling of
decl contexts from different modules. It's not required for the fix in
https://reviews.llvm.org/D110280 but it revealed a missing diagnostic,
so separating this change into a separate commit.

Renamed some variables to distinguish diagnostic like "declaration of
'x' does not match" for different cases.

Differential Revision: https://reviews.llvm.org/D110287

[MLIR][OpenMP] Add support for ordered construct

This patch supports the ordered construct in OpenMP dialect following
Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR
using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive
is not supported yet since LLVM optimization passes do not support it
for now.

Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh

Differential Revision: https://reviews.llvm.org/D110015

[Driver][OpenBSD] Some improvements to the external assembler handling

- Pass CPU variant for ARM
- Pass MIPS CPU in addition to the ABI

[ARM] Use correct name of floating point ceil intrinsic in test.

The intrinsic is called llvm.ceil not llvm.fceil. The checks weren't
strong enough to notice that a call to llvm.fceil was emitted in
the final assembly.

[msan] Add stat-family interceptors on Linux

Add following interceptors on Linux: stat, lstat, fstat, fstatat.

This fixes use-of-uninitialized value on platforms with GLIBC 2.33+.
In particular: Arch Linux, Ubuntu hirsute/impish.

The tests should have also been failing during the release on the mentioned platforms, but I cannot find any related discussion.

Most likely, the regression was introduced by glibc commit [[ https://github.com/bminor/glibc/commit/8ed005daf0ab03e142500324a34087ce179ae78e | 8ed005daf0ab03e14250032 ]]:
all stat-family functions are now exported as shared functions.

Before, some of them (namely stat, lstat, fstat, fstatat) were provided as a part of libc_noshared.a and called their __xstat dopplegangers. This is still true for Debian Sid and earlier Ubuntu's. stat interceptors may be safely provided for them, no problem with that.

Closes https://github.com/google/sanitizers/issues/1452.
See also https://jira.mariadb.org/browse/MDEV-24841

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D111984

[lld-macho] Temporarily disable lc-linker-option.ll on Windows

It's currently using a symlink, which is not supported on Windows.

[SelectionDAG] Bail out of mergeTruncStores when not optimizing

With unoptimized code, we may see lots of stores and spend too much time in mergeTruncStores.

Fixes PR51827.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D111596

[ARM] Fix inline assembly referencing floating point registers on soft-float targets

Fixes PR: https://bugs.llvm.org/show_bug.cgi?id=52230

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112135

OS Laboratory, Huawei Russian Research Institute, Saint-Petersburg

Revert "[ORC-RT] Configure the ORC runtime for more architectures and platforms"

Broke on aarch64-linux. Reverting while I investigate.

This reverts commit 5692ed0cce8c9506eef40ffe6ca2d9629956c51c.

[runtimes] Rename CI job from "Runtimes build" to "Bootstrapping build"

[libunwind] Revert "Use the from-scratch testing configuration by default"

This reverts commit 5a8ad80b6fa5cbad58b78384f534b78fca863e7f, which broke
the Bootstrapping build. I'm reverting until we've fixed the issue.

Differential Revision: https://reviews.llvm.org/D112082

[libc++abi] Guard include of <unistd.h> behind __has_include

This doesn't change anything on platforms that have <unistd.h>, but
it will allow this file to compile on platforms that do not.

[Tests] Add tests for non-speculatable ephemeral values

The loads in these examples are currently not considered ephemeral
because they are not speculatable.

Revert "[fir] Add Character helper"

This reverts commit 02d7089c239075a5c2e148087d2824d253fc3d5f.

[x86] add special-case lowering for usubsat for AVX512

This is a small extension of D112095 to avoid another regression
seen with D112085.
In this case, we allow the same conversion from usubsat to ALU
ops if the target supports vpternlog.

That pattern will get converted later in X86DAGToDAGISel::tryVPTERNLOG().
This seems better than putting a magic immediate constant directly in
this code to create the exact vpternlog that we need. It's possible that
there are other special-cases along these lines, so we should try to
keep all of the vpternlog magic in one place.

Differential Revision: https://reviews.llvm.org/D112138

[libc++] Fix incorrect main() signatures in the tests

Those creep up from time to time. We need to use `int main(int, char**)`
because in freestanding mode, `main` doesn't get special treatment and
special mangling, so we setup a symbol alias from the mangled version of
`main(int, char**)` to `extern "C" main`. That only works if all the tests
are consistent about how they define their main function.

[InstCombine] Fold `(a & ~b) & ~c` to `a & ~(b | c)`

  %not1 = xor i32 %b, -1
  %not2 = xor i32 %c, -1
  %and1 = and i32 %a, %not1
  %and2 = and i32 %and1, %not2
=>
  %i1 = or i32 %b, %c
  %i2 = xor i32 %1, -1
  %and2 = and i32 %i2, %a

Differential Revision: https://reviews.llvm.org/D112108

Remove include of 'type_info' from ext-int test.

Originally I thought that I needed to do a #include to trick the
compiler into letting me use typeid I believe, but Aaron explained that
it was just looking for the type_info type. I had to give it some
public/private members to make it emit the same as before, but this
ought to be a 'perfect' replacement.

Precommit updated InstCombine/and-xor-or.ll test. NFC.

[IndVars] Invalidate SCEV when IR is changed in rewriteLoopExitValue.

At the moment, rewriteLoopExitValue forgets the current phi node in the
loop that collects phis to rewrite. A few lines after the value is
forgotten, SCEV is used again to analyze incoming values and
potentially expand SCEV expression. This means that another SCEV is
created for PN, before the IR is actually updated in the next loop.

This leads to accessing invalid cached expression in combination with
D71539.

PN should only be changed once the actual incoming exit value is set in
the next loop. Moving invalidation there should ensure that PN is
invalidated in all relevant cases.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D111495

[mlir][sparse] make index type explicit in public API of support library

The current implementation used explicit index->int64_t casts for some, but
not all instances of passing values of type "index" in and from the sparse
support library. This revision makes the situation more consistent by
using new "index_t" type at all such places (which allows for less trivial
casting in the generated MLIR code). Note that the current revision still
assumes that "index" is 64-bit wide. If we want to support targets with
alternative "index" bit widths, we need to build the support library different.
But the current revision is a step forward by making this requirement explicit
and more visible.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D112122

Make dr177x.cpp test work with Windows-32 bit platfroms with 'thiscall'.

My downstream noticed that the test failed on windows-32 bit machines
since the types have __attribute__((thiscall)) on them in a few places.
This patch just adds a wildcard to handle that, since it isn't
particularly important to the test.