review.tizen.org Git - platform/upstream/llvm.git/log

[mlir][sparse][taco] Support reduction to scalar tensors.

The PyTACO DSL doesn't support reduction to scalars. This change
enhances the MLIR-PyTACO implementation to support reduction to scalars.

Extend an existing test to show the syntax of reduction to scalars and
two methods to retrieve the scalar values.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120572

[Symbolizer] Move default ctor into .cpp file

Follow up to 1e396affca6a0d21247d960c93a415e8f6fe0301. On some standard
library configurations these have a dependency on the complete type of
SymbolizableModule.

[Mangler] Mangle aliases to fastcall/vectorcall functions correctly

These aliases are produced by MergeFunctions and need to be mangled according to the calling convention of the function they are pointing to instead of defaulting to the C calling convention.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D120382

[Triple] Add llvm::Triple::isSPARC{,32,64}

Reviewed By: ro, MaskRay

Differential Revision: https://reviews.llvm.org/D120381

[libcxx] Fix the error checking for wctob_l, fixing locale narrow function on Windows

According to POSIX.1 (and Glibc docs, and Microsoft docs), the wctob
function returns EOF on error, not WEOF. (And wctob_l should consequently
do the same.)

The previous misconception about what this function returns on errors
seems to stem from incorrect documentation in macOS, stemming from BSD
docs with the same issue. The corresponding documentation bug in FreeBSD
was fixed in 2012 in
https://github.com/freebsd/freebsd-src/commit/945aab90991bdaeabeb6ef25112975a96c01dd4e,
but it hasn't been fixed for macOS yet.

The issue seems to only be a documentation issue; the implementation
on macOS actually does use EOF, not WEOF:
https://opensource.apple.com/source/Libc/Libc-1439.40.11/locale/FreeBSD/wctob.c.auto.html

On most Unices, EOF and WEOF are the same value, but on Windows,
EOF is -1, while WEOF is (unsigned short)0xFFFF. By fixing this,
two tests start passing on Windows.

Differential Revision: https://reviews.llvm.org/D120088

[libcxx] [test] Fix the monetary locale negative_sign test for en_US.UTF-8 on Windows

On Windows, the en_US.UTF-8 locale returns `n_sign_posn == 0`, which
means that the sign for a negative currency is parentheses around
the whole value, instead of a leading minus.

Differential Revision: https://reviews.llvm.org/D120549

[lldb] Fix check for TARGET_OS_IPHONE

Instead of checking whether TARGET_OS_IPHONE is set to 1, the current
code just check the existence of TARGET_OS_IPHONE, which either always
succeeds or always fails, depending on whether you have
TargetConditionals.h included.

[dsymutil] Copy symbol table regardless of LINKEDIT segment

Ensure we copy the symbol table for MH_PRELOAD Mach-Os, which don't have
a LINKEDIT segment, but (can) have a symbol table.

rdar://88919473

Differential revision: https://reviews.llvm.org/D120583

Validate chained fixup image formats

This is part of a series of patches to upstream support for Mach-O
chained fixups.

Differential Revision: https://reviews.llvm.org/D113725

Don't append the working directory to absolute paths

This fixes a bug that happens when using -fdebug-prefix-map to remap
an absolute path to a relative path. Since the path was absolute
before remapping, it is safe to assume that concatenating the remapped
working directory would be wrong.

Differential Revision: https://reviews.llvm.org/D113718

[clang-format] Handle trailing comment for InsertBraces

Differential Revision: https://reviews.llvm.org/D120503

Enable tests from rG8e67982384d4a11892c04d16c2d10d7533e56094 that seem to work now

I noticed randomly that the only reason these tests from
rG8e67982384d4a11892c04d16c2d10d7533e56094 seemed to still be failing is
that they are missing CHECK lines. I don't know anymore than that they
don't appear to crash or assert when I ran them today.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D115377

[HIP] Fix test hip-link-bundled-archive.hip

match pattern should match lld.exe on windows

Reviewed by: Shangwu Yao

Differential Revision: https://reviews.llvm.org/D120563

[flang] Lower logical comparison and logical operations

This handles the lowering of the logical comparison
to `arith.cmpi` operation. The logical operations `.OR.`, `.AND.`
and `.NOT.` are lowered to `arith.ori`, `arith.andi` and `arith.xori`

This patch is part of the upstreaming effort from fir-dev branch.

Depends on D120559

Reviewed By: schweitz, rovka

Differential Revision: https://reviews.llvm.org/D120560

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[flang] Lower real comparison operations

This patch handles the lowering of real
comparison operations. The real comparison operations
are lowered to `arith.cmpf` operation.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: PeteSteinfeld, schweitz

Differential Revision: https://reviews.llvm.org/D120561

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[flang] Lower integer comparison operation

This patch handles the lowering of comprison
operator between integers.
The comparison is lowered to a `arith.cmpi` operation.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: PeteSteinfeld, schweitz, rovka

Differential Revision: https://reviews.llvm.org/D120559

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[libunwind][test] remember_state_leak.pass.sh.s: link with -no-pie

The no-pic large code model style `movabsq $callback, %rsi` does not work with -pie.

[mlir][Linalg] Add support for tileFuseAndDistribute on tensors.

This extends TileAndFuse to handle distribution on tensors.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D120441

[RISCV] Remove tab character from test. Autogenerate CHECK lines. NFC

This was a test for an infinite loop so the CHECK lines don't really
matter, but they'd get generated the next time someone runs the script
on the file so might as well do it while I'm touching it.

mark getTargetTransformInfo and getTargetIRAnalysis as const

Seems like this can be const, since Passes shouldn't modify it.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D120518

[mlir][Vector] Prevent AVX2 lowering for non-f32 transpose ops

The AVX2 lowering for transpose operations is only applicable to f32 vector types.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120427

[mlir][Vector] Generalize AVX2 transpose lowering to n-D vectors

The existing AVX2 lowering patterns for the transpose op only triggers if the
input vector is 2-D. This patch extends the patterns to trigger for n-D vectors
which are effectively 2-D vectors (e.g., vector<1x4x1x8x1). The main constraint
for the generalized AVX2 patterns to be applicable to these vectors is that the
dimensions that are greater than one must be transposed. Otherwise, the existing
patterns are not applicable.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D119505

[ELF] Support some absolute/PC-relative relocation types for REL format

ctfconvert seems to use REL-format `.rel.SUNW_dof` for 32-bit architectures.
```
Binary file usr/ports/lang/perl5.32/work/perl-5.32.1/dtrace_mini.o matches
[alfredo.junior@dell-a ~/tmp/llvm-bug]$ readelf -r dtrace_mini.o

Relocation section (.rel.SUNW_dof):
r_offset r_info r_type st_value st_name
00000184 0000281a R_PPC_REL32 00000000 $dtrace1772974259.Perl_dtrace_probe_load
```

Support R_PPC_REL32 to fix `ld.lld: error: drti.c:(.SUNW_dof+0x4E4): internal linker error: cannot read addend for relocation R_PPC_REL32`.
While here, add some common relocation types for AArch64, PPC, and PPC64.
We perform minimum tests.

Reviewed By: adalava, arichardson

Differential Revision: https://reviews.llvm.org/D120535

[mlir] Support verification order (2/3)

    This change gives explicit order of verifier execution and adds
    `hasRegionVerifier` and `verifyWithRegions` to increase the granularity
    of verifier classification. The orders are as below,

    1. InternalOpTrait will be verified first, they can be run independently.
    2. `verifyInvariants` which is constructed by ODS, it verifies the type,
       attributes, .etc.
    3. Other Traits/Interfaces that have marked their verifier as
       `verifyTrait` or `verifyWithRegions=0`.
    4. Custom verifier which is defined in the op and has marked
       `hasVerifier=1`

    If an operation has regions, then it may have the second phase,

    5. Traits/Interfaces that have marked their verifier as
       `verifyRegionTrait` or
       `verifyWithRegions=1`. This implies the verifier needs to access the
       operations in its regions.
    6. Custom verifier which is defined in the op and has marked
       `hasRegionVerifier=1`

    Note that the second phase will be run after the operations in the
    region are verified. Based on the verification order, you will be able to
    avoid verifying duplicate things.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D116789

[OPENMP]Fix PR50347: Mapping of global scope deep object fails.

Changed the we handle llvm::Constants in sizes arrays. ConstExprs and
GlobalValues cannot be used as initializers, need to put them at the
runtime, otherwise there wight be the compilation errors.

Differential Revision: https://reviews.llvm.org/D105297

Revert "[lsan][test] Temporarily disable ppc64 and ppc64le to appease clang-ppc64le-rhel"

This reverts commit cb76c4d71c41bbbae47852d7980e74b57c5a28df.

The failures were in test/sanitizer_common, not in test/lsan.

[compiler-rt][test] Temporarily disable ppc64 and ppc64le test/sanitizer_common and test/crt

to appease clang-ppc64le-rhel: https://github.com/llvm/llvm-project/issues/54084

[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg

Summary:
  Emit metadata for hidden_heap_v1 kernarg

Reviewers:
  sameerds, b-sumner

Fixes:
  SWDEV-307188

Differential Revision:
  https://reviews.llvm.org/D119027

[BOLT][DWARF] Fix how DW_AT_high_pc [DW_FORM_udata] is handled

We were not handling correctly conversion from DW_AT_high_pc into DW_AT_ranges,
when size of DW_AT_high_pc is not 4/8 bytes.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D120528

[lsan][test] Temporarily disable ppc64 and ppc64le to appease clang-ppc64le-rhel

Seems that ppc64 lsan doesn't work with default PIE (see D120305):
https://lab.llvm.org/buildbot/#/builders/57/builds/15506

[NFC] Remove unnecessary function pass managers

[AArch64] Add tests for tbl + cmp splitting.

Additional tests showing potential for follow-ups after
D120571.

Lower Fortran intrinsic to a runtime call/llvm intrinsic

This patch brings in code which can lower a Fortran intrinsic to
a runtime call or an llvm intrinsic. For math intrinsics the
runtime call is to the `math` or `pgmath` library. Non-math
intrinsics are covered by the Flang runtime. A distance computation
mechanism is introduced to find the runtime function that closely
matches the types of the intrinsic call.

In this patch, the `abs` intrinsic is lowered in the following way,
-> Integer version is lowered as a group of MLIR/FIR operations
-> Real version is lowered to llvm intrinsics
-> Complex version is lowered to the `math_hypot` runtime function

This patch is part of upstreaming from the fir-dev branch of https://github.com/flang-compiler/f18-llvm-project

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D120403

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: zacharyselk <zrselk@gmail.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>

[Sample-PGO] Emit FS discriminators only when -fdebug-info-for-profiling is set

IR level addDiscriminator pass is guarded by DebugInfoForProfiling
(set by option -fdebug-info-for-profiling).
This patch syncs the logic for the MIR and IR level implementations.

Differential Revision: https://reviews.llvm.org/D120536

[PowerPC][NFC] Split out the MMA instructions from the P10 instructions.

Currently all of the MMA instructions as well as the MMA related register info
is bundled with the Power 10 instructions. This patch just splits them out.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D120515

Avoid comparisons between types of different widths in a loop condition to prevent the loop from behaving unexpectedly

This change fixes the code violations flagged in AMD compute CodeQL scan -
Query Description: "Comparisons between types of different widths in a loop condition can cause the loop to behave unexpectedly."

Differential Revision: https://reviews.llvm.org/D120355

[flang] Lower simple character return

Handles function with character return.

Character scalar results are passed as arguments in lowering so
that an assumed length character function callee can access the result
length.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: PeteSteinfeld, schweitz

Differential Revision: https://reviews.llvm.org/D120558

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>

[AArch64] Add scalar min/max costs. NFC

The vector costs were already added, this adds scalar variants to
complete the test coverage.

[SVE] Add missing splat patterns for bfloat vectors.

Differential Revision: https://reviews.llvm.org/D120496

[MergeICmps] Don't require GEP

With opaque pointers, the zero-offset load will generally not use
a GEP. Allow a direct load without GEP, which is treated the same
way as a zero-offset GEP.

[PowerPC][NFC] Add file info and license that was missing from this file.

Added the license info as well as description about how classes should be named
based on existing documentation.

Reviewed By: lei, #powerpc

Differential Revision: https://reviews.llvm.org/D120530

[Clang][Sema] Do not evaluate value-dependent immediate invocations

Value-dependent ConstantExprs are not meant to be evaluated.
There is an assert in Expr::EvaluateAsConstantExpr that ensures this condition.
But before this patch the method was called without prior check.

Fixes https://github.com/llvm/llvm-project/issues/52768

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D119375

[NFC][SVE] Refactor SelectSVEAddSubImm to match SelectSVECpyDupImm.

They're equivalent other than one is signed and the other unsigned.

[SVE] Refactor complex immediate pattern used by CPY/DUP.

SelectSVE8BitLslImm didn't account for constant values that have a
larger bit width than the result vector's element type. This only
seems to affect a single corner case when lowering fixed length
vectors but the code itself is also not consistent with how other
related complex patterns are implemented so I've taken the
opportunity to refactor the code.

Differential Revision: https://reviews.llvm.org/D120440

[libcxx] String format class marked as packed

This patch marks the class _Flags as packed because the design assumes that it
is packed and a number of tests also assume that it is packed. However on AIX
the class is not packed unless it is marked as such.

Reviewed By: hubert.reinterpretcast, #libc, Mordante, ldionne, Quuxplusone

Differential Revision: https://reviews.llvm.org/D119567

[analyzer] Don't crash if the analyzer-constraint is set to Z3, but llvm is not built with it

Exactly what it says on the tin! We had a nasty crash with the following incovation:

$ clang --analyze -Xclang -analyzer-constraints=z3 test.c
fatal error: error in backend: LLVM was not compiled with Z3 support, rebuild with -DLLVM_ENABLE_Z3_SOLVER=ON
... <stack trace> ...

Differential Revision: https://reviews.llvm.org/D120325

[mlir][sparse][taco] Use np.array_equal to compare integer values.

Fix MLIR-PyTACO and some tests to use np.array_equal to compare integer
values.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120526

[AArch64] Add test cases where zext can be lowered to series of tbl.

Add a set of tests for upcoming patches that allow lowering vector zext
using AArch64 tbl instructions instead of shifts.

[mlir][OpDSL] Split arithmetic functions.

Split arithmetic function into unary and binary functions. The revision prepares the introduction of unary and binary function attributes that work similar to type function attributes.

Depends On D120108

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120109

[clang-tidy] Fix `readability-suspicious-call-argument` crash for arguments without name-like identifier

As originally reported by @steakhal in
http://github.com/llvm/llvm-project/issues/54074, the name extraction logic of
`readability-suspicious-call-argument` crashes if the argument passed to a
function was a function call to a non-trivially named entity (e.g. an operator).

Fixed this crash case by ignoring such constructs and considering them as having
no name.

Reviewed By: aaron.ballman, steakhal

Differential Revision: http://reviews.llvm.org/D120555

[mlir][sparse][taco] Add support for scalar tensors.

This change allows the use of scalar tensors with index 0 in tensor index
expressions. In this case, the scalar value is broadcast to match the
dimensions of other tensors in the same expression.

Using scalar tensors as a destination in tensor index expressions is not
supported in the PyTACO DSL.

Add a PyTACO test to show the use of scalar tensors.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120524

[MC][WebAssembly] Fix crash when relocation addend underlows U32

For the object file writer we need to allow the underflow (ar write
zero), but for the final linker output we should probably generate an
error (I've left that as a TODO for now).

Fixes: https://github.com/llvm/llvm-project/issues/54012

Differential Revision: https://reviews.llvm.org/D120522

[WebAssembly] Covert llvm/test/MC/WebAssembly/reloc-code.ll to asm. NFC

Also increase coverage of call_indirect via explict function table
(enabled when reference types is enabled) in
llvm/test/CodeGen/WebAssembly/call-indirect.ll (I believe this
was an oversight that it was not added in https://reviews.llvm.org/D90948)

Differential Revision: https://reviews.llvm.org/D120521

[mlir][OpDSL] Refactor function handling.

Prepare the OpDSL function handling to introduce more function classes. A follow up commit will split ArithFn into UnaryFn and BinaryFn. This revision prepares the split by adding a function kind enum to handle different function types using a single class on the various levels of the stack (for example, there is now one TensorFn and one ScalarFn).

Depends On D119718

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120108

[libcxx] Fix the type in __estimate_column_width

It seems that we are using wchar_t in __estimate_column_width and assume that
it is a 32 bit type. However, on AIX 32 the size of wchar_t is only 16 bits.

Changed wchar_t to uint32_t since the variable is being passed to a function
that uses uint32_t anyway.

Reviewed By: hubert.reinterpretcast, daltenty, Mordante, #libc, Quuxplusone

Differential Revision: https://reviews.llvm.org/D119770

[gn build] Port 53dcd9efd16f

[clang][dataflow] Add SAT solver interface and implementation

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D120289

[IPO] AAFunctionReachabilityFunction.updateImpl - reduce AAReachability scope. NFCI.

We already have a check for !InstQueries.empty(), so move the for-range over InstQueries inside to avoid the AAReachability uninitialized variable static analysis warnings.

Use function prototypes when appropriate; NFC

This prepares the clang-tools-extra project for -Wstrict-prototypes
being enabled by default.

[MLIR][Presburger] coalesce fixups: inline comments /// -> //, i++ -> ++i (NFC)

Also use empty() instead of size() == 0.

[X86] Regenerate x86-cmov-converter.ll checks

[X86] Combine ADC(ADD(X,Y),0,Carry) -> ADC(X,Y,Carry)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120435

[sanitizer] Disable pc guard coverage test on PPC64/s390x

Reviewed By: benshi001, uweigand

Differential Revision: https://reviews.llvm.org/D120541

Revert "[lldb/test] Fix TestProgressReporting.py race issue with the event listener"

This reverts commit 3e3e79a9e4c378b59f5f393f556e6a84edcd8898.

MemorySanitizer: use-of-uninitialized-value

[MLIR][Presburger] Refactor looping strategy in coalesce

This patch refactors the looping strategy of coalesce for future patches. The new strategy works in-place and uses IneqType to organize inequalities into vectors of the same type. Future coalesce cases will pattern match on this organization. E.g. the contained case needs all inequalities and equalities to be redundant, so this case becomes checking whether the respective vectors are empty. For other cases, the patterns consider the types of all inequalities of both sets making it wasteful to only consider whether a can be coalesced with b in one step, as inequalities would need to be typed again for the opposite case. Therefore, the new strategy tries to coalesce a with b and b with a in a single step.

Reviewed By: Groverkss, arjunp

Differential Revision: https://reviews.llvm.org/D120392

[RISCV] Fix a mistake in PostprocessISelDAG

With the condition N->use_empty(), the root node of DAG always
misses peephole optimization. So a dummy node is needed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119934

[gn build] Port f9e8e92cf586

Revert "[clang][analyzer] Add modeling of 'errno'."

This reverts commit 29b512ba322cb6dd2c45d5e07645e20db47fad0d.

This broke several build bots:

https://lab.llvm.org/buildbot/#/builders/86/builds/30183
https://lab.llvm.org/buildbot/#/builders/216/builds/488

[MergeICmps] Add opaque pointer test (NFC)

We fail to merge the icmps here, because the zero-offset access
does not go through a GEP.

[LLDB] XFAIL TestUnambiguousTailCalls.py for Arm/Linux

This patch marks TestUnambiguousTailCalls.py as XFAIL on Arm/Linux.
Test started failing after 3c4ed02698afec021c6bca80740d1e58e3ee019e.

Differential Revision: https://reviews.llvm.org/D120305

[MLIR][Presburger] Use Matrix utilities for IntegerPolyhedron

This patch replaces various functions over inequalities/equalities in
IntegerPolyhedron with Matrix functions already implementing them or refactors
them to a Matrix function.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D120482

[gn build] Port 29b512ba322c

[clang][analyzer] Add modeling of 'errno'.

Add a checker to maintain the system-defined value 'errno'.
The value is supposed to be set in the future by existing or
new checkers that evaluate errno-modifying function calls.

Reviewed By: NoQ, steakhal

Differential Revision: https://reviews.llvm.org/D120310

[SVE] Don't custom lower constant predicate ISD:SPLAT_VECTOR operations.

Differential Revision: https://reviews.llvm.org/D120340

[AArch64] Async unwind - Refactor generation of shadow call stack prologue/epilogue

This patch is in preparation for the async unwind CFI.

Move the emission of the shadow call stack prologue/epilogue
instructions to the `emitPrologue`/`emitEpilogue`. This greatly
simplifies especially epilogue generation and makes unnecessary some
quite fragile code, that tries to skip over those

Reviewed By: MaskRay, efriedma

Differential Revision: https://reviews.llvm.org/D112329

[IndVars] Use phis() (NFC)

[OpenCL] opencl-c.h: Fix incorrect get_image_width guard

`cl_khr_3d_image_writes` should not guard `read_only image3d_t`.

[LLDB] Remove XFAIL from minidebuginfo-set-and-hit-breakpoint.test

This patch removes XFAIL from minidebuginfo-set-and-hit-breakpoint.test.
Test started passing after 3c4ed02698afec021c6bca80740d1e58e3ee019e.

Differential Revision: https://reviews.llvm.org/D120305

[MLIR][Presburger] Move Presburger/ files to presburger namespace

This patch moves the Presburger library to a new `presburger` namespace.

This allows to shorten some names, helps to avoid polluting the mlir namespace,
and also provides some structure.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D120505

[InstCombine] Remove SPF min/max canonicalization

Now that we canonicalize SPF min/max to intrinsics, there's no
need to canonicalize the structure of the SPF min/max itself
anymore. This is conceptually NFC, but in practice does slightly
impact results due to folding order differences.

[MLIR][Presburger][NFC] Refactor redundant code in fourierMotzkinEliminate

This patch removes redundant code from fourierMotzkinEliminate implementation
using existing functions in IntegerPolyhedron.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D120502

[flang][driver] Add support for `--target`/`--triple`

This patch adds support for:
* `--target` in the compiler driver (`flang-new`)
* `--triple` in the frontend driver (`flang-new -fc1`)
The semantics of these flags are inherited from `clangDriver`, i.e.
consistent with `clang --target` and `clang -cc1 --triple`,
respectively.

A new structure is defined, `TargetOptions`, that will hold various
Frontend options related to the target. Currently, this is mostly a
placeholder that contains the target triple. In the future, it will be
used for storing e.g. the CPU to tune for or the target features to
enable.

Additionally, the following target/triple related options are enabled
[*]: `-print-effective-triple`, `-print-target-triple`. Definitions in
Options.td are updated accordingly and, to facilated testing,
`-emit-llvm` is added to the list of options available in `flang-new`
(previously it was only enabled in `flang-new -fc1`).

[*] These options were actually available before (like all other options
defined in `clangDriver`), but not included in `flang-new --help`.
Before this change, `flang-new` would just use `native` for defining the
target, so these options were of little value.

Differential Revision: https://reviews.llvm.org/D120246

[C++20][Modules][4/8] Handle generation of partition implementation CMIs.

Partition implementations are special, they generate a CMI, but it
does not have an 'export' line, and we cannot export anything from the
it [that is it can only make decls available to other members of the
owning module, not to importers of that].

Add initial testcases for partition handling, derived from the examples in
Section 10 of the C++20 standard, which identifies what should be accepted
and/or rejected.

Differential Revision: https://reviews.llvm.org/D118587

[NFC][AArch64][SME] Remove '#' prefix in PSEL test cases

There is no need to prefix '#' for element index in PSEL.
This is a follow up of D111213.

Differential Revision: https://reviews.llvm.org/D120543

[SCEVExpander] Use early returns in FindValueInExprValueMap() (NFC)

[IR] Use CallBase::getParamElementType() (NFC)

As this method now exists on CallBase, use it rather than the
one on AttributeList.

Revert rG87753cebf5f861eee418d6bce155dfa0b00f9878 "[X86] combineX86ShufflesRecursively - don't both widening inputs before calling combineX86ShuffleChain"

Reverting while we investigate codegen regression reports

PlatformMacOSX should be activated for lldb built to run on an iOS etc device

In the changes Jonas made in https://reviews.llvm.org/D117340 , a
small oversight was that PlatformMacOSX (despite the name) is active
for any native Darwin operating system, where lldb and the target
process are running on the same system. This patch uses compile-time
checks to return the appropriate OSType for the OS lldb is being
compiled to, so the "host" platform will correctly be selected when
lldb & the inferior are both running on that OS. And a small change
to PlatformMacOSX::GetSupportedArchitectures which adds additional
recognized triples when running on macOS but not other native Darwin
systems.

Differential Revision: https://reviews.llvm.org/D120517
rdar://89247060

[SCEV] Return ArrayRef from getSCEVValues() (NFC)

Return a read-only view on this set. For the one internal use,
directly access ExprValueMap.

[mlir][OpDSL] Add type function attributes.

Previously, OpDSL operation used hardcoded type conversion operations (cast or cast_unsigned). Supporting signed and unsigned casts thus meant implementing two different operations. Type function attributes allow us to define a single operation that has a cast type function attribute which at operation instantiation time may be set to cast or cast_unsigned. We may for example, defina a matmul operation with a cast argument:

```
@linalg_structured_op
def matmul(A=TensorDef(T1, S.M, S.K), B=TensorDef(T2, S.K, S.N), C=TensorDef(U, S.M, S.N, output=True),
cast=TypeFnAttrDef(default=TypeFn.cast)):
C[D.m, D.n] += cast(U, A[D.m, D.k]) * cast(U, B[D.k, D.n])
```

When instantiating the operation the attribute may be set to the desired cast function:

```
linalg.matmul(lhs, rhs, outs=[out], cast=TypeFn.cast_unsigned)
```

The revsion introduces a enum in the Linalg dialect that maps one-by-one to the type functions defined by OpDSL.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D119718

[NVPTX][AsmPrinter] Emit .attribute(.managed) for global variable declarations

Declaration and definition attributes must match,
otherwise it may cause issues on linking.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120493

[SCEV] Don't try to reuse expressions with offset

SCEVs ExprValueMap currently tracks not only which IR Values
correspond to a given SCEV expression, but additionally stores that
it may be expanded in the form X+Offset. In theory, this allows
reusing existing IR Values in more cases.

In practice, this doesn't seem to be particularly useful (the test
changes are rather underwhelming) and adds a good bit of complexity.
Per https://github.com/llvm/llvm-project/issues/53905, we have an
invalidation issue with these offseted expressions.

Differential Revision: https://reviews.llvm.org/D120311

[SystemZ] [z/OS] Add support for generating huge (1 MiB) stack frames in XPLINK64

This patch extends support for generating huge stack frames on 64-bit XPLINK by implementing the ABI-mandated call to the stack extension routine.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D120450

[MLIR][Presburger] enable copy assignment operator for Simplex

This patch removes the `const` from `usingBigM` to enable the implicit copy assignment operator for Simplex.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D120542

[clang-tidy] Fix `readability-non-const-parameter` for parameter referenced by an lvalue

The checker missed a check for a case when the parameter is referenced by an lvalue and this could cause build breakages.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D117090

[RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm

vcpop and vfirst are still useful when VL=0.
vcpop equivalents to li 0 and vfirst equivalents to li -1,
since no mask elements are active.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120302

[RISCV] Update computeTargetABI from llc as well as clang

Clang computes the default ABI if -mabi is empty
and encode it in LLVM IR module flag since D105555.
For correctness, llc need to give the same target-abi
(Options.MCOptions.ABIName) with ABI encoded in IR.
The getSubtargetImpl already has a check for them only if
Options.MCOptions.ABIName is not empty.

In order to get more robustness we could have a check for
explicit ABI, but now we have two different logic to
compute the default ABI.

The front-end ABI is defautl to the ilp32/ilp32e/lp64, and
ilp32d/lp64d when hardware support for extension D.
The backend ABI is default to the ilp32/ilp32e/lp64.

Reviewed by: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D118333

[AggressiveInstCombine] Fix `TruncInstCombine` (fix f84d732f)

Erase phi-nodes from `InstInfoMap` before erasing themselves

[AggressiveInstCombine] Add `phi` nodes support to `TruncInstCombine`

Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

(recommit of fixed f84d732f, reverted by 8ad6d5e after sanitizer breakage)

Differential Revision: https://reviews.llvm.org/D109817

[CUDA][SPIRV] Assign global address space to CUDA kernel arguments

(resubmit https://reviews.llvm.org/D119207 after fixing the test for
some build settings)

This patch converts CUDA pointer kernel arguments with default address
space to CrossWorkGroup address space (__global in OpenCL). This is
because Generic or Function (OpenCL's private) is not supported as
storage class for kernel pointer types.

Differential revision: https://reviews.llvm.org/D120366