review.tizen.org Git - platform/upstream/llvm.git/log

AMDGPU: Stop setting attributes based on TargetOptions

Having arbitrary passes looking at the TargetOptions is pretty
messy. This was also disregarding if a function already had an
explicit attribute setting on it. opt/llc now add the attributes to
functions that don't specify the attribute. clang and lld do not call
the function to do this, which they maybe should.

This was also treating unsafe-fp-math as implying the others, and
setting the other attributes based on it. This is not done anywhere
else, and I'm not sure is correct based on the current description of
the option bit.

Effectively reverts 1d8cf2be89087a2babc1dc38b16040fad0a555e2

Revert "[Dexter] Add support for Windows to regression test suite."

This reverts commit 89025da9f676aebff7daf055824d6fd102a70c34 as
it breaks the lldb macOS bot.

Fix denormal-fp-math flag and attribute interaction

Make these behave the same way unsafe-fp-math and co. The command line
flag should add the attribute to functions that do not already have
it, and leave existing attributes. The attribute is the actual
implementation, but the flag is useful in some testing situations.

AMDGPU has a variety of tests with denormals enabled/disabled that
would require a painful level of test duplication without a flag. This
doesn't expose setting the separate input/output modes, or add a flag
for the f32 version yet.

Tests will be included in future patch.

[COFF] Don't treat DWARF sections as GC roots

DWARF sections are typically live and not COMDAT, so they would be
treated as GC roots. Enabling DWARF would essentially keep all code with
debug info alive, preventing any section GC.

Fixes PR45273

Reviewed By: mstorsjo, MaskRay

Differential Revision: https://reviews.llvm.org/D76935

[lldb/PlatformMacOSX] Re-implement GetDeveloperDirectory

GetDeveloperDirectory returns a const char* which is NULL when we cannot
find the developer directory. This crashes in
PlatformDarwinKernel::CollectKextAndKernelDirectories because we're
unconditionally assigning it to a std::string. Coincidentally I just
refactored a bunch of code in PlatformMacOSX so instead of a ad-hoc fix
I've reimplemented the method based on GetXcodeContentsDirectory.

The change is mostly NFC. Obviously it fixes the crash, but it also
removes support for finding the Xcode directory through he legacy
$XCODE_SELECT_PREFIX_DIR/usr/share/xcode-select/xcode_dir_path.

Differential revision: https://reviews.llvm.org/D76938

[MC][AArch64] Make .reloc support arbitrary relocation types

Depends on D76746. Generalizes D61973.

Differential Revision: https://reviews.llvm.org/D76754

[MC][ARM] Make .reloc support arbitrary relocation types

Generalizes D61992. In GNU as, the .reloc directive supports arbitrary relocation types.

A MCFixupKind value `V` larger than or equal to FirstLiteralRelocationKind
is used to represent the relocation type whose number is V-FirstLiteralRelocationKind.

This is useful for linker tests. Without the feature the assembler
cannot produce certain relocation records (e.g. R_ARM_ALU_PC_G0/R_ARM_LDR_PC_G0)
This helps move forward D75349 and D76575.

Differential Revision: https://reviews.llvm.org/D76746

Fix a Diag call not to assume option spelling

[lit] Avoid global imports in module declaration

A previous attempt to cleanup module imports broke installing via
pip/setup.py [1]. This should be fixed now.

[1] cf252240e8819d0c90a5e10f773078bdeba33e44

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D76940

[ELF][test] Split basic.s

[AST] Fix typo on NoInitExpr dependence computation

[LLDB] Fix handling of bit-fields when there is a base class when parsing DWARF

When parsing DWARF and laying out bit-fields we currently don't take into account whether we have a base class or not.
Currently if the first field is a bit-field but the bit offset is due a field we inherit from a base class we currently
treat it as an unnamed bit-field and therefore add an extra field.

This fix will not check if we have a base class and assume that this offset is due to members we are inheriting from the base.
We are currently seeing asserts during codegen when debugging clang::DiagnosticOptions.

This assumption will fail in the case where the first field in the derived class in an unnamed bit-field. Fixing the first field
being an unnamed bit-field looks like it will require a larger change since we will need a way to track or discover the last field offset of the bases(s).

Differential Revision: https://reviews.llvm.org/D76808

Only add `darwin_log_cmd` lit shell test feature when the log can be queried.

Summary:
Follow up fix to 445b810fbd4. The `log show` command only works for
privileged users so run a quick test of the command during lit config to
see if the command works and only add the `darwin_log_cmd` feature if
this is the case.

Unfortunately this means the `asan/TestCases/Darwin/duplicate_os_log_reports.cpp`
test and any other tests in the future that use this feature won't run
for unprivileged users which is likely the case in CI.

rdar://problem/55986279

Reviewers: kubamracek, yln, dcoughlin

Subscribers: Charusso, #sanitizers, llvm-commits

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D76899

[Dexter] Add support for Windows to regression test suite.

This patch addresses the issue of the regression suite not running on windows hardware. It changes the following things:

add new dexter regression suite command to lit.cfg.py that makes use of the clang-cl_vs2015 and dbgend builder and debuggers.
sprinkle the new regression suite command through the feature and tool tests that require them.
mark certain problem tests on windows

There's a couple of tests that fail (or pass) in unexpected ways on Windows.

Problem tests are both the penalty and perfect expect_watch_type.cpp tests. Type information reporting parity is not possible a this time in dexter due to the nature of how different debuggers report type information back to their users.

reviewers: Orlando

Differential Revision: https://reviews.llvm.org/D76609

[ORC] Introduce JITSymbolFlags::HasMaterializeSideEffectsOnly flag.

This flag can be used to mark a symbol as existing only for the purpose of
enabling materialization. Such a symbol can be looked up to trigger
materialization with the lookup returning only once materialization is
complete. Symbols with this flag will never resolve however (to avoid
permanently polluting the symbol table), and should only be looked up using
the SymbolLookupFlags::WeaklyReferencedSymbol flag. The primary use case for
this flag is initialization symbols.

[ORC] Don't create MaterializingInfo entries unnecessarily.

[OPENMP50]Add basic support for inscan reduction modifier.

Added basic support (parsing/sema checks) for the inscan modifier in the
reduction clauses.

[X86] Don't form masked instructions if the operation has an additional user.

This will cause the operation to be repeated in both a mask and another masked
or unmasked form. This can a wasted of execution resources.

Differential Revision: https://reviews.llvm.org/D60940

[AST][SVE] Treat built-in SVE types as trivial

Built-in SVE types are trivial, since they're trivially copyable
and support default construction.

Differential Revision: https://reviews.llvm.org/D76692

[AST][SVE] Treat built-in SVE types as trivially copyable

SVE types are trivially copyable: they can be copied simply
by reproducing the byte representation of the source object.

Differential Revision: https://reviews.llvm.org/D76691

[X86][SSE] Add some additional v8i16 'truncation' style shuffle tests

Export Segment.IsGapRegion to JSON

Summary:
So that external tools can make use of that information and not display such lines as uncovered.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45300

Reviewers: vsk

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D76763

[AST][SVE] Treat built-in SVE types as POD

Built-in SVE types are POD in much the same that scalars and
fixed-length vectors are.

Differential Revision: https://reviews.llvm.org/D76690

[X86] Remove orphan LowerSTRICT_FSETCC declaration. NFCI.

LowerSETCC handles strict cases as well, we don't have a separate function.

Revert "[cuda][hip] Add CUDA builtin surface/texture reference support."

This reverts commit 6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6.
The patch breaks CUDA copmilation.

Differential Revision: https://reviews.llvm.org/D76365

[mlir] On Windows, silence warning on functions definition

This fixes a number of warnings, where a function is re-defined after it is tagged as "being imported":

D:\llvm-project\mlir\lib\ExecutionEngine\CRunnerUtils.cpp(24,17): warning: 'print_i32' redeclared without 'dllimport' attribute: 'dllexport' attribute added [-Winconsistent-dllimport]
extern "C" void print_i32(int32_t i) { fprintf(stdout, "%" PRId32, i); }
^
D:\llvm-project\mlir\include\mlir/ExecutionEngine/CRunnerUtils.h(168,42): note: previous declaration is here
extern "C" MLIR_CRUNNERUTILS_EXPORT void print_i32(int32_t i);
^

Differential Revision: https://reviews.llvm.org/D76654

[gn build] Port d60d7d69de9

[llvm-objdump][XCOFF][AIX] Implement -r option

Summary:
Implement several XCOFF hooks to get '-r' option working for llvm-objdump -r.

Reviewer: DiggerLin, hubert.reinterpretcast, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D75131

[ARM,CDE] Improve CDE intrinsics testing

Summary:
This patch:
* adds tests for vreinterpret intinsics in big-endian mode
* adds C++ runs to the CDE+MVE header compatibility test

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76927

[Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76925

[lldb-vscode] fix breakpoint result ordering

Summary:
The DAP specifies the following for the SetBreakpoints request:

  The breakpoints returned are in the same order as the elements of the 'breakpoints' arguments

This was not followed, as lldb-vscode was returning the breakpoints in a different order, because they were first stored into a map, and then traversed. Of course, maps normally don't preserve ordering.

See this log I captured:

  -->
  {"command":"setBreakpoints",
   "arguments":{
     "source":{
       "name":"main.cpp",
       "path":"/Users/wallace/fbsource/xplat/sand/test-projects/buck-cpp/main.cpp"
     },
     "lines":[6,10,11],
     "breakpoints":[{"line":6},{"line":10},{"line":11}],
     "sourceModified":false
   },
   "type":"request",
   "seq":3
  }

  <--
  {"body":{
     "breakpoints":[
       {"id":1, "line":11,"source":{"name":"main.cpp","path":"xplat/sand/test-projects/buck-cpp/main.cpp"},"verified":true},
       {"id":2,"line":6,"source":{"name":"main.cpp","path":"xplat/sand/test-projects/buck-cpp/main.cpp"},"verified":true},
       {"id":3,"line":10,"source":{"name":"main.cpp","path":"xplat/sand/test-projects/buck-cpp/main.cpp"},"verified":true}]},
     "command":"setBreakpoints",
     "request_seq":3,
     "seq":0,
     "success":true,
     "type":"response"
  }

As you can see, the order was not respected. This was causing the IDE not to be able to disable/enable breakpoints by clicking on them in the breakpoint view in the lower corner of the Debug tab.

This diff fixes the ordering problem. The traversal + querying was done very fast in O(nlogn) time. I'm keeping the same complexity.

I also updated a couple of tests to account for the ordering.

Reviewers: clayborg, aadsm, kusmour, labath

Subscribers: lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D76891

[libc++] Use braces around %file_dependencies substitution

This one was left out from a previous commit.

[ARM][LowOverheadLoops] DoubleWidthResult instructions canGenerateZeros

Given that some instructions generate wider result elements than
their inputs, flag them as being able to generate non zeros in the
false lanes.

Differential Revision: https://reviews.llvm.org/D76766

Revert "[OPENMP50]Add basic support for inscan reduction modifier."

This reverts commit 36ed0ceec7d3b0bff9de462d4f4f544d4b5285e4 to fix a
crash in scan_messages.cpp test.

Fix build after 09158252f777c2e2f06a86b154c44abcbcf9bb74

[libc++] NFC: Simplify substitutions by using lit recursive substitutions

Since lit supports expanding substitutions recursively, we can define
substitutions in terms of other substitutions. This allows us to simplify
how libc++ substitutions are defined.

This doesn't change the substitutions at all, it only makes them simpler
to define.

[InstCombine][X86] Add repeated ops demanded elts tests for SSE intrinsics (PR24523)

[InstCombine][X86] Regenerate SSE2 tests

[OPENMP50]Add basic support for inscan reduction modifier.

Added basic support (parsing/sema checks) for the inscan modifier in the
reduction clauses.

[libc++/libc++abi] Properly delimit lit substitutions

lit is not very clever when it performs substitution on RUN lines. It
simply looks for a match anywhere in the line (without tokenization)
and replaces it by the expansion. This means that a RUN line containing
e.g. `-verify-ignore-unexpected=note` wouod be expanded to
`-verify-ignore-unexpected=<substitution for not>e`, which is
surprising and nonsensical.

It also means that something like `%compile_module` could be expanded
to `<substitution-for-%compile>_module` or to the correct substitution,
depending on the order in which substitutions are evaluated by lit.

To avoid such problems, it is a good habit to delimit custom substitutions
with some token. This commit does that for all substitutions used in the
libc++ and libc++abi test suites.

Simplify implementation of Type::isXXXType(); NFC

[ThinLTO] Allow usage of all hardware threads in the system

Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.

One can now say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask.

When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows).

Differential Revision: https://reviews.llvm.org/D75153

[libc++] Remove unused lit substitutions

[mlir] Extended Dominance analysis with a function to find the nearest common dominator of two given blocks.

The Dominance analysis currently misses a utility function to find the nearest common dominator of two given blocks. This is required for a huge variety of different control-flow analyses and transformations. This commit adds this function and moves the getNode function from DominanceInfo to DominanceInfoBase, as it also works for post dominators.

Differential Revision: https://reviews.llvm.org/D75507

[ARM][MVE] Add DoubleWidthResult flag

Add a flag for those instructions which read from the top/bottom
halves of their inputs and produce a vector of results with double
width elements.

Differential Revision: https://reviews.llvm.org/D76762

[analyzer][NFC] Change LangOptions to CheckerManager in the shouldRegister* functions

Some checkers may not only depend on language options but also analyzer options.
To make this possible this patch changes the parameter of the shouldRegister*
function to CheckerManager to be able to query the analyzer options when
deciding whether the checker should be registered.

Differential Revision: https://reviews.llvm.org/D75271

[lit] NFC: Move the flaky test logic to _runShTest

This minor refactoring allows reducing the amount of processing that
is duplicated when we re-run a flaky test. It also has the nice
side effect that libc++'s current test format supports flaky .sh.cpp
tests, because those are built on top of _runShTest, not executeShTest.

[lit] Recursively expand substitutions

This allows defining substitutions in terms of other substitutions. For
example, a %build substitution could be defined in terms of a %cxx
substitution as '%cxx %s -o %t.exe' and the script would be properly
expanded.

Differential Revision: https://reviews.llvm.org/D76178

[LV] Refactor widenIntOrFpInduction. NFC.

This untangles the logic in widenIntOrFpInduction in order to make more
explicit and visible how exactly the induction variable is lowered.

Differential Revision: https://reviews.llvm.org/D76686

[Alignment] Fix overaligning bug

Summary:
This was discovered while converting to Align type.

See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76914

[analyzer][MallocChecker] Fix that kfree only takes a single argument

Exactly what it says on the tin!

https://www.kernel.org/doc/htmldocs/kernel-api/API-kfree.html

Differential Revision: https://reviews.llvm.org/D76917

Revert rGa3c715e9788d829031989b0a5ea4eb43c7288be9 "Twine - fix uninitialized variable warnings. NFCI."

@dblaikie noticed that this may interfere with msan analysis

Revert rG6ff1ea3244c543ad24fc99c7f4979db2f2078593 "Fix "use of uninitialized variable" static analyzer warning. NFCI."

@dblaikie noticed that this may interfere with msan analysis

[MLIR][LLVM] Make index type bitwidth configurable.

This change adds a new option to the StandardToLLVM lowering to configure
the bitwidth of the index type independently of the target architecture's
pointer size.

Differential revision: https://reviews.llvm.org/D76353

[SystemZ] Fix typos in comments.

[ARM] Fix MVE VCMPr f16 pattern

This patterns seemed to be using the f32 instruction, not f16. Fix it to
use the correct one.

Differential Revision: https://reviews.llvm.org/D76841

[mlir][vulkan-runner] Add support for 2D memref.

Summary:
This patch adds support for 2D memref in mlir-vulkan-runner.

Differential Revision: https://reviews.llvm.org/D76737

[llvm-readobj] - Fix a crash when DT_STRTAB is broken.

We might have a crash scenario when we have an invalid DT_STRTAB value
that is larger than the file size. I've added a test case to demonstrate.

Differential revision: https://reviews.llvm.org/D76706

[mlir] StandardToLLVM: use template aliases instead of dummy classes

Multiple operation conversions from the Standard dialect to the LLVM dialect
are trivial one-to-one conversions that use only the pattern defined in base
utility classes such as OneToOneConvertToLLVMPattern and
VectorConvertToLLVMPattern. Use template aliases ("using" declarations) instead
of creating derived classes without new functionality.

clang-format: Fix pointer alignment for overloaded operators (PR45107)

This fixes a regression from D69573 which broke the following example:

$ echo 'operator C<T>*();' | bin/clang-format --style=Chromium
operator C<T> *();

(There should be no space before the asterisk.)

It seems the problem is in TokenAnnotator::spaceRequiredBetween(),
which only looked at the token to the left of the * to see if it was a
type or not. That code only handled simple types or identifiers, not
templates or qualified types. This patch addresses that.

Differential revision: https://reviews.llvm.org/D76850

Fix TBAA for unsigned fixed-point types

Summary:
Unsigned types can alias the corresponding signed types. I don't see
that this is explicitly mentioned in the Embedded-C specification, but
I think it should work the same as for the integer types.

Patch by: materi

Reviewers: ebevhan, leonardchan

Reviewed By: leonardchan

Subscribers: kosarev, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76856

[Alignment][NFC] Update MachineMemOperand implementation to use MaybeAlign

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Reviewed By: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76625

[MCInstPrinter] Add parameter `Address` to printCustomAliasOperand. NFC

Follow-up of D72172 and llvmorg-11-init-6896-gb3cc5dcef0f.

[OpenMP] `omp begin/end declare variant` - part 2, sema ("+CG")

This is the second part loosely extracted from D71179 and cleaned up.

This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.

A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].

The logic is as follows:

If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
  1) Create a function declaration for the definition we were about to generate.
  2) Create a function definition but with a mangled name (according to
     `<SELECTOR>`).
  3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
     one used already for `omp declare variant`, using and the mangled
     function definition as specialization for the context defined by
     `<SELECTOR>`.

When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.

[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman

Subscribers: bollu, guansong, openmp-commits, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75779

[OpenMP] `omp begin/end declare variant` - part 1, parsing

This is the first part extracted from D71179 and cleaned up.

This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].

A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].

[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D74941

[OpenMP][NFC] Open `llvm` and `llvm::omp` namespace in OpenMPClause.cpp

[OpenMP][NFC] Outline common functionality (skipUntilPragmaOpenMPEnd)

The same code was repeated multiple times, we put it in a function now.

[MCInstPrinter] Add parameter `Address` to MCInstPrinter::printAliasInstr. NFC

Follow-up of D72172.

[X86][MC] Fix the bug for prefix padding support

Summary:
There is a tiny logic error of D75300, making branch is not
correctly aligned with option -x86-pad-max-prefix-size

Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight

Reviewed By: reames

Subscribers: hiraditya, llvm-commits, annita.zhang

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76285

[PowerPC] Fix test for PR45297 to adapt build without asserts. NFC.

[PowerPC] Enhance test for PR45297. NFC.

[MLIR][NFC] drop some unnecessary includes

Drop unnecessary includes

Differential Revision: https://reviews.llvm.org/D76898

[DAGCombine] Add basic optimizations for FREEZE in SelDag

Summary: This patch is the first effort to adding basic optimizations for FREEZE in SelDag.

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76707

Use llvm_unreachable after a fully covered/always-returning switch

Make llvm::function_ref's operator bool explicit

This can avoid all sorts of mistakes with implicit conversion
(indirectly) to int, etc. I'm quite surprise there aren't any things to
fixup with this - but I guess most uses of function_ref aren't
optional/nullable.

Fix typo, targetFeature should be lowercase.

this fixing also enable llc -mattr=+cpuhelp

Reviewers: ziangwan, kongyi

Reviewed By: kongyi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76757

[NFC] Clang format for the ELF header and ARM build attributes.

Differential Revision: https://reviews.llvm.org/D76819

Move setBugReportMsg() out from under a conditional

Fixes a build break with LLVM_ENABLE_BACKTRACES=OFF.

Differential Revision: https://reviews.llvm.org/D76893

[llvm][TextAPI/MachO] silence clang-tidy warnings, NFC

* applies only to tests

[WebAssembly] Support wasm exports with zero-length names.

Zero-length strings are valid export names in WebAssembly, so allow
users to specify them.

Differential Revision: https://reviews.llvm.org/D71793

[WebAssembly] Fix the order of destructors in the LowerGlobalDtors pass.

Fix the LowerGlobalDtors pass to run destructors in the same order as the
regular LLVM destructor lowering -- in reverse order. Adjacent
destructors with the same associated object are grouped, but destructors
are not reordered based on associated objects.

Differential Revision: https://reviews.llvm.org/D70685

Make PS4 use -fno-use-init-array only as the ABI does not support .init_array.

Reviewed by Paul Robinson

[Hexagon] Add support for Linux/Musl ABI (part 2)

A continuation of https://reviews.llvm.org/D72701. This
adds support needed in clang.

Differential Revision: https://reviews.llvm.org/D75638

[AMDGPU] Propagate amdgpu-waves-per-eu to callees

Differential Revision: https://reviews.llvm.org/D76868

[OPENMP50]Fix the checks for the nesting of scan directives.

Fixed the check for the orhaned scan directives and improved checks for
parallel for and parallel for simd directives.

[lld][Wasm] Wasm-ld emits invalid .debug_ranges entries for non-live symbols

When the debug info contains a relocation against a dead symbol, wasm-ld
may emit spurious range-list terminator entries (entries with Start==0
and End==0). This change fixes this by emitting the WasmRelocation
Addend as End value for a non-live symbol.

Reviewed by: sbc100, dblaikie

Differential Revision: https://reviews.llvm.org/D74781

[gn build] Port 9f7d4150b9e

[X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.

These transforms rely on a vector reduction flag on the SDNode
set by SelectionDAGBuilder. This flag exists because SelectionDAG
can't see across basic blocks so SelectionDAGBuilder is looking
across and saving the info. X86 is the only target that uses this
flag currently. By removing the X86 code we can remove the flag
and the SelectionDAGBuilder code.

This pass adds a dedicated IR pass for X86 that looks across the
blocks and transforms the IR into a form that the X86 SelectionDAG
can finish.

An advantage of this new approach is that we can enhance it to
shrink the phi nodes and final reduction tree based on the zeroes
that we need to concatenate to bring the partially reduced
reduction back up to the original width.

Differential Revision: https://reviews.llvm.org/D76649

[clang] Allow -DDEFAULT_SYSROOT to be a relative path

In this case we interpret the path as relative the clang driver binary.

This allows SDKs to be built that include clang along with a custom
sysroot without requiring users to specify --sysroot to point to the
directory where they installed the SDK.

See https://github.com/WebAssembly/wasi-sdk/issues/58

Differential Revision: https://reviews.llvm.org/D76653

[X86] Prefer PACKUS(AND(),AND()) to SHUFFLE(PSHUFB(),PSHUFB()) on all targets

Extends rG9d1721ce3926 to support AVX2+ targets.

[AMDGPU] Rename overloaded getMaxWavesPerEU to getWavesPerEUForWorkGroup

Summary: I think Max in the name was misleading. NFC.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76860

[AMDGPU] Remove getMaxWavesPerCU in favour of getWavesPerWorkGroup.

Summary:
These methods were identical. I chose to remove getMaxWavesPerCU because
I think Max in the name was misleading. NFC.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76859

[WEbAssembly] Clear frame base vreg in explicit-locals when stack pointer is dead

Having an alloca in a function causes the stack pointer to be generated in the
prolog, but if it's unused other than for debug info, explicit-locals will drop
it and not allocate a local. In this case we need to reset the FrameBaseVreg.

Differential Revision: https://reviews.llvm.org/D76784

[X86] lowerV16I8Shuffle - create v8i16 mask for PACKUS(AND(),AND()) patterns.

We can improve computeKnownBits results by avoiding excess bitcasts.

For this pattern we were doing:

(v16i8 PACKUS(v8i16 BITCAST(v16i8 AND(V1, MASK)), v8i16 BITCAST(v16i8 AND(V2, MASK))))

By performing the MASK/AND with a v8i16 type and bitcasting V1/V2 directly we can help computeKnownBits see that the mask is clearing the upper bits and allows shuffle combining to peek through later on.

This will be necessary to extend rG9d1721ce3926 to AVX2+ targets in a future patch.

Revert "[OPENMP50]Add basic support for inscan reduction modifier."

This reverts commit 8099e0fe82ce78c15bc6c4cf52caca5b6fbe28f5 to fix the
problems with the Windows-based buildbots.

[sanitizer][RISCV] Implement SignalContext::GetWriteFlag for RISC-V

This patch follows the approach also used for MIPS, where we decode the
offending instruction to determine if the fault was caused by a read or
write operation, as that seems to be the only relevant information we have
in the signal context structure to determine that.

Differential Revision: https://reviews.llvm.org/D75168

[AIX] discard the label in the csect of function description and use qualname for linkage

SUMMARY:

SUMMARY
for a source file  "test.c"

void foo() {};

llc will generate assembly code as (assembly patch)
     .globl  foo
     .globl  .foo
     .csect foo[DS]
foo:

        .long   .foo
        .long   TOC[TC0]
        .long   0

   and symbol table as (xcoff object file)
   [4]     m   0x00000004     .data     1  unamex                    foo
   [5]     a4  0x0000000c       0    0     SD       DS    0    0
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     LD       DS    0    0

   After first patch, the assembly will be as

        .globl  foo[DS]                 # -- Begin function foo
        .globl  .foo
        .align  2
        .csect foo[DS]
        .long   .foo
        .long   TOC[TC0]
        .long   0

    and symbol table will as
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     DS      DS    0    0
Change the code for the assembly path and xcoff objectfile patch for llc.

Reviewers: Jason Liu
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D76162

[libomptarget] Add missing elf_end call in elf_common.c

Summary:
[libomptarget] Add missing elf_end call in elf_common.c
Noticed when reviewing D76843.

Reviewers: simoll, jdoerfert, efocht, AndreyChurbanov, grokos, manorom

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D76874

[OPENMP50]Add basic support for inscan reduction modifier.

Added basic support (parsing/sema checks) for the inscan modifier in the
reduction clauses.

[cuda][hip] Add CUDA builtin surface/texture reference support.

Summary:
- Even though the bindless surface/texture interfaces are promoted,
  there are still code using surface/texture references. For example,
  [PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
  compilation issue for code using `tex2D` with texture references. For
  better compatibility, this patch proposes the support of
  surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
  `nvcc` does use builtins for texture support. From the limited NVVM
  documentation[^nvvm] and NVPTX backend texture/surface related
  tests[^test], it's believed that surface/texture references are
  supported by replacing their reference types, which are annotated with
  `device_builtin_surface_type`/`device_builtin_texture_type`, with the
  corresponding handle-like object types, `cudaSurfaceObject_t` or
  `cudaTextureObject_t`, in the device-side compilation. On the host
  side, that global handle variables are registered and will be
  established and updated later when corresponding binding/unbinding
  APIs are called[^bind]. Surface/texture references are most like
  device global variables but represented in different types on the host
  and device sides.
- In this patch, the following changes are proposed to support that
  behavior:
  + Refine `device_builtin_surface_type` and
    `device_builtin_texture_type` attributes to be applied on `Type`
    decl only to check whether a variable is of the surface/texture
    reference type.
  + Add hooks in code generation to replace that reference types with
    the correponding object types as well as all accesses to them. In
    particular, `nvvm.texsurf.handle.internal` should be used to load
    object handles from global reference variables[^texsurf] as well as
    metadata annotations.
  + Generate host-side registration with proper template argument
    parsing.

---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used.  But, the current backend doesn't have that supported. We may revise that later.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365