Bing1 Yu [Wed, 24 May 2023 02:15:23 +0000 (10:15 +0800)]
[LegalizeType][X86] Support WidenVecRes_AssertZext and SplitVecRes_AssertZext for ISD::AssertZext during LegalizeType procedure
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D150941
Rahman Lavaee [Wed, 24 May 2023 01:44:10 +0000 (01:44 +0000)]
[Propeller] Add HasIndirectBranch to BBEntry::Metadata.
This information helps to avoid considering cloning for blocks with indirect branches.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D150611
Galen Elias [Tue, 23 May 2023 03:11:17 +0000 (20:11 -0700)]
[clang-format] Adjust braced list detection (reland 6dcde65)
This is a retry of https://reviews.llvm.org/D114583, which was backed
out for regressions.
Clang Format is detecting a nested scope followed by another open brace
as a braced initializer list due to incorrectly thinking it's matching a
braced initializer at the end of a constructor initializer list which is
followed by the body open brace.
Unfortunately, UnwrappedLineParser isn't doing a very detailed parse, so
it's not super straightforward to distinguish these cases given the
current structure of calculateBraceTypes. My current hypothesis is that
these can be disambiguated by looking at the token preceding the
l_brace, as initializer list parameters will be preceded by an
identifier, but a scope block generally will not (barring the MACRO
wildcard).
To this end, I am adding tracking of the previous token to the LBraceStack
to help scope this particular case.
TokenAnnotatorTests cherry picked from https://reviews.llvm.org/D150452.
Fixes #33891.
Fixes #52911.
Differential Revision: https://reviews.llvm.org/D150403
Owen Pan [Wed, 24 May 2023 01:33:37 +0000 (18:33 -0700)]
[clang-format] Revert 6dcde65 due to missing commit message title
This reverts commit
6dcde658b2380d7ca1451ea5d1099af3e294ea16.
Aart Bik [Tue, 23 May 2023 20:47:10 +0000 (13:47 -0700)]
[mlir][sparse][gpu] fix F32 bug for SpMV and SpMM
The alpha/beta variables, residing on the host, should have the
32-bit or 64-bit width of the result type. It was formerly always
passed as double.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D151255
Leonard Chan [Tue, 23 May 2023 23:38:29 +0000 (23:38 +0000)]
[compiler-rt] Allow 64-bit sanitizer allocator to be used if using RISCV64 and Fuchsia
This way, Fuchsia can use the 64-bit allocator settings in D151157 without changing the default behavior for others.
Differential Revision: https://reviews.llvm.org/D151159
Akira Hatanaka [Tue, 23 May 2023 23:32:19 +0000 (16:32 -0700)]
[CodeGen] Fix the type of the constant that is used to zero-initialize a
flexible array member
A zero-element array type was incorrectly being used when an incomplete
array was being initialized with a non-empty initializer.
This fixes an assertion failure in AddInitializerToStaticVarDecl. See
the discussion here: https://reviews.llvm.org/D123649#4362210
Differential Revision: https://reviews.llvm.org/D151172
Craig Topper [Tue, 23 May 2023 23:31:22 +0000 (16:31 -0700)]
[RISCV] Expand rotate by non-constant for XTHeadBb during lowering.
Avoids multi instruction isel patterns and enables mask optimizations
on shift amount.
Reviewed By: philipp.tomsich
Differential Revision: https://reviews.llvm.org/D151263
Med Ismail Bennani [Tue, 23 May 2023 23:01:39 +0000 (16:01 -0700)]
Revert "[lldb] Move PassthroughScriptedProcess to `lldb.scripted_process` module"
This reverts commit
273a2d337f675f3ee050f281b1fecc3e806b9a3c, since it
might be the cause for `TestStackCoreScriptedProcess` and
`TestInteractiveScriptedProcess` failures on GreenDragon:
https://green.lab.llvm.org/green/job/lldb-cmake/55460/`
Peter Klausler [Mon, 22 May 2023 18:56:14 +0000 (11:56 -0700)]
[flang][runtime] Complete partial output records when positioning/closing after non-advancing output
Before positioning or closing a unit after a non-advancing output statement
has left a partial record in its buffer, complete the record by calling
AdvanceRecord(). Fixes https://github.com/llvm/llvm-project/issues/59761.
Differential Revision: https://reviews.llvm.org/D151134
Peiming Liu [Sat, 20 May 2023 00:55:44 +0000 (00:55 +0000)]
[mlir][sparse] extend unpack operation to unpack arbitrary encodings.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D151174
Shubham Sandeep Rastogi [Fri, 5 May 2023 00:48:19 +0000 (17:48 -0700)]
Add support for salvaging debug info from icmp instrcuctions.
salvageDebugInfo is a function that allows us to reatin debug info for
instructions that have been optimized out. Currently, it doesn't support
salvaging the debug information from icmp instrcutions, but DWARF
expressions can emulate an icmp by using the DWARF conditional
expressions. This patch adds support for salvaging debug information
from icmp instructions.
Differential Revision: https://reviews.llvm.org/D150216
Aaron Siddhartha Mondal [Tue, 23 May 2023 22:24:05 +0000 (00:24 +0200)]
[bazel] Add clang-offload-packager and clang-linker-wrapper
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D151131
Alex Langford [Tue, 23 May 2023 17:35:32 +0000 (10:35 -0700)]
[lldb][NFCI] Remove unused member from ObjectFileMachO
From what I can see, `m_mach_segments` is completely unused. Let's
remove it.
Differential Revision: https://reviews.llvm.org/D151236
Valentin Clement [Tue, 23 May 2023 22:08:27 +0000 (15:08 -0700)]
[flang][openacc][NFC] Add API to create acc.private.recipe from FIR type
Simply make the creation of acc.private.recipe accesible through an API.
This will be useful when we will implement passes like the implicit
privatization.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D151230
Valentin Clement [Tue, 23 May 2023 22:07:19 +0000 (15:07 -0700)]
[flang] Do not omit fir.ref in getTypeAsString
Do not omit fir.ref when creating the string representation to we can
have different representation for `!fir.ref<i32>` and `i32`.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D151261
Nick Desaulniers [Tue, 23 May 2023 21:41:43 +0000 (14:41 -0700)]
Reland: [clang][ExprConstant] fix __builtin_object_size for flexible array members
As reported by @kees, GCC treats __builtin_object_size of structures
containing flexible array members (aka arrays with incomplete type) not
just as the sizeof the underlying type, but additionally the size of the
members in a designated initializer list.
Fixes: https://github.com/llvm/llvm-project/issues/62789
Reviewed By: erichkeane, efriedma
Differential Revision: https://reviews.llvm.org/D150892
Valentin Clement [Tue, 23 May 2023 21:14:59 +0000 (14:14 -0700)]
[flang] Add IndexType support in getTypeAsString
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D151250
Diego Caballero [Tue, 23 May 2023 20:51:09 +0000 (20:51 +0000)]
[mlir][Vector] Extend xfer drop unit dim patterns
This patch extends the transfer drop unit dim patterns to support cases where the vector shape should also be reduced
(e.g., transfer_read(memref<1x4x1xf32>, vector<1x4x1xf32>) -> transfer_read(memref<4xf32>, vector<4xf32>).
Reviewed By: hanchung, pzread
Differential Revision: https://reviews.llvm.org/D151007
Scott Linder [Tue, 23 May 2023 20:31:18 +0000 (20:31 +0000)]
[llvm-debuginfo-analyzer] Support both Reference and Type attrs in single DIE
Relax the assumption that at most one Reference-or-Type-like attribute is
present on a DWARF DIE.
Add support for at most one Type attribute (i.e. DW_AT_import xor
DW_AT_type) and separately at most one Reference attribute (i.e.
DW_AT_specification xor DW_AT_abstract_origin xor ...).
Update comment describing old assumption and tag it as a "FIXME" to
reflect the fact that this is perhaps still not general enough.
Add a test based on the case which led me to encounter the bug in the
wild.
Reviewed By: CarlosAlbertoEnciso
Differential Revision: https://reviews.llvm.org/D150713
Keith Smiley [Mon, 22 May 2023 20:28:42 +0000 (13:28 -0700)]
[lld-macho] Add support for .so file discovery
While not the recommended extension on macOS .so is supported by ld64.
This mirrors that behavior.
Related report: https://github.com/bazelbuild/bazel/issues/18464
Differential Revision: https://reviews.llvm.org/D151147
Leonard Chan [Tue, 23 May 2023 20:40:22 +0000 (20:40 +0000)]
[lld][ELF] Do not emit warning for NOLOAD output sections
Much of NOLOAD's intended use is to explicitly change the type of an
output section, so we shouldn't flag these as warnings.
Differential Revision: https://reviews.llvm.org/D151144
Nikolas Klauser [Tue, 23 May 2023 15:45:52 +0000 (08:45 -0700)]
[libc++] Apply _LIBCPP_EXCLUDE_FROM_EXPLICIT_INSTANTIATION only in classes that we have instantiated externally
To make sure all member functions that require it are marked `_LIBCPP_EXCLUDE_FROM_EXPLICIT_INSTANTIATION` I compared the output of `objdump --syms lib/libc++.1.0.dylib` before and after, ignoring addresses.
Reviewed By: #libc, ldionne
Spies: Mordante, libcxx-commits, ldionne, arichardson, mstorsjo
Differential Revision: https://reviews.llvm.org/D150896
Craig Topper [Tue, 23 May 2023 20:32:18 +0000 (13:32 -0700)]
[TableGen] Filter duplicate predicates in PatternToMatch::getPredicateRecords.
Recent changes to RISC-V cause the same predicate to appear in the
predicate list multiple times in some cases. This patch filters the
duplicates to reduce the number of predicate string variations.
Kyle Huey [Tue, 23 May 2023 17:44:25 +0000 (17:44 +0000)]
[X86] Use the CFA when appropriate for better variable locations around calls.
Without frame pointers, the locations of variables on the stack are emitted
relative to the stack pointer (via the stack pointer being the value of
DW_AT_frame_base on the subprogram). If a call modifies the stack pointer
this results in the locations being wrong and the debugger displaying the
wrong values for variables.
By using DW_OP_call_frame_cfa in these situations the emitted location for
the variable will automatically handle changes in the stack pointer
(provided LLVM is emitting the correct CFI directives elsewhere, of course).
The CFA needs to be adjusted for the size of the stack frame (including the
return address) to allow the variable locations themselves to remain
unchanged by this patch.
Certain LLDB features cannot cope with DW_OP_call_frame_cfa, so this change
is heuristically limited to the cases where it's necessary for correctness
to minimize the fallout there.
Reviewed By: #debug-info, scott.linder, jryans, jmorse
Differential Revision: https://reviews.llvm.org/D143463
Congcong Cai [Tue, 23 May 2023 20:14:10 +0000 (22:14 +0200)]
[Sema] `setInvalidDecl` for error deduction declaration
Fixed: https://github.com/llvm/llvm-project/issues/62408
`setInvalidDecl` for invalid `CXXDeductionGuideDecl` to
avoid crashes during semantic analysis.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D149516
Eugene Burmako [Tue, 23 May 2023 19:34:12 +0000 (12:34 -0700)]
[MLIR] Update Bazel build to remove references to PybindUtils.cpp
This file has been removed in https://reviews.llvm.org/D151167.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D151242
Florian Hahn [Tue, 23 May 2023 19:36:15 +0000 (20:36 +0100)]
[VPlan] Print IR flags for VPRecipeWithIRFlags.
Now that IR flags are modeled as part of VPRecipeWithIRFlags, include
the flags when printing recipes.
Depends on D150027.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D150029
Piotr Zegar [Tue, 23 May 2023 18:45:22 +0000 (18:45 +0000)]
[clang-tidy] Ignore implicit code in bugprone-branch-clone
Implicit code like, template instances, compiler generated
code are not excluded in this check by using
TK_IgnoreUnlessSpelledInSource.
Fixes #62693
Reviewed By: donat.nagy
Differential Revision: https://reviews.llvm.org/D151133
Piotr Zegar [Tue, 23 May 2023 18:44:55 +0000 (18:44 +0000)]
[clang-tidy] Improve bugprone-branch-clone with support for fallthrough attribute
Ignore duplicated switch cases with [[fallthrough]] attribute to reduce false positives.
Fixes: #47588
Reviewed By: donat.nagy
Differential Revision: https://reviews.llvm.org/D147889
Craig Topper [Tue, 23 May 2023 19:14:07 +0000 (12:14 -0700)]
[RISCV] Add scalable vector cast cost model tests. NFC
Reviewed By: fakepaper56
Differential Revision: https://reviews.llvm.org/D151132
Joshua Cranmer [Tue, 23 May 2023 19:00:19 +0000 (15:00 -0400)]
[CodeGen] Fix crash in CodeGenPrepare::optimizeGatherScatterInst.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D151141
Noah Goldstein [Tue, 23 May 2023 16:18:01 +0000 (11:18 -0500)]
[ValueTracking] Use `select` condition to help determine if `select` is non-zero
In `select c, x, y` the condition `c` dominates the resulting `x` or
`y` chosen by the `select`. This adds logic to `isKnownNonZero` to try
and use the `icmp` for the `c` condition to see if it implies the
select `x` or `y` are known non-zero.
For example in:
```
%c = icmp ugt i8 %x, %C
%r = select i1 %c, i8 %x, i8 %y
```
The true arm of select `%x` is non-zero (when "returned" by the
`select`) because `%c` being true implies `%x` is non-zero.
Alive2 Links (with `x {pred} C`):
- EQ iff `C != 0`:
- https://alive2.llvm.org/ce/z/umLabn
- NE iff `C == 0`:
- https://alive2.llvm.org/ce/z/DQvy8Y
- UGT [always]:
- https://alive2.llvm.org/ce/z/HBkjgQ
- UGE iff `C != 0`:
- https://alive2.llvm.org/ce/z/LDNifB
- SGT iff `C s>= 0`:
- https://alive2.llvm.org/ce/z/QzWDj3
- SGE iff `C s> 0`:
- https://alive2.llvm.org/ce/z/rR4g3D
- SLT iff `C s<= 0`:
- https://alive2.llvm.org/ce/z/uysayx
- SLE iff `C s< 0`:
- https://alive2.llvm.org/ce/z/2jYc7e
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D147900
Noah Goldstein [Sun, 30 Apr 2023 15:44:35 +0000 (10:44 -0500)]
[ValueTracking] Add tests for using condition in select for non-zero analysis; NFC
Differential Revision: https://reviews.llvm.org/D147899
Noah Goldstein [Tue, 23 May 2023 16:11:23 +0000 (11:11 -0500)]
[ValueTracking] Use `KnownBits` functions for `computeKnownBits` of saturating add/sub functions
The knownbits implementation covers all the cases previously handled
by `uadd.sat`/`usub.sat` as well some additional ones. We previously
were not handling the `ssub.sat`/`sadd.sat` cases at all.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D150103
Noah Goldstein [Tue, 23 May 2023 16:13:13 +0000 (11:13 -0500)]
[KnownBits] Add implementations for saturating add/sub functions
These where previously missing. Even in the case where overflow is
indeterminate we can still deduce some of the low/high bits.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D150102
Noah Goldstein [Tue, 23 May 2023 16:11:14 +0000 (11:11 -0500)]
[ValueTracking] Add tests for knownbits of saturating add/sub functions; NFC
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D150101
Noah Goldstein [Tue, 23 May 2023 16:11:09 +0000 (11:11 -0500)]
[KnownBits] Improve implementation of `KnownBits::abs`
`abs` preserves the lowest set bit, so if we know the lowest set bit,
set it in the output.
As well, implement the case where the operand is known negative.
Reviewed By: foad, RKSimon
Differential Revision: https://reviews.llvm.org/D150100
Manna, Soumi [Tue, 23 May 2023 18:41:28 +0000 (11:41 -0700)]
[NFC][CLANG] Fix issue with dereference null return value found by Coverity static analyzer tool
Reported by Coverity static analyzer tool:
Inside "ItaniumCXXABI.cpp" file, in <unnamed>::ItaniumCXXABI::EmitLoadOfMemberFunctionPointer(clang::CodeGen::CodeGenFunction &, clang::Expr const *, clang::CodeGen::Address, llvm::Value *&, llvm::Value *, clang::MemberPointerType const *): Return value of function which returns null is dereferenced without checking.
//returned_null: getAs returns nullptr (checked 130 out of 156 times).
//var_assigned: Assigning: FPT = nullptr return value from getAs.
const FunctionProtoType *FPT =
MPT->getPointeeType()->getAs<FunctionProtoType>();
auto *RD =
cast<CXXRecordDecl>(MPT->getClass()->castAs<RecordType>()->getDecl());
// Dereference null return value (NULL_RETURNS)
//dereference: Dereferencing a pointer that might be nullptr FPT when calling arrangeCXXMethodType.
llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(
CGM.getTypes().arrangeCXXMethodType(RD, FPT, /*FD=*/nullptr));
This patch uses castAs instead of getAs which will assert if the type doesn't match.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151054
LLVM GN Syncbot [Tue, 23 May 2023 18:40:17 +0000 (18:40 +0000)]
[gn build] Port
f237513cda8e
walter erquinigo [Wed, 10 May 2023 20:41:07 +0000 (15:41 -0500)]
[LLDB] Add some declarations related to REPL support for mojo
This simple diff declares some enum values needed to create a REPL for the mojo language.
Differential Revision: https://reviews.llvm.org/D150303
Nitin John Raj [Tue, 23 May 2023 17:47:53 +0000 (10:47 -0700)]
[RISCV][GlobalISel] Add lowerReturn for calling conv
Add minimal support to lower return, and introduce an OutgoingValueHandler and an OutgoingValueAssigner for returns.
Supports return values with integer, pointer and aggregate types.
(Update of D69808 - avoiding commandeering that revision)
Co-authored By: lewis-revill
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D117318
Thurston Dang [Fri, 19 May 2023 23:19:44 +0000 (23:19 +0000)]
hwasan: lay groundwork for importing subset of sanitizer_common interceptors [NFC]
This patch does the bare minimum to import sanitizer_common_interceptors, but
without actually enabling any interceptors or meaningfully defining the
COMMON_INTERCEPT macros.
This will allow selectively enabling sanitizer_common interceptors (if the
appropriate macros are defined), as suggested by Vitaly in D149701.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D150708
Tue Ly [Tue, 23 May 2023 17:34:12 +0000 (13:34 -0400)]
[libc] Change UInt integer conversion operators to use standard types.
This fixes an issue with missing `unsigned long` conversion on macOS.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D151234
Vitaly Buka [Tue, 23 May 2023 17:42:39 +0000 (10:42 -0700)]
Revert "[HWASan] unflake test"
https://reviews.llvm.org/D150742 is the fix.
This reverts commit
edd0981e71af87a686365d40e6410a8a377c153d.
Vlad Serebrennikov [Tue, 23 May 2023 18:03:10 +0000 (21:03 +0300)]
Reland "[clang] Add tests for CWG issues 977, 1482, 2516"
Now with support for MSVC-specific triples.
CWG977 focus on point of /completeness/ of enums. Wording provided in CWG1482.
CWG1482 and CWG2516 focus on locus (point) of /declaration/. Wording provided in CWG2516.
Reviewed By: clang-language-wg, shafik
Differential Revision: https://reviews.llvm.org/D151042
Rahul Kayaith [Tue, 23 May 2023 17:40:00 +0000 (13:40 -0400)]
[mlir][python] Bump min pybind11 version to 2.9.0
2.9.0 was released on December 28, 2021, and some following changes
require at least this version.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D150247
Aaron Ballman [Tue, 23 May 2023 17:38:35 +0000 (13:38 -0400)]
Add a reminder to update docs when updating default; NFC
Markus Böck [Tue, 23 May 2023 17:11:43 +0000 (19:11 +0200)]
[llvm][ADT] Fix invalid `reference` type of depth-first, breadth-first and post order iterators
C++s iterator concept requires operator* to return the same type as is specified by the iterators reference type. This functionality is especially important for older generic code that did not yet make use of auto.
An example from within LLVM is iterator_adaptor_base which uses the reference type of the iterator it is wrapping as its return type for operator* (this class is used as base for a lot of other functionality like filter iterators and so on).
Using any of the graph traversal iterators listed above with it would previously fail to compile due to reference being non-const while operator* returned a const reference.
This patch fixes that by correctly specifying reference and using it as the return type of operator* explicitly to prevent further issues in the future.
Differential Revision: https://reviews.llvm.org/D151198
Azat Khuzhin [Tue, 23 May 2023 17:28:30 +0000 (19:28 +0200)]
[libcxx][tests] Introduce 32-bit feature and use it for stringstream gcount test
This will avoid hardcoding all unsupported targets, since even after one
more follow up fix [1], there is one more failure.
[1]: https://reviews.llvm.org/D150886
Plus, if you want to run it locally on some target that CI does not
covers, it could also false-positively fail, which is not good.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D151046
Alex Langford [Sat, 20 May 2023 00:51:08 +0000 (17:51 -0700)]
[lldb][NFCI] Merge implementations of ObjectFileMachO::GetMinimumOSVersion and ObjectFileMachO::GetSDKVersion
These functions do the exact same thing (even if they look slightly
different). I yanked the common implementation, cleaned it up, and
shoved it into its own function.
Differential Revision: https://reviews.llvm.org/D151120
Diego Caballero [Mon, 22 May 2023 22:59:46 +0000 (22:59 +0000)]
[mlir][Vector] Add 0-d vector support to 'vector.shape_cast`
This patch adds support to shape cast a vector<1x1x1...1xElemenType> to
a vector<ElementType> and the other way around.
Differential Revision: https://reviews.llvm.org/D151169
Joseph Huber [Tue, 23 May 2023 17:19:56 +0000 (12:19 -0500)]
[libc][obvious] Correctly hoist mask out of the loop
Summry:
This was accidentally dropped from a previous patch following a rebase.
Fix it to where it's consistent.
Differential Revision: https://reviews.llvm.org/D151232
Aaron Ballman [Tue, 23 May 2023 17:11:19 +0000 (13:11 -0400)]
Correct stale documentation for default MSVC version numbers
We documented -fmsc-version as defaulting to 1300 and
-fms-compatibility-version as defaulting to 1800, neither of which
were accurate. We currently default to 1920.
See MSVCToolChain::computeMSVCVersion() for details.
Jin Xin Ng [Mon, 22 May 2023 19:20:55 +0000 (19:20 +0000)]
[hwasan] Move RunFreeHooks call
Ensures a subsequent call (via an external caller) to
__sanitizer_get_allocated_size via hooks will return a valid size.
This allows a faster version of __sanitizer_get_allocated_size
to be implemented, which can skip checks.
Test to ensure RunFreeHooks' call order will come with
__sanitizer_get_allocated_size_fast
Differential Revision: https://reviews.llvm.org/D151151
Mark de Wever [Sun, 7 May 2023 17:50:41 +0000 (19:50 +0200)]
[libc++][doc] Updates the tasks to do for a release.
This is a followup of the review comments in D144499.
Reviewed By: ldionne, philnik, #libc
Differential Revision: https://reviews.llvm.org/D150585
Mark de Wever [Wed, 17 May 2023 15:38:13 +0000 (17:38 +0200)]
[NFC][libc++][format] Uses stringstream::view.
This member has been added in D148641 so it can be used in the formatter
to avoid creating a "temporary" string.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D150791
Mark de Wever [Tue, 28 Feb 2023 19:29:26 +0000 (20:29 +0100)]
[libc++][modules] Adds std module cppm files.
This adds the cppm files of D144994. These files by themselves will do
nothing. The goal is to reduce the size of D144994 and making it easier
to review the real changes of the patch.
Implements parts of
- P2465R3 Standard Library Modules std and std.compat
Reviewed By: ldionne, ChuanqiXu, aaronmondal, #libc
Differential Revision: https://reviews.llvm.org/D151030
Fangrui Song [Tue, 23 May 2023 16:49:57 +0000 (09:49 -0700)]
[IR] Make stack protector symbol dso_local according to -f[no-]direct-access-external-data
There are two motivations.
`-fno-pic -fstack-protector -mstack-protector-guard=global` created
`__stack_chk_guard` is referenced directly on all ELF OSes except FreeBSD.
This patch allows referencing the symbol indirectly with
-fno-direct-access-external-data.
Some Linux kernel folks want
`-fno-pic -fstack-protector -mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard`
created `__stack_chk_guard` to be referenced directly, avoiding
R_X86_64_REX_GOTPCRELX (even if the relocation may be optimized out by the linker).
https://github.com/llvm/llvm-project/issues/60116
Why they need this isn't so clear to me.
---
Add module flag "direct-access-external-data" and set the dso_local property of
the stack protector symbol. The module flag can benefit other LLVMCodeGen
synthesized symbols that are not represented in LLVM IR.
Nowadays, with `-fno-pic` being uncommon, ideally we should set
"direct-access-external-data" when it is true. However, doing so would require
~90 clang/test tests to be updated, which are too much.
As a compromise, we set "direct-access-external-data" only when it's different
from the implied default value.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D150841
Mark de Wever [Wed, 17 May 2023 15:54:53 +0000 (17:54 +0200)]
[libc++] Updates C++2b to C++23.
During the ISO C++ Committee meeting plenary session the C++23 Standard
has been voted as technical complete.
This updates the reference to c++2b to c++23 and updates the __cplusplus
macro.
Note since we use clang-tidy 16 a small work-around is needed. Clang
knows -std=c++23 but clang-tidy not so for now force the lit compiler
flag to use -std=c++2b instead of -std=c++23.
Reviewed By: #libc, philnik, jloser, ldionne
Differential Revision: https://reviews.llvm.org/D150795
Jin Xin Ng [Mon, 22 May 2023 21:13:46 +0000 (21:13 +0000)]
[lsan] Invoke hooks on realloc
Previously lsan would not invoke hooks on reallocations.
An accompanying regression test is included in sanitizer_common.
This change also moves hook calls to a location where subsequent
calls (via an external caller) to __sanitizer_get_allocated_size
via hooks will return a valid size.
This allows a faster version of __sanitizer_get_allocated_size
to be implemented, which can skip checks.
Test to ensure RunFreeHooks' call order will come with
__sanitizer_get_allocated_size_fast
Differential Revision: https://reviews.llvm.org/D151175
Slava Zakharin [Tue, 23 May 2023 16:10:26 +0000 (09:10 -0700)]
[flang] Fixed managing copy-in/copy-out temps.
There are several observations regarding the copy-in/copy-out:
* Actual argument associated with INTENT(OUT) dummy argument that
requires finalization (7.5.6.3 p. 7) may be read by the finalization
function, so a copy-in is required.
* A temporary created for the copy-in/copy-out must be destroyed
without finalization after the call (or after the corresponding copy-out),
otherwise, memory leaks may occur.
* The copy-out assignment must not perform finalization for the LHS.
* The copy-out assignment from the temporary to the actual argument
may or may not need to initialize the LHS.
This change-set introduces new runtime methods: CopyOutAssign and
DestroyWithoutFinalization. They are called by the compiler generated
code to match the behavior described above.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D151135
max [Mon, 22 May 2023 22:30:12 +0000 (17:30 -0500)]
[MLIR][python bindings] use pybind C++ APIs for throwing python errors.
Differential Revision: https://reviews.llvm.org/D151167
Pavel Iliin [Thu, 18 May 2023 11:02:04 +0000 (12:02 +0100)]
[AArch64][FMV] Prevent target attribute using for multiversioning.
On AArch64 for function multiversioning target_version/target_clones
attributes should be used. The patch fixes the defect allowing target
attribute to cause multiversioning.
Differential Revision: https://reviews.llvm.org/D150867
Craig Topper [Tue, 23 May 2023 16:19:37 +0000 (09:19 -0700)]
[LegalizeTypes][ARM][AArch6][RISCV][VE][WebAssembly] Add special case for smin(X, -1) and smax(X, 0) to ExpandIntRes_MINMAX.
We can compute a simpler expression for Lo for these cases. This
is an alternative for the test cases in D151180 that works for
more targets.
This is similar to some of the special cases we have for expanding
setcc operands.
Differential Revision: https://reviews.llvm.org/D151182
Joseph Huber [Tue, 23 May 2023 16:09:16 +0000 (11:09 -0500)]
[OpenMP][NFC] clang-format the OpenMP device runtime
These files aren't fully formatted. I'm guessing this was a holdover
from when `clang-format` was totally broken for OpenMP offloading.
Format the files to be more consistent.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D151226
Joseph Huber [Fri, 19 May 2023 16:17:42 +0000 (11:17 -0500)]
[libc] More efficiently send bytes via `send_n` and `recv_n`
Currently we have the `send_n` and `recv_n` routines to stream data,
such as a string to print, to the other side. The first operation is to
send the size so the other side knows the number of bytes to recieve.
However, this wasted 56 bytes that could've been sent. This meant that
small values, like the arguments to a function to call on the host for
example, needed to perform an extra send. This patch sends the first 56
bytes in the first packet and continues if necessary.
Depends on D150992
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D151041
Joseph Huber [Fri, 19 May 2023 19:58:32 +0000 (14:58 -0500)]
[libc] Fix the `send_n` and `recv_n` utilities under divergent lanes
We provide the `send_n` and `recv_n` utilities as a generic way to
stream data between both sides of the process. This was previously
tested and performed as expected when using a string of constant size.
However, when the size was allowed to diverge between the threads in the
warp or wavefront this could deadlock. This did not occur on NVPTX
because of the use of the explicit warp sync. However, on AMD one of the
work items in the wavefront could continue executing and hit the next
`recv` call before the other threads, then we would deadlock as we
violated the RPC invariants.
This patch replaces the for loop with a thread ballot. This will cause
every thread in the warp or wavefront to continue executing the loop
until all of them can exit. This acts as a more explicit wavefront sync.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D150992
Nikolas Klauser [Tue, 23 May 2023 15:59:13 +0000 (08:59 -0700)]
[libc++] Remove tests from ranges.pass.cpp which violate semantic requirements
This also removes some tests which we have grouped together into robust_from_*.pass.cpp tests.
Specifically, checking that
- `ranges::dangling` is returned is done in `libcxx/test/std/algorithms/ranges_robust_against_dangling.pass.cpp`
- `std::invoke` is used is done in `libcxx/test/std/algorithms/ranges_robust_against_omitting_invoke.pass.cpp`.
- implicit conversion to bool works is done in `libcxx/test/std/algorithms/ranges_robust_against_nonbool_predicates.pass.cpp`
Checking the comparison order is invalid because the `operator==` isn't symmetric.
Checking what the exact type of `operator==` is, is invalid because comparing the same object has to yield the same results if the objects are not modified.
Reviewed By: ldionne, #libc
Spies: EricWF, libcxx-commits
Differential Revision: https://reviews.llvm.org/D150588
Nikolas Klauser [Tue, 23 May 2023 15:58:14 +0000 (08:58 -0700)]
[libc++][NFC] Move basic_ios extern instantiations into <ios>
`basic_ios` is defined in `<ios>`, so it seems weird that we declare the explicit instantiation for it i `<streambuf>`, which is technically unrelated.
Reviewed By: #libc, EricWF, ldionne
Spies: ldionne, EricWF, libcxx-commits
Differential Revision: https://reviews.llvm.org/D150912
Yaxun (Sam) Liu [Thu, 11 May 2023 16:57:21 +0000 (12:57 -0400)]
[HIP] Allow std::malloc in device function
D106463 caused a regression that prevents std::malloc to be
called in the device function, which is allowed with nvcc.
Basically the standard C++ header introducing malloc in
std namespace by using ::malloc. The device ::malloc
function needs to be declared before using ::malloc
to be introduced into std namespace.
Revert D106463 and add a test.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D150965
Nikolas Klauser [Tue, 23 May 2023 15:40:47 +0000 (08:40 -0700)]
[libc++][NFC] Fix whitespace problems in the files added to ignore_format.txt in D151115
Reviewed By: ldionne, #libc, Mordante
Spies: arichardson, Mordante, libcxx-commits
Differential Revision: https://reviews.llvm.org/D151119
Felipe de Azevedo Piovezan [Thu, 11 May 2023 13:01:12 +0000 (09:01 -0400)]
[lldb][NFCI] Use llvm's libDebugInfo for DebugRanges
In an effort to unify the different dwarf parsers available in the codebase,
this commit removes LLDB's custom parsing for the `.debug_ranges` DWARF section,
instead calling into LLVM's parser.
Subsequent work should look into unifying `llvm::DWARDebugRangeList` (whose
entries are pairs of (start, end) addresses) with `lldb::DWARFRangeList` (whose
entries are pairs of (start, length)). The lists themselves are also different
data structures, but functionally equivalent.
Depends on D150363
Differential Revision: https://reviews.llvm.org/D150366
Tue Ly [Sun, 21 May 2023 05:27:38 +0000 (01:27 -0400)]
[libc][math] Implement double precision log1p correctly rounded to all rounding modes.
Implement double precision log1p function correctly rounded to all
rounding modes.
**Performance**
- For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.93%.
- Benchmarks with `./perf.sh` tool from the CORE-MATH project, unit is (CPU clocks / call).
- Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log1p
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 39.792 + 1.011 clc/call; Median-Min = 0.940 clc/call; Max = 41.373 clc/call;
-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 87.285 + 1.135 clc/call; Median-Min = 1.299 clc/call; Max = 89.715 clc/call;
-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 20.666 + 0.123 clc/call; Median-Min = 0.125 clc/call; Max = 20.828 clc/call;
-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 20.928 + 0.771 clc/call; Median-Min = 0.725 clc/call; Max = 22.767 clc/call;
-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 31.461 + 0.528 clc/call; Median-Min = 0.602 clc/call; Max = 36.809 clc/call;
```
- Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log1p --latency
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 77.875 + 0.062 clc/call; Median-Min = 0.051 clc/call; Max = 78.003 clc/call;
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 101.958 + 1.202 clc/call; Median-Min = 1.325 clc/call; Max = 104.452 clc/call;
-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 60.581 + 1.443 clc/call; Median-Min = 1.611 clc/call; Max = 62.285 clc/call;
-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.817 + 1.108 clc/call; Median-Min = 1.300 clc/call; Max = 50.282 clc/call;
-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 61.121 + 0.599 clc/call; Median-Min = 0.761 clc/call; Max = 62.020 clc/call;
```
- Accurate pass latency:
```
$ ./perf.sh log1p --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
760.444
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
827.880
-- LIBC latency -- with FMA
711.837
-- LIBC latency -- without FMA
764.317
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/D151049
Nikita Popov [Tue, 23 May 2023 14:59:02 +0000 (16:59 +0200)]
[InstCombine] Add droppable users back to worklist (NFCI)
When sinking and users are dropped, add the using instructions
to the worklist, as they can likely be removed as well.
This should be NFC apart from worklist order effects.
Jean Perier [Tue, 23 May 2023 15:00:15 +0000 (17:00 +0200)]
[flang][NFC] Move Array constructor inlined temp management into a utility
This patch moves the counter and storage management part of the array
constructor inlined temporary strategy into its own utility so that it
can be reused for the simple cases of temporary creations inside WHERE
and FORALL.
It actually fixes a bug where the counter first value used for addressing
was "2" leading to read/write after the allocated storage... It seems
I ran the tests end-to-end without the HLFIR flag when previously testing
this. So this may clear some segfaults.
Differential Revision: https://reviews.llvm.org/D151106
Tom Eccles [Wed, 17 May 2023 16:07:41 +0000 (16:07 +0000)]
[flang] use greedy mlir driver for stack arrays pass
In upstream mlir, the dialect conversion infrastructure is used for
lowering from one dialect to another: the passes are of the form
XToYPass. Whereas, transformations within the same dialect tend to use
applyPatternsAndFoldGreedily.
In this case, the full complexity of applyPatternsAndFoldGreedily isn't
needed so we can get away with the simpler applyOpPatternsAndFold.
This change was suggested by @jeanPerier
Differential Revision: https://reviews.llvm.org/D150853
Tue Ly [Thu, 11 May 2023 15:10:02 +0000 (11:10 -0400)]
[libc][math] Implement double precision log2 function correctly rounded to all rounding modes.
Implement double precision log2 function correctly rounded to all
rounding modes.
See https://reviews.llvm.org/D150014 for a more detail description of the algorithm.
**Performance**
- For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.91%.
- Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log2
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 15.458 + 0.204 clc/call; Median-Min = 0.224 clc/call; Max = 15.867 clc/call;
-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 23.711 + 0.524 clc/call; Median-Min = 0.443 clc/call; Max = 25.307 clc/call;
-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 14.807 + 0.199 clc/call; Median-Min = 0.211 clc/call; Max = 15.137 clc/call;
-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 17.666 + 0.274 clc/call; Median-Min = 0.298 clc/call; Max = 18.531 clc/call;
-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 26.534 + 0.418 clc/call; Median-Min = 0.462 clc/call; Max = 27.327 clc/call;
```
- Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log2 --latency
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 46.048 + 1.643 clc/call; Median-Min = 1.694 clc/call; Max = 48.018 clc/call;
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 62.333 + 0.138 clc/call; Median-Min = 0.119 clc/call; Max = 62.583 clc/call;
-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 45.206 + 1.503 clc/call; Median-Min = 1.467 clc/call; Max = 47.229 clc/call;
-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 43.042 + 0.454 clc/call; Median-Min = 0.484 clc/call; Max = 43.912 clc/call;
-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 57.016 + 1.636 clc/call; Median-Min = 1.655 clc/call; Max = 58.816 clc/call;
```
- Accurate pass latency:
```
$ ./perf.sh log2 --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
177.632
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
231.332
-- LIBC latency -- with FMA
459.751
-- LIBC latency -- without FMA
463.850
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/D150374
Krzysztof Parzyszek [Tue, 23 May 2023 12:46:15 +0000 (05:46 -0700)]
[Hexagon] Fix safety check in moving instructions in HVC::AlignVectors
A prior commit accidentally affected a safety check allowing aliased memory
instructions to be moved across one another.
Manna, Soumi [Tue, 23 May 2023 14:36:15 +0000 (07:36 -0700)]
[NFC][CLANG] Fix static code analyzer concerns
Reported by Static Code Analyzer Tool, Coverity:
Dereference null return value
Inside "ExprConstant.cpp" file, in <unnamed>::RecordExprEvaluator::VisitCXXStdInitializerListExpr(clang::CXXStdInitializerListExpr const *): Return value of function which returns null is dereferenced without checking.
bool RecordExprEvaluator::VisitCXXStdInitializerListExpr(
const CXXStdInitializerListExpr *E) {
// returned_null: getAsConstantArrayType returns nullptr (checked 81 out of 93 times).
//var_assigned: Assigning: ArrayType = nullptr return value from getAsConstantArrayType.
const ConstantArrayType *ArrayType =
Info.Ctx.getAsConstantArrayType(E->getSubExpr()->getType());
LValue Array;
//Condition !EvaluateLValue(E->getSubExpr(), Array, this->Info, false), taking false branch.
if (!EvaluateLValue(E->getSubExpr(), Array, Info))
return false;
// Get a pointer to the first element of the array.
//Dereference null return value (NULL_RETURNS)
//dereference: Dereferencing a pointer that might be nullptr ArrayType when calling addArray.
Array.addArray(Info, E, ArrayType);
This patch adds an assert for unexpected type for array initializer.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151040
Kadir Cetinkaya [Fri, 5 May 2023 10:39:09 +0000 (12:39 +0200)]
[include-cleaner] Treat references to nested types implicit
Differential Revision: https://reviews.llvm.org/D149948
Nikita Popov [Tue, 23 May 2023 14:36:11 +0000 (16:36 +0200)]
[InstCombine] Fix worklist management in select value equiv fold (NFCI)
Requeue the modified instruction.
This should be NFC apart from worklist order effects.
Tue Ly [Mon, 8 May 2023 18:03:52 +0000 (14:03 -0400)]
[libc][math] Implement double precision log function correctly rounded to all rounding modes.
Implement double precision log function correctly rounded to all
rounding modes.
See https://reviews.llvm.org/D150014 for a more detail description of the algorithm.
**Performance**
- For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.93%.
- Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 17.465 + 0.596 clc/call; Median-Min = 0.602 clc/call; Max = 18.389 clc/call;
-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 54.961 + 2.606 clc/call; Median-Min = 2.180 clc/call; Max = 59.583 clc/call;
-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 12.608 + 0.276 clc/call; Median-Min = 0.359 clc/call; Max = 13.147 clc/call;
-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 20.952 + 0.468 clc/call; Median-Min = 0.602 clc/call; Max = 21.881 clc/call;
-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 18.569 + 0.552 clc/call; Median-Min = 0.601 clc/call; Max = 19.259 clc/call;
```
- Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log --latency
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.431 + 0.699 clc/call; Median-Min = 0.073 clc/call; Max = 51.269 clc/call;
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 64.865 + 3.235 clc/call; Median-Min = 3.475 clc/call; Max = 71.788 clc/call;
-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 42.151 + 2.090 clc/call; Median-Min = 2.270 clc/call; Max = 44.773 clc/call;
-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 35.266 + 0.479 clc/call; Median-Min = 0.373 clc/call; Max = 36.798 clc/call;
-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.518 + 0.484 clc/call; Median-Min = 0.500 clc/call; Max = 49.896 clc/call;
```
- Accurate pass latency:
```
$ ./perf.sh log --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
598.306
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
632.925
-- LIBC latency -- with FMA
455.632
-- LIBC latency -- without FMA
488.564
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/D150131
Manna, Soumi [Tue, 23 May 2023 14:22:40 +0000 (07:22 -0700)]
[NFC][Clang] Fix Coverity bug with dereference null return value in clang::CodeGen::CodeGenFunction::EmitOMPArraySectionExpr()
Reported by Coverity:
Inside "CGExpr.cpp" file, in clang::CodeGen::CodeGenFunction::EmitOMPArraySectionExpr(clang::OMPArraySectionExpr const *, bool): Return value of function which returns null is dereferenced without checking.
} else {
//returned_null: getAsConstantArrayType returns nullptr (checked 83 out of 95 times).
// var_assigned: Assigning: CAT = nullptr return value from getAsConstantArrayType.
auto *CAT = C.getAsConstantArrayType(ArrayTy);
//identity_transfer: Member function call CAT->getSize() returns an offset off CAT (this).
// Dereference null return value (NULL_RETURNS)
//dereference: Dereferencing a pointer that might be nullptr CAT->getSize() when calling APInt.
ConstLength = CAT->getSize();
}
This patch adds an assert to resolve the bug.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151137
Nikita Popov [Tue, 23 May 2023 14:24:41 +0000 (16:24 +0200)]
[InstCombine] Regenerate test checks (NFC)
Nikita Popov [Tue, 23 May 2023 14:20:41 +0000 (16:20 +0200)]
[InstCombine] Fix worklist management in replaceGEPIdxWithZero() fold (NFCI)
Make sure the old load/store operand is queued for DCE.
This should be NFC apart from worklist order effects.
Joseph Huber [Tue, 23 May 2023 14:16:30 +0000 (09:16 -0500)]
[libc][AMDGPU] Disable the AMDGPU backend's ctor/dtor lowering for libc
The AMDGPU backend has a built-in pass to lower constructors. We do this
manually in the `start.cpp` implementation so we can disable this to
keep the binaries smaller.
Differential Revision: https://reviews.llvm.org/D151213
Jonathan Peyton [Mon, 22 May 2023 19:08:51 +0000 (14:08 -0500)]
[OpenMP] Insert missing variable update inside loop
While loop within task priority code did not have necessary update of
variable which could lead to hangs if two threads collided when both
attempted to execute the compare_and_exchange.
Fixes: https://github.com/llvm/llvm-project/issues/62867
Differential Revision: https://reviews.llvm.org/D151138
Tue Ly [Sat, 6 May 2023 02:08:42 +0000 (22:08 -0400)]
[libc][math] Make log10 correctly rounded for non-FMA targets and improve itsperformance.
Make log10 correctly rounded for non-FMA targets and improve its
performance.
Implemented fast pass and accurate pass:
**Fast Pass**:
- Range reduction step 0: Extract exponent and mantissa
```
x = 2^(e_x) * m_x
```
- Range reduction step 1: Use lookup tables of size 2^7 = 128 to reduce the argument to:
```
-2^-8 <= v = r * m_x - 1 < 2^-7
where r = 2^-8 * ceil( 2^8 * (1 - 2^-8) / (1 + k * 2^-7) )
and k = trunc( (m_x - 1) * 2^7 )
```
- Polynomial approximation: approximate `log(1 + v)` by a degree-7 polynomial generated by Sollya with:
```
> P = fpminimax((log(1 + x) - x)/x^2, 5, [|D...|], [-2^-8, 2^-7]);
```
- Combine the results:
```
log10(x) ~ ( e_x * log(2) - log(r) + v + v^2 * P(v) ) * log10(e)
```
- Perform additive Ziv's test with errors bounded by `P_ERR * v^2`. Return the result if Ziv's test passed.
**Accurate Pass**:
- Take `e_x`, `v`, and the lookup table index from the range reduction step of fast pass.
- Perform 3 more range reduction steps:
- Range reduction step 2: Use look-up tables of size 193 to reduce the argument to `[-0x1.3ffcp-15, 0x1.3e3dp-15]`
```
v2 = r2 * (1 + v) - 1 = (1 + s2) * (1 + v) - 1 = s2 + v + s2 * v
where r2 = 2^-16 * round ( 2^16 / (1 + k * 2^-14) )
and k = trunc( v * 2^14 + 0.5 ).
```
- Range reduction step 3: Use look-up tables of size 161 to reduce the argument to `[-0x1.01928p-22 , 0x1p-22]`
```
v3 = r3 * (1 + v2) - 1 = (1 + s3) * (1 + v2) - 1 = s3 + v2 + s3 * v2
where r3 = 2^-21 * round ( 2^21 / (1 + k * 2^-21) )
and k = trunc( v * 2^21 + 0.5 ).
```
- Range reduction step 4: Use look-up tables of size 130 to reduce the argument to `[-0x1.0002143p-29 , 0x1p-29]`
```
v4 = r4 * (1 + v3) - 1 = (1 + s4) * (1 + v3) - 1 = s4 + v3 + s4 * v3
where r4 = 2^-28 * round ( 2^28 / (1 + k * 2^-28) )
and k = trunc( v * 2^28 + 0.5 ).
```
- Polynomial approximation: approximate `log10(1 + v4)` by a degree-4 minimax polynomial generated by Sollya with:
```
> P = fpminimax(log10(1 + x)/x, 3, [|128...|], [-0x1.0002143p-29 , 0x1p-29]);
```
- Combine the results:
```
log10(x) ~ e_x * log10(2) - log10(r) - log10(r2) - log10(r3) - log10(r4) + v * P(v)
```
- The combined results are computed using floating points of 128-bit precision.
**Performance**
- For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.92%.
- Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log10
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 20.402 + 0.589 clc/call; Median-Min = 0.277 clc/call; Max = 22.752 clc/call;
-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 75.797 + 3.317 clc/call; Median-Min = 3.407 clc/call; Max = 79.371 clc/call;
-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 22.668 + 0.184 clc/call; Median-Min = 0.181 clc/call; Max = 23.205 clc/call;
-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 25.977 + 0.183 clc/call; Median-Min = 0.138 clc/call; Max = 26.283 clc/call;
-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 22.140 + 0.980 clc/call; Median-Min = 0.853 clc/call; Max = 23.790 clc/call;
```
- Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log10 --latency
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 54.613 + 0.357 clc/call; Median-Min = 0.287 clc/call; Max = 55.701 clc/call;
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 79.681 + 0.482 clc/call; Median-Min = 0.294 clc/call; Max = 81.604 clc/call;
-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 61.532 + 0.208 clc/call; Median-Min = 0.199 clc/call; Max = 62.256 clc/call;
-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 41.510 + 0.205 clc/call; Median-Min = 0.244 clc/call; Max = 41.867 clc/call;
-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 55.669 + 0.240 clc/call; Median-Min = 0.280 clc/call; Max = 56.056 clc/call;
```
- Accurate pass latency:
```
$ ./perf.sh log10 --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
640.688
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
667.354
-- LIBC latency -- with FMA
495.593
-- LIBC latency -- without FMA
504.143
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/D150014
Manna, Soumi [Tue, 23 May 2023 14:07:09 +0000 (07:07 -0700)]
[NFC][CLANG] Fix static code analyzer concerns with dereference null return value
Reported by Static Code Analyzer Tool, Coverity:
Inside "SemaExprMember.cpp" file, in clang::Sema::BuildMemberReferenceExpr(clang::Expr *, clang::QualType, clang::SourceLocation, bool, clang::CXXScopeSpec &, clang::SourceLocation, clang::NamedDecl *, clang::DeclarationNameInfo const &, clang::TemplateArgumentListInfo const *, clang::Scope const *, clang::Sema::ActOnMemberAccessExtraArgs *): Return value of function which returns null is dereferenced without checking
//Condition !Base, taking true branch.
if (!Base) {
TypoExpr *TE = nullptr;
QualType RecordTy = BaseType;
//Condition IsArrow, taking true branch.
if (IsArrow) RecordTy = RecordTy->castAs<PointerType>()->getPointeeType();
//returned_null: getAs returns nullptr (checked 279 out of 294 times).
//Condition TemplateArgs != NULL, taking true branch.
//Dereference null return value (NULL_RETURNS)
//dereference: Dereferencing a pointer that might be nullptr RecordTy->getAs() when calling LookupMemberExprInRecord.
if (LookupMemberExprInRecord(
*this, R, nullptr, RecordTy->getAs<RecordType>(), OpLoc, IsArrow,
SS, TemplateArgs != nullptr, TemplateKWLoc, TE))
return ExprError();
if (TE)
return TE;
This patch uses castAs instead of getAs which will assert if the type doesn't match.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151130
Nikita Popov [Tue, 23 May 2023 14:04:24 +0000 (16:04 +0200)]
[Driver] Try to fix linux-ld.c test with DEFAULT_LINKER set (NFC)
The test fails on the clang-ppc64le-rhel build bot, which has
DEFAULT_LINKER set and an ld.lld binary in the LLVM build directory.
Joseph Huber [Mon, 15 May 2023 12:59:53 +0000 (07:59 -0500)]
[AMDGPU] Add an option to disable manual ctor / dtor lowering
Currently AMDGPU offers extra ctor / dtor lowering by emitting a kernel
that can be called. It's possible to handle ctors and dtors using the
standard method as shown in D149340's commit message. In which case we
on't need these extra kernels as they won't be called. This patch simply
adds a way to conditionally turn off this handling if we do not want to
get extra kernels in the output.
Unrelated, but we could convert this handling to an ODR function that simply
calls the code in D149340 constructed via LLVM-IR. That would handle priority
correctly and would then be correct if not run in LTO mode.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D150565
Fangrui Song [Tue, 23 May 2023 13:59:01 +0000 (06:59 -0700)]
[ubsan][test] Remove --check-prefix=UNIQUE for x86_64-apple from
e215996a2932ed7c472f4e94dc4345b30fd0c373
After switching to use a type hash instead of possibly-non-unique typeinfo
objects, we no longer have unique/non-unique distinction.
Nikita Popov [Tue, 23 May 2023 13:39:53 +0000 (15:39 +0200)]
[InstCombine] Remove dead extractelements (NFCI)
Directly remove these dead extractelement instructions, rather than
leaving them for the next InstCombine iteration to clean up.
Should be mostly NFC, apart from worklist order differences.
Matthias Springer [Tue, 23 May 2023 13:22:20 +0000 (15:22 +0200)]
[mlir][bufferization] Fix bug in findValueInReverseUseDefChain
This bug was recently introduced in D143927 and manifests as a dominance violation.
Differential Revision: https://reviews.llvm.org/D151077
Aaron Ballman [Tue, 23 May 2023 13:28:05 +0000 (09:28 -0400)]
Silence switch statement contains 'default' but no 'case' labels warning; NFC
These are showing up in MSVC builds.
Dinar Temirbulatov [Tue, 23 May 2023 13:24:01 +0000 (13:24 +0000)]
[AArch64][LV] Disable maximising bandwidth for streaming compatible sve
Fixing last commit by adding actual change to AArch64TargetTransformInfo.cpp
Differential Revision: https://reviews.llvm.org/D150336
Thomas Preud'homme [Tue, 16 May 2023 09:24:57 +0000 (09:24 +0000)]
Add StringRef::consumeInteger(APInt)
This will be required to allow arbitrary precision support to
FileCheck's numeric variables and expressions. Note: as per
getAsInteger(), this does not support negative value. If there is
interest for that it can be added in a separate patch.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D150878
Dinar Temirbulatov [Tue, 23 May 2023 12:58:19 +0000 (12:58 +0000)]
[AArch64][LV] Disable maximising bandwidth for streaming compatible sve
We noticed some runtime performance improvements by disabling maximising
bandwidth for streaming compatible sve.
Differential Revision: https://reviews.llvm.org/D150336
Thomas Preud'homme [Tue, 16 May 2023 09:22:01 +0000 (09:22 +0000)]
Turn unreachable error into assert
Function valueFromStringRepr() throws an error on missing 0x prefix when
parsing a number string into a value. However, getWildcardRegex() already
ensures that only text with the 0x prefix will match and be parsed,
making that error throwing code dead code. This commit turn the code
into an assert and remove the unit tests exercising that test
accordingly.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D150797