Kirill Okhotnikov [Mon, 6 Jun 2022 08:10:24 +0000 (10:10 +0200)]
[libc][math] fmod/fmodf implementation.
This is a implementation of find remainder fmod function from standard libm.
The underline algorithm is developed by myself, but probably it was first
invented before.
Some features of the implementation:
1. The code is written on more-or-less modern C++.
2. One general implementation for both float and double precision numbers.
3. Spitted platform/architecture dependent and independent code and tests.
4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc.
5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided).
6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication.
Performance tests:
The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases.
`./check.sh <--special|--worst> fmodf` passed.
`CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf` results are
```
GNU libc version: 2.35
GNU libc release: stable
21.166 <-- FPU
51.031 <-- current glibc
37.659 <-- this fmod version.
```
Fangrui Song [Fri, 24 Jun 2022 20:52:27 +0000 (13:52 -0700)]
Revert "[Driver][test] Replace ^//$ with empty string"
This reverts commit
4817b7729a1846b709ec02b98bfe11b0125f8e8f.
It caused some `^/\n` and had some objection about its readability improvement.
Philip Reames [Fri, 24 Jun 2022 20:08:39 +0000 (13:08 -0700)]
[RISCV] Simplify 16 bit index handling in lowerVECTOR_REVERSE [nfc]
getRealMaxVLen returns an upper bound on the value of VLEN. We can use this upper bound (which unless explicitly set at command line is going to result in a e8 MaxVLMax of much greater than 256) instead of explicitly handling the unknown case separately from the bounded by number greater than 256 case.
Note as well that this code already implicitly depends on a capped value for VLEN. If infinite VLEN were possible, than 16 bit indices wouldn't be enough.
Philip Reames [Fri, 24 Jun 2022 19:59:06 +0000 (12:59 -0700)]
[RISCV] Replace two calls to getMinRVVVectorSizeInBits in fixed length lowering [nfc]
Both of these are only reached if useRVVForFixedLengthVectors is true. Given that, we know that getRealMinVLen() == getMinRVVVectorSizeInBits().
Wei Yi Tee [Fri, 24 Jun 2022 19:39:23 +0000 (21:39 +0200)]
[clang][dataflow] Store flow condition constraints in a single `FlowConditionConstraints` map.
A flow condition is represented with an atomic boolean token, and it is bound to a set of constraints: `(FC <=> C1 ^ C2 ^ ...)`. \
This was internally represented as `(FC v !C1 v !C2 v ...) ^ (C1 v !FC) ^ (C2 v !FC) ^ ...` and tracked by 2 maps:
- `FlowConditionFirstConjunct` stores the first conjunct `(FC v !C1 v !C2 v ...)`
- `FlowConditionRemainingConjuncts` stores the remaining conjuncts `(C1 v !FC) ^ (C2 v !FC) ^ ...`
This patch simplifies the tracking of the constraints by using a single `FlowConditionConstraints` map which stores `(C1 ^ C2 ^ ...)`, eliminating the use of two maps.
Reviewed By: gribozavr2, sgatev, xazax.hun
Differential Revision: https://reviews.llvm.org/D128357
Alexander Yermolovich [Fri, 24 Jun 2022 19:37:01 +0000 (12:37 -0700)]
[BOLT][DWARF] Add support for DW_AT_call_pc/DW_AT_call_return_pc
DWARF 5 added two new attributes DW_AT_call_pc and DW_AT_call_return_pc.
Adding support for them.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D128526
Thomas Raoux [Fri, 24 Jun 2022 19:11:36 +0000 (19:11 +0000)]
[mlir][vector] Fix bug when swapping scf.for and vector warp op
When creating a scf.for without argument a scf.yield is automatically
created. Make sure we don't create a second one.
Differential Revision: https://reviews.llvm.org/D128405
Philip Reames [Fri, 24 Jun 2022 19:03:33 +0000 (12:03 -0700)]
[RISCV] Replace two calls to getMinRVVVectorSizeInBits with getRealMinVLen [nfc]
This doesn't change behavior, it just makes it slightly more obvious what's
going on. Note that getRealMinVLen is always >= getMinRVVVectorSizeInBits.
The first case is a bit tricky, as you have to know that
getMinRVVVectorSizeInBits returns 0 when not set, and thus is equivalent
to the else value clause. The new code structure makes it more obvious we
return 0 unless using RVV for fixed length vectors.
Valentin Clement [Fri, 24 Jun 2022 19:06:05 +0000 (21:06 +0200)]
[flang][OpenACC] Lower parallel loop
Lower the `parallel loop` contrsuct and refactor some of the code
of parallel and loop lowering to be reused.
Also add tests for loop and parallel since they were not upstreamed.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128510
Valentin Clement [Fri, 24 Jun 2022 19:04:24 +0000 (21:04 +0200)]
[flang][lowering] handle MERGE with different FSOURCE and TSOURCE types
In merge FSOURCE and TSOURCE must have the same Fortran dynamic types,
but this does not imply that FSOURCE and TSOURCE will be lowered to the
same MLIR types. For instance, TSOURCE may be a character expression
with a compile type constant length (!fir.char<1,4>) while FSOURCE may
have dynamic length (!fir.char<1,?>).
Cast FSOURCE to TSOURCE MLIR types to handle these cases.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128507
Co-authored-by: Jean Perier <jperier@nvidia.com>
Mitch Phillips [Fri, 24 Jun 2022 17:48:18 +0000 (10:48 -0700)]
Add no_sanitize('hwaddress') (and 'memtag', but that's a no-op).
Currently, `__attribute__((no_sanitize('hwaddress')))` is not possible. Add this piece of plumbing, and now that we properly support copying attributes between an old and a new global variable, add a regression test for the GlobalOpt bug that previously lost the attribute.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D127544
Mitch Phillips [Fri, 24 Jun 2022 17:47:34 +0000 (10:47 -0700)]
[HWASan] Use new IR attribute for communicating unsanitized globals.
Globals that shouldn't be sanitized are currently communicated to HWASan
through the use of the llvm.asan.globals IR metadata. Now that we have
an on-GV attribute, use it.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D127543
Valentin Clement [Fri, 24 Jun 2022 19:03:00 +0000 (21:03 +0200)]
[flang] Explicitly map host associated symbols
Explicitly map host associated symbols in DoConcurrent with shared
locality-spec, clauses in OpenMP/OpenACC. The mapping of host-assoc
symbols is set to their parent SymbolBox. This is achieved through
a new interface function in the AbstractConverter.
This was already upstream for OpenMP.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128518
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Thomas Raoux [Fri, 24 Jun 2022 18:40:40 +0000 (18:40 +0000)]
[mlir][vector] Relax transfer_write vector distribution pattern
Small change to relax the pattern to support any vector containing a
single element.
Differential Revision: https://reviews.llvm.org/D128545
Valentin Clement [Fri, 24 Jun 2022 19:01:25 +0000 (21:01 +0200)]
[flang] Fix LBOUND with assumed size array and non constant DIM
LBOUND with a non constant DIM argument use the runtime to allow runtime
verification of DIM <= RANK. The interface uses a descriptor. This caused
undefined behavior because the runtime believed it was seeing an explicit
shape arrays with zero extent and returned `1` (the runtime descriptor
does not allow making a difference between an explicit shape and an
assumed size. Assumed size are not meant to be described by runtime
descriptors).
Fix the issue by setting the last extent of assumed size to `1` when
creating the descriptor to inquire about the LBOUND with the runtime.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128509
Co-authored-by: Jean Perier <jperier@nvidia.com>
Philip Reames [Fri, 24 Jun 2022 18:35:19 +0000 (11:35 -0700)]
[LV][RISCV] Add coverage showing scalable codegen when etype != ELEN
We currently have a costing bug around the etype == ELEN case, so add otherwise duplicate tests to show test diffs as I work on other parts of costing.
Venkata Ramanaiah Nalamothu [Thu, 2 Jun 2022 17:49:30 +0000 (23:19 +0530)]
[lldb] Fix thread step until to not set breakpoint(s) on incorrect line numbers
The requirements for "thread until <line number>" are:
a) If any code contributed by <line number> or the nearest subsequent of <line number> is executed before leaving the function, stop
b) If you end up leaving the function w/o triggering (a), then stop
In case of (a), since the <line number> may have multiple entries in the line table and the compiler might have scheduled/moved the relevant code across, and the lldb does not know the control flow, set breakpoints on all the line table entries of best match of <line number> i.e. exact or the nearest subsequent line.
Along with the above, currently, CommandObjectThreadUntil is also setting the breakpoints on all the subsequent line numbers after the best match and this latter part is wrong.
This issue is discussed at http://lists.llvm.org/pipermail/lldb-dev/2018-August/013979.html.
In fact, currently `TestStepUntil.py` is not actually testing step until scenarios and `test_missing_one` test fails without this patch if tests are made to run. Fixed the test as well.
Reviewed By: jingham
Differential Revision: https://reviews.llvm.org/D50304
Fangrui Song [Fri, 24 Jun 2022 18:25:03 +0000 (11:25 -0700)]
[Driver][test] Replace ^//$ with empty string
The convention does not add //\n. Having all RUN/CHECK lines separated by //\n
makes editor movement difficult (e.g. { } in Vim).
Jonas Devlieghere [Fri, 24 Jun 2022 18:17:38 +0000 (11:17 -0700)]
[lldb] Move Host::SystemLog out of !defined(_WIN32)
The definition of Host::SystemLog was (unintentionally) guarded by
!defined(_WIN32).
Aart Bik [Thu, 23 Jun 2022 19:17:01 +0000 (12:17 -0700)]
[mlir][bufferization][sparse] put restriction on sparse tensor allocation
Putting some direct use restrictions on tensor allocations in the
sparse case enables the use of simplifying assumptions in the
bufferization analysis.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D128463
Jonas Devlieghere [Fri, 24 Jun 2022 17:52:13 +0000 (10:52 -0700)]
[lldb] Fix flakiness in shell tests that mixed stderr and stdout
Because the diagnostic events are processed by the default event handler
in its own thread, tests cannot rely on output ordering. Split stdout
and stderr to make the test reliable again.
Jonas Devlieghere [Fri, 24 Jun 2022 17:03:03 +0000 (10:03 -0700)]
[lldb] Add SystemLogHandler for emitting log messages to the system log
Add a system log handler that emits log messages to the operating system
log. In addition to the log handler itself, this patch also introduces a
new Host::SystemLog helper function to abstract over writing to the
system log.
Differential revision: https://reviews.llvm.org/D128321
David Blaikie [Fri, 24 Jun 2022 16:47:57 +0000 (16:47 +0000)]
Revert "DebugInfo: Fully integrate ctor type homing into 'limited' debug info"
Reverting to simplify some Google-internal rollout issues. Will recommit
in a week or two.
This reverts commit
517bbc64dbe493644eff8d55fd9566435e930520.
Mingming Liu [Thu, 12 May 2022 07:12:20 +0000 (00:12 -0700)]
[Inline] Annotate inline pass name with link phase information for analysis.
The annotation is flag gated; flag is turned off by default.
Differential Revision: https://reviews.llvm.org/D125495
Daniel Douglas [Fri, 24 Jun 2022 16:59:22 +0000 (11:59 -0500)]
[OpenMP][libomp] avoid spin wait and yield on arm64 macOS
This patch changes the default behavior to avoid spin waiting and
yielding. (See “Don’t Keep Threads Active And Idle” section here:
https://developer.apple.com/documentation/apple-silicon/tuning-your-code-s-performance-for-apple-silicon)
We verified using instruments traces that the changes improve scheduling
behavior on macOS.
We also collected results using EPCC schedbench
(https://github.com/LangdalP/EPCC-OpenMP-micro-benchmarks) that are
attached here that show a reduction in standard deviation and max test
run time across all scheduling types. Static scheduling sees dramatic
improvements with these changes, we see a 2-4x average runtime
improvement in the benchmark.
Differential Revision: https://reviews.llvm.org/D126510
Fazlay Rabbi [Fri, 24 Jun 2022 15:42:21 +0000 (08:42 -0700)]
[OpenMP] Initial parsing and sema support for 'masked taskloop' construct
This patch gives basic parsing and semantic support for "masked taskloop"
construct introduced in OpenMP 5.1 (section 2.16.7)
Differential Revision: https://reviews.llvm.org/D128478
Eli Friedman [Fri, 24 Jun 2022 16:58:31 +0000 (09:58 -0700)]
[clang codegen] Add dso_local/hidden/etc. markings to VTT declarations
We were marking definitions, but not declarations. Marking declarations
makes computing the address more efficient.
Fixes issue reported at https://discourse.llvm.org/t/63090
Differential Revision: https://reviews.llvm.org/D128482
Akira Hatanaka [Wed, 22 Jun 2022 18:52:22 +0000 (11:52 -0700)]
[Sema] Check whether `__auto_type` has been deduced before merging
This fixes a bug in clang where it emits the following diagnostic when
compiling the test case:
"argument to 'sizeof' in 'memset' call is the same pointer type 'S' as
the destination"
The code that merges __auto_type with other types was committed in
https://reviews.llvm.org/D122029.
Differential Revision: https://reviews.llvm.org/D128373
Richard [Fri, 24 Jun 2022 16:47:51 +0000 (10:47 -0600)]
[clang-tidy] Update release notes (NFC)
- Sort changes to existing checks by check name
- Correct check link
Jonas Devlieghere [Fri, 24 Jun 2022 16:36:29 +0000 (09:36 -0700)]
[lldb] Replace Host::SystemLog with Debugger::Report{Error,Warning}
As it exists today, Host::SystemLog is used exclusively for error
reporting. With the introduction of diagnostic events, we have a better
way of reporting those. Instead of printing directly to stderr, these
messages now get printed to the debugger's error stream (when using the
default event handler). Alternatively, if someone is listening for these
events, they can decide how to display them, for example in the context
of an IDE such as Xcode.
This change also means we no longer write these messages to the system
log on Darwin. As far as I know, nobody is relying on this, but I think
this is something we could add to the diagnostic event mechanism.
Differential revision: https://reviews.llvm.org/D128480
Alexey Bataev [Thu, 9 Dec 2021 18:34:08 +0000 (10:34 -0800)]
[SLP]Improve shuffles cost estimation where possible.
Improved/fixed cost modeling for shuffles by providing masks, improved
cost model for non-identity insertelements.
Differential Revision: https://reviews.llvm.org/D115462
Joshua Root [Fri, 24 Jun 2022 16:12:55 +0000 (09:12 -0700)]
[ObjCopy] Fix type mismatch in writeCodeSignatureData()
The result of pointer subtraction is of type ptrdiff_t, which is not necessarily the same underlying type as ssize_t. This can lead to a compilation error since std::min requires both parameters to be the same type.
Fixes: https://github.com/llvm/llvm-project/issues/54846
Reviewed By: alexander-shaposhnikov, drodriguez, jhenderson
Differential Revision: https://reviews.llvm.org/D128117
Arthur Eubanks [Sat, 18 Jun 2022 21:14:04 +0000 (14:14 -0700)]
[GlobalOpt] Perform store->dominated load forwarding for stored once globals
The initial land incorrectly optimized forwarding non-Constants in non-nosync/norecurse functions. Bail on non-Constants since norecurse should cause global -> alloca promotion anyway.
The initial land also incorrectly assumed that StoredOnceStore was the only store to the global, but it actually means that only one value other than the global initializer is stored. Add a check that there's only one store.
Compile time tracker:
https://llvm-compile-time-tracker.com/compare.php?from=
c80b88ee29f34078d2149de94e27600093e6c7c0&to=
ef2c2b7772424b6861a75e794f3c31b45167304a&stat=instructions
Reviewed By: nikic, asbirlea, jdoerfert
Differential Revision: https://reviews.llvm.org/D128128
Casey Carter [Fri, 24 Jun 2022 16:06:39 +0000 (09:06 -0700)]
[libcxx][test] barrier completion functions must be non-throwing
... per N4910 [thread.barrier.class]/5.
Siva Chandra Reddy [Tue, 21 Jun 2022 18:14:13 +0000 (18:14 +0000)]
[libc] Add Uint128 type as a fallback when __uint128_t is not available.
Also, the unused specializations of __int128_t have been removed.
Differential Revision: https://reviews.llvm.org/D128304
Philip Reames [Fri, 24 Jun 2022 15:47:03 +0000 (08:47 -0700)]
[RISCV] Split a vectorizer test runline so that upcoming changes in defaults are visible
Philip Reames [Fri, 24 Jun 2022 15:45:53 +0000 (08:45 -0700)]
[RISCV] Modify a test line so it exercises the intended configuration once we turn on scalable vectorization
Peter Collingbourne [Fri, 24 Jun 2022 04:47:14 +0000 (21:47 -0700)]
ELF: Do not relax ADRP/LDR -> ADRP/ADD for absolute symbols in PIC.
GOT references to absolute symbols can't be relaxed to use ADRP/ADD in
position-independent code because these instructions produce a relative
address.
Differential Revision: https://reviews.llvm.org/D128492
Florian Hahn [Fri, 24 Jun 2022 15:42:11 +0000 (17:42 +0200)]
[LV] Create RT checks once VF/IC are selected, track scalar cost.
This patch updates LV to generate runtime after the VF & IC are selected. It
allows deciding whether to vectorize with runtime checks or not based on
their cost compared to the vector loop.
It also updates VectorizationFactor to include the scalar cost.
Reviewed By: lebedev.ri, dmgreen
Differential Revision: https://reviews.llvm.org/D75981
Walter Erquinigo [Fri, 24 Jun 2022 00:45:24 +0000 (17:45 -0700)]
[NFC][lldb][trace] Rename trace session to trace bundle
As previously discussed with @jj10306, we didn't really have a name for
the post-mortem (or offline) trace session representation, which is in
fact a folder with a bunch of files. We decided to call this folder
"trace bundle", and the main JSON file in it "trace bundle description
file". This naming is pretty decent, so I'm refactoring all the existing
code to account for that.
Differential Revision: https://reviews.llvm.org/D128484
Joe Nash [Wed, 25 May 2022 18:09:11 +0000 (14:09 -0400)]
[AMDGPU] gfx11 VOPD instructions MC support
VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.
A VOPD instruction is constituted of an X component (for which there are
13 possible opcodes) and a Y component (for which there are the 13 X
opcodes plus 3 more). Most of the complexity in defining and parsing
a VOPD operation arises from the possible different total numbers of
operands and deferred parsing of certain operands depending on the
constituent X and Y opcodes.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D128218
Craig Topper [Fri, 24 Jun 2022 15:21:05 +0000 (08:21 -0700)]
[RISCV] Change how we isel (add X, [-4096, -2049]) or (add X, [2048,4095]).
We currently split the immediate almost equally between two addis.
If the immediate is odd, it won't be split exactly equal.
This patch instead gives one addi an immediate of 2047 or -2048 and the
other getsthe remainder. If the original immediate is near -2049 or 2048,
this might allow the use of c.addi for the addi that receives the
smaller immediate.
Reviewed By: asb, luismarques
Differential Revision: https://reviews.llvm.org/D128500
Konstantin Zhuravlyov [Fri, 24 Jun 2022 15:28:59 +0000 (11:28 -0400)]
AMDGPU: Clear kill flags when optimizing vcmp save exec sequence
It was causing bad machine code for several blender scenes:
*** Bad machine code: Using an undefined physical register ***
- function: kernel_holdout_emission_blurring_pathtermination_ao
- basic block: %bb.28 if.end40.i (0x7f84861a2320)
- instruction: V_CMPX_EQ_U32_nosdst_e64 0, $vgpr3, implicit-def $exec, implicit $exec
- operand 1: $vgpr3
Differential Revision: https://reviews.llvm.org/D127768
Michał Górny [Wed, 22 Jun 2022 17:25:13 +0000 (19:25 +0200)]
[lldb] [test] Move part of fork tests to common helper
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128361
Michał Górny [Wed, 22 Jun 2022 06:32:05 +0000 (08:32 +0200)]
[lldb] [llgs] Introduce an AppendThreadIDToResponse() helper
Introduce a helper function to append GDB Remote Serial Protocol "thread
IDs", with optional PID in multiprocess mode.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128324
Michał Górny [Mon, 20 Jun 2022 09:34:23 +0000 (11:34 +0200)]
[lldb] [llgs] Implement the 'T' packet
Implement the 'T' packet that is used to verify whether the specified
thread belongs to the debugged processes.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128170
Michał Górny [Mon, 20 Jun 2022 06:59:27 +0000 (08:59 +0200)]
[lldb] [llgs] Include PID in QC response in multiprocess mode
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128156
Michał Górny [Sun, 19 Jun 2022 07:00:54 +0000 (09:00 +0200)]
[lldb] [llgs] Add a test for multiprocess register read/write
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128153
Michał Górny [Sat, 18 Jun 2022 19:06:41 +0000 (21:06 +0200)]
[lldb] [llgs] Support multiprocess in qfThreadInfo
Update the `qfThreadInfo` handler to report threads of all debugged
processes and include PIDs when in multiprocess mode.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128152
Michał Górny [Sat, 18 Jun 2022 16:56:10 +0000 (18:56 +0200)]
[lldb] [llgs] Add a test for multiprocess memory read/write
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128150
Michał Górny [Wed, 15 Jun 2022 14:48:48 +0000 (16:48 +0200)]
[lldb] [llgs] Support resuming one process with PID!=current via vCont
Extend vCont function to support resuming a process with an arbitrary
PID, that could be different than the one selected via Hc (or no process
at all may be selected). Resuming more than one process simultaneously
is not supported yet.
Remove the ReadTid() method that was only used by Handle_vCont(),
and furthermore it was wrongly using m_current_process rather than
m_continue_process.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127862
Michał Górny [Tue, 14 Jun 2022 14:07:11 +0000 (16:07 +0200)]
[lldb] [llgs] Add test for resuming via c in multiprocess scenarios
Add a test verifying that it is possible to resume a single process
via the `c` packet when multiple processes are being debugged. This
includes a tiny change to the test program — when `fork()` is called,
the child process is no longer terminated immediately but continues
performing the same tasks as queued for the parent.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127755
Michał Górny [Sun, 12 Jun 2022 06:55:41 +0000 (08:55 +0200)]
[lldb] [llgs] Implement the vKill packet
Implement the support for the vKill packet. This is the modern packet
used by the GDB Remote Serial Protocol to kill one of the debugged
processes. Unlike the `k` packet, it has well-defined semantics.
The `vKill` packet takes the PID of the process to kill, and always
replies with an `OK` reply (rather than the exit status, as LLGS does
for `k` packets at the moment). Additionally, unlike the `k` packet
it does not cause the connection to be terminated once the last process
is killed — the client needs to close it explicitly.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127667
Michał Górny [Fri, 10 Jun 2022 13:53:11 +0000 (15:53 +0200)]
[lldb] [llgs] Make `k` kill all processes, and fix multiple exits
Modify the behavior of the `k` packet to kill all inferiors rather than
just the current one. The specification leaves the exact behavior
of this packet up to the implementation but since vKill is specifically
meant to be used to kill a single process, it seems logical to use `k`
to provide the alternate function of killing all of them.
Move starting stdio forwarding from the "running" response
to the packet handlers that trigger the process to start. This avoids
attempting to start it multiple times when multiple processes are killed
on Linux which implicitly causes LLGS to receive "started" events
for all of them. This is probably also more correct as the ability
to send "O" packets is implied by the continue-like command being issued
(and therefore the client waiting for responses) rather than the start
notification.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127500
Yaxun (Sam) Liu [Fri, 17 Jun 2022 01:26:33 +0000 (21:26 -0400)]
[HIP] add -fhip-kernel-arg-name
Add option -fhip-kernel-arg-name to emit kernel argument
name metadata, which is needed for certain HIP applications.
Reviewed by: Artem Belevich, Fangrui Song, Brian Sumner
Differential Revision: https://reviews.llvm.org/D128022
chenglin.bi [Fri, 24 Jun 2022 15:14:20 +0000 (23:14 +0800)]
[SelectionDAG][DAGCombiner] Reuse exist node by reassociate
When already have (op N0, N2), reassociate (op (op N0, N1), N2) to (op (op N0, N2), N1) to reuse the exist (op N0, N2)
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122539
Stephen Long [Wed, 22 Jun 2022 17:54:50 +0000 (10:54 -0700)]
[MSVC] Add initial support for MSVC pragma optimize
MSVC's pragma optimize turns optimizations on or off based on the list
passed. At the moment, we only support an empty optimization list.
i.e. `#pragma optimize("", on | off)`
From MSVC's docs:
| Parameter | Type of optimization |
|-----------|--------------------------------------------------|
| g | Enable global optimizations. Deprecated |
| s or t | Specify short or fast sequences of machine code |
| y | Generate frame pointers on the program stack |
https://docs.microsoft.com/en-us/cpp/preprocessor/optimize?view=msvc-170
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125723
Tapasweni Pathak [Fri, 24 Jun 2022 11:11:20 +0000 (11:11 +0000)]
Implement soft reset of the diagnostics engine.
This patch implements soft reset and adds tests for soft reset success of the
diagnostics engine. This allows us to recover from errors in clang-repl without
resetting the pragma handlers' state.
Differential revision: https://reviews.llvm.org/D126183
Sam Estep [Fri, 24 Jun 2022 14:37:59 +0000 (14:37 +0000)]
[clang][dataflow] Allow MatchSwitch to return a value
Reland of D128467. This version replaces `return {};` with `return Result();`, since the former failed on GCC with `Result = void`.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D128533
Nicolas Vasilache [Fri, 24 Jun 2022 14:31:47 +0000 (07:31 -0700)]
[mlir][Vector]Fix bug where vector::WarpExecuteOnLane0Op are created with 2 blocks in the region
Differential Revision: https://reviews.llvm.org/D128534
Dawid Jurczak [Fri, 24 Jun 2022 14:23:46 +0000 (16:23 +0200)]
[InlineCost] Improve debugging experience by adding print about initial inlining cost
Differential Revision: https://reviews.llvm.org/D127597
serge-sans-paille [Thu, 2 Jun 2022 06:53:45 +0000 (08:53 +0200)]
[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f632498307d22e10fab0704548b270b15f1e1e but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.
Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.
n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member
n = 3: any trailing array member of undefined size is a flexible array member (strict c99 conformance)
Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
Nikita Popov [Wed, 8 Jun 2022 08:12:08 +0000 (10:12 +0200)]
[MemoryBuiltins] Accept any value in getInitialValueOfAllocation() (NFC)
Drop the requirement that getInitialValueOfAllocation() must be
passed an allocator function, shifting the responsibility for
checking that into the function (which it does anyway). The
motivation is to avoid some calls to isAllocationFn(), which has
somewhat ill-defined semantics (given the number of
allocator-related attributes we have floating around...)
(For this function, all we eventually need is an allockind of
zeroed or uninitialized.)
Differential Revision: https://reviews.llvm.org/D127274
Nikita Popov [Fri, 24 Jun 2022 14:06:04 +0000 (16:06 +0200)]
[GlobalOpt] Add tests for PR55859 (NFC)
Joseph Huber [Fri, 24 Jun 2022 13:46:05 +0000 (09:46 -0400)]
[Binary] Further improve malformed input handling for the OffloadBinary
Summary:
This patch adds some new sanity checks to make sure that the sizes of
the offsets are within the bounds of the file or what is expected by the
binary. This also improves the error handling of the version structure
to be built into the binary itself so we can change it easier.
Kai Luo [Fri, 24 Jun 2022 13:55:35 +0000 (13:55 +0000)]
[AIX][libatomic] Fix link flags after
30dfe016d4 for libatomic on AIX
After
30dfe016d4, we no longer use string as link flags.
Patch by @tingwang.
Reviewed By: tingwang
Differential Revision: https://reviews.llvm.org/D128524
Sam Estep [Fri, 24 Jun 2022 13:52:11 +0000 (13:52 +0000)]
Revert "[clang][dataflow] Allow MatchSwitch to return a value"
This reverts commit
4eecd194b073492a309b87c8f60da6614bba9153.
Sam Estep [Fri, 24 Jun 2022 13:32:47 +0000 (13:32 +0000)]
[clang][dataflow] Allow MatchSwitch to return a value
This patch adds another `typename` parameter to `MatchSwitch` class: `Result` (defaults to `void`), corresponding to the return type of the function. This necessitates a couple minor changes to the `MatchSwitchBuilder` class, and is tested via a new `ReturnNonVoid` test in `clang/unittests/Analysis/FlowSensitive/MatchSwitchTest.cpp`.
Reviewed By: gribozavr2, sgatev, xazax.hun
Differential Revision: https://reviews.llvm.org/D128467
Haojian Wu [Fri, 24 Jun 2022 09:34:58 +0000 (11:34 +0200)]
[clang-tidy] Make the cert/uppercase-literal-suffix-integer fully hermetic.
after the test-reorg commit (
89a1d03e2b379e325daa5249411e414bbd995b5e), the
cert/uppercase test starts to fail in our internal environment -- it accesses
a header file from "../readability", which is not friendly to a hermetic test environment.
This change makes the test fully hermetic, and does some cleanup on the
uppercase header (I think it is better to move it the share
Inputs/Header directory, and rename it)
Differential Revision: https://reviews.llvm.org/D128511
Matthias Springer [Fri, 24 Jun 2022 11:43:30 +0000 (13:43 +0200)]
[mlir][sparse][bufferize] Implement BufferizableOpInterface
Only the analysis part of the interface is implemented. The bufferization itself is performed by the SparseTensorConversion pass.
Differential Revision: https://reviews.llvm.org/D128138
LLVM GN Syncbot [Fri, 24 Jun 2022 11:33:41 +0000 (11:33 +0000)]
[gn build] Port
7a3918b540c3
Aaron Ballman [Fri, 24 Jun 2022 11:32:18 +0000 (07:32 -0400)]
Revert "[clang] Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface"
This reverts commit
6546fdbe36fd1227d7f23f89fd9a9825813b3de9.
This broke some build bots due to a layering issue:
https://lab.llvm.org/buildbot/#/builders/57/builds/19244
LLVM GN Syncbot [Fri, 24 Jun 2022 11:18:35 +0000 (11:18 +0000)]
[gn build] Port
6546fdbe36fd
Vaibhav Yenamandra [Fri, 24 Jun 2022 11:16:54 +0000 (07:16 -0400)]
[clang] Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface
Create an interface for writing SARIF documents from within clang:
The primary intent of this change is to introduce the interface
clang::SarifDocumentWriter, which allows incrementally adding
diagnostic data to a JSON backed document. The proposed interface is
not yet connected to the compiler internals, which will be covered in
future work. As such this change will not change the input/output
interface of clang.
This change also introduces the clang::FullSourceRange type that is
modeled after clang::SourceRange + clang::FullSourceLoc, this is useful
for packaging a pair of clang::SourceLocation objects with their
corresponding SourceManagers.
Previous discussions:
RFC for this change: https://lists.llvm.org/pipermail/cfe-dev/2021-March/067907.html
https://lists.llvm.org/pipermail/cfe-dev/2021-July/068480.html
SARIF Standard (2.1.0):
https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html
Differential Revision: https://reviews.llvm.org/D109701
Nikita Popov [Fri, 24 Jun 2022 10:30:00 +0000 (12:30 +0200)]
[InlineFunction] Slightly clarify noalias scope calculation (NFC)
Rename CanDeriveViaCapture -> RequiresNoCaptureBefore, drop
unnecessary const cast, reformat some code avoid an ugly
super-indented comment.
Nabeel Omer [Wed, 22 Jun 2022 09:56:41 +0000 (09:56 +0000)]
[SLP] Add cost model for `llvm.powi.*` intrinsics (REAPPLIED)
Patch was reverted in 4c5f10a due to buildbot failures, now being
reapplied with updated AArch64 and RISCV tests.
This patch adds handling for the llvm.powi.* intrinsics in
BasicTTIImplBase::getIntrinsicInstrCost() and improves vectorization.
Closes #53887.
Differential Revision: https://reviews.llvm.org/D128172
Muhammad Omair Javaid [Fri, 24 Jun 2022 09:41:30 +0000 (13:41 +0400)]
[LLDB] Mark TestExprsChar Xfail for Windows/AArch64
test_unsigned_char test in TestExprsChar.py fails on AArch64/Windows
platform. There is known bug already present for the failure for various
arch/os combinations. This patch marks the test as xfail for
windows/AArch64.
Nikita Popov [Fri, 24 Jun 2022 09:58:16 +0000 (11:58 +0200)]
[AA] Export isEscapeSource() API (NFC)
Export API that was previously private to BasicAliasAnalysis and
will be used in D127202.
Frederic Cambus [Fri, 24 Jun 2022 09:09:34 +0000 (11:09 +0200)]
[clang] Update Clang version from 14 to 15 in scan-build.1.
Similar to D110763.
David Green [Fri, 24 Jun 2022 09:04:28 +0000 (10:04 +0100)]
[AArch64] Convert vector add(ext, ext) into ext(add(ext, ext))
Given a vector add or sub from extends that needs more that one 'step'
(i.e i8 to i32 or i16 to i64), we can transform the sequence to
sext(add(ext, ext)), to allow the add(ext, ext) to become a single uaddl
and a larger extend, producing less instructions in total.
https://alive2.llvm.org/ce/z/S2T4k-
Differential Revision: https://reviews.llvm.org/D128426
Nikita Popov [Thu, 23 Jun 2022 14:36:16 +0000 (16:36 +0200)]
[BasicAA] Handle passthru calls in isEscapeSource()
isEscapeSource() currently considers all call return values as
escape sources. However, CaptureTracking can look through certain
calls, so we shouldn't consider these as escape sources either.
The corresponding CaptureTracking code is:
https://github.com/llvm/llvm-project/blob/
7c9a3825b8420f5d37c5bb8919a9e46684a87089/llvm/lib/Analysis/CaptureTracking.cpp#L332-L333
Differential Revision: https://reviews.llvm.org/D128444
Kiran Chandramohan [Fri, 24 Jun 2022 08:45:33 +0000 (08:45 +0000)]
[Flang] enable fir.is_present and fir.absent with function types
Fortran dummy procedures and procedure pointer can be OPTIONAL, and
there is no technical reason to prevent fir.is_present and
fir.absent from accepting function types, so allow it and add test.
Note: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project. This patch is
basically upstreaming the following PR.
https://github.com/flang-compiler/f18-llvm-project/pull/1568
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D128464
Co-authored-by: Jean Perier <jperier@nvidia.com>
David Green [Fri, 24 Jun 2022 08:41:10 +0000 (09:41 +0100)]
[AArch64] Add addition extend of add/sub neon tests. NFC
Vassil Vassilev [Fri, 24 Jun 2022 07:17:20 +0000 (07:17 +0000)]
Reland "[clang-repl] Recover the lookup tables of the primary context."
The asan issue was fixed in llvm/llvm-project@
7bc00ce5cd41
This reverts commit
575e297fcb289f0a9b0ac4b01d1d0fa051f5cc29.
Differential revision: https://reviews.llvm.org/D123674
Petr Hosek [Fri, 24 Jun 2022 08:24:45 +0000 (08:24 +0000)]
Revert "[CMake][compiler-rt] Clean up the use of libcxx and libcxxabi"
This reverts commit
c0d4f2282d8335cd15338663b18cd7f22155456e which
broke clang-x86_64-debian-fast:
https://lab.llvm.org/buildbot/#/builders/109/builds/41268
Siva Chandra Reddy [Thu, 23 Jun 2022 18:18:50 +0000 (18:18 +0000)]
[libc][NFC] Remove the templatization from the linux implementation of thread.
This enables setting up a single "self" thread object to be returned by
API like thrd_self and pthread_self.
Cullen Rhodes [Thu, 23 Jun 2022 17:24:07 +0000 (17:24 +0000)]
[AArch64] NFC: Fix PRFS -> PRFW inst def name
Florian Hahn [Fri, 24 Jun 2022 08:16:55 +0000 (10:16 +0200)]
[VPlan] Set VFs included in plan before last set of VPTransforms (NFC).
This allows VPlanTransforms to query the VFs included in the plan in the
future.
Petr Hosek [Thu, 2 Jun 2022 18:45:26 +0000 (18:45 +0000)]
[CMake][compiler-rt] Clean up the use of libcxx and libcxxabi
We no longer support the use of LLVM_ENABLE_PROJECTS for libcxx and
libcxxabi. We don't use paths to libcxx and libcxxabi in compiler-rt.
Differential Revision: https://reviews.llvm.org/D126905
Matt Devereau [Fri, 24 Jun 2022 07:33:50 +0000 (07:33 +0000)]
[AArch64][SVE] Add sve.dupq.lane(insert(constant vector, 0), 0) ld1rq tests
Fangrui Song [Fri, 24 Jun 2022 07:36:26 +0000 (00:36 -0700)]
[gdb-scripts] Fix PointerIntPairPrinter.to_string after D127969
Peixin-Qiao [Fri, 24 Jun 2022 07:33:09 +0000 (15:33 +0800)]
[flang][OpenMP] Initial support the lowering of copyin clause
This supports the lowering of copyin clause initially. The pointer,
allocatable, common block, polymorphic varaibles will be supported
later.
This also includes the following changes:
1. Resolve the COPYIN clause and make the entity as host associated.
2. Fix collectSymbolSet by adding one option to control collecting the
symbol itself or ultimate symbol of it so that it can be used
explicitly differentiate the host and associated variables in
host-association.
3. Add one helper function `lookupOneLevelUpSymbol` to differentiate the
usage of host and associated variables explicitly. The previous
lowering of firstprivate depends on the order of
`createHostAssociateVarClone` and `lookupSymbol` of host symbol. With
this fix, this dependence is removed.
4. Reuse `copyHostAssociateVar` for copying operation of COPYIN clause.
Reviewed By: kiranchandramohan, NimishMishra
Differential Revision: https://reviews.llvm.org/D127468
Florian Hahn [Fri, 24 Jun 2022 07:27:14 +0000 (09:27 +0200)]
Recommit "[ConstraintElimination] Transfer info from ULT to signed system."
This reverts commit
94ed2caf708818dd3a0b376bbce56e53c0956f1e.
The issue with no-determinism with the test has been fixed in
d9526e8a52ca9d5.
Valentin Clement [Fri, 24 Jun 2022 07:10:03 +0000 (09:10 +0200)]
[flang] Keep PURE in IEEE functions
PURE keyword should be kept in `__fortran_ieee_exceptions.f90`
and `ieee_arithmetic.f90` and not removed as done in
https://reviews.llvm.org/D128431
Reviewed By: vdonaldson
Differential Revision: https://reviews.llvm.org/D128498
Valentin Clement [Fri, 24 Jun 2022 07:07:03 +0000 (09:07 +0200)]
[flang] Fix forall issue with substring operation
When there is a substring operation on a scalar assignment in a FORALL
context, we have to lower the entire substring and not the entire
CHARACTER variable.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld, klausler
Differential Revision: https://reviews.llvm.org/D128459
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Fangrui Song [Fri, 24 Jun 2022 07:04:55 +0000 (00:04 -0700)]
[llvm-readobj] Simplify startswith+drop_front pattern with consume_front. NFC
Craig Topper [Fri, 24 Jun 2022 06:35:15 +0000 (23:35 -0700)]
[RISCV] Move vfma_vl+fneg_vl matching to DAG combine.
This patch adds 3 new _VL RISCVISD opcodes to represent VFMA_VL with
different portions negated. It also adds a DAG combine to peek
through FNEG_VL to create these new opcodes.
This is modeled after similar code from X86.
This makes the isel patterns more regular and reduces the size of
the isel table by ~37K.
The test changes look like regressions, but they point to a bug that
was already there. We aren't able to commute a masked FMA instruction
to improve register allocation because we always use a mask undisturbed
policy. Prior to this patch we matched two multiply operands in a
different order and hid this issue for these test cases, but a different
test still could have encountered it.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D128310
Petr Hosek [Wed, 30 Mar 2022 17:33:40 +0000 (10:33 -0700)]
[CMake][compiler-rt] Use COMPILE_OPTIONS and LINK_OPTIONS
This avoids the need for string-ification and lets CMake deduplicate
potentially duplicate flags.
Differential Revision: https://reviews.llvm.org/D122750
Fangrui Song [Fri, 24 Jun 2022 06:26:02 +0000 (23:26 -0700)]
[CodeGen] Simplify isVirtualRegister. NFC
Hui Xie [Thu, 12 May 2022 12:23:11 +0000 (13:23 +0100)]
[libc++] P2321R2 section [tuple.tuple]. Adding C++23 constructors, assignment operators and swaps to `tuple`
1. for constructors that takes cvref variation of tuple<UTypes...>, there
used to be two SFINAE helper _EnableCopyFromOtherTuple,
_EnableMoveFromOtherTuple. And the implementations of these two helpers
seem to slightly differ from the spec. But now, we need 4 variations.
Instead of adding another two, this change refactored it to a single one
_EnableCtrFromUTypesTuple, which directly maps to the spec without
changing the C++11 behaviour. However, we need the helper __copy_cvref_t
to get the type of std::get<i>(cvref tuple<Utypes...>) for different
cvref, so I made __copy_cvref_t to be available in C++11.
2. for constructors that takes variations of std::pair, there used to be
four helpers _EnableExplicitCopyFromPair, _EnableImplicitCopyFromPair,
_EnableImplicitMoveFromPair, _EnableExplicitMoveFromPair. Instead of
adding another four, this change refactored into two helper
_EnableCtrFromPair and _BothImplicitlyConvertible. This also removes the
need to use _nat
3. for const member assignment operator, since the requirement is very
simple, I haven't refactored the old code but instead directly adding
the new c++23 code.
4. for const swap, I pretty much copy pasted the non-const version to make
these overloads look consistent
5. while doing these change, I found two of the old constructors wasn't
marked constexpr for C++20 but they should. fixed them and added unit
tests
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D116621