sergey ignatov [Wed, 6 Sep 2017 01:06:50 +0000 (04:06 +0300)]
implementing profiler ELT callbacks for AMD64 Linux (#12603)
* implement profiler ELT callbacks for AMD64 Linux
* Some formatting fixes
* Fixed profiler
* Added aligning frame option
* Added aligning stack for quad values stores
José Rivero [Wed, 6 Sep 2017 00:38:54 +0000 (17:38 -0700)]
Moving the Windows Performance runs from Server 2012 to Server 2016. (#13725)
Omair Majid [Tue, 5 Sep 2017 23:22:55 +0000 (19:22 -0400)]
Add support for building under glibc 2.26 (#13785)
glibc 2.26 renames a number of identifiers so they are reserved under
POSIX. Specifically, `padding` becomes `__glibc_reserved1`. Add a
configure test for it and use the appropriate field name.
See https://sourceware.org/bugzilla/show_bug.cgi?id=21457 for more
information.
Resolves #13009
Jarret Shook [Tue, 5 Sep 2017 22:20:39 +0000 (15:20 -0700)]
Merge pull request #13791 from jashook/revert_13687
revert_13687
Koundinya Veluri [Tue, 5 Sep 2017 21:52:23 +0000 (14:52 -0700)]
Fix SemaphoreSlim throughput (#13766)
In https://github.com/dotnet/coreclr/pull/13670, by mistake I made the spin loop infinite, that is now fixed.
As a result the numbers I had provided in that PR for SemaphoreSlim were skewed, and fixing it caused the throughput to get even lower. To compensate, I have found and fixed one culprit for the low throughput problem:
- Every release wakes up a waiter. Effectively, when there is a thread acquiring and releasing the semaphore, waiters don't get to remain in a wait state.
- Added a field to keep track of how many waiters were pulsed to wake but have not yet woken, and took that into account in Release() to not wake up more waiters than necessary.
- Retuned and increased the number of spin iterations. The total spin delay is still less than before the above PR.
jashoo [Tue, 5 Sep 2017 18:42:15 +0000 (11:42 -0700)]
revert_13687
Fei Peng [Tue, 5 Sep 2017 18:34:56 +0000 (11:34 -0700)]
Add Intel hardware intrinsic API (#13576)
Victor "Nate" Graf [Tue, 5 Sep 2017 16:36:53 +0000 (09:36 -0700)]
Fix access order for double pointer (#13759)
* Fix access order for double pointer
* Reinforce test to catch more errors
Jarret Shook [Tue, 5 Sep 2017 16:36:35 +0000 (09:36 -0700)]
Merge pull request #13747 from jashook/lst_file_updates
Lst File updates.
Andy Ayers [Tue, 5 Sep 2017 14:23:55 +0000 (07:23 -0700)]
JIT: allow inlines of methods with calli (#13756)
Provided call sig has default callling convention. Added test case.
Continuation of #12714.
Shiming Ge [Tue, 5 Sep 2017 07:01:41 +0000 (15:01 +0800)]
Merge pull request #13763 from shimingsg/v-shige/addtestdependencyxmlfiles
add test dependency xml files
Hanjoung Lee [Tue, 5 Sep 2017 05:26:33 +0000 (14:26 +0900)]
Print zapper stats on verbose mode (#13774)
Hanjoung Lee [Mon, 4 Sep 2017 23:33:13 +0000 (08:33 +0900)]
Fix uninitialized fields of ZapperStats (#13773)
shimingsg [Sat, 2 Sep 2017 10:01:37 +0000 (18:01 +0800)]
update correct file path if the working folder is not test case folder
Victor "Nate" Graf [Fri, 1 Sep 2017 23:19:07 +0000 (16:19 -0700)]
Change identifier for EventProviders from GUID to string name (#13370)
* [WIP] Changed event provider to user String identifiers
* [WIP] Remove GUID from generated code
* [WIP] Many small fixes
* [WIP] Fix error in constructing GUID
* Pass EventSource to abstract away GUID/Name references
* Fix various small errors
* Delay construction of SString objects
* Change GUIDs to names
* Change hardcoded GUID strings to names
* Revert testing changes
* Remove extra line
* Use the EventSource name
* Use provider full names
* Use full-names for Rundown
* Bump version number for eventpipe file
* Address review comments
Bruce Forstall [Fri, 1 Sep 2017 23:03:18 +0000 (16:03 -0700)]
Merge pull request #13755 from BruceForstall/AddLsraAssert
Add an assert to getRegisterRecord() that regNum is legal
Maoni Stephens [Fri, 1 Sep 2017 22:21:47 +0000 (15:21 -0700)]
need casting for size calculation (#13754)
Bruce Forstall [Fri, 1 Sep 2017 20:59:23 +0000 (13:59 -0700)]
Add an assert to getRegisterRecord() that regNum is legal
Koundinya Veluri [Fri, 1 Sep 2017 20:09:40 +0000 (13:09 -0700)]
Add normalized equivalent of YieldProcessor, retune some spin loops (#13670)
* Add normalized equivalent of YieldProcessor, retune some spin loops
Part of fix for https://github.com/dotnet/coreclr/issues/13388
Normalized equivalent of YieldProcessor
- The delay incurred by YieldProcessor is measured once lazily at run-time
- Added YieldProcessorNormalized that yields for a specific duration (the duration is approximately equal to what was measured for one YieldProcessor on a Skylake processor, about 125 cycles). The measurement calculates how many YieldProcessor calls are necessary to get a delay close to the desired duration.
- Changed Thread.SpinWait to use YieldProcessorNormalized
Thread.SpinWait divide count by 7 experiment
- At this point I experimented with changing Thread.SpinWait to divide the requested number of iterations by 7, to see how it fares on perf. On my Sandy Bridge processor, 7 * YieldProcessor == YieldProcessorNormalized. See numbers in PR below.
- Not too many regressions, and the overall perf is somewhat as expected - not much change on Sandy Bridge processor, significant improvement on Skylake processor.
- I'm discounting the SemaphoreSlim throughput score because it seems to be heavily dependent on Monitor. It would be more interesting to revisit SemaphoreSlim after retuning Monitor's spin heuristics.
- ReaderWriterLockSlim seems to perform worse on Skylake, the current spin heuristics are not translating well
Spin tuning
- At this point, I abandoned the experiment above and tried to retune spins that use Thread.SpinWait
- General observations
- YieldProcessor stage
- At this stage in many places we're currently doing very long spins on YieldProcessor per iteration of the spin loop. In the last YieldProcessor iteration, it amounts to about 70 K cycles on Sandy Bridge and 512 K cycles on Skylake.
- Long spins on YieldProcessor don't let other work run efficiently. Especially when many scheduled threads all issue a long YieldProcessor, a significant portion of the processor can go unused for a long time.
- Long spins on YieldProcessor is in some cases helping to reduce contention in high-contention cases, effectively taking away some threads into a long delay. Sleep(1) works much better but has a much higher delay so it's not always appropriate. In other cases, I found that it's better to do more iterations with a shorter YieldProcessor. It would be even better to reduce the contention in the app or to have a proper wait in the sync object, where appropriate.
- Updated the YieldProcessor measurement above to calculate the number of YieldProcessorNormalized calls that amount to about 900 cycles (this was tuned based on perf), and modified SpinWait's YieldProcessor stage to cap the number of iterations passed to Thread.SpinWait. Effectively, the first few iterations have a longer delay than before on Sandy Bridge and a shorter delay than before on Skylake, and the later iterations have a much shorter delay than before on both.
- Yield/Sleep(0) stage
- Observed a couple of issues:
- When there are no threads to switch to, Yield and Sleep(0) become no-op and it turns the spin loop into a busy-spin that may quickly reach the max spin count and cause the thread to enter a wait state, or may just busy-spin for longer than desired before a Sleep(1). Completing the spin loop too early can cause excessive context switcing if a wait follows, and entering the Sleep(1) stage too early can cause excessive delays.
- If there are multiple threads doing Yield and Sleep(0) (typically from the same spin loop due to contention), they may switch between one another, delaying work that can make progress.
- I found that it works well to interleave a Yield/Sleep(0) with YieldProcessor, it enforces a minimum delay for this stage. Modified SpinWait to do this until it reaches the Sleep(1) threshold.
- Sleep(1) stage
- I didn't see any benefit in the tests to interleave Sleep(1) calls with some Yield/Sleep(0) calls, perf seemed to be a bit worse actually. If the Sleep(1) stage is reached, there is probably a lot of contention and the Sleep(1) stage helps to remove some threads from the equation for a while. Adding some Yield/Sleep(0) in-between seems to add back some of that contention.
- Modified SpinWait to use a Sleep(1) threshold, after which point it only does Sleep(1) on each spin iteration
- For the Sleep(1) threshold, I couldn't find one constant that works well in all cases
- For spin loops that are followed by a proper wait (such as a wait on an event that is signaled when the resource becomes available), they benefit from not doing Sleep(1) at all, and spinning in other stages for longer
- For infinite spin loops, they usually seemed to benefit from a lower Sleep(1) threshold to reduce contention, but the threshold also depends on other factors like how much work is done in each spin iteration, how efficient waiting is, and whether waiting has any negative side-effects.
- Added an internal overload of SpinWait.SpinOnce to take the Sleep(1) threshold as a parameter
- SpinWait - Tweaked the spin strategy as mentioned above
- ManualResetEventSlim - Changed to use SpinWait, retuned the default number of iterations (total delay is still significantly less than before). Retained the previous behavior of having Sleep(1) if a higher spin count is requested.
- Task - It was using the same heuristics as ManualResetEventSlim, copied the changes here as well
- SemaphoreSlim - Changed to use SpinWait, retuned similarly to ManualResetEventSlim but with double the number of iterations because the wait path is a lot more expensive
- SpinLock - SpinLock was using very long YieldProcessor spins. Changed to use SpinWait, removed process count multiplier, simplified.
- ReaderWriterLockSlim - This one is complicated as there are many issues. The current spin heuristics performed better even after normalizing Thread.SpinWait but without changing the SpinWait iterations (the delay is longer than before), so I left this one as is.
- The perf (see numbers in PR below) seems to be much better than both the baseline and the Thread.SpinWait divide by 7 experiment
- On Sandy Bridge, I didn't see many significant regressions. ReaderWriterLockSlim is a bit worse in some cases and a bit better in other similar cases, but at least the really low scores in the baseline got much better and not the other way around.
- On Skylake, some significant regressions are in SemaphoreSlim throughput (which I'm discounting as I mentioned above in the experiment) and CountdownEvent add/signal throughput. The latter can probably be improved later.
Andy Ayers [Fri, 1 Sep 2017 18:20:39 +0000 (11:20 -0700)]
JIT: fix some instruction size estimates (#13432)
The jit might double-count the REX prefix for certain reg-reg moves, and
could overestimate the size of a code-referent LEA.
While the downstream emitter code can tolerate overestimates, it is better
not to have them.
Closes #13398
Also fixes overestimates for the following:
* `call reg`
* `call [reg]`
* `call [reg + disp-byte]`
* `cmp al,byte`
Jan Kotas [Fri, 1 Sep 2017 17:57:33 +0000 (10:57 -0700)]
Clear the init-locals bit for CoreLib to workaround #1279 (#13728)
* Clear the init-locals bit for CoreLib to workaround #1279
* Yet another place that depends on zero init locals
Atsushi Kanamori [Fri, 1 Sep 2017 17:04:47 +0000 (10:04 -0700)]
Update Type.GetMethods() to be generics friendly (#13745)
Update Type.GetMethods() to be generics friendly
Dan Moseley [Fri, 1 Sep 2017 17:03:50 +0000 (10:03 -0700)]
Share four exception types (#13492)
* AE
* Move RuntimeWrappedException to shared
* Move AccessViolationException to shared
* Move IOException to shared
* nit
* assert to catch when message ought to include path
* AE feedback
* RWE feedback
* Remove most of __Error in favor of duplicate shared code in Win32Marshal
* Remove duplicated MakeHRFromErrorCode
* Extra using
* Revert RWE field rename
* Rename to wrappedException
* Make DNFE public for corefx
* Add no path case to match corefx
* Share DNFE
* Add underscore
* EOL
* Temporarily make DrNFE internal again
* Make RWE public
* Fixup __HResults
* Remove dead entries from Win32Native
* Redirect Kernel32 to PAL on Unix
* Remove dead targets file
* Unify on Interop.Libraries
* Refactor to expression-bodied property
jashook [Fri, 1 Sep 2017 16:42:02 +0000 (09:42 -0700)]
Lst File updates.
This change includes:
1) lst_creator updates to allow adding priority tags automatically
2) arm32 lstFile updates: 29 new tests, 50 removed
3) arm64 lstFile updates: 80 new tests, 55 removed
Bruce Forstall [Fri, 1 Sep 2017 15:44:09 +0000 (08:44 -0700)]
Merge pull request #13743 from alpencolt/ryu-arm-13056-tests
[RyuJIT/ARM32] Add regression tests
Pankaj Gode [Fri, 1 Sep 2017 15:42:26 +0000 (21:12 +0530)]
[ARM64/Windows] Add JIT_Stelem_Ref helper (#13687)
* [ARM64/Windows] Add JIT_Stelem_Ref helper
* [ARM64/Windows] Modified JIT_Stelem_Ref helper to save and restore x0-x2 instead of x0-x8
* [ARM64/Windows] using EPILOG_BRANCH for correct unwind info
Hanjoung Lee [Fri, 1 Sep 2017 15:41:01 +0000 (00:41 +0900)]
Initialize m_failedILStubs of ZapperStats (#13742)
Justin Van Patten [Fri, 1 Sep 2017 14:12:30 +0000 (07:12 -0700)]
Avoid StringBuilder allocation in ResourceManager (#13732)
Michal Strehovský [Thu, 31 Aug 2017 01:05:28 +0000 (18:05 -0700)]
Merge pull request dotnet/corert#4370 from dotnet/nmirror
Merge nmirror to master
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Alexander Soldatov [Fri, 1 Sep 2017 10:46:26 +0000 (13:46 +0300)]
[RyuJIT/ARM32] Add regression tests
Tests for #13056
Eugene Rozenfeld [Fri, 1 Sep 2017 04:29:51 +0000 (21:29 -0700)]
Enable checking of GTF_EXCEPT and GTF_ASG flags. (#13668)
* Enable checking of GTF_EXCEPT and GTF_ASG flags.
fgDebugCheckFlags is modified to check that GTF_EXCEPT and GTF_ASG are set precisely when needed.
It's also modified to handle several special operators correctly.
fgAddrCouldBeNull is updated to check for handles, implicit byref locals, and stack byrefs.
OperMayThrow is modified to handle several operators correctly.
GTF_IND_NONFAULTING is reused on operations for which OperIsIndir() is true and on GT_ARR_LENGTH.
Various places in morph are updated to set side effect flags correctly.
gtUpdateSideEffects is re-written so that it's precise for GTF_ASG and GTF_EXCEPT
and conservatively correct for the other side effects. It's now called from more places
to keep the flags up-to-date after transformations.
NoThrow in HelperCallProperties is updated and GTF_EXCEPT flag is set on helper calls according to
that property.
optRemoveRangeCheck is cleaned up and simplified.
Sergey Andreenko [Fri, 1 Sep 2017 01:31:12 +0000 (18:31 -0700)]
spmi: fix prevEnviroment delete statement. (#13729)
Carol Eidt [Fri, 1 Sep 2017 00:28:04 +0000 (17:28 -0700)]
Merge pull request #13215 from CarolEidt/CrossHFA
Support checking for HFA types in altjit
dotnet-maestro-bot [Thu, 31 Aug 2017 21:03:20 +0000 (14:03 -0700)]
Update CoreClr, CoreFx to preview2-25631-03, preview2-25631-02, respectively (#13678)
Bruce Forstall [Thu, 31 Aug 2017 20:44:48 +0000 (13:44 -0700)]
Merge pull request #13669 from BruceForstall/ArmTailcallViaHelper
[RyuJIT/arm32] Add support for tailcall via helper
Carol Eidt [Thu, 31 Aug 2017 20:37:49 +0000 (13:37 -0700)]
Merge pull request #13724 from alpencolt/ryu-arm-13056
[RyuJIT/ARM32] Correct handling of double registers in registerIsAvai…
Ahson Ahmed Khan [Thu, 31 Aug 2017 18:34:20 +0000 (11:34 -0700)]
Adding {ReadOnly}Memory, OwnedMemory, MemoryHandle, and IRetainable (#13583)
* Adding {ReadOnly}Memory<T>, OwnedMemory<T>, MemoryHandle, and IRetainable.
* Adding types {ReadOnly}Memory<T>, OwnedMemory<T>, MemoryHandle, and IRetainable.
* Adding Unsafe.As and Unsafe.Add + other fixes of build errors
* Addressing PR feedback.
* Add a check for length > 0 before indexing the array.
* Removing use of Unsafe.As and changing signature to RefTFrom_RetRefTTo.
* Fixing metasig definition.
* Removing unnecessary call to Unsafe.Add.
Alexander Soldatov [Thu, 31 Aug 2017 18:14:07 +0000 (21:14 +0300)]
[RyuJIT/ARM32] Correct handling of double registers in registerIsAvailable()
José Rivero [Thu, 31 Aug 2017 17:00:54 +0000 (10:00 -0700)]
Bug fix on measurement.py loop (#13711)
The loop was iterating through fixed file name and a pattern. The fixed file name did not exist, and the whole function failed.
With this change, we loop through the files if they exist.
Bruce Forstall [Thu, 31 Aug 2017 16:34:45 +0000 (09:34 -0700)]
Merge pull request #13685 from kbaladurin/ryujit-fix-opcodeoffs
JIT: Fix calculation of opcodeOffsets in Compiler::impImportBlockCode
Bruce Forstall [Thu, 31 Aug 2017 16:32:51 +0000 (09:32 -0700)]
Merge pull request #13719 from wateret/fix-multiregop
[RyuJIT/armel] Fix MultiRegOp definition
Jan Kotas [Thu, 31 Aug 2017 14:36:39 +0000 (07:36 -0700)]
Cleanup CoreLib defines (#13713)
Daniel Podder [Thu, 31 Aug 2017 14:20:03 +0000 (09:20 -0500)]
Update optdata to version
20170830-0123 (#13717)
Shiming Ge [Thu, 31 Aug 2017 10:23:17 +0000 (18:23 +0800)]
Merge pull request #13394 from shimingsg/v-shige/add-perftc-0816
Add perf test (making cards) to coreclr
Hanjoung Lee [Thu, 31 Aug 2017 08:04:38 +0000 (17:04 +0900)]
[RyuJIT/armel] Fix MultiRegOp definition
GT_COPY should not be MultiRegOp since it is CopyOrReload.
Shiming Ge [Thu, 31 Aug 2017 06:03:06 +0000 (14:03 +0800)]
add test dependency files
Carol Eidt [Thu, 31 Aug 2017 04:34:52 +0000 (21:34 -0700)]
Merge pull request #13690 from CarolEidt/NullcheckNotContained
Assert that Nullcheck child is not contained
Brian Chavez [Thu, 31 Aug 2017 04:00:22 +0000 (21:00 -0700)]
Spelling and grammar corrections - M through Z (#13698)
Pat Gavlin [Thu, 31 Aug 2017 03:56:13 +0000 (20:56 -0700)]
Merge pull request #13701 from pgavlin/IndexAddr2
Restore `GT_INDEX_ADDR` after #13682.
Maggie Tsang [Thu, 31 Aug 2017 03:08:40 +0000 (20:08 -0700)]
Random span-based API (#13708)
* Span NextBytes
* int fixes
* undo ints
Pat Gavlin [Thu, 31 Aug 2017 01:36:32 +0000 (18:36 -0700)]
PR feedback.
Pat Gavlin [Thu, 31 Aug 2017 01:27:15 +0000 (18:27 -0700)]
Merge pull request #13704 from pgavlin/PerfTestsPri0
Move performance tests back into Priority 0.
Andy Ayers [Thu, 31 Aug 2017 00:36:07 +0000 (17:36 -0700)]
JIT: don't reuse box temps when optimizing (#13703)
The importer reuses temps for different box operations that do not overlap.
This keeps the number of temps to a minimum and reduces prolog zeroing if the
temps end up untracked. But reuse prevents the importer from accurately typing
the temp.
So now, when optimizing, allocate a new temp for each box operation, and type
the temp with the type of the box.
This, along with a small update in `gtGetClassHandle` to obtain the box type,
enables some devirtualization of interface calls on boxes and will facilitate
future changes to optimize away boxes entirely (eg #5626).
Atsushi Kanamori [Wed, 30 Aug 2017 23:33:28 +0000 (16:33 -0700)]
Preemptively adding a resource string. (#13705)
Adding this now to save time later as
the CoreCLR/CoreRT bot will soon be
porting over a big feature change
and the only thing blocking the bot
from a green CI is this one new error string.
Carol Eidt [Wed, 30 Aug 2017 21:34:12 +0000 (14:34 -0700)]
Fix a typo
José Rivero [Wed, 30 Aug 2017 20:51:23 +0000 (13:51 -0700)]
Error out if specified testBinLoc path does not exist. (#13700)
Pat Gavlin [Wed, 30 Aug 2017 20:25:23 +0000 (13:25 -0700)]
Move performance tests back into Priority 0.
Just what it says on the tin. Should fix #13697.
Jarret Shook [Wed, 30 Aug 2017 19:55:38 +0000 (12:55 -0700)]
Merge pull request #13684 from hseok-oh/ci/set_xml_file
Use --xunitOutputPath option in runtest.sh
Pat Gavlin [Mon, 28 Aug 2017 22:45:15 +0000 (15:45 -0700)]
Fix `INDEX_ADDR` codegen on ARM for large element sizes.
We were attempting to generate `base + index * size` using `MADD`, but
had the registers in the wrong order and were generating
`base * index + size`. This change fixes the register order s.t. the
expected instruction is generated.
Fixes #13593.
Pat Gavlin [Wed, 30 Aug 2017 19:44:27 +0000 (12:44 -0700)]
Fix `gtCloneExpr` for `GT_IND(GT_INDEX_ADDR)`.
This function does not need to update the array info map when cloning
a `GT_IND` if the address is a `GT_INDEX_ADDR`.
Pat Gavlin [Wed, 30 Aug 2017 19:25:30 +0000 (12:25 -0700)]
Fix the GC info for `INDEX_ADDR` codegen.
`genConsumeReg` marks the consumed register as not a GC pointer, as it
assumes that the input register dies at the first instruction
generated by the node. This is not the case for `INDEX_ADDR`, however,
as the base register is multiply-used. As such, we need to mark the base
regsiter as containing a GC pointer until we are finished generating the
code for this node.
Fixes
Pat Gavlin [Wed, 30 Aug 2017 19:41:25 +0000 (12:41 -0700)]
Revert commit
bec6ac10f3968a8f699aad6233657ac59df37a73.
This restores the `GT_INDEX_ADDR` changes.
Matt Mitchell [Wed, 30 Aug 2017 18:48:52 +0000 (11:48 -0700)]
Remove EOL openSuSE 42.1 (#13693)
Sung Yoon Whang [Wed, 30 Aug 2017 17:45:52 +0000 (10:45 -0700)]
Change FinalizerThreadCreate location to after profiler is initialized (#13663)
* Change FinalizerThreadCreate location to after profiler is initialized to ensure finalizer creation notification to profilerAPI
* Fixes issue #13499
Carol Eidt [Wed, 2 Aug 2017 22:10:50 +0000 (15:10 -0700)]
Support checking for HFA types in altjit
When using an HFA altjit with a non-HFA coreclr, `FEATURE_HFA` will not
be defined in the runtime. In this case, instead of caching the information
on the `MethodTable`, it will be recomputed for every call to query for
an HFA type or to get the base element type.
In order to do this, the functionality must be available on a built class,
so implement these methods on `EEClass` instead of `MethodTableBuilder`
(which was already using `GetHalfBakedClass()` to access some of the
`EEClass` functionality).
Fix #13092
Daniel Podder [Wed, 30 Aug 2017 16:56:09 +0000 (11:56 -0500)]
Revert #13647 and #13638 (#13666)
#13638 and #13647 attempted to replace the generic 'dotnet-bot' e-mail
address with PR authors' e-mail addresses when uploading perf data to
BenchView. Unfortunately, these changes are still broken for some users
(specifically, if a user's e-mail address is not published/visible on
their GitHub profile).
There is no clean way to implement the proper fix, and the right fix
will change once pipeline support is available. Rather than putting in
something hacky now, I'm reverting these changes to unblock PRs. We
should revisit these changes after pipeline jobs are available.
Carol Eidt [Wed, 30 Aug 2017 16:03:18 +0000 (09:03 -0700)]
Assert that Nullcheck child is not contained
TreeNodeInfoInit doesn't call `TreeNodeInfoInitIndir on `GT_NULLCHECK`, though it does an indirection, because we never create an LEA for these. Assert that the child is not contained.
Andy Ayers [Wed, 30 Aug 2017 16:01:51 +0000 (09:01 -0700)]
JIT: allow nulls in gtCanOptimizeTypeEquality (#13680)
This is a follow-on to #13657. I looked at the remaining calls to
`Type::op_Equality` in the jit-diffs output and saw many of the calls had a
null pointer argument. This pattern comes about from explicit null checks in
the sources, often as part of argument validation.
Such calls can also be optimized into simple pointer equality checks, so
add another clause to `gtCanOptimizeTypeEquality` to look for nulls.
Bruce Forstall [Wed, 30 Aug 2017 16:01:26 +0000 (09:01 -0700)]
Formatting
Carol Eidt [Wed, 30 Aug 2017 15:09:44 +0000 (08:09 -0700)]
Merge pull request #13677 from hseok-oh/ryujit/fix_13675
[RyuJIT/ARMARCH] TreeNodeInfoInit for GT_NULLCHECK
Konstantin Baladurin [Thu, 6 Jul 2017 11:21:42 +0000 (14:21 +0300)]
JIT: Fix calculation of opcodeOffsets in Compiler::impImportBlockCode
Jan Kotas [Wed, 30 Aug 2017 13:35:44 +0000 (06:35 -0700)]
Revert "Merge pull request #13245 from pgavlin/NoExpandIndex" (#13682)
This reverts commit
a7ffdeca6fed927dbd457293d97b07237db95e82, reversing
changes made to
f5f622db2a00d7687f256c0d1cdda5e6f6da7ad4.
Omair Majid [Wed, 30 Aug 2017 09:15:42 +0000 (05:15 -0400)]
Remove -sequential build-flag (#13658)
The flag is not implemented anywhere and is completely ignored. Remove
it form various help notices too.
Fixes #12035
Hyeongseok Oh [Wed, 30 Aug 2017 08:40:23 +0000 (17:40 +0900)]
Use --xunitOutputPath option in runtest.sh
Fix to be used --xunitOutputPath in runtest.sh
This option was useless because xunitOutputPath was always set as default
Andy Ayers [Wed, 30 Aug 2017 02:48:40 +0000 (19:48 -0700)]
JIT: rework gtCanOptimizeTypeEquality (#13657)
Instead of looking at the verifier type for a local (which is currently never
set for ref classes in CoreCLR), use the new utilties for finding ref type
class handles.
Since the utility works on all tree types, also remove the restriction that
the tree must be a local.
Closes #13555.
Hyeongseok Oh [Wed, 30 Aug 2017 02:28:58 +0000 (11:28 +0900)]
Fix comment
Fix comment for TreeNodeInfoInitIndir
Hyeongseok Oh [Wed, 30 Aug 2017 02:06:45 +0000 (11:06 +0900)]
[RyuJIT/ARMARCH] TreeNodeInfoInit for GT_NULLCHECK
Change dstCount, internalcount for GT_NULLCHECK on ARM32
Remove calling TreeNodeInfoInitIndir() for GT_NULLCHECK on ARM32, ARM64
José Rivero [Wed, 30 Aug 2017 01:10:32 +0000 (18:10 -0700)]
IlLink perf stopped running because the VM pool name does not exist. (#13672)
Konstantin Baladurin [Wed, 30 Aug 2017 00:01:32 +0000 (03:01 +0300)]
callsignalhandlerwrapper: improve unwinding (#13566)
* Fix free_stack macro for ARM
free_stack shouldn't contain unwinder annotations for stack adjustment
* callsignalhandlerwrapper: improve unwinding
For linux: make CallSignalHandlerWrapper's frame sigtramp frame
for gdb and lldb:
- Save all registers on stack
- Add sigreturn syscall after call of signal_handler_worker
It provides ability for gdb and lldb unwind frame with invalid pc
(due to jump to invalid addresss).
For non linux:
- Save r11 on stack as it also can be used as frame pointer
- Set instruction set flag (thumb / arm) for saved pc. It is
necessary for gdb because it uses lr's lsb to determine
function mode
William Godbe [Tue, 29 Aug 2017 21:35:36 +0000 (14:35 -0700)]
Merge pull request #13650 from dagood/fix-BuildVersionFile
Fix dir.props BuildVersionFile override
Bruce Forstall [Tue, 29 Aug 2017 21:04:48 +0000 (14:04 -0700)]
[RyuJIT/arm32] Add support for tailcall via helper
Essentially, just do what AMD64 was doing in fgMorphTailCall()
and LowerTailCallViaHelper(). All the VM support for tailcall
via helper is already there (and used by legacy backend).
Fixes #11844, #11836
Joseph Tremoulet [Tue, 29 Aug 2017 20:47:00 +0000 (16:47 -0400)]
Merge pull request #13652 from JosephTremoulet/EntryNext
Check for new blocks after `entry`
Jan Kotas [Tue, 29 Aug 2017 16:17:31 +0000 (09:17 -0700)]
Use nameof for parse failures (#13640)
Carol Eidt [Tue, 29 Aug 2017 15:11:17 +0000 (08:11 -0700)]
Merge pull request #13649 from wateret/fix-13622
[RyuJIT/armel] Fix double reg arg passing
José Rivero [Tue, 29 Aug 2017 15:08:25 +0000 (08:08 -0700)]
Fixing IlLink job leg name. (#13646)
- CoreClr Scenario and CoreClr IlLink jobs were aliased on CI.
Carol Eidt [Tue, 29 Aug 2017 05:19:05 +0000 (22:19 -0700)]
Merge pull request #13648 from hseok-oh/ryujit/remove_regset_fieldlist
[RyuJIT/ARM32] Fix setting register of GT_FIELD_LIST for long
Carol Eidt [Tue, 29 Aug 2017 05:14:18 +0000 (22:14 -0700)]
Merge pull request #13628 from CarolEidt/FixNullCheck
Fix NullCheck register modeling
Joseph Tremoulet [Tue, 29 Aug 2017 04:31:07 +0000 (00:31 -0400)]
Check for new blocks after `entry`
Loop construction has a check for the case that an in-loop block has a
`bbNext` block that is a new block but not visited in the loop flow
walk; make sure that check fires for `entry` as well as other loop
blocks.
Fixes #13507.
Davis Goodin [Tue, 29 Aug 2017 02:07:57 +0000 (21:07 -0500)]
Fix dir.props BuildVersionFile override
The override needs to be before the Build.Common.props import, because BuildVersion.targets is in Build.Common.props and it's what's responsible for importing BuildVersionFile if it exists.
Hanjoung Lee [Tue, 29 Aug 2017 01:51:40 +0000 (10:51 +0900)]
[RyuJIT/armel] Fix double reg arg passing
Fix reg count for double arg reg
Fix #13622
Hyeongseok Oh [Tue, 29 Aug 2017 01:48:23 +0000 (10:48 +0900)]
[RyuJIT/ARM32] Fix setting register of GT_FIELD_LIST for long
Remove setting gtRegNum of GT_FIELD_LIST in LowerArg()
gtRegNum of GT_FIELD_LIST is reset in NewPutArg()
Daniel Podder [Tue, 29 Aug 2017 01:20:15 +0000 (20:20 -0500)]
Replace ghprbTriggerAuthorEmail -> ghprbPullAuthorEmail (#13647)
Perf smoke tests have been broken since #13638. The issue appears to be
that `ghprbTriggerAuthorEmail` is an empty string in most cases, so its
value is not expanded.
The attempted fix is to use `ghprbPullAuthorEmail` instead, which is
always defined (since there's always an owner to a PR).
Sergey Andreenko [Tue, 29 Aug 2017 01:14:50 +0000 (18:14 -0700)]
SuperPMI replay: fix enviroment variables initialization. (#13596)
SuperPMI replay: fix enviroment variables initialization.
If we have mch with mc files with different ENV_variables, we ran them
with the set for the first mc.
Brian Sullivan [Tue, 29 Aug 2017 00:53:06 +0000 (17:53 -0700)]
Merge pull request #13643 from dotnet-bot/from-tfs
Merge changes from TFS
Pat Gavlin [Tue, 29 Aug 2017 00:21:56 +0000 (17:21 -0700)]
Merge pull request #13627 from pgavlin/TagPri1Lst
Tag Pri1+ tests as such in the ARM32/64 LST files.
Brian Sullivan [Mon, 28 Aug 2017 23:31:08 +0000 (16:31 -0700)]
Fix for clang-format issue with previous Changeset
(previous CS is #
1672052)
[tfs-changeset:
1672055]
Brian Sullivan [Mon, 28 Aug 2017 23:00:53 +0000 (16:00 -0700)]
Fixes VSO 469476 - Check for the MARSHAL_BYREF class attribute before performing tail recursion optimization
[tfs-changeset:
1672052]
Michelle McDaniel [Mon, 28 Aug 2017 22:21:16 +0000 (15:21 -0700)]
Merge pull request #13638 from adiaaida/updatePerfUser
User the PR trigger email for benchview submit
Michelle McDaniel [Mon, 28 Aug 2017 21:41:27 +0000 (14:41 -0700)]
Use the PR trigger email for benchview submit
When users trigger perf runs for PRs, the alias used to submit in
submission-metadata.json should be that user's email. We should only
use dotnet-bot for official runs.
Jan Kotas [Mon, 28 Aug 2017 20:04:24 +0000 (13:04 -0700)]
Fix build breaks - delete parse tests that are redundant with CoreFX