Improve call counting mechanism (#1457)
authorKoundinya Veluri <kouvel@users.noreply.github.com>
Tue, 28 Jan 2020 22:19:27 +0000 (14:19 -0800)
committerGitHub <noreply@github.com>
Tue, 28 Jan 2020 22:19:27 +0000 (14:19 -0800)
Improve call counting mechanism

- Call counting through the prestub is fairly expensive and can be seen immediately after call counting begins
- Added call counting stubs. When starting call counting for a method:
  - A `CallCountingInfo` is created and initializes a remaining call count with a threshold
  - A `CallCountingStub` is created. It contains a small amount of code that decrements the remaining call count and checks for zero. When nonzero, it jumps to the code version's native code entry point. When zero, it forwards to a helper function that handles tier promotion.
  - When the call count threshold is reached, the helper call enqueues completion of call counting for background processing
  - When completing call counting, the code version is enqueued for promotion, and the call counting stub is removed from the call chain
  - Once all work queued for promotion is completed and methods transitioned to optimized tier, call counting stubs are deleted based on some heuristics and under runtime suspension
- The `CallCountingManager` is the main class with most of the logic. Its private subclasses are just simple data structures.
- Call counting is done at a `NativeCodeVersion` level (stub association is with the code version)
- The code versioning lock is used for data structures used for call counting. Since installing a call counting stub requires that we know what the currently active code version is, it made sense to use the same lock.
- Call counting stubs have hardcoded code. x64 has short and long stubs, short stubs are used when possible (often) and use IP-relative branches to the method's code and helper stub. Other platforms have only one type of stub (a short stub).
- For tiered methods that don't have a precode (virtual and interface methods), a forwarder stub (a precode) is created and it forwards to the call counting stub. This is so that the call counting stub can be safely and easily deleted. The forwarder stubs are only used when counting calls, there is one per method (not per code version), and they are not deleted. See `CallCountingManager::SetCodeEntryPoint()` for more info.
- The `OnCallCountThresholdReachedStub()` takes a "stub-identifying token". The helper call gets a stub address from it, and tells whether it's a short or long stub. From the stub, the remaining call count pointer is used to get the `CallCountingInfo`, and from it gets the `NativeCodeVersion` associated with the stub.
- The `CallCountingStubManager` traces through a call counting stub so that VS-like debuggers can step into a method through the call counting stub
- Exceptions (OOM)
  - On foreground threads, exceptions are propagated unless they can be handled without any compromise
  - On background threads, exceptions are caught and logged as before. Tried to limit scope of exception to one per method or code version such that a loop over many would not all be aborted by one exception.
- Fixed a latent race where a method is recorded for call counting and then the method's code entry point is set to tier 0 code
  - With that order, the tiering delay may expire and the method's entry point may be updated for call counting in the background before the code entry point is set by the recording thread, and that last action would disable call counting for the method and cause it to not be optimized. The only thing protecting from this happening was the delay itself, but a configured shorter delay increases the possibility of this happening.
  - Inverted the order such that the method's code entry point is set before recording it for call counting, both on first and subsequent calls
  - Changed the tiered compilation lock to be an any-GC-mode lock so that it can be taken inside the code versioning lock, as some things were more naturally placed inside the code versioning lock where we know the active code version, like checking for the tiering delay to delay call counting and promoting the code version when the call count threshold is reached
    - Unfortunately, that makes code inside the lock a GC-no-trigger scope and things like scheduling a timer or queuing a work item to the thread pool could not be done inside that scope. This tradeoff seems to be better than alternatives, so refactored those pieces to occur outside that scope.
- Publishing an entry point after changing the active code version now takes call counting into account, fixes https://github.com/dotnet/coreclr/issues/22426
- After the changes:
  - Call counting overhead is much smaller and is not many orders of magnitude greater than a method call
  - Some config modes for tuning tiering are now much more reasonable and do not affect perf negatively nearly as much as before - increasing call count threshold, disabling or decreasing the tiering delay. Enables dynamic thresholds in the future, which is not feasible due to the overhead currently.
  - No change to startup or steady-state perf
- Left for later
  - Eventing work to report call counting stub code ranges and method name (also needs to be done for other stubs)
  - Some tests that consume events to verify run-time behavior in a few config modes
  - Debugger test to verify debugging while call-counting. Debugger tests also need to be fixed for tiering.
  - The call count threshold has not been changed for now. As we don't have many tests that measure the performance in-between startup and steady-state, some will need to be created maybe from existing tests, to determine the effects
- Fixes https://github.com/dotnet/coreclr/issues/23596

54 files changed:
docs/design/features/code-versioning.md
src/coreclr/src/debug/daccess/request.cpp
src/coreclr/src/debug/ee/debugger.cpp
src/coreclr/src/debug/ee/functioninfo.cpp
src/coreclr/src/inc/CrstTypes.def
src/coreclr/src/inc/clrconfigvalues.h
src/coreclr/src/inc/crsttypes.h
src/coreclr/src/inc/dacvars.h
src/coreclr/src/inc/loaderheap.h
src/coreclr/src/inc/shash.h
src/coreclr/src/inc/shash.inl
src/coreclr/src/inc/vptr_list.h
src/coreclr/src/vm/CMakeLists.txt
src/coreclr/src/vm/amd64/AsmHelpers.asm
src/coreclr/src/vm/amd64/cgencpu.h
src/coreclr/src/vm/amd64/unixasmhelpers.S
src/coreclr/src/vm/appdomain.cpp
src/coreclr/src/vm/arm/asmhelpers.S
src/coreclr/src/vm/arm/asmhelpers.asm
src/coreclr/src/vm/arm/cgencpu.h
src/coreclr/src/vm/arm64/asmhelpers.S
src/coreclr/src/vm/arm64/asmhelpers.asm
src/coreclr/src/vm/arm64/cgencpu.h
src/coreclr/src/vm/callcounter.cpp [deleted file]
src/coreclr/src/vm/callcounter.h [deleted file]
src/coreclr/src/vm/callcounting.cpp [new file with mode: 0644]
src/coreclr/src/vm/callcounting.h [new file with mode: 0644]
src/coreclr/src/vm/ceemain.cpp
src/coreclr/src/vm/codeversion.cpp
src/coreclr/src/vm/codeversion.h
src/coreclr/src/vm/eeconfig.cpp
src/coreclr/src/vm/eeconfig.h
src/coreclr/src/vm/eventtrace.cpp
src/coreclr/src/vm/fptrstubs.cpp
src/coreclr/src/vm/frames.cpp
src/coreclr/src/vm/frames.h
src/coreclr/src/vm/i386/asmhelpers.S
src/coreclr/src/vm/i386/asmhelpers.asm
src/coreclr/src/vm/i386/cgencpu.h
src/coreclr/src/vm/jitinterface.cpp
src/coreclr/src/vm/loaderallocator.cpp
src/coreclr/src/vm/loaderallocator.hpp
src/coreclr/src/vm/method.cpp
src/coreclr/src/vm/method.hpp
src/coreclr/src/vm/method.inl
src/coreclr/src/vm/methoddescbackpatchinfo.cpp
src/coreclr/src/vm/methoddescbackpatchinfo.h
src/coreclr/src/vm/prestub.cpp
src/coreclr/src/vm/proftoeeinterfaceimpl.cpp
src/coreclr/src/vm/rejit.cpp
src/coreclr/src/vm/tieredcompilation.cpp
src/coreclr/src/vm/tieredcompilation.h
src/coreclr/src/vm/win32threadpool.cpp
src/coreclr/src/vm/win32threadpool.h

index fce9e9c..7d07e8a 100644 (file)
@@ -330,11 +330,7 @@ to update the active child at either of those levels (ReJIT uses SetActiveILCode
 In order to do step 3 the `CodeVersionManager` relies on one of three different mechanisms, a `FixupPrecode`, a `JumpStamp`, or backpatching entry point slots. In [method.hpp](https://github.com/dotnet/coreclr/blob/master/src/vm/method.hpp) these mechanisms are described in the `MethodDesc::IsVersionableWith*()` functions, and all methods have been classified to use at most one of the techniques, based on the `MethodDesc::IsVersionableWith*()` functions.
 
 ### Thread-safety ###
-CodeVersionManager is designed for use in a free-threaded environment, in many cases by requiring the caller to acquire a lock before calling. This lock can be acquired by constructing an instance of the
-
-```
-CodeVersionManager::TableLockHolder(CodeVersionManager*)
-```
+CodeVersionManager is designed for use in a free-threaded environment, in many cases by requiring the caller to acquire a lock before calling. This lock can be acquired by constructing an instance of `CodeVersionManager::LockHolder`.
 
 in some scope for the CodeVersionManager being operated on. CodeVersionManagers from different domains should not have their locks taken by the same thread with one exception, it is OK to take the shared domain  manager lock and one AppDomain manager lock in that order. The lock is required to change the shape of the tree or traverse it but not to read/write configuration properties from each node. A few special cases:
 
index 6039964..2d04124 100644 (file)
@@ -4281,7 +4281,7 @@ HRESULT ClrDataAccess::GetPendingReJITID(CLRDATA_ADDRESS methodDesc, int *pRejit
     PTR_MethodDesc pMD = PTR_MethodDesc(TO_TADDR(methodDesc));
 
     CodeVersionManager* pCodeVersionManager = pMD->GetCodeVersionManager();
-    CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
     ILCodeVersion ilVersion = pCodeVersionManager->GetActiveILCodeVersion(pMD);
     if (ilVersion.IsNull())
     {
@@ -4313,7 +4313,7 @@ HRESULT ClrDataAccess::GetReJITInformation(CLRDATA_ADDRESS methodDesc, int rejit
     PTR_MethodDesc pMD = PTR_MethodDesc(TO_TADDR(methodDesc));
 
     CodeVersionManager* pCodeVersionManager = pMD->GetCodeVersionManager();
-    CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
     ILCodeVersion ilVersion = pCodeVersionManager->GetILCodeVersion(pMD, rejitId);
     if (ilVersion.IsNull())
     {
@@ -4365,7 +4365,7 @@ HRESULT ClrDataAccess::GetProfilerModifiedILInformation(CLRDATA_ADDRESS methodDe
     PTR_MethodDesc pMD = PTR_MethodDesc(TO_TADDR(methodDesc));
 
     CodeVersionManager* pCodeVersionManager = pMD->GetCodeVersionManager();
-    CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
     ILCodeVersion ilVersion = pCodeVersionManager->GetActiveILCodeVersion(pMD);
     if (ilVersion.GetRejitState() != ILCodeVersion::kStateActive || !ilVersion.HasDefaultIL())
     {
@@ -4398,7 +4398,7 @@ HRESULT ClrDataAccess::GetMethodsWithProfilerModifiedIL(CLRDATA_ADDRESS mod, CLR
 
     PTR_Module pModule = PTR_Module(TO_TADDR(mod));
     CodeVersionManager* pCodeVersionManager = pModule->GetCodeVersionManager();
-    CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
 
     LookupMap<PTR_MethodTable>::Iterator typeIter(&pModule->m_TypeDefToMethodTableMap);
     for (int i = 0; typeIter.Next(); i++)
index 91a927d..dbd79f1 100644 (file)
@@ -3634,7 +3634,7 @@ HRESULT Debugger::SetIP( bool fCanSetIPOnly, Thread *thread,Module *module,
 
     CodeVersionManager *pCodeVersionManager = module->GetCodeVersionManager();
     {
-        CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
         ILCodeVersion ilCodeVersion = pCodeVersionManager->GetActiveILCodeVersion(module, mdMeth);
         if (!ilCodeVersion.IsDefaultVersion())
         {
index b2c2ecd..e42a37b 100644 (file)
@@ -933,7 +933,7 @@ void DebuggerJitInfo::LazyInitBounds()
         LOG((LF_CORDB,LL_EVERYTHING, "DJI::LazyInitBounds: this=0x%x GetBoundariesAndVars success=0x%x\n", this, fSuccess));
 
         // SetBoundaries uses the CodeVersionManager, need to take it now for lock ordering reasons
-        CodeVersionManager::TableLockHolder lockHolder(mdesc->GetCodeVersionManager());
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
         Debugger::DebuggerDataLockHolder debuggerDataLockHolder(g_pDebugger);
 
         if (!m_fAttemptInit)
@@ -1059,7 +1059,7 @@ void DebuggerJitInfo::SetBoundaries(ULONG32 cMap, ICorDebugInfo::OffsetMapping *
     // Pick a unique initial value (-10) so that the 1st doesn't accidentally match.
     int ilPrevOld = -10;
 
-    _ASSERTE(m_nativeCodeVersion.GetMethodDesc()->GetCodeVersionManager()->LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
 
     InstrumentedILOffsetMapping mapping;
 
@@ -1606,8 +1606,8 @@ DebuggerJitInfo *DebuggerMethodInfo::FindOrCreateInitAndAddJitInfo(MethodDesc* f
     NativeCodeVersion nativeCodeVersion;
     if (fd->IsVersionable())
     {
-        CodeVersionManager::TableLockHolder lockHolder(fd->GetCodeVersionManager());
         CodeVersionManager *pCodeVersionManager = fd->GetCodeVersionManager();
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
         nativeCodeVersion = pCodeVersionManager->GetNativeCodeVersion(fd, startAddr);
         if (nativeCodeVersion.IsNull())
         {
@@ -2087,7 +2087,7 @@ void DebuggerMethodInfo::CreateDJIsForMethodDesc(MethodDesc * pMethodDesc)
     CodeVersionManager* pCodeVersionManager = pMethodDesc->GetCodeVersionManager();
     // grab the code version lock to iterate available versions of the code
     {
-        CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
         NativeCodeVersionCollection nativeCodeVersions = pCodeVersionManager->GetNativeCodeVersions(pMethodDesc);
 
         for (NativeCodeVersionIterator itr = nativeCodeVersions.Begin(), end = nativeCodeVersions.End(); itr != end; itr++)
index 42ba9fe..57119c0 100644 (file)
@@ -280,7 +280,7 @@ Crst NativeImageCache
 End
 
 Crst GCCover
-    AcquiredBefore LoaderHeap ReJITDomainTable
+    AcquiredBefore LoaderHeap CodeVersioning
 End
 
 Crst GCMemoryPressure
@@ -486,7 +486,7 @@ Crst Reflection
 End
 
 // Used to synchronize all rejit information stored in a given AppDomain.
-Crst ReJITDomainTable
+Crst CodeVersioning
     AcquiredBefore LoaderHeap SingleUseLock DeadlockDetection JumpStubCache DebuggerController FuncPtrStubs
     AcquiredAfter ReJITGlobalRequest ThreadStore GlobalStrLiteralMap SystemDomain DebuggerMutex MethodDescBackpatchInfoTracker
 End
@@ -495,7 +495,7 @@ End
 // new functions to rejit tables, or request Reverts on existing functions in the rejit
 // tables.  One of these crsts exist per runtime.
 Crst ReJITGlobalRequest
-    AcquiredBefore ThreadStore ReJITDomainTable SystemDomain JitInlineTrackingMap
+    AcquiredBefore ThreadStore CodeVersioning SystemDomain JitInlineTrackingMap
 End
 
 // ETW infrastructure uses this crst to protect a hash table of TypeHandles which is
@@ -679,7 +679,7 @@ Crst InlineTrackingMap
 End
 
 Crst JitInlineTrackingMap
-    AcquiredBefore ReJITDomainTable ThreadStore LoaderAllocator
+    AcquiredBefore CodeVersioning ThreadStore LoaderAllocator
 End
 
 Crst EventPipe
@@ -695,6 +695,7 @@ Crst ReadyToRunEntryPointToMethodDescMap
 End
 
 Crst TieredCompilation
+    AcquiredAfter CodeVersioning
     AcquiredBefore ThreadpoolTimerQueue
 End
 
index 77ecde7..25a447c 100644 (file)
@@ -633,10 +633,17 @@ RETAIL_CONFIG_DWORD_INFO(INTERNAL_HillClimbing_GainExponent,
 RETAIL_CONFIG_DWORD_INFO(EXTERNAL_TieredCompilation, W("TieredCompilation"), 1, "Enables tiered compilation")
 RETAIL_CONFIG_DWORD_INFO(EXTERNAL_TC_QuickJit, W("TC_QuickJit"), 1, "For methods that would be jitted, enable using quick JIT when appropriate.")
 RETAIL_CONFIG_DWORD_INFO(UNSUPPORTED_TC_QuickJitForLoops, W("TC_QuickJitForLoops"), 0, "When quick JIT is enabled, quick JIT may also be used for methods that contain loops.")
+RETAIL_CONFIG_DWORD_INFO(EXTERNAL_TC_AggressiveTiering, W("TC_AggressiveTiering"), 0, "Transition through tiers aggressively.")
 RETAIL_CONFIG_DWORD_INFO(INTERNAL_TC_CallCountThreshold, W("TC_CallCountThreshold"), 30, "Number of times a method must be called in tier 0 after which it is promoted to the next tier.")
 RETAIL_CONFIG_DWORD_INFO(INTERNAL_TC_CallCountingDelayMs, W("TC_CallCountingDelayMs"), 100, "A perpetual delay in milliseconds that is applied call counting in tier 0 and jitting at higher tiers, while there is startup-like activity.")
 RETAIL_CONFIG_DWORD_INFO(INTERNAL_TC_DelaySingleProcMultiplier, W("TC_DelaySingleProcMultiplier"), 10, "Multiplier for TC_CallCountingDelayMs that is applied on a single-processor machine or when the process is affinitized to a single processor.")
 RETAIL_CONFIG_DWORD_INFO(INTERNAL_TC_CallCounting, W("TC_CallCounting"), 1, "Enabled by default (only activates when TieredCompilation is also enabled). If disabled immediately backpatches prestub, and likely prevents any promotion to higher tiers")
+RETAIL_CONFIG_DWORD_INFO(INTERNAL_TC_UseCallCountingStubs, W("TC_UseCallCountingStubs"), 1, "Uses call counting stubs for faster call counting.")
+#ifdef _DEBUG
+RETAIL_CONFIG_DWORD_INFO(INTERNAL_TC_DeleteCallCountingStubsAfter, W("TC_DeleteCallCountingStubsAfter"), 1, "Deletes call counting stubs after this many have completed. Zero to disable deleting.")
+#else
+RETAIL_CONFIG_DWORD_INFO(INTERNAL_TC_DeleteCallCountingStubsAfter, W("TC_DeleteCallCountingStubsAfter"), 4096, "Deletes call counting stubs after this many have completed. Zero to disable deleting.")
+#endif
 #endif
 
 ///
index f497a54..79cf46b 100644 (file)
@@ -35,97 +35,97 @@ enum CrstType
     CrstCLRPrivBinderMaps = 16,
     CrstCLRPrivBinderMapsAdd = 17,
     CrstCodeFragmentHeap = 18,
-    CrstCOMCallWrapper = 19,
-    CrstCOMWrapperCache = 20,
-    CrstConnectionNameTable = 21,
-    CrstContexts = 22,
-    CrstCrstCLRPrivBinderLocalWinMDPath = 23,
-    CrstCSPCache = 24,
-    CrstDataTest1 = 25,
-    CrstDataTest2 = 26,
-    CrstDbgTransport = 27,
-    CrstDeadlockDetection = 28,
-    CrstDebuggerController = 29,
-    CrstDebuggerFavorLock = 30,
-    CrstDebuggerHeapExecMemLock = 31,
-    CrstDebuggerHeapLock = 32,
-    CrstDebuggerJitInfo = 33,
-    CrstDebuggerMutex = 34,
-    CrstDelegateToFPtrHash = 35,
-    CrstDomainLocalBlock = 36,
-    CrstDynamicIL = 37,
-    CrstDynamicMT = 38,
-    CrstDynLinkZapItems = 39,
-    CrstEtwTypeLogHash = 40,
-    CrstEventPipe = 41,
-    CrstEventStore = 42,
-    CrstException = 43,
-    CrstExecuteManLock = 44,
-    CrstExecuteManRangeLock = 45,
-    CrstFCall = 46,
-    CrstFriendAccessCache = 47,
-    CrstFuncPtrStubs = 48,
-    CrstFusionAppCtx = 49,
-    CrstGCCover = 50,
-    CrstGCMemoryPressure = 51,
-    CrstGlobalStrLiteralMap = 52,
-    CrstHandleTable = 53,
-    CrstHostAssemblyMap = 54,
-    CrstHostAssemblyMapAdd = 55,
-    CrstIbcProfile = 56,
-    CrstIJWFixupData = 57,
-    CrstIJWHash = 58,
-    CrstILStubGen = 59,
-    CrstInlineTrackingMap = 60,
-    CrstInstMethodHashTable = 61,
-    CrstInterfaceVTableMap = 62,
-    CrstInterop = 63,
-    CrstInteropData = 64,
-    CrstIOThreadpoolWorker = 65,
-    CrstIsJMCMethod = 66,
-    CrstISymUnmanagedReader = 67,
-    CrstJit = 68,
-    CrstJitGenericHandleCache = 69,
-    CrstJitInlineTrackingMap = 70,
-    CrstJitPerf = 71,
-    CrstJumpStubCache = 72,
-    CrstLeafLock = 73,
-    CrstListLock = 74,
-    CrstLoaderAllocator = 75,
-    CrstLoaderAllocatorReferences = 76,
-    CrstLoaderHeap = 77,
-    CrstMda = 78,
-    CrstMetadataTracker = 79,
-    CrstMethodDescBackpatchInfoTracker = 80,
-    CrstModIntPairList = 81,
-    CrstModule = 82,
-    CrstModuleFixup = 83,
-    CrstModuleLookupTable = 84,
-    CrstMulticoreJitHash = 85,
-    CrstMulticoreJitManager = 86,
-    CrstMUThunkHash = 87,
-    CrstNativeBinderInit = 88,
-    CrstNativeImageCache = 89,
-    CrstNls = 90,
-    CrstNotifyGdb = 91,
-    CrstObjectList = 92,
-    CrstOnEventManager = 93,
-    CrstPatchEntryPoint = 94,
-    CrstPEImage = 95,
-    CrstPEImagePDBStream = 96,
-    CrstPendingTypeLoadEntry = 97,
-    CrstPinHandle = 98,
-    CrstPinnedByrefValidation = 99,
-    CrstProfilerGCRefDataFreeList = 100,
-    CrstProfilingAPIStatus = 101,
-    CrstPublisherCertificate = 102,
-    CrstRCWCache = 103,
-    CrstRCWCleanupList = 104,
-    CrstRCWRefCache = 105,
-    CrstReadyToRunEntryPointToMethodDescMap = 106,
-    CrstReDacl = 107,
-    CrstReflection = 108,
-    CrstReJITDomainTable = 109,
+    CrstCodeVersioning = 19,
+    CrstCOMCallWrapper = 20,
+    CrstCOMWrapperCache = 21,
+    CrstConnectionNameTable = 22,
+    CrstContexts = 23,
+    CrstCrstCLRPrivBinderLocalWinMDPath = 24,
+    CrstCSPCache = 25,
+    CrstDataTest1 = 26,
+    CrstDataTest2 = 27,
+    CrstDbgTransport = 28,
+    CrstDeadlockDetection = 29,
+    CrstDebuggerController = 30,
+    CrstDebuggerFavorLock = 31,
+    CrstDebuggerHeapExecMemLock = 32,
+    CrstDebuggerHeapLock = 33,
+    CrstDebuggerJitInfo = 34,
+    CrstDebuggerMutex = 35,
+    CrstDelegateToFPtrHash = 36,
+    CrstDomainLocalBlock = 37,
+    CrstDynamicIL = 38,
+    CrstDynamicMT = 39,
+    CrstDynLinkZapItems = 40,
+    CrstEtwTypeLogHash = 41,
+    CrstEventPipe = 42,
+    CrstEventStore = 43,
+    CrstException = 44,
+    CrstExecuteManLock = 45,
+    CrstExecuteManRangeLock = 46,
+    CrstFCall = 47,
+    CrstFriendAccessCache = 48,
+    CrstFuncPtrStubs = 49,
+    CrstFusionAppCtx = 50,
+    CrstGCCover = 51,
+    CrstGCMemoryPressure = 52,
+    CrstGlobalStrLiteralMap = 53,
+    CrstHandleTable = 54,
+    CrstHostAssemblyMap = 55,
+    CrstHostAssemblyMapAdd = 56,
+    CrstIbcProfile = 57,
+    CrstIJWFixupData = 58,
+    CrstIJWHash = 59,
+    CrstILStubGen = 60,
+    CrstInlineTrackingMap = 61,
+    CrstInstMethodHashTable = 62,
+    CrstInterfaceVTableMap = 63,
+    CrstInterop = 64,
+    CrstInteropData = 65,
+    CrstIOThreadpoolWorker = 66,
+    CrstIsJMCMethod = 67,
+    CrstISymUnmanagedReader = 68,
+    CrstJit = 69,
+    CrstJitGenericHandleCache = 70,
+    CrstJitInlineTrackingMap = 71,
+    CrstJitPerf = 72,
+    CrstJumpStubCache = 73,
+    CrstLeafLock = 74,
+    CrstListLock = 75,
+    CrstLoaderAllocator = 76,
+    CrstLoaderAllocatorReferences = 77,
+    CrstLoaderHeap = 78,
+    CrstMda = 79,
+    CrstMetadataTracker = 80,
+    CrstMethodDescBackpatchInfoTracker = 81,
+    CrstModIntPairList = 82,
+    CrstModule = 83,
+    CrstModuleFixup = 84,
+    CrstModuleLookupTable = 85,
+    CrstMulticoreJitHash = 86,
+    CrstMulticoreJitManager = 87,
+    CrstMUThunkHash = 88,
+    CrstNativeBinderInit = 89,
+    CrstNativeImageCache = 90,
+    CrstNls = 91,
+    CrstNotifyGdb = 92,
+    CrstObjectList = 93,
+    CrstOnEventManager = 94,
+    CrstPatchEntryPoint = 95,
+    CrstPEImage = 96,
+    CrstPEImagePDBStream = 97,
+    CrstPendingTypeLoadEntry = 98,
+    CrstPinHandle = 99,
+    CrstPinnedByrefValidation = 100,
+    CrstProfilerGCRefDataFreeList = 101,
+    CrstProfilingAPIStatus = 102,
+    CrstPublisherCertificate = 103,
+    CrstRCWCache = 104,
+    CrstRCWCleanupList = 105,
+    CrstRCWRefCache = 106,
+    CrstReadyToRunEntryPointToMethodDescMap = 107,
+    CrstReDacl = 108,
+    CrstReflection = 109,
     CrstReJITGlobalRequest = 110,
     CrstRemoting = 111,
     CrstRetThunkCache = 112,
@@ -179,158 +179,158 @@ enum CrstType
 // An array mapping CrstType to level.
 int g_rgCrstLevelMap[] =
 {
-    9,          // CrstAllowedFiles
-    9,          // CrstAppDomainCache
-    13,         // CrstAppDomainHandleTable
-    0,          // CrstArgBasedStubCache
-    0,          // CrstAssemblyDependencyGraph
-    0,          // CrstAssemblyIdentityCache
-    0,          // CrstAssemblyList
-    7,          // CrstAssemblyLoader
-    3,          // CrstAvailableClass
-    4,          // CrstAvailableParamTypes
-    7,          // CrstBaseDomain
-    -1,         // CrstCCompRC
-    9,          // CrstCer
-    12,         // CrstClassFactInfoHash
-    8,          // CrstClassInit
-    -1,         // CrstClrNotification
-    0,          // CrstCLRPrivBinderMaps
-    3,          // CrstCLRPrivBinderMapsAdd
-    6,          // CrstCodeFragmentHeap
-    0,          // CrstCOMCallWrapper
-    4,          // CrstCOMWrapperCache
-    0,          // CrstConnectionNameTable
-    16,         // CrstContexts
-    0,          // CrstCrstCLRPrivBinderLocalWinMDPath
-    7,          // CrstCSPCache
-    3,          // CrstDataTest1
-    0,          // CrstDataTest2
-    0,          // CrstDbgTransport
-    0,          // CrstDeadlockDetection
-    -1,         // CrstDebuggerController
-    3,          // CrstDebuggerFavorLock
-    0,          // CrstDebuggerHeapExecMemLock
-    0,          // CrstDebuggerHeapLock
-    4,          // CrstDebuggerJitInfo
-    10,         // CrstDebuggerMutex
-    0,          // CrstDelegateToFPtrHash
-    15,         // CrstDomainLocalBlock
-    0,          // CrstDynamicIL
-    3,          // CrstDynamicMT
-    3,          // CrstDynLinkZapItems
-    7,          // CrstEtwTypeLogHash
-    17,         // CrstEventPipe
-    0,          // CrstEventStore
-    0,          // CrstException
-    7,          // CrstExecuteManLock
-    0,          // CrstExecuteManRangeLock
-    3,          // CrstFCall
-    7,          // CrstFriendAccessCache
-    7,          // CrstFuncPtrStubs
-    5,          // CrstFusionAppCtx
-    10,         // CrstGCCover
-    0,          // CrstGCMemoryPressure
-    12,         // CrstGlobalStrLiteralMap
-    1,          // CrstHandleTable
-    0,          // CrstHostAssemblyMap
-    3,          // CrstHostAssemblyMapAdd
-    0,          // CrstIbcProfile
-    9,          // CrstIJWFixupData
-    0,          // CrstIJWHash
-    7,          // CrstILStubGen
-    3,          // CrstInlineTrackingMap
-    16,         // CrstInstMethodHashTable
-    0,          // CrstInterfaceVTableMap
-    17,         // CrstInterop
-    4,          // CrstInteropData
-    12,         // CrstIOThreadpoolWorker
-    0,          // CrstIsJMCMethod
-    7,          // CrstISymUnmanagedReader
-    8,          // CrstJit
-    0,          // CrstJitGenericHandleCache
-    15,         // CrstJitInlineTrackingMap
-    -1,         // CrstJitPerf
-    6,          // CrstJumpStubCache
-    0,          // CrstLeafLock
-    -1,         // CrstListLock
-    14,         // CrstLoaderAllocator
-    15,         // CrstLoaderAllocatorReferences
-    0,          // CrstLoaderHeap
-    0,          // CrstMda
-    -1,         // CrstMetadataTracker
-    13,         // CrstMethodDescBackpatchInfoTracker
-    0,          // CrstModIntPairList
-    4,          // CrstModule
-    14,         // CrstModuleFixup
-    3,          // CrstModuleLookupTable
-    0,          // CrstMulticoreJitHash
-    12,         // CrstMulticoreJitManager
-    0,          // CrstMUThunkHash
-    -1,         // CrstNativeBinderInit
-    -1,         // CrstNativeImageCache
-    0,          // CrstNls
-    0,          // CrstNotifyGdb
-    2,          // CrstObjectList
-    0,          // CrstOnEventManager
-    0,          // CrstPatchEntryPoint
-    4,          // CrstPEImage
-    0,          // CrstPEImagePDBStream
-    18,         // CrstPendingTypeLoadEntry
-    0,          // CrstPinHandle
-    0,          // CrstPinnedByrefValidation
-    0,          // CrstProfilerGCRefDataFreeList
-    0,          // CrstProfilingAPIStatus
-    0,          // CrstPublisherCertificate
-    3,          // CrstRCWCache
-    0,          // CrstRCWCleanupList
-    3,          // CrstRCWRefCache
-    4,          // CrstReadyToRunEntryPointToMethodDescMap
-    0,          // CrstReDacl
-    9,          // CrstReflection
-    9,          // CrstReJITDomainTable
-    16,         // CrstReJITGlobalRequest
-    19,         // CrstRemoting
-    3,          // CrstRetThunkCache
-    0,          // CrstRWLock
-    3,          // CrstSavedExceptionInfo
-    0,          // CrstSaveModuleProfileData
-    0,          // CrstSecurityStackwalkCache
-    4,          // CrstSharedAssemblyCreate
-    3,          // CrstSigConvert
-    5,          // CrstSingleUseLock
-    0,          // CrstSpecialStatics
-    0,          // CrstSqmManager
-    0,          // CrstStackSampler
-    -1,         // CrstStressLog
-    0,          // CrstStrongName
-    5,          // CrstStubCache
-    0,          // CrstStubDispatchCache
-    4,          // CrstStubUnwindInfoHeapSegments
-    3,          // CrstSyncBlockCache
-    0,          // CrstSyncHashLock
-    4,          // CrstSystemBaseDomain
-    12,         // CrstSystemDomain
-    0,          // CrstSystemDomainDelayedUnloadList
-    0,          // CrstThreadIdDispenser
-    0,          // CrstThreadpoolEventCache
-    7,          // CrstThreadpoolTimerQueue
-    7,          // CrstThreadpoolWaitThreads
-    12,         // CrstThreadpoolWorker
-    4,          // CrstThreadStaticDataHashTable
-    11,         // CrstThreadStore
-    9,          // CrstTieredCompilation
-    9,          // CrstTPMethodTable
-    3,          // CrstTypeEquivalenceMap
-    7,          // CrstTypeIDMap
-    3,          // CrstUMEntryThunkCache
-    0,          // CrstUMThunkHash
-    3,          // CrstUniqueStack
-    7,          // CrstUnresolvedClassLock
-    3,          // CrstUnwindInfoTableLock
-    3,          // CrstVSDIndirectionCellLock
-    3,          // CrstWinRTFactoryCache
-    3,          // CrstWrapperTemplate
+    9,                 // CrstAllowedFiles
+    9,                 // CrstAppDomainCache
+    14,                        // CrstAppDomainHandleTable
+    0,                 // CrstArgBasedStubCache
+    0,                 // CrstAssemblyDependencyGraph
+    0,                 // CrstAssemblyIdentityCache
+    0,                 // CrstAssemblyList
+    7,                 // CrstAssemblyLoader
+    3,                 // CrstAvailableClass
+    4,                 // CrstAvailableParamTypes
+    7,                 // CrstBaseDomain
+    -1,                        // CrstCCompRC
+    9,                 // CrstCer
+    13,                        // CrstClassFactInfoHash
+    8,                 // CrstClassInit
+    -1,                        // CrstClrNotification
+    0,                 // CrstCLRPrivBinderMaps
+    3,                 // CrstCLRPrivBinderMapsAdd
+    6,                 // CrstCodeFragmentHeap
+    10,                        // CrstCodeVersioning
+    0,                 // CrstCOMCallWrapper
+    4,                 // CrstCOMWrapperCache
+    0,                 // CrstConnectionNameTable
+    17,                        // CrstContexts
+    0,                 // CrstCrstCLRPrivBinderLocalWinMDPath
+    7,                 // CrstCSPCache
+    3,                 // CrstDataTest1
+    0,                 // CrstDataTest2
+    0,                 // CrstDbgTransport
+    0,                 // CrstDeadlockDetection
+    -1,                        // CrstDebuggerController
+    3,                 // CrstDebuggerFavorLock
+    0,                 // CrstDebuggerHeapExecMemLock
+    0,                 // CrstDebuggerHeapLock
+    4,                 // CrstDebuggerJitInfo
+    11,                        // CrstDebuggerMutex
+    0,                 // CrstDelegateToFPtrHash
+    16,                        // CrstDomainLocalBlock
+    0,                 // CrstDynamicIL
+    3,                 // CrstDynamicMT
+    3,                 // CrstDynLinkZapItems
+    7,                 // CrstEtwTypeLogHash
+    18,                        // CrstEventPipe
+    0,                 // CrstEventStore
+    0,                 // CrstException
+    7,                 // CrstExecuteManLock
+    0,                 // CrstExecuteManRangeLock
+    3,                 // CrstFCall
+    7,                 // CrstFriendAccessCache
+    7,                 // CrstFuncPtrStubs
+    5,                 // CrstFusionAppCtx
+    11,                        // CrstGCCover
+    0,                 // CrstGCMemoryPressure
+    13,                        // CrstGlobalStrLiteralMap
+    1,                 // CrstHandleTable
+    0,                 // CrstHostAssemblyMap
+    3,                 // CrstHostAssemblyMapAdd
+    0,                 // CrstIbcProfile
+    9,                 // CrstIJWFixupData
+    0,                 // CrstIJWHash
+    7,                 // CrstILStubGen
+    3,                 // CrstInlineTrackingMap
+    17,                        // CrstInstMethodHashTable
+    0,                 // CrstInterfaceVTableMap
+    18,                        // CrstInterop
+    4,                 // CrstInteropData
+    13,                        // CrstIOThreadpoolWorker
+    0,                 // CrstIsJMCMethod
+    7,                 // CrstISymUnmanagedReader
+    8,                 // CrstJit
+    0,                 // CrstJitGenericHandleCache
+    16,                        // CrstJitInlineTrackingMap
+    -1,                        // CrstJitPerf
+    6,                 // CrstJumpStubCache
+    0,                 // CrstLeafLock
+    -1,                        // CrstListLock
+    15,                        // CrstLoaderAllocator
+    16,                        // CrstLoaderAllocatorReferences
+    0,                 // CrstLoaderHeap
+    0,                 // CrstMda
+    -1,                        // CrstMetadataTracker
+    14,                        // CrstMethodDescBackpatchInfoTracker
+    0,                 // CrstModIntPairList
+    4,                 // CrstModule
+    15,                        // CrstModuleFixup
+    3,                 // CrstModuleLookupTable
+    0,                 // CrstMulticoreJitHash
+    13,                        // CrstMulticoreJitManager
+    0,                 // CrstMUThunkHash
+    -1,                        // CrstNativeBinderInit
+    -1,                        // CrstNativeImageCache
+    0,                 // CrstNls
+    0,                 // CrstNotifyGdb
+    2,                 // CrstObjectList
+    0,                 // CrstOnEventManager
+    0,                 // CrstPatchEntryPoint
+    4,                 // CrstPEImage
+    0,                 // CrstPEImagePDBStream
+    19,                        // CrstPendingTypeLoadEntry
+    0,                 // CrstPinHandle
+    0,                 // CrstPinnedByrefValidation
+    0,                 // CrstProfilerGCRefDataFreeList
+    0,                 // CrstProfilingAPIStatus
+    0,                 // CrstPublisherCertificate
+    3,                 // CrstRCWCache
+    0,                 // CrstRCWCleanupList
+    3,                 // CrstRCWRefCache
+    4,                 // CrstReadyToRunEntryPointToMethodDescMap
+    0,                 // CrstReDacl
+    9,                 // CrstReflection
+    17,                        // CrstReJITGlobalRequest
+    20,                        // CrstRemoting
+    3,                 // CrstRetThunkCache
+    0,                 // CrstRWLock
+    3,                 // CrstSavedExceptionInfo
+    0,                 // CrstSaveModuleProfileData
+    0,                 // CrstSecurityStackwalkCache
+    4,                 // CrstSharedAssemblyCreate
+    3,                 // CrstSigConvert
+    5,                 // CrstSingleUseLock
+    0,                 // CrstSpecialStatics
+    0,                 // CrstSqmManager
+    0,                 // CrstStackSampler
+    -1,                        // CrstStressLog
+    0,                 // CrstStrongName
+    5,                 // CrstStubCache
+    0,                 // CrstStubDispatchCache
+    4,                 // CrstStubUnwindInfoHeapSegments
+    3,                 // CrstSyncBlockCache
+    0,                 // CrstSyncHashLock
+    4,                 // CrstSystemBaseDomain
+    13,                        // CrstSystemDomain
+    0,                 // CrstSystemDomainDelayedUnloadList
+    0,                 // CrstThreadIdDispenser
+    0,                 // CrstThreadpoolEventCache
+    7,                 // CrstThreadpoolTimerQueue
+    7,                 // CrstThreadpoolWaitThreads
+    13,                        // CrstThreadpoolWorker
+    4,                 // CrstThreadStaticDataHashTable
+    12,                        // CrstThreadStore
+    9,                 // CrstTieredCompilation
+    9,                 // CrstTPMethodTable
+    3,                 // CrstTypeEquivalenceMap
+    7,                 // CrstTypeIDMap
+    3,                 // CrstUMEntryThunkCache
+    0,                 // CrstUMThunkHash
+    3,                 // CrstUniqueStack
+    7,                 // CrstUnresolvedClassLock
+    3,                 // CrstUnwindInfoTableLock
+    3,                 // CrstVSDIndirectionCellLock
+    3,                 // CrstWinRTFactoryCache
+    3,                 // CrstWrapperTemplate
 };
 
 // An array mapping CrstType to a stringized name.
@@ -355,6 +355,7 @@ LPCSTR g_rgCrstNameMap[] =
     "CrstCLRPrivBinderMaps",
     "CrstCLRPrivBinderMapsAdd",
     "CrstCodeFragmentHeap",
+    "CrstCodeVersioning",
     "CrstCOMCallWrapper",
     "CrstCOMWrapperCache",
     "CrstConnectionNameTable",
@@ -445,7 +446,6 @@ LPCSTR g_rgCrstNameMap[] =
     "CrstReadyToRunEntryPointToMethodDescMap",
     "CrstReDacl",
     "CrstReflection",
-    "CrstReJITDomainTable",
     "CrstReJITGlobalRequest",
     "CrstRemoting",
     "CrstRetThunkCache",
index 1b397de..0f1cac9 100644 (file)
@@ -95,6 +95,7 @@ DEFINE_DACVAR(ULONG, PTR_JumpStubStubManager, JumpStubStubManager__g_pManager, J
 DEFINE_DACVAR(ULONG, PTR_RangeSectionStubManager, RangeSectionStubManager__g_pManager, RangeSectionStubManager::g_pManager)
 DEFINE_DACVAR(ULONG, PTR_DelegateInvokeStubManager, DelegateInvokeStubManager__g_pManager, DelegateInvokeStubManager::g_pManager)
 DEFINE_DACVAR(ULONG, PTR_VirtualCallStubManagerManager, VirtualCallStubManagerManager__g_pManager, VirtualCallStubManagerManager::g_pManager)
+DEFINE_DACVAR(ULONG, PTR_CallCountingStubManager, CallCountingStubManager__g_pManager, CallCountingStubManager::g_pManager)
 
 DEFINE_DACVAR(ULONG, PTR_ThreadStore, ThreadStore__s_pThreadStore, ThreadStore::s_pThreadStore)
 
index c9ddf52..97d24f6 100644 (file)
@@ -439,17 +439,17 @@ public:
     LoaderHeap(DWORD dwReserveBlockSize,
                DWORD dwCommitBlockSize,
                RangeList *pRangeList = NULL,
-               BOOL fMakeExecutable = FALSE
+               BOOL fMakeExecutable = FALSE,
+               BOOL fUnlocked = FALSE
                )
       : UnlockedLoaderHeap(dwReserveBlockSize,
                            dwCommitBlockSize,
                            NULL, 0,
                            pRangeList,
-                           fMakeExecutable)
+                           fMakeExecutable),
+        m_CriticalSection(fUnlocked ? NULL : CreateLoaderHeapLock())
     {
         WRAPPER_NO_CONTRACT;
-        m_CriticalSection = NULL;
-        m_CriticalSection = CreateLoaderHeapLock();
         m_fExplicitControl = FALSE;
     }
 
@@ -459,18 +459,18 @@ public:
                const BYTE* dwReservedRegionAddress,
                SIZE_T dwReservedRegionSize,
                RangeList *pRangeList = NULL,
-               BOOL fMakeExecutable = FALSE
+               BOOL fMakeExecutable = FALSE,
+               BOOL fUnlocked = FALSE
                )
       : UnlockedLoaderHeap(dwReserveBlockSize,
                            dwCommitBlockSize,
                            dwReservedRegionAddress,
                            dwReservedRegionSize,
                            pRangeList,
-                           fMakeExecutable)
+                           fMakeExecutable),
+        m_CriticalSection(fUnlocked ? NULL : CreateLoaderHeapLock())
     {
         WRAPPER_NO_CONTRACT;
-        m_CriticalSection = NULL;
-        m_CriticalSection = CreateLoaderHeapLock();
         m_fExplicitControl = FALSE;
     }
 
index 6452140..d71d721 100644 (file)
@@ -117,10 +117,10 @@ class DefaultSHashTraits
 
     static const bool s_supports_remove = true;
 
-    static ELEMENT Null() { return (const ELEMENT) 0; }
-    static ELEMENT Deleted() { return (const ELEMENT) -1; }
-    static bool IsNull(const ELEMENT &e) { return e == (const ELEMENT) 0; }
-    static bool IsDeleted(const ELEMENT &e) { return e == (const ELEMENT) -1; }
+    static ELEMENT Null() { return (ELEMENT)(TADDR)0; }
+    static ELEMENT Deleted() { return (ELEMENT)(TADDR)-1; }
+    static bool IsNull(const ELEMENT &e) { return e == (ELEMENT)(TADDR)0; }
+    static bool IsDeleted(const ELEMENT &e) { return e == (ELEMENT)(TADDR)-1; }
 
     static inline void OnDestructPerEntryCleanupAction(const ELEMENT& e) { /* Do nothing */ }
     static const bool s_DestructPerEntryCleanupAction = false;
@@ -219,6 +219,10 @@ class SHash : public TRAITS
 
     count_t GetCount() const;
 
+    // Return the number of elements allocated in the table
+
+    count_t GetCapacity() const;
+
     // Resizes a hash table for growth.  The new size is computed based
     // on the current population, growth factor, and maximum density factor.
 
index 88e07a3..d0d6e1a 100644 (file)
@@ -52,6 +52,14 @@ typename SHash<TRAITS>::count_t SHash<TRAITS>::GetCount() const
 }
 
 template <typename TRAITS>
+typename SHash<TRAITS>::count_t SHash<TRAITS>::GetCapacity() const
+{
+    LIMITED_METHOD_CONTRACT;
+
+    return m_tableMax;
+}
+
+template <typename TRAITS>
 typename SHash<TRAITS>::element_t SHash<TRAITS>::Lookup(key_t key) const
 {
     CONTRACT(element_t)
index ee87d78..e3ee771 100644 (file)
@@ -40,6 +40,7 @@ VPTR_CLASS(ILStubManager)
 VPTR_CLASS(InteropDispatchStubManager)
 VPTR_CLASS(DelegateInvokeStubManager)
 VPTR_CLASS(TailCallStubManager)
+VPTR_CLASS(CallCountingStubManager)
 VPTR_CLASS(PEFile)
 VPTR_CLASS(PEAssembly)
 VPTR_CLASS(PEImageLayout)
@@ -84,6 +85,7 @@ VPTR_CLASS(ResumableFrame)
 VPTR_CLASS(RedirectedThreadFrame)
 #endif
 VPTR_CLASS(StubDispatchFrame)
+VPTR_CLASS(CallCountingHelperFrame)
 VPTR_CLASS(ExternalMethodFrame)
 #ifdef FEATURE_READYTORUN
 VPTR_CLASS(DynamicHelperFrame)
index 5e8978b..9ca927d 100644 (file)
@@ -41,7 +41,7 @@ set(VM_SOURCES_DAC_AND_WKS_COMMON
     baseassemblyspec.cpp
     binder.cpp
     castcache.cpp
-    callcounter.cpp
+    callcounting.cpp
     ceeload.cpp
     class.cpp
     classhash.cpp
@@ -143,6 +143,7 @@ set(VM_HEADERS_DAC_AND_WKS_COMMON
     baseassemblyspec.inl
     binder.h
     castcache.h
+    callcounting.h
     ceeload.h
     ceeload.inl
     class.h
@@ -409,7 +410,6 @@ set(VM_HEADERS_WKS
     assemblyspec.hpp
     assemblyspecbase.h
     cachelinealloc.h
-    callcounter.h
     callhelpers.h
     callsiteinspect.h
     ceemain.h
index 32b7ab6..f65c6f0 100644 (file)
@@ -760,5 +760,28 @@ NullObject:
 
 LEAF_END SinglecastDelegateInvokeStub, _TEXT
 
+ifdef FEATURE_TIERED_COMPILATION
+
+extern OnCallCountThresholdReached:proc
+
+LEAF_ENTRY OnCallCountThresholdReachedStub, _TEXT
+        ; Pop the return address (the stub-identifying token) into a non-argument volatile register that can be trashed
+        pop     rax
+        jmp     OnCallCountThresholdReachedStub2
+LEAF_END OnCallCountThresholdReachedStub, _TEXT
+
+NESTED_ENTRY OnCallCountThresholdReachedStub2, _TEXT
+        PROLOG_WITH_TRANSITION_BLOCK
+
+        lea     rcx, [rsp + __PWTB_TransitionBlock] ; TransitionBlock *
+        mov     rdx, rax ; stub-identifying token, see OnCallCountThresholdReachedStub
+        call    OnCallCountThresholdReached
+
+        EPILOG_WITH_TRANSITION_BLOCK_TAILCALL
+        TAILJMP_RAX
+NESTED_END OnCallCountThresholdReachedStub2, _TEXT
+
+endif ; FEATURE_TIERED_COMPILATION
+
         end
 
index 53f7323..3638271 100644 (file)
@@ -530,4 +530,334 @@ inline BOOL ClrFlushInstructionCache(LPCVOID pCodeAddr, size_t sizeOfCode)
 
 #define JIT_Stelem_Ref              JIT_Stelem_Ref
 
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// Call counting
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+#define DISABLE_COPY(T) \
+    T(const T &) = delete; \
+    T &operator =(const T &) = delete
+
+typedef UINT16 CallCount;
+typedef DPTR(CallCount) PTR_CallCount;
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub
+
+class CallCountingStub;
+typedef DPTR(const CallCountingStub) PTR_CallCountingStub;
+
+class CallCountingStub
+{
+public:
+    static const SIZE_T Alignment = sizeof(void *);
+
+#ifndef DACCESS_COMPILE
+protected:
+    static const PCODE TargetForThresholdReached;
+
+    CallCountingStub() = default;
+
+public:
+    static const CallCountingStub *From(TADDR stubIdentifyingToken);
+
+    PCODE GetEntryPoint() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return PINSTRToPCODE((TADDR)this);
+    }
+#endif // !DACCESS_COMPILE
+
+public:
+    PTR_CallCount GetRemainingCallCountCell() const;
+    PCODE GetTargetForMethod() const;
+
+#ifndef DACCESS_COMPILE
+protected:
+    template<class T> static INT_PTR GetRelativeOffset(const T *relRef, PCODE target)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(T) != 0);
+        static_assert_no_msg(sizeof(T) <= sizeof(void *));
+        static_assert_no_msg((sizeof(T) & (sizeof(T) - 1)) == 0); // is a power of 2
+        _ASSERTE(relRef != nullptr);
+
+        TADDR targetAddress = PCODEToPINSTR(target);
+        _ASSERTE(targetAddress != NULL);
+        return (INT_PTR)targetAddress - (INT_PTR)(relRef + 1);
+    }
+#endif
+
+protected:
+    template<class T> static PCODE GetTarget(const T *relRef)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(T) == 1 || sizeof(T) == 2 || sizeof(T) == 4 || sizeof(T) == 8);
+        _ASSERTE(relRef != nullptr);
+
+        return PINSTRToPCODE((INT_PTR)(relRef + 1) + *relRef);
+    }
+
+    DISABLE_COPY(CallCountingStub);
+};
+
+////////////////////////////////////////////////////////////////
+// CallCountingStubShort
+
+class CallCountingStubShort;
+typedef DPTR(const CallCountingStubShort) PTR_CallCountingStubShort;
+class CallCountingStubLong;
+typedef DPTR(const CallCountingStubLong) PTR_CallCountingStubLong;
+
+#pragma pack(push, 1)
+class CallCountingStubShort : public CallCountingStub
+{
+private:
+    const UINT8 m_part0[2];
+    CallCount *const m_remainingCallCountCell;
+    const UINT8 m_part1[5];
+    const INT32 m_rel32TargetForMethod;
+    const UINT8 m_part2[1];
+    const INT32 m_rel32TargetForThresholdReached;
+    const UINT8 m_alignmentPadding[0];
+
+#ifndef DACCESS_COMPILE
+public:
+    CallCountingStubShort(CallCount *remainingCallCountCell, PCODE targetForMethod)
+        : m_part0{                                              0x48, 0xb8},            //     mov  rax,
+        m_remainingCallCountCell(remainingCallCountCell),                               //               <imm64>
+        m_part1{                                                0x66, 0xff, 0x08,       //     dec  word ptr [rax]
+                                                                0x0f, 0x85},            //     jnz  
+        m_rel32TargetForMethod(                                                         //          <rel32>
+            GetRelative32BitOffset(
+                &m_rel32TargetForMethod,
+                targetForMethod)),
+        m_part2{                                                0xe8},                  //     call
+        m_rel32TargetForThresholdReached(                                               //          <rel32>
+            GetRelative32BitOffset(
+                &m_rel32TargetForThresholdReached,
+                TargetForThresholdReached)),
+                                                                                        // (rip == stub-identifying token)
+        m_alignmentPadding{}
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(CallCountingStubShort) % Alignment == 0);
+        _ASSERTE(remainingCallCountCell != nullptr);
+        _ASSERTE(PCODEToPINSTR(targetForMethod) != NULL);
+    }
+
+    static bool Is(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg((offsetof(CallCountingStubShort, m_alignmentPadding[0]) & 1) == 0);
+
+        return (stubIdentifyingToken & 1) == 0;
+    }
+
+    static const CallCountingStubShort *From(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(stubIdentifyingToken));
+        _ASSERTE(stubIdentifyingToken % Alignment == offsetof(CallCountingStubShort, m_alignmentPadding[0]) % Alignment);
+
+        const CallCountingStubShort *stub =
+            (const CallCountingStubShort *)(stubIdentifyingToken - offsetof(CallCountingStubShort, m_alignmentPadding[0]));
+        _ASSERTE(IS_ALIGNED(stub, Alignment));
+        return stub;
+    }
+#endif // !DACCESS_COMPILE
+
+public:
+    static bool Is(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        return dac_cast<PTR_CallCountingStubShort>(callCountingStub)->m_part1[4] == 0x85;
+    }
+
+    static PTR_CallCountingStubShort From(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(callCountingStub));
+
+        return dac_cast<PTR_CallCountingStubShort>(callCountingStub);
+    }
+
+    PCODE GetTargetForMethod() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return GetTarget(&m_rel32TargetForMethod);
+    }
+
+#ifndef DACCESS_COMPILE
+private:
+    static bool CanUseRelative32BitOffset(const INT32 *rel32Ref, PCODE target)
+    {
+        WRAPPER_NO_CONTRACT;
+
+        INT_PTR relativeOffset = GetRelativeOffset(rel32Ref, target);
+        return (INT32)relativeOffset == relativeOffset;
+    }
+
+public:
+    static bool CanUseFor(const void *allocationAddress, PCODE targetForMethod)
+    {
+        WRAPPER_NO_CONTRACT;
+
+        const CallCountingStubShort *fakeStub = (const CallCountingStubShort *)allocationAddress;
+        return
+            CanUseRelative32BitOffset(&fakeStub->m_rel32TargetForMethod, targetForMethod) &&
+            CanUseRelative32BitOffset(&fakeStub->m_rel32TargetForThresholdReached, TargetForThresholdReached);
+    }
+
+private:
+    static INT32 GetRelative32BitOffset(const INT32 *rel32Ref, PCODE target)
+    {
+        WRAPPER_NO_CONTRACT;
+
+        INT_PTR relativeOffset = GetRelativeOffset(rel32Ref, target);
+        _ASSERTE((INT32)relativeOffset == relativeOffset);
+        return (INT32)relativeOffset;
+    }
+#endif // !DACCESS_COMPILE
+
+    friend CallCountingStub;
+    friend CallCountingStubLong;
+    DISABLE_COPY(CallCountingStubShort);
+};
+#pragma pack(pop)
+
+////////////////////////////////////////////////////////////////
+// CallCountingStubLong
+
+#pragma pack(push, 1)
+class CallCountingStubLong : public CallCountingStub
+{
+private:
+    const UINT8 m_part0[2];
+    CallCount *const m_remainingCallCountCell;
+    const UINT8 m_part1[7];
+    const PCODE m_targetForMethod;
+    const UINT8 m_part2[4];
+    const PCODE m_targetForThresholdReached;
+    const UINT8 m_part3[2];
+    const UINT8 m_alignmentPadding[1];
+
+#ifndef DACCESS_COMPILE
+public:
+    CallCountingStubLong(CallCount *remainingCallCountCell, PCODE targetForMethod)
+        : m_part0{                                              0x48, 0xb8},            //     mov  rax,
+        m_remainingCallCountCell(remainingCallCountCell),                               //               <imm64>
+        m_part1{                                                0x66, 0xff, 0x08,       //     dec  word ptr [rax]
+                                                                0x74, 0x0c,             //     jz   L0
+                                                                0x48, 0xb8},            //     mov  rax,
+        m_targetForMethod(targetForMethod),                                             //               <imm64>
+        m_part2{                                                0xff, 0xe0,             //     jmp  rax
+                                                                0x48, 0xb8},            // L0: mov  rax,
+        m_targetForThresholdReached(TargetForThresholdReached),                         //               <imm64>
+        m_part3{                                                0xff, 0xd0},            //     call rax
+                                                                                        // (rip == stub-identifying token)
+        m_alignmentPadding{                                     0xcc}                   //     int  3
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(CallCountingStubLong) % Alignment == 0);
+        static_assert_no_msg(sizeof(CallCountingStubLong) > sizeof(CallCountingStubShort));
+        _ASSERTE(remainingCallCountCell != nullptr);
+        _ASSERTE(PCODEToPINSTR(targetForMethod) != NULL);
+    }
+
+    static bool Is(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg((offsetof(CallCountingStubLong, m_alignmentPadding[0]) & 1) != 0);
+
+        return (stubIdentifyingToken & 1) != 0;
+    }
+
+    static const CallCountingStubLong *From(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(stubIdentifyingToken));
+        _ASSERTE(stubIdentifyingToken % Alignment == offsetof(CallCountingStubLong, m_alignmentPadding[0]) % Alignment);
+
+        const CallCountingStubLong *stub =
+            (const CallCountingStubLong *)(stubIdentifyingToken - offsetof(CallCountingStubLong, m_alignmentPadding[0]));
+        _ASSERTE(IS_ALIGNED(stub, Alignment));
+        return stub;
+    }
+#endif // !DACCESS_COMPILE
+
+public:
+    static bool Is(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(offsetof(CallCountingStubShort, m_part1[4]) == offsetof(CallCountingStubLong, m_part1[4]));
+        static_assert_no_msg(sizeof(CallCountingStubShort::m_part1[4]) == sizeof(CallCountingStubLong::m_part1[4]));
+
+        return dac_cast<PTR_CallCountingStubLong>(callCountingStub)->m_part1[4] == 0x0c;
+    }
+
+    static PTR_CallCountingStubLong From(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(callCountingStub));
+
+        return dac_cast<PTR_CallCountingStubLong>(callCountingStub);
+    }
+
+    PCODE GetTargetForMethod() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return m_targetForMethod;
+    }
+
+    friend CallCountingStub;
+    DISABLE_COPY(CallCountingStubLong);
+};
+#pragma pack(pop)
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub definitions
+
+#ifndef DACCESS_COMPILE
+inline const CallCountingStub *CallCountingStub::From(TADDR stubIdentifyingToken)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(stubIdentifyingToken != NULL);
+
+    return
+        CallCountingStubShort::Is(stubIdentifyingToken)
+            ? (const CallCountingStub *)CallCountingStubShort::From(stubIdentifyingToken)
+            : (const CallCountingStub *)CallCountingStubLong::From(stubIdentifyingToken);
+}
+#endif
+
+inline PTR_CallCount CallCountingStub::GetRemainingCallCountCell() const
+{
+    WRAPPER_NO_CONTRACT;
+    static_assert_no_msg(
+        offsetof(CallCountingStubShort, m_remainingCallCountCell) ==
+        offsetof(CallCountingStubLong, m_remainingCallCountCell));
+
+    return PTR_CallCount(dac_cast<PTR_CallCountingStubShort>(this)->m_remainingCallCountCell);
+}
+
+inline PCODE CallCountingStub::GetTargetForMethod() const
+{
+    WRAPPER_NO_CONTRACT;
+
+    return
+        CallCountingStubShort::Is(PTR_CallCountingStub(this))
+            ? CallCountingStubShort::From(PTR_CallCountingStub(this))->GetTargetForMethod()
+            : CallCountingStubLong::From(PTR_CallCountingStub(this))->GetTargetForMethod();
+}
+
+////////////////////////////////////////////////////////////////
+
+#undef DISABLE_COPY
+
+#endif // FEATURE_TIERED_COMPILATION
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+
 #endif // __cgencpu_h__
index e7a983b..7ec5640 100644 (file)
@@ -229,3 +229,24 @@ NullObject:
         jmp     C_FUNC(JIT_InternalThrow)
 
 LEAF_END SinglecastDelegateInvokeStub, _TEXT
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+LEAF_ENTRY OnCallCountThresholdReachedStub, _TEXT
+        // Pop the return address (the stub-identifying token) into a non-argument volatile register that can be trashed
+        pop     rax
+        jmp     C_FUNC(OnCallCountThresholdReachedStub2)
+LEAF_END OnCallCountThresholdReachedStub, _TEXT
+
+NESTED_ENTRY OnCallCountThresholdReachedStub2, _TEXT, NoHandler
+        PROLOG_WITH_TRANSITION_BLOCK
+
+        lea     rdi, [rsp + __PWTB_TransitionBlock] // TransitionBlock *
+        mov     rsi, rax // stub-identifying token, see OnCallCountThresholdReachedStub
+        call    C_FUNC(OnCallCountThresholdReached)
+
+        EPILOG_WITH_TRANSITION_BLOCK_TAILCALL
+        TAILJMP_RAX
+NESTED_END OnCallCountThresholdReachedStub2, _TEXT
+
+#endif // FEATURE_TIERED_COMPILATION
index 0a61d8e..f3e7505 100644 (file)
@@ -669,11 +669,6 @@ BaseDomain::BaseDomain()
     m_JITLock.PreInit();
     m_ClassInitLock.PreInit();
     m_ILStubGenLock.PreInit();
-
-#ifdef FEATURE_CODE_VERSIONING
-    m_codeVersionManager.PreInit();
-#endif
-
 } //BaseDomain::BaseDomain
 
 //*****************************************************************************
@@ -1567,10 +1562,11 @@ void SystemDomain::Attach()
     ILStubManager::Init();
     InteropDispatchStubManager::Init();
     StubLinkStubManager::Init();
-
     ThunkHeapStubManager::Init();
-
     TailCallStubManager::Init();
+#ifdef FEATURE_TIERED_COMPILATION
+    CallCountingStubManager::Init();
+#endif
 
     PerAppDomainTPCountList::InitAppDomainIndexList();
 #endif // CROSSGEN_COMPILE
index 68e6f08..aa8bdca 100644 (file)
@@ -1485,3 +1485,19 @@ ProbeLoop:
         EPILOG_POP "{r7}"
         EPILOG_BRANCH_REG lr
         NESTED_END JIT_StackProbe, _TEXT
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+    NESTED_ENTRY OnCallCountThresholdReachedStub, _TEXT, NoHandler
+        PROLOG_WITH_TRANSITION_BLOCK
+
+        add     r0, sp, #__PWTB_TransitionBlock // TransitionBlock *
+        mov     r1, r12 // stub-identifying token
+        bl      C_FUNC(OnCallCountThresholdReached)
+        mov     r12, r0
+
+        EPILOG_WITH_TRANSITION_BLOCK_TAILCALL
+        EPILOG_BRANCH_REG r12
+    NESTED_END OnCallCountThresholdReachedStub, _TEXT
+
+#endif // FEATURE_TIERED_COMPILATION
index 1df5420..4454626 100644 (file)
@@ -2180,5 +2180,23 @@ ProbeLoop
     EPILOG_BRANCH_REG lr
     NESTED_END
 
+#ifdef FEATURE_TIERED_COMPILATION
+
+    IMPORT OnCallCountThresholdReached
+
+    NESTED_ENTRY OnCallCountThresholdReachedStub
+        PROLOG_WITH_TRANSITION_BLOCK
+
+        add     r0, sp, #__PWTB_TransitionBlock ; TransitionBlock *
+        mov     r1, r12 ; stub-identifying token
+        bl      OnCallCountThresholdReached
+        mov     r12, r0
+
+        EPILOG_WITH_TRANSITION_BLOCK_TAILCALL
+        EPILOG_BRANCH_REG r12
+    NESTED_END
+
+#endif ; FEATURE_TIERED_COMPILATION
+
 ; Must be at very end of file
     END
index 507be47..98a00f1 100644 (file)
@@ -1340,4 +1340,166 @@ inline size_t GetARMInstructionLength(PBYTE pInstr)
     return GetARMInstructionLength(*(WORD*)pInstr);
 }
 
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// Call counting
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+#define DISABLE_COPY(T) \
+    T(const T &) = delete; \
+    T &operator =(const T &) = delete
+
+typedef UINT16 CallCount;
+typedef DPTR(CallCount) PTR_CallCount;
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub
+
+class CallCountingStub;
+typedef DPTR(const CallCountingStub) PTR_CallCountingStub;
+
+class CallCountingStub
+{
+public:
+    static const SIZE_T Alignment = sizeof(void *);
+
+#ifndef DACCESS_COMPILE
+protected:
+    static const PCODE TargetForThresholdReached;
+
+    CallCountingStub() = default;
+
+public:
+    static const CallCountingStub *From(TADDR stubIdentifyingToken);
+
+    PCODE GetEntryPoint() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return PINSTRToPCODE((TADDR)this);
+    }
+#endif
+
+public:
+    PTR_CallCount GetRemainingCallCountCell() const;
+    PCODE GetTargetForMethod() const;
+
+    DISABLE_COPY(CallCountingStub);
+};
+
+////////////////////////////////////////////////////////////////
+// CallCountingStubShort
+
+class CallCountingStubShort;
+typedef DPTR(const CallCountingStubShort) PTR_CallCountingStubShort;
+
+#pragma pack(push, 1)
+class CallCountingStubShort : public CallCountingStub
+{
+private:
+    const UINT16 m_part0[16];
+    CallCount *const m_remainingCallCountCell;
+    const PCODE m_targetForMethod;
+    const PCODE m_targetForThresholdReached;
+
+#ifndef DACCESS_COMPILE
+public:
+    CallCountingStubShort(CallCount *remainingCallCountCell, PCODE targetForMethod)
+        : m_part0{  0xb401,                 //     push {r0}
+                    0xf8df, 0xc01c,         //     ldr  r12, [pc, #(m_remainingCallCountCell)]
+                    0xf8bc, 0x0000,         //     ldrh r0, [r12]
+                    0x1e40,                 //     subs r0, r0, #1
+                    0xf8ac, 0x0000,         //     strh r0, [r12]
+                    0xbc01,                 //     pop  {r0}
+                    0xd001,                 //     beq  L0
+                    0xf8df, 0xf00c,         //     ldr  pc, [pc, #(m_targetForMethod)]
+                    0xf2af, 0x0c1c,         // L0: adr  r12, #(this)
+                                            // (r12 == stub-identifying token == this)
+                    0xf8df, 0xf008},        //     ldr  pc, [pc, #(m_targetForThresholdReached)]
+        m_remainingCallCountCell(remainingCallCountCell),
+        m_targetForMethod(targetForMethod),
+        m_targetForThresholdReached(TargetForThresholdReached)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(CallCountingStubShort) % Alignment == 0);
+        _ASSERTE(remainingCallCountCell != nullptr);
+        _ASSERTE(PCODEToPINSTR(targetForMethod) != NULL);
+    }
+
+    static bool Is(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        return true;
+    }
+
+    static const CallCountingStubShort *From(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(stubIdentifyingToken));
+
+        const CallCountingStubShort *stub = (const CallCountingStubShort *)stubIdentifyingToken;
+        _ASSERTE(IS_ALIGNED(stub, Alignment));
+        return stub;
+    }
+#endif
+
+public:
+    static bool Is(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        return true;
+    }
+
+    static PTR_CallCountingStubShort From(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(callCountingStub));
+
+        return dac_cast<PTR_CallCountingStubShort>(callCountingStub);
+    }
+
+    PCODE GetTargetForMethod() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return m_targetForMethod;
+    }
+
+    friend CallCountingStub;
+    DISABLE_COPY(CallCountingStubShort);
+};
+#pragma pack(pop)
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub definitions
+
+#ifndef DACCESS_COMPILE
+inline const CallCountingStub *CallCountingStub::From(TADDR stubIdentifyingToken)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(stubIdentifyingToken != NULL);
+
+    return CallCountingStubShort::From(stubIdentifyingToken);
+}
+#endif
+
+inline PTR_CallCount CallCountingStub::GetRemainingCallCountCell() const
+{
+    WRAPPER_NO_CONTRACT;
+    return PTR_CallCount(dac_cast<PTR_CallCountingStubShort>(this)->m_remainingCallCountCell);
+}
+
+inline PCODE CallCountingStub::GetTargetForMethod() const
+{
+    WRAPPER_NO_CONTRACT;
+    return CallCountingStubShort::From(PTR_CallCountingStub(this))->GetTargetForMethod();
+}
+
+////////////////////////////////////////////////////////////////
+
+#undef DISABLE_COPY
+
+#endif // FEATURE_TIERED_COMPILATION
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+
 #endif // __cgencpu_h__
index 2ebcc6e..21b55ea 100644 (file)
@@ -1432,3 +1432,19 @@ GenerateProfileHelper ProfileLeave, PROFILE_LEAVE
 GenerateProfileHelper ProfileTailcall, PROFILE_TAILCALL
 
 #endif
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+NESTED_ENTRY OnCallCountThresholdReachedStub, _TEXT, NoHandler
+    PROLOG_WITH_TRANSITION_BLOCK
+
+    add     x0, sp, #__PWTB_TransitionBlock // TransitionBlock *
+    mov     x1, x10 // stub-identifying token
+    bl      C_FUNC(OnCallCountThresholdReached)
+    mov     x9, x0
+
+    EPILOG_WITH_TRANSITION_BLOCK_TAILCALL
+    EPILOG_BRANCH_REG x9
+NESTED_END OnCallCountThresholdReachedStub, _TEXT
+
+#endif // FEATURE_TIERED_COMPILATION
index 5648fa2..bf730eb 100644 (file)
@@ -1679,5 +1679,23 @@ __HelperNakedFuncName SETS "$helper":CC:"Naked"
 
 #endif
 
+#ifdef FEATURE_TIERED_COMPILATION
+
+    IMPORT OnCallCountThresholdReached
+
+    NESTED_ENTRY OnCallCountThresholdReachedStub
+        PROLOG_WITH_TRANSITION_BLOCK
+
+        add     x0, sp, #__PWTB_TransitionBlock ; TransitionBlock *
+        mov     x1, x10 ; stub-identifying token
+        bl      OnCallCountThresholdReached
+        mov     x9, x0
+
+        EPILOG_WITH_TRANSITION_BLOCK_TAILCALL
+        EPILOG_BRANCH_REG x9
+    NESTED_END
+
+#endif ; FEATURE_TIERED_COMPILATION
+
 ; Must be at very end of file
     END
index babda93..f29e045 100644 (file)
@@ -798,4 +798,165 @@ struct ThisPtrRetBufPrecode {
 };
 typedef DPTR(ThisPtrRetBufPrecode) PTR_ThisPtrRetBufPrecode;
 
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// Call counting
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+#define DISABLE_COPY(T) \
+    T(const T &) = delete; \
+    T &operator =(const T &) = delete
+
+typedef UINT16 CallCount;
+typedef DPTR(CallCount) PTR_CallCount;
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub
+
+class CallCountingStub;
+typedef DPTR(const CallCountingStub) PTR_CallCountingStub;
+
+class CallCountingStub
+{
+public:
+    static const SIZE_T Alignment = sizeof(void *);
+
+#ifndef DACCESS_COMPILE
+protected:
+    static const PCODE TargetForThresholdReached;
+
+    CallCountingStub() = default;
+
+public:
+    static const CallCountingStub *From(TADDR stubIdentifyingToken);
+
+    PCODE GetEntryPoint() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return PINSTRToPCODE((TADDR)this);
+    }
+#endif // !DACCESS_COMPILE
+
+public:
+    PTR_CallCount GetRemainingCallCountCell() const;
+    PCODE GetTargetForMethod() const;
+
+    DISABLE_COPY(CallCountingStub);
+};
+
+////////////////////////////////////////////////////////////////
+// CallCountingStubShort
+
+class CallCountingStubShort;
+typedef DPTR(const CallCountingStubShort) PTR_CallCountingStubShort;
+
+#pragma pack(push, 1)
+class CallCountingStubShort : public CallCountingStub
+{
+private:
+    const UINT32 m_part0[10];
+    CallCount *const m_remainingCallCountCell;
+    const PCODE m_targetForMethod;
+    const PCODE m_targetForThresholdReached;
+
+#ifndef DACCESS_COMPILE
+public:
+    CallCountingStubShort(CallCount *remainingCallCountCell, PCODE targetForMethod)
+        : m_part0{  0x58000149,             //     ldr  x9, [pc, #(m_remainingCallCountCell)]
+                    0x7940012a,             //     ldrh w10, [x9]
+                    0x7100054a,             //     subs w10, w10, #1
+                    0x7900012a,             //     strh w10, [x9]
+                    0x54000060,             //     beq  L0
+                    0x580000e9,             //     ldr  x9, [pc, #(m_targetForMethod)]
+                    0xd61f0120,             //     br   x9
+                    0x10ffff2a,             // L0: adr  x10, #(this)
+                                            // (x10 == stub-identifying token == this)
+                    0x580000c9,             //     ldr  x9, [pc, #(m_targetForThresholdReached)]
+                    0xd61f0120},            //     br   x9
+        m_remainingCallCountCell(remainingCallCountCell),
+        m_targetForMethod(targetForMethod),
+        m_targetForThresholdReached(TargetForThresholdReached)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(CallCountingStubShort) % Alignment == 0);
+        _ASSERTE(remainingCallCountCell != nullptr);
+        _ASSERTE(PCODEToPINSTR(targetForMethod) != NULL);
+    }
+
+    static bool Is(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        return true;
+    }
+
+    static const CallCountingStubShort *From(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(stubIdentifyingToken));
+
+        const CallCountingStubShort *stub = (const CallCountingStubShort *)stubIdentifyingToken;
+        _ASSERTE(IS_ALIGNED(stub, Alignment));
+        return stub;
+    }
+#endif // !DACCESS_COMPILE
+
+public:
+    static bool Is(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        return true;
+    }
+
+    static PTR_CallCountingStubShort From(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(callCountingStub));
+
+        return dac_cast<PTR_CallCountingStubShort>(callCountingStub);
+    }
+
+    PCODE GetTargetForMethod() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return m_targetForMethod;
+    }
+
+    friend CallCountingStub;
+    DISABLE_COPY(CallCountingStubShort);
+};
+#pragma pack(pop)
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub definitions
+
+#ifndef DACCESS_COMPILE
+inline const CallCountingStub *CallCountingStub::From(TADDR stubIdentifyingToken)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(stubIdentifyingToken != NULL);
+
+    return CallCountingStubShort::From(stubIdentifyingToken);
+}
+#endif
+
+inline PTR_CallCount CallCountingStub::GetRemainingCallCountCell() const
+{
+    WRAPPER_NO_CONTRACT;
+    return PTR_CallCount(dac_cast<PTR_CallCountingStubShort>(this)->m_remainingCallCountCell);
+}
+
+inline PCODE CallCountingStub::GetTargetForMethod() const
+{
+    WRAPPER_NO_CONTRACT;
+    return CallCountingStubShort::From(PTR_CallCountingStub(this))->GetTargetForMethod();
+}
+
+////////////////////////////////////////////////////////////////
+
+#undef DISABLE_COPY
+
+#endif // FEATURE_TIERED_COMPILATION
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+
 #endif // __cgencpu_h__
diff --git a/src/coreclr/src/vm/callcounter.cpp b/src/coreclr/src/vm/callcounter.cpp
deleted file mode 100644 (file)
index 9a2d76f..0000000
+++ /dev/null
@@ -1,178 +0,0 @@
-// Licensed to the .NET Foundation under one or more agreements.
-// The .NET Foundation licenses this file to you under the MIT license.
-// See the LICENSE file in the project root for more information.
-// ===========================================================================
-// File: CallCounter.CPP
-//
-// ===========================================================================
-
-
-
-#include "common.h"
-#include "excep.h"
-#include "log.h"
-#include "tieredcompilation.h"
-#include "callcounter.h"
-
-#ifdef FEATURE_TIERED_COMPILATION
-#ifndef DACCESS_COMPILE
-
-CallCounterEntry CallCounterEntry::CreateWithCallCountingDisabled(MethodDesc *m)
-{
-    WRAPPER_NO_CONTRACT;
-    _ASSERTE(m != nullptr);
-
-    CallCounterEntry entry(m, INT_MAX);
-    _ASSERTE(!entry.IsCallCountingEnabled());
-    return entry;
-}
-
-CallCounter::CallCounter()
-{
-    LIMITED_METHOD_CONTRACT;
-
-    m_lock.Init(LOCK_TYPE_DEFAULT);
-}
-
-#endif // !DACCESS_COMPILE
-
-bool CallCounter::IsCallCountingEnabled(PTR_MethodDesc pMethodDesc)
-{
-    WRAPPER_NO_CONTRACT;
-    _ASSERTE(pMethodDesc != PTR_NULL);
-    _ASSERTE(pMethodDesc->IsEligibleForTieredCompilation());
-
-#ifndef DACCESS_COMPILE
-    SpinLockHolder holder(&m_lock);
-#endif
-
-    PTR_CallCounterEntry entry =
-        (PTR_CallCounterEntry)const_cast<CallCounterEntry *>(m_methodToCallCount.LookupPtr(pMethodDesc));
-    return entry == PTR_NULL || entry->IsCallCountingEnabled();
-}
-
-#ifndef DACCESS_COMPILE
-
-void CallCounter::DisableCallCounting(MethodDesc* pMethodDesc)
-{
-    WRAPPER_NO_CONTRACT;
-    _ASSERTE(pMethodDesc != NULL);
-    _ASSERTE(pMethodDesc->IsEligibleForTieredCompilation());
-
-    // Disabling call counting will affect the tier of the MethodDesc's first native code version. Callers must ensure that this
-    // change is made deterministically and prior to or while jitting the first native code version such that the tier would not
-    // be changed after it is already jitted. At that point, the call count threshold would already be initialized and the entry
-    // would exist. To disable call counting at different points in time, it would be ok to do so if the method has not been
-    // called yet (if the entry does not yet exist in the hash table), if necessary that could be a different function like
-    // TryDisable...() that would fail to disable call counting if the method has already been called.
-
-    SpinLockHolder holder(&m_lock);
-
-    CallCounterEntry *existingEntry = const_cast<CallCounterEntry *>(m_methodToCallCount.LookupPtr(pMethodDesc));
-    if (existingEntry != nullptr)
-    {
-        existingEntry->DisableCallCounting();
-        return;
-    }
-
-    // Typically, the entry would already exist because OnMethodCalled() would have been called before this function on the same
-    // thread. With multi-core JIT, a function may be jitted before it is called, in which case the entry would not exist.
-    m_methodToCallCount.Add(CallCounterEntry::CreateWithCallCountingDisabled(pMethodDesc));
-}
-
-NOINLINE bool CallCounter::OnMethodCodeVersionCalledSubsequently(NativeCodeVersion nativeCodeVersion, bool *doPublishRef)
-{
-    STANDARD_VM_CONTRACT;
-    _ASSERTE(!nativeCodeVersion.IsNull());
-    _ASSERTE(nativeCodeVersion.GetNativeCode() != NULL);
-    _ASSERTE(doPublishRef != nullptr);
-    _ASSERTE(*doPublishRef);
-
-    MethodDesc *methodDesc = nativeCodeVersion.GetMethodDesc();
-    if (!methodDesc->IsEligibleForTieredCompilation() ||
-        nativeCodeVersion.GetOptimizationTier() != NativeCodeVersion::OptimizationTier0)
-    {
-        return false;
-    }
-
-    TieredCompilationManager *tieredCompilationManager = GetAppDomain()->GetTieredCompilationManager();
-    if (tieredCompilationManager->OnMethodCodeVersionCalledSubsequently(methodDesc))
-    {
-        return true;
-    }
-
-    if (methodDesc->GetCallCounter()->IncrementCount(methodDesc))
-    {
-        *doPublishRef = false;
-    }
-    return true;
-}
-
-// This is called by the prestub each time the method is invoked in a particular
-// AppDomain (the AppDomain for which AppDomain.GetCallCounter() == this). These
-// calls continue until we backpatch the prestub to avoid future calls. This allows
-// us to track the number of calls to each method and use it as a trigger for tiered
-// compilation.
-bool CallCounter::IncrementCount(MethodDesc* pMethodDesc)
-{
-    STANDARD_VM_CONTRACT;
-
-    _ASSERTE(pMethodDesc->IsEligibleForTieredCompilation());
-
-    if (!g_pConfig->TieredCompilation_CallCounting())
-    {
-        return false; // stop counting calls
-    }
-
-    // PERF: This as a simple to implement, but not so performant, call counter
-    // Currently this is only called until we reach a fixed call count and then
-    // disabled. Its likely we'll want to improve this at some point but
-    // its not as bad as you might expect. Allocating a counter inline in the
-    // MethodDesc or at some location computable from the MethodDesc should
-    // eliminate 1 pointer per-method (the MethodDesc* key) and the CPU
-    // overhead to acquire the lock/search the dictionary. Depending on where it
-    // is we may also be able to reduce it to 1 byte counter without wasting the
-    // following bytes for alignment. Further work to inline the OnMethodCalled
-    // callback directly into the jitted code would eliminate CPU overhead of
-    // leaving the prestub unpatched, but may not be good overall as it increases
-    // the size of the jitted code.
-
-    int callCountLimit;
-    {
-        //Be careful if you convert to something fully lock/interlocked-free that
-        //you correctly handle what happens when some N simultaneous calls don't
-        //all increment the counter. The slight drift is probably neglible for tuning
-        //but TieredCompilationManager::OnMethodCalled() doesn't expect multiple calls
-        //each claiming to be exactly the threshhold call count needed to trigger
-        //optimization.
-        SpinLockHolder holder(&m_lock);
-        CallCounterEntry* pEntry = const_cast<CallCounterEntry*>(m_methodToCallCount.LookupPtr(pMethodDesc));
-        if (pEntry == NULL)
-        {
-            callCountLimit = (int)g_pConfig->TieredCompilation_CallCountThreshold() - 1;
-            _ASSERTE(callCountLimit >= 0);
-            m_methodToCallCount.Add(CallCounterEntry(pMethodDesc, callCountLimit));
-        }
-        else if (pEntry->IsCallCountingEnabled())
-        {
-            callCountLimit = --pEntry->callCountLimit;
-        }
-        else
-        {
-            return false; // stop counting calls
-        }
-    }
-
-    if (callCountLimit > 0)
-    {
-        return true; // continue counting calls
-    }
-    if (callCountLimit == 0)
-    {
-        GetAppDomain()->GetTieredCompilationManager()->AsyncPromoteMethodToTier1(pMethodDesc);
-    }
-    return false; // stop counting calls
-}
-
-#endif // !DACCESS_COMPILE
-#endif // FEATURE_TIERED_COMPILATION
diff --git a/src/coreclr/src/vm/callcounter.h b/src/coreclr/src/vm/callcounter.h
deleted file mode 100644 (file)
index a09a3aa..0000000
+++ /dev/null
@@ -1,110 +0,0 @@
-// Licensed to the .NET Foundation under one or more agreements.
-// The .NET Foundation licenses this file to you under the MIT license.
-// See the LICENSE file in the project root for more information.
-// ===========================================================================
-// File: CallCounter.h
-//
-// ===========================================================================
-
-
-#ifndef CALL_COUNTER_H
-#define CALL_COUNTER_H
-
-#ifdef FEATURE_TIERED_COMPILATION
-
-// One entry in our dictionary mapping methods to the number of times they
-// have been invoked
-struct CallCounterEntry
-{
-    CallCounterEntry() {}
-    CallCounterEntry(PTR_MethodDesc m, const int callCountLimit)
-        : pMethod(m), callCountLimit(callCountLimit) {}
-
-    PTR_MethodDesc pMethod;
-    int callCountLimit;
-
-#ifndef DACCESS_COMPILE
-    static CallCounterEntry CreateWithCallCountingDisabled(MethodDesc *m);
-#endif
-
-    bool IsCallCountingEnabled() const
-    {
-        LIMITED_METHOD_CONTRACT;
-        return callCountLimit != INT_MAX;
-    }
-
-#ifndef DACCESS_COMPILE
-    void DisableCallCounting()
-    {
-        LIMITED_METHOD_CONTRACT;
-        callCountLimit = INT_MAX;
-    }
-#endif
-};
-
-typedef DPTR(struct CallCounterEntry) PTR_CallCounterEntry;
-
-class CallCounterHashTraits : public DefaultSHashTraits<CallCounterEntry>
-{
-public:
-    typedef typename DefaultSHashTraits<CallCounterEntry>::element_t element_t;
-    typedef typename DefaultSHashTraits<CallCounterEntry>::count_t count_t;
-
-    typedef PTR_MethodDesc key_t;
-
-    static key_t GetKey(element_t e)
-    {
-        LIMITED_METHOD_CONTRACT;
-        return e.pMethod;
-    }
-    static BOOL Equals(key_t k1, key_t k2)
-    {
-        LIMITED_METHOD_CONTRACT;
-        return k1 == k2;
-    }
-    static count_t Hash(key_t k)
-    {
-        LIMITED_METHOD_CONTRACT;
-        return (count_t)dac_cast<TADDR>(k);
-    }
-
-    static const element_t Null() { LIMITED_METHOD_CONTRACT; return element_t(PTR_NULL, 0); }
-    static const element_t Deleted() { LIMITED_METHOD_CONTRACT; return element_t((PTR_MethodDesc)-1, 0); }
-    static bool IsNull(const element_t &e) { LIMITED_METHOD_CONTRACT; return e.pMethod == PTR_NULL; }
-    static bool IsDeleted(const element_t &e) { return e.pMethod == (PTR_MethodDesc)-1; }
-};
-
-typedef SHash<NoRemoveSHashTraits<CallCounterHashTraits>> CallCounterHash;
-
-
-// This is a per-appdomain cache of call counts for all code in that AppDomain.
-// Each method invocation should trigger a call to OnMethodCalled (until it is disabled per-method)
-// and the CallCounter will forward the call to the TieredCompilationManager including the
-// current call count.
-class CallCounter
-{
-public:
-#ifdef DACCESS_COMPILE
-    CallCounter() {}
-#else
-    CallCounter();
-#endif
-
-    bool IsCallCountingEnabled(PTR_MethodDesc pMethodDesc);
-#ifndef DACCESS_COMPILE
-    void DisableCallCounting(MethodDesc* pMethodDesc);
-#endif
-
-    static bool OnMethodCodeVersionCalledSubsequently(NativeCodeVersion nativeCodeVersion, bool *doPublishRef);
-    bool IncrementCount(MethodDesc* pMethodDesc);
-
-private:
-
-    // fields protected by lock
-    SpinLock m_lock;
-    CallCounterHash m_methodToCallCount;
-};
-
-#endif // FEATURE_TIERED_COMPILATION
-
-#endif // CALL_COUNTER_H
diff --git a/src/coreclr/src/vm/callcounting.cpp b/src/coreclr/src/vm/callcounting.cpp
new file mode 100644 (file)
index 0000000..8b8749a
--- /dev/null
@@ -0,0 +1,1361 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+// See the LICENSE file in the project root for more information.
+
+#include "common.h"
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+#include "callcounting.h"
+#include "threadsuspend.h"
+
+#ifndef DACCESS_COMPILE
+extern "C" void STDCALL OnCallCountThresholdReachedStub();
+#endif
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingStub
+
+#ifndef DACCESS_COMPILE
+const PCODE CallCountingStub::TargetForThresholdReached = (PCODE)GetEEFuncEntryPoint(OnCallCountThresholdReachedStub);
+#endif
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager::CallCountingInfo
+
+#ifndef DACCESS_COMPILE
+
+CallCountingManager::CallCountingInfo::CallCountingInfo(NativeCodeVersion codeVersion)
+    : m_codeVersion(codeVersion),
+    m_callCountingStub(nullptr),
+    m_remainingCallCount(0),
+    m_stage(Stage::Disabled)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(!codeVersion.IsNull());
+}
+
+CallCountingManager::CallCountingInfo *
+CallCountingManager::CallCountingInfo::CreateWithCallCountingDisabled(NativeCodeVersion codeVersion)
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    return new CallCountingInfo(codeVersion);
+}
+
+CallCountingManager::CallCountingInfo::CallCountingInfo(NativeCodeVersion codeVersion, CallCount callCountThreshold)
+    : m_codeVersion(codeVersion),
+    m_callCountingStub(nullptr),
+    m_remainingCallCount(callCountThreshold),
+    m_stage(Stage::StubIsNotActive)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(!codeVersion.IsNull());
+    _ASSERTE(callCountThreshold != 0);
+}
+
+CallCountingManager::CallCountingInfo::~CallCountingInfo()
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(m_stage != Stage::Disabled);
+
+    if (m_callCountingStub == nullptr)
+    {
+        return;
+    }
+
+    if (m_stage != Stage::StubIsNotActive)
+    {
+        _ASSERTE(s_activeCallCountingStubCount != 0);
+        --s_activeCallCountingStubCount;
+    }
+    ++s_completedCallCountingStubCount;
+}
+
+#endif // !DACCESS_COMPILE
+
+CallCountingManager::PTR_CallCountingInfo CallCountingManager::CallCountingInfo::From(PTR_CallCount remainingCallCountCell)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(remainingCallCountCell != nullptr);
+
+    return PTR_CallCountingInfo(dac_cast<TADDR>(remainingCallCountCell) - offsetof(CallCountingInfo, m_remainingCallCount));
+}
+
+NativeCodeVersion CallCountingManager::CallCountingInfo::GetCodeVersion() const
+{
+    WRAPPER_NO_CONTRACT;
+    return m_codeVersion;
+}
+
+#ifndef DACCESS_COMPILE
+
+const CallCountingStub *CallCountingManager::CallCountingInfo::GetCallCountingStub() const
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(m_stage != Stage::Disabled);
+
+    return m_callCountingStub;
+}
+
+void CallCountingManager::CallCountingInfo::SetCallCountingStub(const CallCountingStub *callCountingStub)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(g_pConfig->TieredCompilation_UseCallCountingStubs());
+    _ASSERTE(m_stage == Stage::StubIsNotActive);
+    _ASSERTE(m_callCountingStub == nullptr);
+    _ASSERTE(callCountingStub != nullptr);
+
+    ++s_callCountingStubCount;
+    m_callCountingStub = callCountingStub;
+}
+
+void CallCountingManager::CallCountingInfo::ClearCallCountingStub()
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(m_stage == Stage::StubIsNotActive);
+    _ASSERTE(m_callCountingStub != nullptr);
+
+    m_callCountingStub = nullptr;
+}
+
+PTR_CallCount CallCountingManager::CallCountingInfo::GetRemainingCallCountCell()
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(m_stage != Stage::Disabled);
+
+    return &m_remainingCallCount;
+}
+
+#endif // !DACCESS_COMPILE
+
+CallCountingManager::CallCountingInfo::Stage CallCountingManager::CallCountingInfo::GetStage() const
+{
+    WRAPPER_NO_CONTRACT;
+    return m_stage;
+}
+
+#ifndef DACCESS_COMPILE
+FORCEINLINE void CallCountingManager::CallCountingInfo::SetStage(Stage stage)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(m_stage != Stage::Disabled);
+    _ASSERTE(stage != Stage::Disabled);
+    _ASSERTE(stage <= Stage::PendingCompletion);
+
+    switch (stage)
+    {
+        case Stage::StubIsNotActive:
+            _ASSERTE(m_stage == Stage::StubMayBeActive);
+            _ASSERTE(m_callCountingStub != nullptr);
+            _ASSERTE(s_activeCallCountingStubCount != 0);
+            --s_activeCallCountingStubCount;
+            break;
+
+        case Stage::StubMayBeActive:
+            _ASSERTE(m_callCountingStub != nullptr);
+            // fall through
+
+        case Stage::PendingCompletion:
+            _ASSERTE(m_stage == Stage::StubIsNotActive || m_stage == Stage::StubMayBeActive);
+            if (m_stage == Stage::StubIsNotActive && m_callCountingStub != nullptr)
+            {
+                ++s_activeCallCountingStubCount;
+            }
+            break;
+
+        default:
+            UNREACHABLE();
+    }
+
+    m_stage = stage;
+}
+#endif
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager::CallCountingInfo::CodeVersionHashTraits
+
+CallCountingManager::CallCountingInfo::CodeVersionHashTraits::key_t
+CallCountingManager::CallCountingInfo::CodeVersionHashTraits::GetKey(const element_t &e)
+{
+    WRAPPER_NO_CONTRACT;
+    return e->GetCodeVersion();
+}
+
+BOOL CallCountingManager::CallCountingInfo::CodeVersionHashTraits::Equals(const key_t &k1, const key_t &k2)
+{
+    WRAPPER_NO_CONTRACT;
+    return k1 == k2;
+}
+
+CallCountingManager::CallCountingInfo::CodeVersionHashTraits::count_t
+CallCountingManager::CallCountingInfo::CodeVersionHashTraits::Hash(const key_t &k)
+{
+    WRAPPER_NO_CONTRACT;
+    return (count_t)dac_cast<TADDR>(k.GetMethodDesc()) + k.GetVersionId();
+}
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager::CallCountingStubAllocator
+
+CallCountingManager::CallCountingStubAllocator::CallCountingStubAllocator() : m_heap(nullptr)
+{
+    WRAPPER_NO_CONTRACT;
+}
+
+CallCountingManager::CallCountingStubAllocator::~CallCountingStubAllocator()
+{
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+#ifndef DACCESS_COMPILE
+    LoaderHeap *heap = m_heap;
+    if (heap != nullptr)
+    {
+        delete m_heap;
+    }
+#endif
+}
+
+#ifndef DACCESS_COMPILE
+
+void CallCountingManager::CallCountingStubAllocator::Reset()
+{
+    WRAPPER_NO_CONTRACT;
+
+    this->~CallCountingStubAllocator();
+    new(this) CallCountingStubAllocator();
+}
+
+const CallCountingStub *CallCountingManager::CallCountingStubAllocator::AllocateStub(
+    CallCount *remainingCallCountCell,
+    PCODE targetForMethod)
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    LoaderHeap *heap = m_heap;
+    if (heap == nullptr)
+    {
+        heap = AllocateHeap();
+    }
+
+    SIZE_T sizeInBytes;
+    const CallCountingStub *stub;
+    do
+    {
+        bool forceLongStub = false;
+    #if defined(_DEBUG) && defined(_TARGET_AMD64_)
+        if (s_callCountingStubCount % 2 == 0)
+        {
+            forceLongStub = true;
+        }
+    #endif
+
+        if (!forceLongStub)
+        {
+            sizeInBytes = sizeof(CallCountingStubShort);
+            AllocMemHolder<void> allocationAddressHolder(heap->AllocAlignedMem(sizeInBytes, CallCountingStub::Alignment));
+        #ifdef _TARGET_AMD64_
+            if (CallCountingStubShort::CanUseFor(allocationAddressHolder, targetForMethod))
+        #endif
+            {
+                stub = new(allocationAddressHolder) CallCountingStubShort(remainingCallCountCell, targetForMethod);
+                allocationAddressHolder.SuppressRelease();
+                break;
+            }
+        }
+
+    #ifdef _TARGET_AMD64_
+        sizeInBytes = sizeof(CallCountingStubLong);
+        void *allocationAddress = (void *)heap->AllocAlignedMem(sizeInBytes, CallCountingStub::Alignment);
+        stub = new(allocationAddress) CallCountingStubLong(remainingCallCountCell, targetForMethod);
+    #else
+        UNREACHABLE();
+    #endif
+    } while (false);
+
+    ClrFlushInstructionCache(stub, sizeInBytes);
+    return stub;
+}
+
+NOINLINE LoaderHeap *CallCountingManager::CallCountingStubAllocator::AllocateHeap()
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(m_heap == nullptr);
+
+    LoaderHeap *heap = new LoaderHeap(0, 0, &m_heapRangeList, true /* fMakeExecutable */, true /* fUnlocked */);
+    m_heap = heap;
+    return heap;
+}
+
+#endif // !DACCESS_COMPILE
+
+bool CallCountingManager::CallCountingStubAllocator::IsStub(TADDR entryPoint)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(entryPoint != NULL);
+
+    return !!m_heapRangeList.IsInRange(entryPoint);
+}
+
+#ifdef DACCESS_COMPILE
+
+void CallCountingManager::CallCountingStubAllocator::EnumerateHeapRanges(CLRDataEnumMemoryFlags flags)
+{
+    WRAPPER_NO_CONTRACT;
+    m_heapRangeList.EnumMemoryRegions(flags);
+}
+
+#endif // DACCESS_COMPILE
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager::MethodDescForwarderStubHashTraits
+
+CallCountingManager::MethodDescForwarderStubHashTraits::key_t
+CallCountingManager::MethodDescForwarderStubHashTraits::GetKey(const element_t &e)
+{
+    WRAPPER_NO_CONTRACT;
+    return e->GetMethodDesc();
+}
+
+BOOL CallCountingManager::MethodDescForwarderStubHashTraits::Equals(const key_t &k1, const key_t &k2)
+{
+    WRAPPER_NO_CONTRACT;
+    return k1 == k2;
+}
+
+CallCountingManager::MethodDescForwarderStubHashTraits::count_t
+CallCountingManager::MethodDescForwarderStubHashTraits::Hash(const key_t &k)
+{
+    WRAPPER_NO_CONTRACT;
+    return (count_t)k;
+}
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager::CallCountingManagerHashTraits
+
+CallCountingManager::CallCountingManagerHashTraits::key_t
+CallCountingManager::CallCountingManagerHashTraits::GetKey(const element_t &e)
+{
+    WRAPPER_NO_CONTRACT;
+    return e;
+}
+
+BOOL CallCountingManager::CallCountingManagerHashTraits::Equals(const key_t &k1, const key_t &k2)
+{
+    WRAPPER_NO_CONTRACT;
+    return k1 == k2;
+}
+
+CallCountingManager::CallCountingManagerHashTraits::count_t
+CallCountingManager::CallCountingManagerHashTraits::Hash(const key_t &k)
+{
+    WRAPPER_NO_CONTRACT;
+    return (count_t)dac_cast<TADDR>(k);
+}
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager
+
+CallCountingManager::PTR_CallCountingManagerHash CallCountingManager::s_callCountingManagers = PTR_NULL;
+COUNT_T CallCountingManager::s_callCountingStubCount = 0;
+COUNT_T CallCountingManager::s_activeCallCountingStubCount = 0;
+COUNT_T CallCountingManager::s_completedCallCountingStubCount = 0;
+
+CallCountingManager::CallCountingManager()
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+#ifndef DACCESS_COMPILE
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
+    s_callCountingManagers->Add(this);
+#endif
+}
+
+CallCountingManager::~CallCountingManager()
+{
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+#ifndef DACCESS_COMPILE
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+    for (auto itEnd = m_callCountingInfoByCodeVersionHash.End(), it = m_callCountingInfoByCodeVersionHash.Begin();
+        it != itEnd;
+        ++it)
+    {
+        CallCountingInfo *callCountingInfo = *it;
+        delete callCountingInfo;
+    }
+
+    s_callCountingManagers->Remove(this);
+#endif
+}
+
+#ifndef DACCESS_COMPILE
+void CallCountingManager::StaticInitialize()
+{
+    WRAPPER_NO_CONTRACT;
+    s_callCountingManagers = PTR_CallCountingManagerHash(new CallCountingManagerHash());
+}
+#endif
+
+bool CallCountingManager::IsCallCountingEnabled(NativeCodeVersion codeVersion)
+{
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(!codeVersion.IsNull());
+    _ASSERTE(codeVersion.IsDefaultVersion());
+    _ASSERTE(codeVersion.GetMethodDesc()->IsEligibleForTieredCompilation());
+
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+    PTR_CallCountingInfo callCountingInfo = m_callCountingInfoByCodeVersionHash.Lookup(codeVersion);
+    return callCountingInfo == NULL || callCountingInfo->GetStage() != CallCountingInfo::Stage::Disabled;
+}
+
+#ifndef DACCESS_COMPILE
+
+void CallCountingManager::DisableCallCounting(NativeCodeVersion codeVersion)
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(!codeVersion.IsNull());
+    _ASSERTE(codeVersion.IsDefaultVersion());
+    _ASSERTE(codeVersion.GetMethodDesc()->IsEligibleForTieredCompilation());
+
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+    _ASSERTE(m_callCountingInfoByCodeVersionHash.Lookup(codeVersion) == nullptr);
+    NewHolder<CallCountingInfo> callCountingInfoHolder = CallCountingInfo::CreateWithCallCountingDisabled(codeVersion);
+    m_callCountingInfoByCodeVersionHash.Add(callCountingInfoHolder);
+    callCountingInfoHolder.SuppressRelease();
+}
+
+// Returns true if the code entry point was updated to reflect the active code version, false otherwise. In normal paths, the
+// code entry point is not updated only when the use of call counting stubs is disabled, as in that case returning to the
+// prestub is necessary for further call counting. On exception, the code entry point may or may not have been updated and it's
+// up to the caller to decide how to proceed.
+bool CallCountingManager::SetCodeEntryPoint(
+    NativeCodeVersion activeCodeVersion,
+    PCODE codeEntryPoint,
+    bool wasMethodCalled,
+    bool *scheduleTieringBackgroundWorkRef)
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+
+        // Backpatching entry point slots requires cooperative GC mode, see MethodDescBackpatchInfoTracker::Backpatch_Locked().
+        // The code version manager's table lock is an unsafe lock that may be taken in any GC mode. The lock is taken in
+        // cooperative GC mode on other paths, so the caller must use the same ordering to prevent deadlock (switch to
+        // cooperative GC mode before taking the lock).
+        PRECONDITION(!activeCodeVersion.IsNull());
+        if (activeCodeVersion.GetMethodDesc()->MayHaveEntryPointSlotsToBackpatch())
+        {
+            MODE_COOPERATIVE;
+        }
+        else
+        {
+            MODE_ANY;
+        }
+    }
+    CONTRACTL_END;
+
+    MethodDesc *methodDesc = activeCodeVersion.GetMethodDesc();
+    _ASSERTE(!methodDesc->MayHaveEntryPointSlotsToBackpatch() || MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+    _ASSERTE(
+        activeCodeVersion ==
+        methodDesc->GetCodeVersionManager()->GetActiveILCodeVersion(methodDesc).GetActiveNativeCodeVersion(methodDesc));
+    _ASSERTE(codeEntryPoint != NULL);
+    _ASSERTE(codeEntryPoint == activeCodeVersion.GetNativeCode());
+    _ASSERTE(!wasMethodCalled || scheduleTieringBackgroundWorkRef != nullptr);
+    _ASSERTE(scheduleTieringBackgroundWorkRef == nullptr || !*scheduleTieringBackgroundWorkRef);
+
+    if (!methodDesc->IsEligibleForTieredCompilation() ||
+        (
+            // For a default code version that is not tier 0, call counting will have been disabled by this time (checked
+            // below). Avoid the redundant and not-insignificant expense of GetOptimizationTier() on a default code version.
+            !activeCodeVersion.IsDefaultVersion() &&
+            activeCodeVersion.GetOptimizationTier() != NativeCodeVersion::OptimizationTier0
+        ) ||
+        !g_pConfig->TieredCompilation_CallCounting())
+    {
+        methodDesc->SetCodeEntryPoint(codeEntryPoint);
+        return true;
+    }
+
+    const CallCountingStub *callCountingStub;
+    CallCountingManager *callCountingManager = methodDesc->GetLoaderAllocator()->GetCallCountingManager();
+    CallCountingInfoByCodeVersionHash &callCountingInfoByCodeVersionHash =
+        callCountingManager->m_callCountingInfoByCodeVersionHash;
+    CallCountingInfo *const *callCountingInfoPtr = callCountingInfoByCodeVersionHash.LookupPtr(activeCodeVersion);
+    CallCountingInfo *callCountingInfo = callCountingInfoPtr == nullptr ? nullptr : *callCountingInfoPtr;
+    do
+    {
+        if (callCountingInfo != nullptr)
+        {
+            _ASSERTE(callCountingInfo->GetCodeVersion() == activeCodeVersion);
+
+            CallCountingInfo::Stage callCountingStage = callCountingInfo->GetStage();
+            if (callCountingStage >= CallCountingInfo::Stage::PendingCompletion)
+            {
+                // The pending completion stage here would be rare, typically only if there was an exception somewhere, stop
+                // coming back to the prestub for now and let it be handled elsewhere
+                methodDesc->SetCodeEntryPoint(codeEntryPoint);
+                return true;
+            }
+
+            _ASSERTE(activeCodeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
+
+            // If the tiering delay is active, postpone further work
+            if (GetAppDomain()
+                    ->GetTieredCompilationManager()
+                    ->TrySetCodeEntryPointAndRecordMethodForCallCounting(methodDesc, codeEntryPoint))
+            {
+                if (callCountingStage == CallCountingInfo::Stage::StubMayBeActive)
+                {
+                    callCountingInfo->SetStage(CallCountingInfo::Stage::StubIsNotActive);
+                }
+                return true;
+            }
+
+            do
+            {
+                if (!wasMethodCalled)
+                {
+                    break;
+                }
+
+                CallCount remainingCallCount = --*callCountingInfo->GetRemainingCallCountCell();
+                if (remainingCallCount != 0)
+                {
+                    break;
+                }
+
+                callCountingInfo->SetStage(CallCountingInfo::Stage::PendingCompletion);
+                if (!activeCodeVersion.GetILCodeVersion().HasAnyOptimizedNativeCodeVersion(activeCodeVersion))
+                {
+                    GetAppDomain()
+                        ->GetTieredCompilationManager()
+                        ->AsyncPromoteToTier1(activeCodeVersion, scheduleTieringBackgroundWorkRef);
+                }
+                methodDesc->SetCodeEntryPoint(codeEntryPoint);
+                callCountingManager->RemoveForwarderStub(methodDesc);
+                callCountingInfoByCodeVersionHash.RemovePtr(const_cast<CallCountingInfo **>(callCountingInfoPtr));
+                delete callCountingInfo;
+                return true;
+            } while (false);
+
+            callCountingStub = callCountingInfo->GetCallCountingStub();
+            if (callCountingStub != nullptr)
+            {
+                break;
+            }
+        }
+        else
+        {
+            _ASSERTE(activeCodeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
+
+            // If the tiering delay is active, postpone further work
+            if (GetAppDomain()
+                    ->GetTieredCompilationManager()
+                    ->TrySetCodeEntryPointAndRecordMethodForCallCounting(methodDesc, codeEntryPoint))
+            {
+                return true;
+            }
+
+            CallCount callCountThreshold = (CallCount)g_pConfig->TieredCompilation_CallCountThreshold();
+            _ASSERTE(callCountThreshold != 0);
+
+            NewHolder<CallCountingInfo> callCountingInfoHolder = new CallCountingInfo(activeCodeVersion, callCountThreshold);
+            callCountingInfoByCodeVersionHash.Add(callCountingInfoHolder);
+            callCountingInfo = callCountingInfoHolder.Extract();
+        }
+
+        if (!g_pConfig->TieredCompilation_UseCallCountingStubs())
+        {
+            // Call counting is not yet complete, so reset or don't set the code entry point to continue counting calls
+
+            if (wasMethodCalled)
+            {
+                return false;
+            }
+
+            // This path is reached after activating a code version when publishing its code entry point. The method may
+            // currently be pointing to the code entry point of a different code version, so an explicit reset is necessary.
+            methodDesc->ResetCodeEntryPoint();
+            return true;
+        }
+
+        callCountingStub =
+            callCountingManager->m_callCountingStubAllocator.AllocateStub(
+                callCountingInfo->GetRemainingCallCountCell(),
+                codeEntryPoint);
+        callCountingInfo->SetCallCountingStub(callCountingStub);
+    } while (false);
+
+    PCODE callCountingCodeEntryPoint = callCountingStub->GetEntryPoint();
+    if (methodDesc->MayHaveEntryPointSlotsToBackpatch())
+    {
+        // The call counting stub should not be the entry point that is called first in the process of a call
+        // - Stubs should be deletable. Many methods will have call counting stubs associated with them, and although the memory
+        //   involved is typically insignificant compared to the average memory overhead per method, by steady-state it would
+        //   otherwise be unnecessary memory overhead serving no purpose.
+        // - In order to be able to delete a stub, the jitted code of a method cannot be allowed to load the stub as the entry
+        //   point of a callee into a register in a GC-safe point that allows for the stub to be deleted before the register is
+        //   reused to call the stub. On some processor architectures, perhaps the JIT can guarantee that it would not load the
+        //   entry point into a register before the call, but this is not possible on arm32 or arm64. Rather, perhaps the
+        //   region containing the load and call would not be considered GC-safe. Calls are considered GC-safe points, and this
+        //   may cause many methods that are currently fully interruptible to have to be partially interruptible and record
+        //   extra GC info instead. This would be nontrivial and there would be tradeoffs.
+        // - For any method that may have an entry point slot that would be backpatched with the call counting stub's entry
+        //   point, a small forwarder stub (precode) is created. The forwarder stub has loader allocator lifetime and fowards to
+        //   the larger call counting stub. This is a simple solution for now and seems to have negligible impact.
+        // - Reusing FuncPtrStubs was considered. FuncPtrStubs are currently not used as a code entry point for a virtual or
+        //   interface method and may be bypassed. For example, a call may call through the vtable slot, or a devirtualized call
+        //   may call through a FuncPtrStub. The target of a FuncPtrStub is a code entry point and is backpatched when a
+        //   method's active code entry point changes. Mixing the current use of FuncPtrStubs with the use as a forwarder for
+        //   call counting does not seem trivial and would likely complicate its use. There may not be much gain in reusing
+        //   FuncPtrStubs, as typically, they are created for only a small percentage of virtual/interface methods.
+
+        MethodDescForwarderStubHash &methodDescForwarderStubHash = callCountingManager->m_methodDescForwarderStubHash;
+        Precode *forwarderStub = methodDescForwarderStubHash.Lookup(methodDesc);
+        if (forwarderStub == nullptr)
+        {
+            AllocMemTracker forwarderStubAllocationTracker;
+            forwarderStub =
+                Precode::Allocate(
+                    methodDesc->GetPrecodeType(),
+                    methodDesc,
+                    methodDesc->GetLoaderAllocator(),
+                    &forwarderStubAllocationTracker);
+            methodDescForwarderStubHash.Add(forwarderStub);
+            forwarderStubAllocationTracker.SuppressRelease();
+        }
+
+        forwarderStub->SetTargetInterlocked(callCountingCodeEntryPoint, false);
+        callCountingCodeEntryPoint = forwarderStub->GetEntryPoint();
+    }
+    else
+    {
+        _ASSERTE(methodDesc->IsVersionableWithPrecode());
+    }
+
+    methodDesc->SetCodeEntryPoint(callCountingCodeEntryPoint);
+    callCountingInfo->SetStage(CallCountingInfo::Stage::StubMayBeActive);
+    return true;
+}
+
+extern "C" PCODE STDCALL OnCallCountThresholdReached(TransitionBlock *transitionBlock, TADDR stubIdentifyingToken)
+{
+    WRAPPER_NO_CONTRACT;
+    return CallCountingManager::OnCallCountThresholdReached(transitionBlock, stubIdentifyingToken);
+}
+
+PCODE CallCountingManager::OnCallCountThresholdReached(TransitionBlock *transitionBlock, TADDR stubIdentifyingToken)
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_TRIGGERS;
+        MODE_COOPERATIVE;
+        PRECONDITION(CheckPointer(transitionBlock));
+    }
+    CONTRACTL_END;
+
+    MAKE_CURRENT_THREAD_AVAILABLE();
+
+#ifdef _DEBUG
+    Thread::ObjectRefFlush(CURRENT_THREAD);
+#endif
+
+    // Get the code version from the call counting stub/info in cooperative GC mode to synchronize with deletion. The stub/info
+    // may be deleted only when the runtime is suspended, so when we are in cooperative GC mode it is safe to read from them.
+    NativeCodeVersion codeVersion =
+        CallCountingInfo::From(CallCountingStub::From(stubIdentifyingToken)->GetRemainingCallCountCell())->GetCodeVersion();
+
+    MethodDesc *methodDesc = codeVersion.GetMethodDesc();
+    FrameWithCookie<CallCountingHelperFrame> frameWithCookie(transitionBlock, methodDesc);
+    CallCountingHelperFrame *frame = &frameWithCookie;
+    frame->Push(CURRENT_THREAD);
+
+    PCODE codeEntryPoint;
+
+    INSTALL_MANAGED_EXCEPTION_DISPATCHER;
+    INSTALL_UNWIND_AND_CONTINUE_HANDLER;
+
+    // The switch to preemptive GC mode no longer guarantees that the stub/info will be valid. Only the code version will be
+    // used going forward under appropriate locking to synchronize further with deletion.
+    GCX_PREEMP_THREAD_EXISTS(CURRENT_THREAD);
+
+    _ASSERTE(codeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
+
+    codeEntryPoint = codeVersion.GetNativeCode();
+    do
+    {
+        {
+            CallCountingManager *callCountingManager = methodDesc->GetLoaderAllocator()->GetCallCountingManager();
+
+            CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+            CallCountingInfo *callCountingInfo = callCountingManager->m_callCountingInfoByCodeVersionHash.Lookup(codeVersion);
+            if (callCountingInfo == nullptr)
+            {
+                break;
+            }
+
+            CallCountingInfo::Stage callCountingStage = callCountingInfo->GetStage();
+            if (callCountingStage >= CallCountingInfo::Stage::PendingCompletion)
+            {
+                break;
+            }
+
+            // Fully completing call counting for a method is relative expensive. Call counting with stubs is relatively cheap.
+            // Since many methods will typically reach the call count threshold at roughly the same time (a perf spike),
+            // delegate as much of the overhead as possible to the background. This significantly decreases the degree of the
+            // perf spike.
+            callCountingManager->m_callCountingInfosPendingCompletion.Append(callCountingInfo);
+            callCountingInfo->SetStage(CallCountingInfo::Stage::PendingCompletion);
+        }
+
+        GetAppDomain()->GetTieredCompilationManager()->AsyncCompleteCallCounting();
+    } while (false);
+
+    UNINSTALL_UNWIND_AND_CONTINUE_HANDLER;
+    UNINSTALL_MANAGED_EXCEPTION_DISPATCHER;
+
+    frame->Pop(CURRENT_THREAD);
+    return codeEntryPoint;
+}
+
+COUNT_T CallCountingManager::GetCountOfCodeVersionsPendingCompletion()
+{
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_PREEMPTIVE;
+    }
+    CONTRACTL_END;
+
+    COUNT_T count = 0;
+
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+    for (auto itEnd = s_callCountingManagers->End(), it = s_callCountingManagers->Begin(); it != itEnd; ++it)
+    {
+        CallCountingManager *callCountingManager = *it;
+        count += callCountingManager->m_callCountingInfosPendingCompletion.GetCount();
+    }
+
+    return count;
+}
+
+void CallCountingManager::CompleteCallCounting()
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_TRIGGERS;
+        MODE_PREEMPTIVE;
+    }
+    CONTRACTL_END;
+
+    AppDomain *appDomain = GetAppDomain();
+    TieredCompilationManager *tieredCompilationManager = appDomain->GetTieredCompilationManager();
+    bool scheduleTieringBackgroundWork = false;
+    {
+        CodeVersionManager *codeVersionManager = appDomain->GetCodeVersionManager();
+
+        MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder;
+
+        // Backpatching entry point slots requires cooperative GC mode, see
+        // MethodDescBackpatchInfoTracker::Backpatch_Locked(). The code version manager's table lock is an unsafe lock that
+        // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
+        // must be used here to prevent deadlock.
+        GCX_COOP();
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+        for (auto itEnd = s_callCountingManagers->End(), it = s_callCountingManagers->Begin(); it != itEnd; ++it)
+        {
+            CallCountingManager *callCountingManager = *it;
+            SArray<CallCountingInfo *> &callCountingInfosPendingCompletion =
+                callCountingManager->m_callCountingInfosPendingCompletion;
+            COUNT_T callCountingInfoCount = callCountingInfosPendingCompletion.GetCount();
+            if (callCountingInfoCount == 0)
+            {
+                continue;
+            }
+
+            CallCountingInfo **callCountingInfos = callCountingInfosPendingCompletion.GetElements();
+            for (COUNT_T i = 0; i < callCountingInfoCount; ++i)
+            {
+                CallCountingInfo *callCountingInfo = callCountingInfos[i];
+                if (callCountingInfo == nullptr)
+                {
+                    continue;
+                }
+
+                CallCountingInfo::Stage callCountingStage = callCountingInfo->GetStage();
+                if (callCountingStage != CallCountingInfo::Stage::PendingCompletion)
+                {
+                    continue;
+                }
+
+                NativeCodeVersion codeVersion = callCountingInfo->GetCodeVersion();
+                MethodDesc *methodDesc = codeVersion.GetMethodDesc();
+                _ASSERTE(codeVersionManager == methodDesc->GetCodeVersionManager());
+                EX_TRY
+                {
+                    if (!codeVersion.GetILCodeVersion().HasAnyOptimizedNativeCodeVersion(codeVersion))
+                    {
+                        tieredCompilationManager->AsyncPromoteToTier1(codeVersion, &scheduleTieringBackgroundWork);
+                    }
+
+                    // The active code version may have changed externally after the call counting stub was activated, deactivating
+                    // the call counting stub without our knowledge. Check the active code version and determine what needs to be
+                    // done.
+                    NativeCodeVersion activeCodeVersion =
+                        codeVersionManager->GetActiveILCodeVersion(methodDesc).GetActiveNativeCodeVersion(methodDesc);
+                    do
+                    {
+                        if (activeCodeVersion == codeVersion)
+                        {
+                            methodDesc->SetCodeEntryPoint(activeCodeVersion.GetNativeCode());
+                            break;
+                        }
+
+                        // There is at least one case where the IL code version is changed inside the code versioning lock, the lock
+                        // is released and reacquired, then the method's code entry point is reset. So if this path is reached
+                        // between those locks, the method would still be pointing to the call counting stub. Once the stub is
+                        // marked as complete, it may be deleted, so in all cases update the method's code entry point to ensure
+                        // that the method is no longer pointing to the call counting stub.
+
+                        if (!activeCodeVersion.IsNull())
+                        {
+                            PCODE activeNativeCode = activeCodeVersion.GetNativeCode();
+                            if (activeNativeCode != NULL)
+                            {
+                                methodDesc->SetCodeEntryPoint(activeNativeCode);
+                                break;
+                            }
+                        }
+
+                        methodDesc->ResetCodeEntryPoint();
+                    } while (false);
+
+                    callCountingManager->RemoveForwarderStub(methodDesc);
+                    callCountingInfos[i] = nullptr; // in case of exception on a later iteration
+                    callCountingManager->m_callCountingInfoByCodeVersionHash.Remove(codeVersion);
+                    delete callCountingInfo;
+                }
+                EX_CATCH
+                {
+                    // Avoid abandoning call counting completion for all recorded call counting infos on exception. Since this is
+                    // happening on a background thread, following the general policy so far, the exception will be caught, logged,
+                    // and ignored anyway, so make an attempt to complete call counting for each item. Individual items that fail
+                    // will result in those code versions not getting promoted (similar to elsewhere).
+                    STRESS_LOG1(LF_TIEREDCOMPILATION, LL_WARNING, "CallCountingManager::CompleteCallCounting: "
+                        "Exception, hr=0x%x\n",
+                        GET_EXCEPTION()->GetHR());
+                }
+                EX_END_CATCH(RethrowTerminalExceptions);
+            }
+
+            callCountingInfosPendingCompletion.Clear();
+            if (callCountingInfosPendingCompletion.GetAllocation() > 64)
+            {
+                callCountingInfosPendingCompletion.Trim();
+                EX_TRY
+                {
+                    callCountingInfosPendingCompletion.Preallocate(64);
+                }
+                EX_CATCH
+                {
+                }
+                EX_END_CATCH(RethrowTerminalExceptions);
+            }
+        }
+    }
+
+    if (scheduleTieringBackgroundWork)
+    {
+        tieredCompilationManager->ScheduleBackgroundWork(); // requires GC_TRIGGERS
+    }
+}
+
+void CallCountingManager::RemoveForwarderStub(MethodDesc *methodDesc)
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(methodDesc != nullptr);
+    _ASSERTE(!methodDesc->MayHaveEntryPointSlotsToBackpatch() || MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+
+    // Currently, tier 0 is the last code version that is counted, and the method is typically not counted anymore.
+    // Remove the forwarder stub if one exists, a new one will be created if necessary, for example, if a profiler
+    // adds an IL code version for the method.
+    Precode *const *forwarderStubPtr = m_methodDescForwarderStubHash.LookupPtr(methodDesc);
+    if (forwarderStubPtr != nullptr)
+    {
+        (*forwarderStubPtr)->ResetTargetInterlocked();
+        m_methodDescForwarderStubHash.RemovePtr(const_cast<Precode **>(forwarderStubPtr));
+    }
+}
+
+void CallCountingManager::StopAndDeleteAllCallCountingStubs()
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_TRIGGERS;
+        MODE_PREEMPTIVE;
+    }
+    CONTRACTL_END;
+
+    COUNT_T deleteCallCountingStubsAfter = g_pConfig->TieredCompilation_DeleteCallCountingStubsAfter();
+    if (deleteCallCountingStubsAfter == 0)
+    {
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+        for (auto itEnd = s_callCountingManagers->End(), it = s_callCountingManagers->Begin(); it != itEnd; ++it)
+        {
+            CallCountingManager *callCountingManager = *it;
+            callCountingManager->TrimCollections();
+        }
+        return;
+    }
+
+    // If a number of call counting stubs have completed, we can try to delete them to reclaim some memory. Deleting
+    // involves suspending the runtime and will delete all call counting stubs, and after that some call counting stubs may
+    // be recreated in the foreground. The threshold is to decrease the impact of both of those overheads.
+    if (s_completedCallCountingStubCount < deleteCallCountingStubsAfter)
+    {
+        return;
+    }
+
+    TieredCompilationManager *tieredCompilationManager = GetAppDomain()->GetTieredCompilationManager();
+    bool scheduleTieringBackgroundWork = false;
+    {
+        MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder;
+
+        ThreadSuspend::SuspendEE(ThreadSuspend::SUSPEND_OTHER);
+        struct AutoRestartEE
+        {
+            ~AutoRestartEE()
+            {
+                WRAPPER_NO_CONTRACT;
+                ThreadSuspend::RestartEE(false, true);
+            }
+        } autoRestartEE;
+
+        // Backpatching entry point slots requires cooperative GC mode, see
+        // MethodDescBackpatchInfoTracker::Backpatch_Locked(). The code version manager's table lock is an unsafe lock that
+        // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
+        // must be used here to prevent deadlock.
+        GCX_COOP();
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+        // After the following, no method's entry point would be pointing to a call counting stub
+        StopAllCallCounting(tieredCompilationManager, &scheduleTieringBackgroundWork);
+
+        // Call counting has been stopped above and call counting stubs will soon be deleted. Ensure that call counting stubs
+        // will not be used after resuming the runtime. The following ensures that other threads will not use an old cached
+        // entry point value that will not be valid. Do this here in case of exception later.
+        MemoryBarrier(); // flush writes from this thread first to guarantee ordering
+        FlushProcessWriteBuffers();
+
+        DeleteAllCallCountingStubs();
+    }
+
+    if (scheduleTieringBackgroundWork)
+    {
+        tieredCompilationManager->ScheduleBackgroundWork(); // requires GC_TRIGGERS
+    }
+}
+
+void CallCountingManager::StopAllCallCounting(
+    TieredCompilationManager *tieredCompilationManager,
+    bool *scheduleTieringBackgroundWorkRef)
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_COOPERATIVE; // for slot backpatching
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+    _ASSERTE(tieredCompilationManager != nullptr);
+    _ASSERTE(scheduleTieringBackgroundWorkRef != nullptr);
+    _ASSERTE(!*scheduleTieringBackgroundWorkRef);
+
+    for (auto itEnd = s_callCountingManagers->End(), it = s_callCountingManagers->Begin(); it != itEnd; ++it)
+    {
+        CallCountingManager *callCountingManager = *it;
+
+        // Clear call counting infos pending completion. An attempt is made to complete them below, but in case of exception,
+        // doing this first ensures that during any partial work done, deleted call counting infos are not referenced.
+        SArray<CallCountingInfo *> &callCountingInfosPendingCompletion =
+            callCountingManager->m_callCountingInfosPendingCompletion;
+        if (!callCountingInfosPendingCompletion.IsEmpty())
+        {
+            callCountingInfosPendingCompletion.Clear();
+            if (callCountingInfosPendingCompletion.GetAllocation() > 64)
+            {
+                callCountingInfosPendingCompletion.Trim();
+                EX_TRY
+                {
+                    callCountingInfosPendingCompletion.Preallocate(64);
+                }
+                EX_CATCH
+                {
+                }
+                EX_END_CATCH(RethrowTerminalExceptions);
+            }
+        }
+
+        CallCountingInfoByCodeVersionHash &callCountingInfoByCodeVersionHash =
+            callCountingManager->m_callCountingInfoByCodeVersionHash;
+        for (auto itEnd = callCountingInfoByCodeVersionHash.End(), it = callCountingInfoByCodeVersionHash.Begin();
+            it != itEnd;
+            ++it)
+        {
+            CallCountingInfo *callCountingInfo = *it;
+            CallCountingInfo::Stage callCountingStage = callCountingInfo->GetStage();
+            if (callCountingStage != CallCountingInfo::Stage::StubMayBeActive &&
+                callCountingStage != CallCountingInfo::Stage::PendingCompletion)
+            {
+                continue;
+            }
+
+            NativeCodeVersion codeVersion = callCountingInfo->GetCodeVersion();
+            if (callCountingStage == CallCountingInfo::Stage::PendingCompletion &&
+                !codeVersion.GetILCodeVersion().HasAnyOptimizedNativeCodeVersion(codeVersion))
+            {
+                tieredCompilationManager->AsyncPromoteToTier1(codeVersion, scheduleTieringBackgroundWorkRef);
+            }
+
+            // The intention is that all call counting stubs will be deleted shortly, and only methods that are called again
+            // will cause stubs to be recreated, so reset the code entry point
+            codeVersion.GetMethodDesc()->ResetCodeEntryPoint();
+
+            if (callCountingStage == CallCountingInfo::Stage::StubMayBeActive)
+            {
+                callCountingInfo->SetStage(CallCountingInfo::Stage::StubIsNotActive);
+                callCountingInfo->ClearCallCountingStub();
+                continue;
+            }
+
+            _ASSERTE(callCountingStage == CallCountingInfo::Stage::PendingCompletion);
+            callCountingManager->RemoveForwarderStub(codeVersion.GetMethodDesc());
+            callCountingInfoByCodeVersionHash.Remove(it);
+            delete callCountingInfo;
+        }
+
+        // Reset forwarder stubs, they are not in use anymore
+        MethodDescForwarderStubHash &methodDescForwarderStubHash = callCountingManager->m_methodDescForwarderStubHash;
+        for (auto itEnd = methodDescForwarderStubHash.End(), it = methodDescForwarderStubHash.Begin(); it != itEnd; ++it)
+        {
+            Precode *forwarderStub = *it;
+            forwarderStub->ResetTargetInterlocked();
+        }
+
+        callCountingManager->TrimCollections();
+    }
+}
+
+void CallCountingManager::DeleteAllCallCountingStubs()
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+    _ASSERTE(IsSuspendEEThread());
+    _ASSERTE(s_activeCallCountingStubCount == 0);
+
+    for (auto itEnd = s_callCountingManagers->End(), it = s_callCountingManagers->Begin(); it != itEnd; ++it)
+    {
+        CallCountingManager *callCountingManager = *it;
+        _ASSERTE(callCountingManager->m_callCountingInfosPendingCompletion.IsEmpty());
+
+        // All call counting stubs are deleted, not just the completed stubs. Typically, there are many methods that are called
+        // only a few times and don't reach the call count threshold, so many stubs may not be recreated. On the other hand,
+        // some methods may still be getting called, just less frequently, then call counting stubs would be recreated in the
+        // foreground, which has some overhead that is currently managed in the conditions for deleting call counting stubs.
+        // There are potential solutions to reclaim as much memory as possible and to minimize the foreground overhead, but they
+        // seem to involve significantly higher complexity that doesn't seem worthwhile.
+        callCountingManager->m_callCountingStubAllocator.Reset();
+    }
+
+    s_callCountingStubCount = 0;
+    s_completedCallCountingStubCount = 0;
+}
+
+void CallCountingManager::TrimCollections()
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+
+    // Resize the hash tables if it would save some space. The hash tables' item counts typically spikes and then stabilizes at
+    // a lower value after most of the repeatedly called methods are promoted and the call counting infos deleted above.
+
+    COUNT_T count = m_callCountingInfoByCodeVersionHash.GetCount();
+    COUNT_T capacity = m_callCountingInfoByCodeVersionHash.GetCapacity();
+    if (count == 0)
+    {
+        if (capacity != 0)
+        {
+            m_callCountingInfoByCodeVersionHash.RemoveAll();
+        }
+    }
+    else if (count <= capacity / 4)
+    {
+        EX_TRY
+        {
+            m_callCountingInfoByCodeVersionHash.Reallocate(count * 2);
+        }
+        EX_CATCH
+        {
+        }
+        EX_END_CATCH(RethrowTerminalExceptions);
+    }
+
+    count = m_methodDescForwarderStubHash.GetCount();
+    capacity = m_methodDescForwarderStubHash.GetCapacity();
+    if (count == 0)
+    {
+        if (capacity != 0)
+        {
+            m_methodDescForwarderStubHash.RemoveAll();
+        }
+    }
+    else if (count <= capacity / 4)
+    {
+        EX_TRY
+        {
+            m_methodDescForwarderStubHash.Reallocate(count * 2);
+        }
+        EX_CATCH
+        {
+        }
+        EX_END_CATCH(RethrowTerminalExceptions);
+    }
+}
+
+#endif // !DACCESS_COMPILE
+
+bool CallCountingManager::IsCallCountingStub(PCODE entryPoint)
+{
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_ANY;
+        SUPPORTS_DAC;
+    }
+    CONTRACTL_END;
+
+    TADDR entryAddress = PCODEToPINSTR(entryPoint);
+    _ASSERTE(entryAddress != NULL);
+
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+    for (auto itEnd = s_callCountingManagers->End(), it = s_callCountingManagers->Begin(); it != itEnd; ++it)
+    {
+        PTR_CallCountingManager callCountingManager = *it;
+        if (callCountingManager->m_callCountingStubAllocator.IsStub(entryAddress))
+        {
+            return true;
+        }
+    }
+    return false;
+}
+
+PCODE CallCountingManager::GetTargetForMethod(PCODE callCountingStubEntryPoint)
+{
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_COOPERATIVE; // the call counting stub cannot be deleted while inspecting it
+        SUPPORTS_DAC;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(IsCallCountingStub(callCountingStubEntryPoint));
+
+    return PTR_CallCountingStub(PCODEToPINSTR(callCountingStubEntryPoint))->GetTargetForMethod();
+}
+
+#ifdef DACCESS_COMPILE
+
+void CallCountingManager::DacEnumerateCallCountingStubHeapRanges(CLRDataEnumMemoryFlags flags)
+{
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_ANY;
+        SUPPORTS_DAC;
+    }
+    CONTRACTL_END;
+
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+    for (auto itEnd = s_callCountingManagers->End(), it = s_callCountingManagers->Begin(); it != itEnd; ++it)
+    {
+        PTR_CallCountingManager callCountingManager = *it;
+        callCountingManager->m_callCountingStubAllocator.EnumerateHeapRanges(flags);
+    }
+}
+
+#endif // DACCESS_COMPILE
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager::CallCountingStubManager
+
+SPTR_IMPL(CallCountingStubManager, CallCountingStubManager, g_pManager);
+
+#ifndef DACCESS_COMPILE
+
+CallCountingStubManager::CallCountingStubManager()
+{
+    WRAPPER_NO_CONTRACT;
+}
+
+void CallCountingStubManager::Init()
+{
+    CONTRACTL
+    {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    g_pManager = new CallCountingStubManager();
+    StubManager::AddStubManager(g_pManager);
+}
+
+#endif // !DACCESS_COMPILE
+
+#ifdef _DEBUG
+const char *CallCountingStubManager::DbgGetName()
+{
+    WRAPPER_NO_CONTRACT;
+    return "CallCountingStubManager";
+}
+#endif
+
+#ifdef DACCESS_COMPILE
+LPCWSTR CallCountingStubManager::GetStubManagerName(PCODE addr)
+{
+    WRAPPER_NO_CONTRACT;
+    return W("CallCountingStub");
+}
+#endif
+
+BOOL CallCountingStubManager::CheckIsStub_Internal(PCODE entryPoint)
+{
+    WRAPPER_NO_CONTRACT;
+    SUPPORTS_DAC;
+
+    return CallCountingManager::IsCallCountingStub(entryPoint);
+}
+
+BOOL CallCountingStubManager::DoTraceStub(PCODE callCountingStubEntryPoint, TraceDestination *trace)
+{
+    WRAPPER_NO_CONTRACT;
+    SUPPORTS_DAC;
+    _ASSERTE(trace != nullptr);
+
+    trace->InitForStub(CallCountingManager::GetTargetForMethod(callCountingStubEntryPoint));
+    return true;
+}
+
+#ifdef DACCESS_COMPILE
+void CallCountingStubManager::DoEnumMemoryRegions(CLRDataEnumMemoryFlags flags)
+{
+    WRAPPER_NO_CONTRACT;
+    SUPPORTS_DAC;
+
+    DAC_ENUM_VTHIS();
+    EMEM_OUT(("MEM: %p CallCountingStubManager\n", dac_cast<TADDR>(this)));
+    CallCountingManager::DacEnumerateCallCountingStubHeapRanges(flags);
+}
+#endif
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+
+#endif // FEATURE_TIERED_COMPILATION
diff --git a/src/coreclr/src/vm/callcounting.h b/src/coreclr/src/vm/callcounting.h
new file mode 100644 (file)
index 0000000..c10fc85
--- /dev/null
@@ -0,0 +1,342 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+// See the LICENSE file in the project root for more information.
+
+#pragma once
+
+#include "codeversion.h"
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+/*******************************************************************************************************************************
+** Summary
+
+Outline of phases
+-----------------
+
+When starting call counting for a method (see CallCountingManager::SetCodeEntryPoint):
+- A CallCountingInfo is created (associated with the NativeCodeVersion to be counted), which initializes a remaining call count
+  with a threshold
+- A CallCountingStub is created. It contains a small amount of code that decrements the remaining call count and checks for
+  zero. When nonzero, it jumps to the code version's native code entry point. When zero, it forwards to a helper function that
+  handles tier promotion.
+- For tiered methods that don't have a precode (virtual and interface methods when slot backpatching is enabled), a forwarder
+  stub (a precode) is created and it forwards to the call counting stub. This is so that the call counting stub can be safely
+  and easily deleted. The forwarder stubs are only used when counting calls, there is one per method (not per code version), and
+  they are not deleted.
+- The method's code entry point is set to the forwarder stub or the call counting stub to count calls to the code version
+
+When the call count threshold is reached (see CallCountingManager::OnCallCountThresholdReached):
+- The helper call enqueues completion of call counting for background processing
+- When completing call counting in the background, the code version is enqueued for promotion, and the call counting stub is
+  removed from the call chain
+
+After all work queued for promotion is completed and methods transitioned to optimized tier, some cleanup follows
+(see CallCountingManager::StopAndDeleteAllCallCountingStubs):
+- Some heuristics are checked and if cleanup will be done, the runtime is suspended
+- All call counting stubs are deleted. For code versions that have not completed counting, the method's code entry point is
+  reset such that call counting would be reestablished on the next call.
+- Completed call counting infos are deleted
+- For methods that no longer have any code versions that need to be counted, the forwarder stubs are no longer tracked. If a
+  new IL code version is added thereafter (perhaps by a profiler), a new forwarder stub may be created.
+
+Miscellaneous
+-------------
+
+- The CallCountingManager is the main class with most of the logic. Its private subclasses are just simple data structures.
+- The code versioning lock is used for data structures used for call counting. Installing a call counting stub requires that we
+  know what the currently active code version is, it made sense to use the same lock.
+- Call counting stubs have hardcoded code. x64 has short and long stubs, short stubs are used when possible (often) and use
+  IP-relative branches to the method's code and helper stub. Other archs have only one type of stub (a short stub).
+  - Call counting stubs pass a stub-identifying token to the threshold-reached helper function. The stub's address can be
+    determined from it. On x64, it also indicates whether the stub is a short or long stub.
+  - From a call counting stub, the call counting info can be determined using the remaining call count cell, and from the call
+    counting info the code version and method can be determined
+- Call counting is not stopped when the tiering delay is reactivated (often happens in larger and more realistic scenarios). The
+  overhead necessary to stop and restart call counting (among other things, many methods will have to go through the prestub
+  again) is greater than the overhead of completing call counting + calling the threshold-reached helper function, even for very
+  high call count thresholds. While it may at times be desirable to not count method calls during startup phases, there would be
+  a fair bit of additional overhead to stop counting. On the other hand, it may at times be beneficial to rejit some methods
+  during startup. So for now, only newly called methods during the current tiering delay would not be counted, any that already
+  started counting will continue (their delay already expired).
+
+*******************************************************************************************************************************/
+
+#define DISABLE_COPY(T) \
+    T(const T &) = delete; \
+    T &operator =(const T &) = delete
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager
+
+class CallCountingManager;
+typedef DPTR(CallCountingManager) PTR_CallCountingManager;
+
+class CallCountingManager
+{
+    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+    // CallCountingManager::CallCountingInfo
+
+private:
+    class CallCountingInfo;
+    typedef DPTR(CallCountingInfo) PTR_CallCountingInfo;
+
+    class CallCountingInfo
+    {
+    public:
+        enum class Stage : UINT8
+        {
+            // Stub is definitely not going to be called, stub may be deleted
+            StubIsNotActive,
+
+            // Stub may be called, don't know if it's actually active (changes to code versions, etc.)
+            StubMayBeActive,
+
+            // Stub may be active, call counting complete, not yet promoted
+            PendingCompletion,
+
+            // Call counting is disabled, only used for the default code version to indicate that it is to be optimized
+            Disabled
+        };
+
+    private:
+        const NativeCodeVersion m_codeVersion;
+        const CallCountingStub *m_callCountingStub;
+        CallCount m_remainingCallCount;
+        Stage m_stage;
+
+    #ifndef DACCESS_COMPILE
+    private:
+        CallCountingInfo(NativeCodeVersion codeVersion);
+    public:
+        static CallCountingInfo *CreateWithCallCountingDisabled(NativeCodeVersion codeVersion);
+        CallCountingInfo(NativeCodeVersion codeVersion, CallCount callCountThreshold);
+        ~CallCountingInfo();
+    #endif
+
+    public:
+        static PTR_CallCountingInfo From(PTR_CallCount remainingCallCountCell);
+        NativeCodeVersion GetCodeVersion() const;
+
+    #ifndef DACCESS_COMPILE
+    public:
+        const CallCountingStub *GetCallCountingStub() const;
+        void SetCallCountingStub(const CallCountingStub *callCountingStub);
+        void ClearCallCountingStub();
+        CallCount *GetRemainingCallCountCell();
+    #endif
+
+    public:
+        Stage GetStage() const;
+    #ifndef DACCESS_COMPILE
+    public:
+        void SetStage(Stage stage);
+    #endif
+
+        ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+        // CallCountingManager::CallCountingInfo::CodeVersionHashTraits
+
+    public:
+        class CodeVersionHashTraits : public DefaultSHashTraits<PTR_CallCountingInfo>
+        {
+        private:
+            typedef DefaultSHashTraits<PTR_CallCountingInfo> Base;
+        public:
+            typedef Base::element_t element_t;
+            typedef Base::count_t count_t;
+            typedef const NativeCodeVersion key_t;
+
+        public:
+            static key_t GetKey(const element_t &e);
+            static BOOL Equals(const key_t &k1, const key_t &k2);
+            static count_t Hash(const key_t &k);
+        };
+    };
+
+    typedef SHash<CallCountingInfo::CodeVersionHashTraits> CallCountingInfoByCodeVersionHash;
+
+    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+    // CallCountingManager::CallCountingStubAllocator
+
+private:
+    class CallCountingStubAllocator
+    {
+    private:
+        // LoaderHeap cannot be constructed when DACCESS_COMPILE is defined (at the time, its destructor was private). Working
+        // around that by controlling creation/destruction using a pointer.
+        LoaderHeap *m_heap;
+        RangeList m_heapRangeList;
+
+    public:
+        CallCountingStubAllocator();
+        ~CallCountingStubAllocator();
+
+    #ifndef DACCESS_COMPILE
+    public:
+        void Reset();
+        const CallCountingStub *AllocateStub(CallCount *remainingCallCountCell, PCODE targetForMethod);
+    private:
+        LoaderHeap *AllocateHeap();
+    #endif // !DACCESS_COMPILE
+
+    public:
+        bool IsStub(TADDR entryPoint);
+
+    #ifdef DACCESS_COMPILE
+        void EnumerateHeapRanges(CLRDataEnumMemoryFlags flags);
+    #endif
+
+        DISABLE_COPY(CallCountingStubAllocator);
+    };
+
+    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+    // CallCountingManager::MethodDescForwarderStub
+
+private:
+    class MethodDescForwarderStubHashTraits : public DefaultSHashTraits<Precode *>
+    {
+    private:
+        typedef DefaultSHashTraits<Precode *> Base;
+    public:
+        typedef Base::element_t element_t;
+        typedef Base::count_t count_t;
+        typedef MethodDesc *key_t;
+
+    public:
+        static key_t GetKey(const element_t &e);
+        static BOOL Equals(const key_t &k1, const key_t &k2);
+        static count_t Hash(const key_t &k);
+    };
+
+    typedef SHash<MethodDescForwarderStubHashTraits> MethodDescForwarderStubHash;
+
+    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+    // CallCountingManager::CallCountingManagerHashTraits
+
+private:
+    class CallCountingManagerHashTraits : public DefaultSHashTraits<PTR_CallCountingManager>
+    {
+    private:
+        typedef DefaultSHashTraits<PTR_CallCountingManager> Base;
+    public:
+        typedef Base::element_t element_t;
+        typedef Base::count_t count_t;
+        typedef PTR_CallCountingManager key_t;
+
+    public:
+        static key_t GetKey(const element_t &e);
+        static BOOL Equals(const key_t &k1, const key_t &k2);
+        static count_t Hash(const key_t &k);
+    };
+
+    typedef SHash<CallCountingManagerHashTraits> CallCountingManagerHash;
+    typedef DPTR(CallCountingManagerHash) PTR_CallCountingManagerHash;
+
+    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+    // CallCountingManager members
+
+private:
+    static PTR_CallCountingManagerHash s_callCountingManagers;
+    static COUNT_T s_callCountingStubCount;
+    static COUNT_T s_activeCallCountingStubCount;
+    static COUNT_T s_completedCallCountingStubCount;
+
+private:
+    CallCountingInfoByCodeVersionHash m_callCountingInfoByCodeVersionHash;
+    CallCountingStubAllocator m_callCountingStubAllocator;
+    MethodDescForwarderStubHash m_methodDescForwarderStubHash;
+    SArray<CallCountingInfo *> m_callCountingInfosPendingCompletion;
+
+public:
+    CallCountingManager();
+    ~CallCountingManager();
+
+#ifndef DACCESS_COMPILE
+public:
+    static void StaticInitialize();
+#endif // !DACCESS_COMPILE
+
+public:
+    bool IsCallCountingEnabled(NativeCodeVersion codeVersion);
+
+#ifndef DACCESS_COMPILE
+public:
+    void DisableCallCounting(NativeCodeVersion codeVersion);
+
+public:
+    static bool SetCodeEntryPoint(
+        NativeCodeVersion activeCodeVersion,
+        PCODE codeEntryPoint,
+        bool wasMethodCalled,
+        bool *scheduleTieringBackgroundWorkRef);
+    static PCODE OnCallCountThresholdReached(TransitionBlock *transitionBlock, TADDR stubIdentifyingToken);
+    static COUNT_T GetCountOfCodeVersionsPendingCompletion();
+    static void CompleteCallCounting();
+    void RemoveForwarderStub(MethodDesc *methodDesc);
+
+public:
+    static void StopAndDeleteAllCallCountingStubs();
+private:
+    static void StopAllCallCounting(TieredCompilationManager *tieredCompilationManager, bool *scheduleTieringBackgroundWorkRef);
+    static void DeleteAllCallCountingStubs();
+    void TrimCollections();
+#endif // !DACCESS_COMPILE
+
+public:
+    static bool IsCallCountingStub(PCODE entryPoint);
+    static PCODE GetTargetForMethod(PCODE callCountingStubEntryPoint);
+#ifdef DACCESS_COMPILE
+    static void DacEnumerateCallCountingStubHeapRanges(CLRDataEnumMemoryFlags flags);
+#endif
+
+    DISABLE_COPY(CallCountingManager);
+};
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// CallCountingManager::CallCountingStubManager
+
+class CallCountingStubManager;
+typedef VPTR(CallCountingStubManager) PTR_CallCountingStubManager;
+
+class CallCountingStubManager : public StubManager
+{
+    VPTR_VTABLE_CLASS(CallCountingStubManager, StubManager);
+
+private:
+    SPTR_DECL(CallCountingStubManager, g_pManager);
+
+#ifndef DACCESS_COMPILE
+public:
+    CallCountingStubManager();
+
+public:
+    static void Init();
+#endif
+
+#ifdef _DEBUG
+public:
+    virtual const char *DbgGetName(); // override
+#endif
+
+#ifdef DACCESS_COMPILE
+public:
+    virtual LPCWSTR GetStubManagerName(PCODE addr);
+#endif
+
+protected:
+    virtual BOOL CheckIsStub_Internal(PCODE entryPoint); // override
+    virtual BOOL DoTraceStub(PCODE callCountingStubEntryPoint, TraceDestination *trace); // override
+
+#ifdef DACCESS_COMPILE
+protected:
+    virtual void DoEnumMemoryRegions(CLRDataEnumMemoryFlags flags); // override
+#endif
+
+    DISABLE_COPY(CallCountingStubManager);
+};
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+
+#undef DISABLE_COPY
+
+#endif // FEATURE_TIERED_COMPILATION
index e5128f9..5173d9f 100644 (file)
@@ -683,6 +683,9 @@ void EEStartupHelper(COINITIEE fFlags)
         InitializeStartupFlags();
 
         MethodDescBackpatchInfoTracker::StaticInitialize();
+        CodeVersionManager::StaticInitialize();
+        TieredCompilationManager::StaticInitialize();
+        CallCountingManager::StaticInitialize();
 
         InitThreadManager();
         STRESS_LOG0(LF_STARTUP, LL_ALWAYS, "Returned successfully from InitThreadManager");
index fc3c22e..9ff6416 100644 (file)
 // versioning information
 //
 
-NativeCodeVersion::NativeCodeVersion() : m_pMethodDesc(PTR_NULL) {};
-NativeCodeVersion::NativeCodeVersion(const NativeCodeVersion & rhs) : m_pMethodDesc(rhs.m_pMethodDesc) {}
 NativeCodeVersion::NativeCodeVersion(PTR_MethodDesc pMethod) : m_pMethodDesc(pMethod) {}
-BOOL NativeCodeVersion::IsNull() const { return m_pMethodDesc == NULL; }
-PTR_MethodDesc NativeCodeVersion::GetMethodDesc() const { return m_pMethodDesc; }
-NativeCodeVersionId NativeCodeVersion::GetVersionId() const { return 0; }
 BOOL NativeCodeVersion::IsDefaultVersion() const { return TRUE; }
 PCODE NativeCodeVersion::GetNativeCode() const { return m_pMethodDesc->GetNativeCode(); }
 
@@ -48,19 +43,10 @@ void NativeCodeVersion::SetGCCoverageInfo(PTR_GCCoverageInfo gcCover)
 }
 #endif
 
-bool NativeCodeVersion::operator==(const NativeCodeVersion & rhs) const { return m_pMethodDesc == rhs.m_pMethodDesc; }
-bool NativeCodeVersion::operator!=(const NativeCodeVersion & rhs) const { return !operator==(rhs); }
-
 
 #else // FEATURE_CODE_VERSIONING
 
 
-// This HRESULT is only used as a private implementation detail. If it escapes through public APIS
-// it is a bug. Corerror.xml has a comment in it reserving this value for our use but it doesn't
-// appear in the public headers.
-
-#define CORPROF_E_RUNTIME_SUSPEND_REQUIRED _HRESULT_TYPEDEF_(0x80131381L)
-
 #ifndef DACCESS_COMPILE
 NativeCodeVersionNode::NativeCodeVersionNode(
     NativeCodeVersionId id,
@@ -83,20 +69,6 @@ NativeCodeVersionNode::NativeCodeVersionNode(
 {}
 #endif
 
-#ifdef DEBUG
-BOOL NativeCodeVersionNode::LockOwnedByCurrentThread() const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    return GetMethodDesc()->GetCodeVersionManager()->LockOwnedByCurrentThread();
-}
-#endif //DEBUG
-
-PTR_MethodDesc NativeCodeVersionNode::GetMethodDesc() const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    return m_pMethodDesc;
-}
-
 PCODE NativeCodeVersionNode::GetNativeCode() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
@@ -115,7 +87,7 @@ ILCodeVersion NativeCodeVersionNode::GetILCodeVersion() const
 #ifdef DEBUG
     if (GetILVersionId() != 0)
     {
-        _ASSERTE(LockOwnedByCurrentThread());
+        _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     }
 #endif
     PTR_MethodDesc pMD = GetMethodDesc();
@@ -140,7 +112,7 @@ BOOL NativeCodeVersionNode::SetNativeCodeInterlocked(PCODE pCode, PCODE pExpecte
 BOOL NativeCodeVersionNode::IsActiveChildVersion() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     return (m_flags & IsActiveChildFlag) != 0;
 }
 
@@ -148,7 +120,7 @@ BOOL NativeCodeVersionNode::IsActiveChildVersion() const
 void NativeCodeVersionNode::SetActiveChildFlag(BOOL isActive)
 {
     LIMITED_METHOD_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     if (isActive)
     {
         m_flags |= IsActiveChildFlag;
@@ -199,23 +171,6 @@ void NativeCodeVersionNode::SetGCCoverageInfo(PTR_GCCoverageInfo gcCover)
 
 #endif // HAVE_GCCOVER
 
-NativeCodeVersion::NativeCodeVersion() :
-    m_storageKind(StorageKind::Unknown), m_pVersionNode(PTR_NULL)
-{}
-
-NativeCodeVersion::NativeCodeVersion(const NativeCodeVersion & rhs) :
-    m_storageKind(rhs.m_storageKind)
-{
-    if(m_storageKind == StorageKind::Explicit)
-    {
-        m_pVersionNode = rhs.m_pVersionNode;
-    }
-    else if(m_storageKind == StorageKind::Synthetic)
-    {
-        m_synthetic = rhs.m_synthetic;
-    }
-}
-
 NativeCodeVersion::NativeCodeVersion(PTR_NativeCodeVersionNode pVersionNode) :
     m_storageKind(pVersionNode != NULL ? StorageKind::Explicit : StorageKind::Unknown),
     m_pVersionNode(pVersionNode)
@@ -228,31 +183,12 @@ NativeCodeVersion::NativeCodeVersion(PTR_MethodDesc pMethod) :
     m_synthetic.m_pMethodDesc = pMethod;
 }
 
-BOOL NativeCodeVersion::IsNull() const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    return m_storageKind == StorageKind::Unknown;
-}
-
 BOOL NativeCodeVersion::IsDefaultVersion() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
     return m_storageKind == StorageKind::Synthetic;
 }
 
-PTR_MethodDesc NativeCodeVersion::GetMethodDesc() const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    if (m_storageKind == StorageKind::Explicit)
-    {
-        return AsNode()->GetMethodDesc();
-    }
-    else
-    {
-        return m_synthetic.m_pMethodDesc;
-    }
-}
-
 PCODE NativeCodeVersion::GetNativeCode() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
@@ -293,19 +229,6 @@ ILCodeVersion NativeCodeVersion::GetILCodeVersion() const
     }
 }
 
-NativeCodeVersionId NativeCodeVersion::GetVersionId() const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    if (m_storageKind == StorageKind::Explicit)
-    {
-        return AsNode()->GetVersionId();
-    }
-    else
-    {
-        return 0;
-    }
-}
-
 #ifndef DACCESS_COMPILE
 BOOL NativeCodeVersion::SetNativeCodeInterlocked(PCODE pCode, PCODE pExpected)
 {
@@ -473,30 +396,6 @@ PTR_NativeCodeVersionNode NativeCodeVersion::AsNode()
 }
 #endif
 
-bool NativeCodeVersion::operator==(const NativeCodeVersion & rhs) const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    if (m_storageKind == StorageKind::Explicit)
-    {
-        return (rhs.m_storageKind == StorageKind::Explicit) &&
-            (rhs.AsNode() == AsNode());
-    }
-    else if (m_storageKind == StorageKind::Synthetic)
-    {
-        return (rhs.m_storageKind == StorageKind::Synthetic) &&
-            (m_synthetic.m_pMethodDesc == rhs.m_synthetic.m_pMethodDesc);
-    }
-    else
-    {
-        return rhs.m_storageKind == StorageKind::Unknown;
-    }
-}
-bool NativeCodeVersion::operator!=(const NativeCodeVersion & rhs) const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    return !operator==(rhs);
-}
-
 NativeCodeVersionCollection::NativeCodeVersionCollection(PTR_MethodDesc pMethodDescFilter, ILCodeVersion ilCodeFilter) :
     m_pMethodDescFilter(pMethodDescFilter),
     m_ilCodeFilter(ilCodeFilter)
@@ -625,14 +524,6 @@ ILCodeVersionNode::ILCodeVersionNode(Module* pModule, mdMethodDef methodDef, ReJ
 {}
 #endif
 
-#ifdef DEBUG
-BOOL ILCodeVersionNode::LockOwnedByCurrentThread() const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-    return GetModule()->GetCodeVersionManager()->LockOwnedByCurrentThread();
-}
-#endif //DEBUG
-
 PTR_Module ILCodeVersionNode::GetModule() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
@@ -679,14 +570,14 @@ DWORD ILCodeVersionNode::GetJitFlags() const
 const InstrumentedILOffsetMapping* ILCodeVersionNode::GetInstrumentedILMap() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     return &m_instrumentedILMap;
 }
 
 PTR_ILCodeVersionNode ILCodeVersionNode::GetNextILVersionNode() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     return m_pNextILVersionNode;
 }
 
@@ -695,7 +586,7 @@ void ILCodeVersionNode::SetRejitState(ILCodeVersion::RejitFlags newState)
 {
     LIMITED_METHOD_CONTRACT;
     // We're doing a non thread safe modification to m_rejitState
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
 
     ILCodeVersion::RejitFlags oldNonMaskFlags =
         static_cast<ILCodeVersion::RejitFlags>(m_rejitState.Load() & ~ILCodeVersion::kStateMask);
@@ -706,7 +597,7 @@ void ILCodeVersionNode::SetEnableReJITCallback(BOOL state)
 {
     LIMITED_METHOD_CONTRACT;
     // We're doing a non thread safe modification to m_rejitState
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
 
     ILCodeVersion::RejitFlags oldFlags = m_rejitState.Load();
     if (state)
@@ -734,14 +625,14 @@ void ILCodeVersionNode::SetJitFlags(DWORD flags)
 void ILCodeVersionNode::SetInstrumentedILMap(SIZE_T cMap, COR_IL_MAP * rgMap)
 {
     LIMITED_METHOD_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     m_instrumentedILMap.SetMappingInfo(cMap, rgMap);
 }
 
 void ILCodeVersionNode::SetNextILVersionNode(ILCodeVersionNode* pNextILVersionNode)
 {
     LIMITED_METHOD_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     m_pNextILVersionNode = pNextILVersionNode;
 }
 #endif
@@ -874,6 +765,39 @@ NativeCodeVersion ILCodeVersion::GetActiveNativeCodeVersion(PTR_MethodDesc pClos
     return NativeCodeVersion();
 }
 
+#if defined(FEATURE_TIERED_COMPILATION) && !defined(DACCESS_COMPILE)
+bool ILCodeVersion::HasAnyOptimizedNativeCodeVersion(NativeCodeVersion tier0NativeCodeVersion) const
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+    _ASSERTE(!tier0NativeCodeVersion.IsNull());
+    _ASSERTE(tier0NativeCodeVersion.GetILCodeVersion() == *this);
+    _ASSERTE(tier0NativeCodeVersion.GetMethodDesc()->IsEligibleForTieredCompilation());
+    _ASSERTE(tier0NativeCodeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
+
+    NativeCodeVersionCollection nativeCodeVersions = GetNativeCodeVersions(tier0NativeCodeVersion.GetMethodDesc());
+    for (auto itEnd = nativeCodeVersions.End(), it = nativeCodeVersions.Begin(); it != itEnd; ++it)
+    {
+        NativeCodeVersion nativeCodeVersion = *it;
+
+        // The tier 0 native code version is often the default code version and this is much faster than below
+        if (nativeCodeVersion == tier0NativeCodeVersion)
+        {
+            continue;
+        }
+
+        NativeCodeVersion::OptimizationTier optimizationTier = nativeCodeVersion.GetOptimizationTier();
+        if (optimizationTier == NativeCodeVersion::OptimizationTier1 ||
+            optimizationTier == NativeCodeVersion::OptimizationTierOptimized)
+        {
+            return true;
+        }
+    }
+
+    return false;
+}
+#endif
+
 ILCodeVersion::RejitFlags ILCodeVersion::GetRejitState() const
 {
     LIMITED_METHOD_DAC_CONTRACT;
@@ -1055,7 +979,7 @@ HRESULT ILCodeVersion::GetOrCreateActiveNativeCodeVersion(MethodDesc* pClosedMet
     return S_OK;
 }
 
-HRESULT ILCodeVersion::SetActiveNativeCodeVersion(NativeCodeVersion activeNativeCodeVersion, BOOL fEESuspended)
+HRESULT ILCodeVersion::SetActiveNativeCodeVersion(NativeCodeVersion activeNativeCodeVersion)
 {
     LIMITED_METHOD_CONTRACT;
     HRESULT hr = S_OK;
@@ -1077,7 +1001,7 @@ HRESULT ILCodeVersion::SetActiveNativeCodeVersion(NativeCodeVersion activeNative
     CodeVersionManager* pCodeVersionManager = GetModule()->GetCodeVersionManager();
     if (pCodeVersionManager->GetActiveILCodeVersion(GetModule(), GetMethodDef()) == *this)
     {
-        if (FAILED(hr = pCodeVersionManager->PublishNativeCodeVersion(pMethodDesc, activeNativeCodeVersion, fEESuspended)))
+        if (FAILED(hr = pCodeVersionManager->PublishNativeCodeVersion(pMethodDesc, activeNativeCodeVersion)))
         {
             return hr;
         }
@@ -1161,8 +1085,8 @@ void ILCodeVersionIterator::Next()
     }
     if (m_stage == IterationStage::ImplicitCodeVersion)
     {
+        _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
         CodeVersionManager* pCodeVersionManager = m_pCollection->m_pModule->GetCodeVersionManager();
-        _ASSERTE(pCodeVersionManager->LockOwnedByCurrentThread());
         PTR_ILCodeVersioningState pILCodeVersioningState = pCodeVersionManager->GetILCodeVersioningState(m_pCollection->m_pModule, m_pCollection->m_methodDef);
         if (pILCodeVersioningState != NULL)
         {
@@ -1327,57 +1251,6 @@ bool CodeVersionManager::s_initialNativeCodeVersionMayNotBeTheDefaultNativeCodeV
 CodeVersionManager::CodeVersionManager()
 {}
 
-//---------------------------------------------------------------------------------------
-//
-// Called from BaseDomain::BaseDomain to do any constructor-time initialization.
-// Presently, this takes care of initializing the Crst.
-//
-
-void CodeVersionManager::PreInit()
-{
-    CONTRACTL
-    {
-        THROWS;
-        GC_TRIGGERS;
-        CAN_TAKE_LOCK;
-        MODE_ANY;
-    }
-    CONTRACTL_END;
-
-#ifndef DACCESS_COMPILE
-    m_crstTable.Init(
-        CrstReJITDomainTable,
-        CrstFlags(CRST_UNSAFE_ANYMODE | CRST_DEBUGGER_THREAD | CRST_REENTRANCY | CRST_TAKEN_DURING_SHUTDOWN));
-#endif // DACCESS_COMPILE
-}
-
-CodeVersionManager::TableLockHolder::TableLockHolder(CodeVersionManager* pCodeVersionManager) :
-    CrstHolder(&pCodeVersionManager->m_crstTable)
-{
-}
-#ifndef DACCESS_COMPILE
-void CodeVersionManager::EnterLock()
-{
-    m_crstTable.Enter();
-}
-void CodeVersionManager::LeaveLock()
-{
-    m_crstTable.Leave();
-}
-#endif
-
-#ifdef DEBUG
-BOOL CodeVersionManager::LockOwnedByCurrentThread() const
-{
-    LIMITED_METHOD_DAC_CONTRACT;
-#ifdef DACCESS_COMPILE
-    return TRUE;
-#else
-    return const_cast<CrstExplicitInit &>(m_crstTable).OwnedByCurrentThread();
-#endif
-}
-#endif
-
 PTR_ILCodeVersioningState CodeVersionManager::GetILCodeVersioningState(PTR_Module pModule, mdMethodDef methodDef) const
 {
     LIMITED_METHOD_DAC_CONTRACT;
@@ -1464,28 +1337,28 @@ DWORD CodeVersionManager::GetNonDefaultILVersionCount()
 ILCodeVersionCollection CodeVersionManager::GetILCodeVersions(PTR_MethodDesc pMethod)
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     return GetILCodeVersions(dac_cast<PTR_Module>(pMethod->GetModule()), pMethod->GetMemberDef());
 }
 
 ILCodeVersionCollection CodeVersionManager::GetILCodeVersions(PTR_Module pModule, mdMethodDef methodDef)
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     return ILCodeVersionCollection(pModule, methodDef);
 }
 
 ILCodeVersion CodeVersionManager::GetActiveILCodeVersion(PTR_MethodDesc pMethod)
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     return GetActiveILCodeVersion(dac_cast<PTR_Module>(pMethod->GetModule()), pMethod->GetMemberDef());
 }
 
 ILCodeVersion CodeVersionManager::GetActiveILCodeVersion(PTR_Module pModule, mdMethodDef methodDef)
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     ILCodeVersioningState* pILCodeVersioningState = GetILCodeVersioningState(pModule, methodDef);
     if (pILCodeVersioningState == NULL)
     {
@@ -1500,7 +1373,7 @@ ILCodeVersion CodeVersionManager::GetActiveILCodeVersion(PTR_Module pModule, mdM
 ILCodeVersion CodeVersionManager::GetILCodeVersion(PTR_MethodDesc pMethod, ReJITID rejitId)
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
 
 #ifdef FEATURE_REJIT
     ILCodeVersionCollection collection = GetILCodeVersions(pMethod);
@@ -1521,14 +1394,14 @@ ILCodeVersion CodeVersionManager::GetILCodeVersion(PTR_MethodDesc pMethod, ReJIT
 NativeCodeVersionCollection CodeVersionManager::GetNativeCodeVersions(PTR_MethodDesc pMethod) const
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     return NativeCodeVersionCollection(pMethod, ILCodeVersion());
 }
 
 NativeCodeVersion CodeVersionManager::GetNativeCodeVersion(PTR_MethodDesc pMethod, PCODE codeStartAddress) const
 {
     LIMITED_METHOD_DAC_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
 
     NativeCodeVersionCollection nativeCodeVersions = GetNativeCodeVersions(pMethod);
     for (NativeCodeVersionIterator cur = nativeCodeVersions.Begin(), end = nativeCodeVersions.End(); cur != end; cur++)
@@ -1545,7 +1418,7 @@ NativeCodeVersion CodeVersionManager::GetNativeCodeVersion(PTR_MethodDesc pMetho
 HRESULT CodeVersionManager::AddILCodeVersion(Module* pModule, mdMethodDef methodDef, ReJITID rejitId, ILCodeVersion* pILCodeVersion)
 {
     LIMITED_METHOD_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
 
     ILCodeVersioningState* pILCodeVersioningState;
     HRESULT hr = GetOrCreateILCodeVersioningState(pModule, methodDef, &pILCodeVersioningState);
@@ -1565,7 +1438,7 @@ HRESULT CodeVersionManager::AddILCodeVersion(Module* pModule, mdMethodDef method
     return S_OK;
 }
 
-HRESULT CodeVersionManager::SetActiveILCodeVersions(ILCodeVersion* pActiveVersions, DWORD cActiveVersions, BOOL fEESuspended, CDynArray<CodePublishError> * pErrors)
+HRESULT CodeVersionManager::SetActiveILCodeVersions(ILCodeVersion* pActiveVersions, DWORD cActiveVersions, CDynArray<CodePublishError> * pErrors)
 {
     // If the IL version is in the shared domain we need to iterate all domains
     // looking for instantiations. The domain iterator lock is bigger than
@@ -1587,7 +1460,7 @@ HRESULT CodeVersionManager::SetActiveILCodeVersions(ILCodeVersion* pActiveVersio
         PRECONDITION(CheckPointer(pErrors, NULL_OK));
     }
     CONTRACTL_END;
-    _ASSERTE(!LockOwnedByCurrentThread());
+    _ASSERTE(!IsLockOwnedByCurrentThread());
     HRESULT hr = S_OK;
 
 #if DEBUG
@@ -1609,7 +1482,7 @@ HRESULT CodeVersionManager::SetActiveILCodeVersions(ILCodeVersion* pActiveVersio
     // any new method instantiations added after this point will bind to
     // the correct version
     {
-        TableLockHolder(this);
+        LockHolder codeVersioningLockHolder;
         for (DWORD i = 0; i < cActiveVersions; i++)
         {
             ILCodeVersion activeVersion = pActiveVersions[i];
@@ -1654,7 +1527,7 @@ HRESULT CodeVersionManager::SetActiveILCodeVersions(ILCodeVersion* pActiveVersio
         // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
         // must be used here to prevent deadlock.
         GCX_COOP();
-        TableLockHolder lock(this);
+        LockHolder codeVersioningLockHolder;
 
         for (DWORD i = 0; i < cActiveVersions; i++)
         {
@@ -1679,7 +1552,7 @@ HRESULT CodeVersionManager::SetActiveILCodeVersions(ILCodeVersion* pActiveVersio
 
                 // Publish that child version, because it is the active native child of the active IL version
                 // Failing to publish is non-fatal, but we do record it so the caller is aware
-                if (FAILED(hr = PublishNativeCodeVersion(methodDescs[j], activeNativeChild, fEESuspended)))
+                if (FAILED(hr = PublishNativeCodeVersion(methodDescs[j], activeNativeChild)))
                 {
                     if (FAILED(hr = AddCodePublishError(activeILVersion.GetModule(), activeILVersion.GetMethodDef(), methodDescs[j], hr, &errorRecords)))
                     {
@@ -1701,7 +1574,7 @@ HRESULT CodeVersionManager::AddNativeCodeVersion(
     NativeCodeVersion* pNativeCodeVersion)
 {
     LIMITED_METHOD_CONTRACT;
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
 
     MethodDescVersioningState* pMethodVersioningState;
     HRESULT hr = GetOrCreateMethodDescVersioningState(pClosedMethodDesc, &pMethodVersioningState);
@@ -1742,7 +1615,7 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
     bool *doFullBackpatchRef)
 {
     STANDARD_VM_CONTRACT;
-    _ASSERTE(!LockOwnedByCurrentThread());
+    _ASSERTE(!IsLockOwnedByCurrentThread());
     _ASSERTE(pMethodDesc->IsVersionable());
     _ASSERTE(doBackpatchRef != nullptr);
     _ASSERTE(*doBackpatchRef);
@@ -1771,7 +1644,7 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
             return NULL;
         }
 
-        TableLockHolder lock(this);
+        LockHolder codeVersioningLockHolder;
 
         if (SUCCEEDED(hr = GetActiveILCodeVersion(pMethodDesc).GetOrCreateActiveNativeCodeVersion(pMethodDesc, &activeVersion)))
         {
@@ -1785,14 +1658,14 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
         return pCode != NULL ? pCode : pMethodDesc->PrepareInitialCode();
     } while (false);
 
-#ifdef FEATURE_TIERED_COMPILATION
-    bool shouldCountCalls = true;
-#endif
     while (true)
     {
-        // Compile the code if needed
+        bool handleCallCountingForFirstCall = false;
+        bool handleCallCounting = false;
         bool doPublish = true;
         bool profilerMayHaveActivatedNonDefaultCodeVersion = false;
+
+        // Compile the code if needed
         if (pCode == NULL)
         {
             PrepareCodeConfigBuffer configBuffer(activeVersion);
@@ -1813,8 +1686,7 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
                 _ASSERTE(
                     !config->ShouldCountCalls() ||
                     activeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
-                if (shouldCountCalls &&
-                    config->ShouldCountCalls()) // the generated code was at a tier that is call-counted
+                if (config->ShouldCountCalls()) // the generated code was at a tier that is call-counted
                 {
                     // This is the first call to a call-counted code version of the method
                     // - It is possible that this is not the first call to the method, for example after the method is called a
@@ -1825,12 +1697,16 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
                     // - Currently, there is only one call-counted tier in the normal flow of tier transitions for a method. In
                     //   the future there may be more call-counted tiers. Those code versions should be jitted and activated in
                     //   the background and would not reach this path.
-                    if (!GetAppDomain()->GetTieredCompilationManager()->OnMethodCodeVersionCalledFirstTime(pMethodDesc))
+                    if (g_pConfig->TieredCompilation_CallCountingDelayMs() != 0)
                     {
+                        handleCallCountingForFirstCall = true;
+                    }
+                    else if (g_pConfig->TieredCompilation_CallCounting())
+                    {
+                        // The tiering delay is disabled, avoid creating call counting stubs on the first call to every method,
+                        // as that is a slower path. Instead, wait for a second call to establish call counting.
                         doPublish = false;
                     }
-
-                    shouldCountCalls = false;
                 }
             #endif
             }
@@ -1843,70 +1719,124 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
             }
         }
     #ifdef FEATURE_TIERED_COMPILATION
-        else if (shouldCountCalls && CallCounter::OnMethodCodeVersionCalledSubsequently(activeVersion, &doPublish))
+        else
         {
-            shouldCountCalls = false;
+            handleCallCounting = true;
         }
     #endif
 
-        bool mayHaveEntryPointSlotsToBackpatch = doPublish && pMethodDesc->MayHaveEntryPointSlotsToBackpatch();
-        MethodDescBackpatchInfoTracker::ConditionalLockHolder lockHolder(mayHaveEntryPointSlotsToBackpatch);
-
-        // Try a faster check to see if we can avoid checking the currently active code version
-        // - For the default code version, if a profiler is attached it may be notified of JIT events and may request rejit
-        //   synchronously on the same thread. In that case, for back-compat as described below, the returned code must be for
-        //   the rejitted or newer code.
-        // - It must be ensured that there are no races that could cause an older entry point to replace a newer entry point.
-        //   For non-default code versions, it's necessary to check the currently active code version and publish under the
-        //   CodeVersionManager's lock. For default code versions, the entry point is atomically updated and only if it is
-        //   pointing to the prestub.
-        if (activeVersion.IsDefaultVersion() && !profilerMayHaveActivatedNonDefaultCodeVersion)
+        bool done = false;
+        bool scheduleTieringBackgroundWork = false;
+        NativeCodeVersion newActiveVersion;
+        do
         {
-            if (doPublish)
+            bool mayHaveEntryPointSlotsToBackpatch = doPublish && pMethodDesc->MayHaveEntryPointSlotsToBackpatch();
+            MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder(mayHaveEntryPointSlotsToBackpatch);
+
+            // Try a faster check to see if we can avoid checking the currently active code version
+            // - For the default code version, if a profiler is attached it may be notified of JIT events and may request rejit
+            //   synchronously on the same thread. In that case, for back-compat as described below, the returned code must be for
+            //   the rejitted or newer code.
+            // - It must be ensured that there are no races that could cause an older entry point to replace a newer entry point.
+            //   For non-default code versions, it's necessary to check the currently active code version and publish under the
+            //   CodeVersionManager's lock. For default code versions, the entry point is atomically updated and only if it is
+            //   pointing to the prestub.
+            if (activeVersion.IsDefaultVersion() && !handleCallCounting && !profilerMayHaveActivatedNonDefaultCodeVersion)
             {
-                pMethodDesc->TrySetInitialCodeEntryPointForVersionableMethod(pCode, mayHaveEntryPointSlotsToBackpatch);
+                if (doPublish)
+                {
+                    pMethodDesc->TrySetInitialCodeEntryPointForVersionableMethod(pCode, mayHaveEntryPointSlotsToBackpatch);
+                }
+                else
+                {
+                    *doBackpatchRef = false;
+                }
+
+                done = true;
+                break;
             }
-            else
+
+            // Backpatching entry point slots requires cooperative GC mode, see
+            // MethodDescBackpatchInfoTracker::Backpatch_Locked(). The code version manager's table lock is an unsafe lock that
+            // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
+            // must be used here to prevent deadlock.
+            GCX_MAYBE_COOP(mayHaveEntryPointSlotsToBackpatch);
+            LockHolder codeVersioningLockHolder;
+
+            hr = GetActiveILCodeVersion(pMethodDesc).GetOrCreateActiveNativeCodeVersion(pMethodDesc, &newActiveVersion);
+            if (FAILED(hr))
             {
-                *doBackpatchRef = false;
+                break;
             }
-            return pCode;
-        }
 
-        // Backpatching entry point slots requires cooperative GC mode, see
-        // MethodDescBackpatchInfoTracker::Backpatch_Locked(). The code version manager's table lock is an unsafe lock that
-        // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
-        // must be used here to prevent deadlock.
-        GCX_MAYBE_COOP(mayHaveEntryPointSlotsToBackpatch);
-        TableLockHolder lock(this);
+            // The common case is that newActiveCode == activeCode, however we did leave the lock so there is
+            // possibility that the active version has changed. If it has we need to restart the compilation
+            // and publishing process with the new active version instead.
+            //
+            // In theory it should be legitimate to break out of this loop and run the less recent active version,
+            // because ultimately this is a race between one thread that is updating the version and another thread
+            // trying to run the current version. However for back-compat with ReJIT we need to guarantee that
+            // a versioning update at least as late as the profiler JitCompilationFinished callback wins the race.
+            if (newActiveVersion == activeVersion)
+            {
+                if (doPublish)
+                {
+                    if (!handleCallCounting)
+                    {
+                        pMethodDesc->SetCodeEntryPoint(pCode);
+                    }
+                #ifdef FEATURE_TIERED_COMPILATION
+                    else if (
+                        !CallCountingManager::SetCodeEntryPoint(activeVersion, pCode, true, &scheduleTieringBackgroundWork))
+                    {
+                        _ASSERTE(!g_pConfig->TieredCompilation_UseCallCountingStubs());
+                        _ASSERTE(!scheduleTieringBackgroundWork);
+                        *doBackpatchRef = doPublish = false;
+                    }
+                #endif
+                }
+                else
+                {
+                    *doBackpatchRef = false;
+                }
 
-        NativeCodeVersion newActiveVersion;
-        if (FAILED(hr = GetActiveILCodeVersion(pMethodDesc).GetOrCreateActiveNativeCodeVersion(pMethodDesc, &newActiveVersion)))
-        {
-            break;
-        }
+                done = true;
+            }
+        } while (false);
 
-        // The common case is that newActiveCode == activeCode, however we did leave the lock so there is
-        // possibility that the active version has changed. If it has we need to restart the compilation
-        // and publishing process with the new active version instead.
-        //
-        // In theory it should be legitimate to break out of this loop and run the less recent active version,
-        // because ultimately this is a race between one thread that is updating the version and another thread
-        // trying to run the current version. However for back-compat with ReJIT we need to guarantee that
-        // a versioning update at least as late as the profiler JitCompilationFinished callback wins the race.
-        if (newActiveVersion == activeVersion)
+        if (done)
         {
-            if (doPublish)
+            _ASSERTE(SUCCEEDED(hr));
+
+        #ifdef FEATURE_TIERED_COMPILATION
+            if (handleCallCountingForFirstCall)
             {
-                pMethodDesc->SetCodeEntryPoint(pCode);
+                _ASSERTE(doPublish);
+                _ASSERTE(!handleCallCounting);
+                _ASSERTE(!scheduleTieringBackgroundWork);
+
+                // The code entry point is set before recording the method for call counting to avoid a race. Otherwise, the
+                // tiering delay may expire and enable call counting for the method before the entry point is set here, in which
+                // case calls to the method would not be counted anymore.
+                GetAppDomain()->GetTieredCompilationManager()->HandleCallCountingForFirstCall(pMethodDesc);
             }
-            else
+            else if (scheduleTieringBackgroundWork)
             {
-                *doBackpatchRef = false;
+                _ASSERTE(doPublish);
+                _ASSERTE(handleCallCounting);
+                _ASSERTE(!handleCallCountingForFirstCall);
+                GetAppDomain()->GetTieredCompilationManager()->ScheduleBackgroundWork(); // requires GC_TRIGGERS
             }
+        #endif
+
             return pCode;
         }
 
+        if (FAILED(hr))
+        {
+            break;
+        }
+
         activeVersion = newActiveVersion;
         pCode = activeVersion.GetNativeCode();
     }
@@ -1917,11 +1847,8 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
     return pCode;
 }
 
-HRESULT CodeVersionManager::PublishNativeCodeVersion(MethodDesc* pMethod, NativeCodeVersion nativeCodeVersion, BOOL fEESuspended)
+HRESULT CodeVersionManager::PublishNativeCodeVersion(MethodDesc* pMethod, NativeCodeVersion nativeCodeVersion)
 {
-    // TODO: This function needs to make sure it does not change the precode's target if call counting is in progress. Track
-    // whether call counting is currently being done for the method, and use a lock to ensure the expected precode target.
-
     CONTRACTL
     {
         GC_NOTRIGGER;
@@ -1942,7 +1869,8 @@ HRESULT CodeVersionManager::PublishNativeCodeVersion(MethodDesc* pMethod, Native
     }
     CONTRACTL_END;
 
-    _ASSERTE(LockOwnedByCurrentThread());
+    _ASSERTE(!pMethod->MayHaveEntryPointSlotsToBackpatch() || MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     _ASSERTE(pMethod->IsVersionable());
 
     HRESULT hr = S_OK;
@@ -1957,7 +1885,12 @@ HRESULT CodeVersionManager::PublishNativeCodeVersion(MethodDesc* pMethod, Native
             }
             else
             {
+            #ifdef FEATURE_TIERED_COMPILATION
+                bool wasSet = CallCountingManager::SetCodeEntryPoint(nativeCodeVersion, pCode, false, nullptr);
+                _ASSERTE(wasSet);
+            #else
                 pMethod->SetCodeEntryPoint(pCode);
+            #endif
             }
         }
         EX_CATCH_HRESULT(hr);
@@ -2293,7 +2226,7 @@ void CodeVersionManager::ReportCodePublishError(Module* pModule, mdMethodDef met
 #ifdef FEATURE_REJIT
     BOOL isRejitted = FALSE;
     {
-        TableLockHolder(this);
+        LockHolder codeVersioningLockHolder;
         isRejitted = !GetActiveILCodeVersion(pModule, methodDef).IsDefaultVersion();
     }
 
@@ -2307,4 +2240,19 @@ void CodeVersionManager::ReportCodePublishError(Module* pModule, mdMethodDef met
 }
 #endif // DACCESS_COMPILE
 
+CrstStatic CodeVersionManager::s_lock;
+
+#ifdef _DEBUG
+bool CodeVersionManager::IsLockOwnedByCurrentThread()
+{
+    WRAPPER_NO_CONTRACT;
+
+#ifndef DACCESS_COMPILE
+    return !!s_lock.OwnedByCurrentThread();
+#else
+    return true;
+#endif
+}
+#endif // _DEBUG
+
 #endif // FEATURE_CODE_VERSIONING
index daf0b76..43d2093 100644 (file)
@@ -31,10 +31,6 @@ typedef DPTR(class ILCodeVersioningState) PTR_ILCodeVersioningState;
 class CodeVersionManager;
 typedef DPTR(class CodeVersionManager) PTR_CodeVersionManager;
 
-// This HRESULT is only used as a private implementation detail. Corerror.xml has a comment in it
-//  reserving this value for our use but it doesn't appear in the public headers.
-#define CORPROF_E_RUNTIME_SUSPEND_REQUIRED _HRESULT_TYPEDEF_(0x80131381L)
-
 #endif
 
 #ifdef HAVE_GCCOVER
@@ -159,6 +155,9 @@ public:
     ReJITID GetVersionId() const;
     NativeCodeVersionCollection GetNativeCodeVersions(PTR_MethodDesc pClosedMethodDesc) const;
     NativeCodeVersion GetActiveNativeCodeVersion(PTR_MethodDesc pClosedMethodDesc) const;
+#if defined(FEATURE_TIERED_COMPILATION) && !defined(DACCESS_COMPILE)
+    bool HasAnyOptimizedNativeCodeVersion(NativeCodeVersion tier0NativeCodeVersion) const;
+#endif
     PTR_COR_ILMETHOD GetIL() const;
     PTR_COR_ILMETHOD GetILNoThrow() const;
     DWORD GetJitFlags() const;
@@ -170,7 +169,7 @@ public:
     void SetInstrumentedILMap(SIZE_T cMap, COR_IL_MAP * rgMap);
     HRESULT AddNativeCodeVersion(MethodDesc* pClosedMethodDesc, NativeCodeVersion::OptimizationTier optimizationTier, NativeCodeVersion* pNativeCodeVersion);
     HRESULT GetOrCreateActiveNativeCodeVersion(MethodDesc* pClosedMethodDesc, NativeCodeVersion* pNativeCodeVersion);
-    HRESULT SetActiveNativeCodeVersion(NativeCodeVersion activeNativeCodeVersion, BOOL fEESuspended);
+    HRESULT SetActiveNativeCodeVersion(NativeCodeVersion activeNativeCodeVersion);
 #endif //DACCESS_COMPILE
 
     enum RejitFlags
@@ -250,10 +249,6 @@ public:
     NativeCodeVersionNode(NativeCodeVersionId id, MethodDesc* pMethod, ReJITID parentId, NativeCodeVersion::OptimizationTier optimizationTier);
 #endif
 
-#ifdef DEBUG
-    BOOL LockOwnedByCurrentThread() const;
-#endif
-
     PTR_MethodDesc GetMethodDesc() const;
     NativeCodeVersionId GetVersionId() const;
     PCODE GetNativeCode() const;
@@ -351,9 +346,6 @@ public:
 #ifndef DACCESS_COMPILE
     ILCodeVersionNode(Module* pModule, mdMethodDef methodDef, ReJITID id);
 #endif
-#ifdef DEBUG
-    BOOL LockOwnedByCurrentThread() const;
-#endif //DEBUG
     PTR_Module GetModule() const;
     mdMethodDef GetMethodDef() const;
     ReJITID GetVersionId() const;
@@ -558,23 +550,6 @@ class CodeVersionManager
 public:
     CodeVersionManager();
 
-    void PreInit();
-
-    class TableLockHolder : public CrstHolder
-    {
-    public:
-        TableLockHolder(CodeVersionManager * pCodeVersionManager);
-    };
-    //Using the holder is preferable, but in some cases the holder can't be used
-#ifndef DACCESS_COMPILE
-    void EnterLock();
-    void LeaveLock();
-#endif
-
-#ifdef DEBUG
-    BOOL LockOwnedByCurrentThread() const;
-#endif
-
     DWORD GetNonDefaultILVersionCount();
     ILCodeVersionCollection GetILCodeVersions(PTR_MethodDesc pMethod);
     ILCodeVersionCollection GetILCodeVersions(PTR_Module pModule, mdMethodDef methodDef);
@@ -598,10 +573,10 @@ public:
     HRESULT AddILCodeVersion(Module* pModule, mdMethodDef methodDef, ReJITID rejitId, ILCodeVersion* pILCodeVersion);
     HRESULT AddNativeCodeVersion(ILCodeVersion ilCodeVersion, MethodDesc* pClosedMethodDesc, NativeCodeVersion::OptimizationTier optimizationTier, NativeCodeVersion* pNativeCodeVersion);
     PCODE PublishVersionableCodeIfNecessary(MethodDesc* pMethodDesc, bool *doBackpatchRef, bool *doFullBackpatchRef);
-    HRESULT PublishNativeCodeVersion(MethodDesc* pMethodDesc, NativeCodeVersion nativeCodeVersion, BOOL fEESuspended);
+    HRESULT PublishNativeCodeVersion(MethodDesc* pMethodDesc, NativeCodeVersion nativeCodeVersion);
     HRESULT GetOrCreateMethodDescVersioningState(MethodDesc* pMethod, MethodDescVersioningState** ppMethodDescVersioningState);
     HRESULT GetOrCreateILCodeVersioningState(Module* pModule, mdMethodDef methodDef, ILCodeVersioningState** ppILCodeVersioningState);
-    HRESULT SetActiveILCodeVersions(ILCodeVersion* pActiveVersions, DWORD cActiveVersions, BOOL fEESuspended, CDynArray<CodePublishError> * pPublishErrors);
+    HRESULT SetActiveILCodeVersions(ILCodeVersion* pActiveVersions, DWORD cActiveVersions, CDynArray<CodePublishError> * pPublishErrors);
     static HRESULT AddCodePublishError(Module* pModule, mdMethodDef methodDef, MethodDesc* pMD, HRESULT hrStatus, CDynArray<CodePublishError> * pErrors);
     static HRESULT AddCodePublishError(NativeCodeVersion nativeCodeVersion, HRESULT hrStatus, CDynArray<CodePublishError> * pErrors);
     static void OnAppDomainExit(AppDomain* pAppDomain);
@@ -647,9 +622,140 @@ private:
     //closed MethodDesc -> MethodDescVersioningState
     MethodDescVersioningStateHash m_methodDescVersioningStateMap;
 
-    CrstExplicitInit m_crstTable;
+private:
+    static CrstStatic s_lock;
+
+#ifndef DACCESS_COMPILE
+public:
+    static void StaticInitialize()
+    {
+        WRAPPER_NO_CONTRACT;
+
+        s_lock.Init(
+            CrstCodeVersioning,
+            CrstFlags(CRST_UNSAFE_ANYMODE | CRST_DEBUGGER_THREAD | CRST_REENTRANCY | CRST_TAKEN_DURING_SHUTDOWN));
+    }
+#endif
+
+#ifdef _DEBUG
+public:
+    static bool IsLockOwnedByCurrentThread();
+#endif
+
+public:
+    class LockHolder : private CrstHolderWithState
+    {
+    public:
+        LockHolder()
+        #ifndef DACCESS_COMPILE
+            : CrstHolderWithState(&s_lock)
+        #else
+            : CrstHolderWithState(nullptr)
+        #endif
+        {
+            WRAPPER_NO_CONTRACT;
+        }
+
+        LockHolder(const LockHolder &) = delete;
+        LockHolder &operator =(const LockHolder &) = delete;
+    };
 };
 
 #endif // FEATURE_CODE_VERSIONING
 
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// NativeCodeVersion definitions
+
+inline NativeCodeVersion::NativeCodeVersion()
+#ifdef FEATURE_CODE_VERSIONING
+    : m_storageKind(StorageKind::Unknown), m_pVersionNode(PTR_NULL)
+#else
+    : m_pMethodDesc(PTR_NULL)
+#endif
+{
+    LIMITED_METHOD_DAC_CONTRACT;
+#ifdef FEATURE_CODE_VERSIONING
+    static_assert_no_msg(sizeof(m_pVersionNode) == sizeof(m_synthetic));
+#endif
+}
+
+inline NativeCodeVersion::NativeCodeVersion(const NativeCodeVersion & rhs)
+#ifdef FEATURE_CODE_VERSIONING
+    : m_storageKind(rhs.m_storageKind), m_pVersionNode(rhs.m_pVersionNode)
+#else
+    : m_pMethodDesc(rhs.m_pMethodDesc)
+#endif
+{
+    LIMITED_METHOD_DAC_CONTRACT;
+#ifdef FEATURE_CODE_VERSIONING
+    static_assert_no_msg(sizeof(m_pVersionNode) == sizeof(m_synthetic));
+#endif
+}
+
+inline BOOL NativeCodeVersion::IsNull() const
+{
+    LIMITED_METHOD_DAC_CONTRACT;
+
+#ifdef FEATURE_CODE_VERSIONING
+    return m_storageKind == StorageKind::Unknown;
+#else
+    return m_pMethodDesc == NULL;
+#endif
+}
+
+inline PTR_MethodDesc NativeCodeVersion::GetMethodDesc() const
+{
+    LIMITED_METHOD_DAC_CONTRACT;
+
+#ifdef FEATURE_CODE_VERSIONING
+    return m_storageKind == StorageKind::Explicit ? m_pVersionNode->GetMethodDesc() : m_synthetic.m_pMethodDesc;
+#else
+    return m_pMethodDesc;
+#endif
+}
+
+inline NativeCodeVersionId NativeCodeVersion::GetVersionId() const
+{
+    LIMITED_METHOD_DAC_CONTRACT;
+
+#ifdef FEATURE_CODE_VERSIONING
+    if (m_storageKind == StorageKind::Explicit)
+    {
+        return m_pVersionNode->GetVersionId();
+    }
+#endif
+    return 0;
+}
+
+inline bool NativeCodeVersion::operator==(const NativeCodeVersion & rhs) const
+{
+    LIMITED_METHOD_DAC_CONTRACT;
+
+#ifdef FEATURE_CODE_VERSIONING
+    static_assert_no_msg(sizeof(m_pVersionNode) == sizeof(m_synthetic));
+    return m_storageKind == rhs.m_storageKind && m_pVersionNode == rhs.m_pVersionNode;
+#else
+    return m_pMethodDesc == rhs.m_pMethodDesc;
+#endif
+}
+
+inline bool NativeCodeVersion::operator!=(const NativeCodeVersion & rhs) const
+{
+    WRAPPER_NO_CONTRACT;
+    return !operator==(rhs);
+}
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// NativeCodeVersionNode definitions
+
+#ifdef FEATURE_CODE_VERSIONING
+
+inline PTR_MethodDesc NativeCodeVersionNode::GetMethodDesc() const
+{
+    LIMITED_METHOD_DAC_CONTRACT;
+    return m_pMethodDesc;
+}
+
+#endif // FEATURE_CODE_VERSIONING
+
 #endif // CODE_VERSION_H
index a942add..909e9c6 100644 (file)
@@ -335,8 +335,10 @@ HRESULT EEConfig::Init()
     fTieredCompilation_QuickJit = false;
     fTieredCompilation_QuickJitForLoops = false;
     fTieredCompilation_CallCounting = false;
+    fTieredCompilation_UseCallCountingStubs = false;
     tieredCompilation_CallCountThreshold = 1;
     tieredCompilation_CallCountingDelayMs = 0;
+    tieredCompilation_DeleteCallCountingStubsAfter = 0;
 #endif
 
 #ifndef CROSSGEN_COMPILE
@@ -1203,14 +1205,19 @@ fTrackDynamicMethodDebugInfo = CLRConfig::GetConfigValue(CLRConfig::UNSUPPORTED_
 
         fTieredCompilation_CallCounting = CLRConfig::GetConfigValue(CLRConfig::INTERNAL_TC_CallCounting) != 0;
 
-        tieredCompilation_CallCountThreshold = CLRConfig::GetConfigValue(CLRConfig::INTERNAL_TC_CallCountThreshold);
-        if (tieredCompilation_CallCountThreshold < 1)
+        DWORD tieredCompilation_ConfiguredCallCountThreshold =
+            CLRConfig::GetConfigValue(CLRConfig::INTERNAL_TC_CallCountThreshold);
+        if (tieredCompilation_ConfiguredCallCountThreshold == 0)
         {
             tieredCompilation_CallCountThreshold = 1;
         }
-        else if (tieredCompilation_CallCountThreshold > INT_MAX) // CallCounter uses 'int'
+        else if (tieredCompilation_ConfiguredCallCountThreshold > UINT16_MAX)
         {
-            tieredCompilation_CallCountThreshold = INT_MAX;
+            tieredCompilation_CallCountThreshold = UINT16_MAX;
+        }
+        else
+        {
+            tieredCompilation_CallCountThreshold = (UINT16)tieredCompilation_ConfiguredCallCountThreshold;
         }
 
         tieredCompilation_CallCountingDelayMs = CLRConfig::GetConfigValue(CLRConfig::INTERNAL_TC_CallCountingDelayMs);
@@ -1233,6 +1240,28 @@ fTrackDynamicMethodDebugInfo = CLRConfig::GetConfigValue(CLRConfig::UNSUPPORTED_
             }
         }
 
+        if (fTieredCompilation_CallCounting)
+        {
+            fTieredCompilation_UseCallCountingStubs =
+                CLRConfig::GetConfigValue(CLRConfig::INTERNAL_TC_UseCallCountingStubs) != 0;
+            if (fTieredCompilation_UseCallCountingStubs)
+            {
+                tieredCompilation_DeleteCallCountingStubsAfter =
+                    CLRConfig::GetConfigValue(CLRConfig::INTERNAL_TC_DeleteCallCountingStubsAfter);
+            }
+        }
+
+        if (CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_TC_AggressiveTiering) != 0)
+        {
+            // TC_AggressiveTiering may be used in some benchmarks to have methods be tiered up more quickly, for example when
+            // the measurement is sensitive to GC allocations or activity. Methods tiered up more quickly may have different
+            // performance characteristics, as timing of the rejit may play a role. If there are multiple tiers before the final
+            // tier, the expectation is that the method progress through all tiers as quickly as possible, ideally running the
+            // code for each tier at least once before progressing to the next tier.
+            tieredCompilation_CallCountThreshold = 1;
+            tieredCompilation_CallCountingDelayMs = 0;
+        }
+
         if (ETW::CompilationLog::TieredCompilation::Runtime::IsEnabled())
         {
             ETW::CompilationLog::TieredCompilation::Runtime::SendSettings();
index ceecfbf..e3772cb 100644 (file)
@@ -284,8 +284,10 @@ public:
     bool          TieredCompilation_QuickJit() const { LIMITED_METHOD_CONTRACT; return fTieredCompilation_QuickJit; }
     bool          TieredCompilation_QuickJitForLoops() const { LIMITED_METHOD_CONTRACT; return fTieredCompilation_QuickJitForLoops; }
     bool          TieredCompilation_CallCounting()  const { LIMITED_METHOD_CONTRACT; return fTieredCompilation_CallCounting; }
-    DWORD         TieredCompilation_CallCountThreshold() const { LIMITED_METHOD_CONTRACT; return tieredCompilation_CallCountThreshold; }
+    UINT16        TieredCompilation_CallCountThreshold() const { LIMITED_METHOD_CONTRACT; return tieredCompilation_CallCountThreshold; }
     DWORD         TieredCompilation_CallCountingDelayMs() const { LIMITED_METHOD_CONTRACT; return tieredCompilation_CallCountingDelayMs; }
+    bool          TieredCompilation_UseCallCountingStubs() const { LIMITED_METHOD_CONTRACT; return fTieredCompilation_UseCallCountingStubs; }
+    DWORD         TieredCompilation_DeleteCallCountingStubsAfter() const { LIMITED_METHOD_CONTRACT; return tieredCompilation_DeleteCallCountingStubsAfter; }
 #endif
 
 #ifndef CROSSGEN_COMPILE
@@ -1019,8 +1021,10 @@ private: //----------------------------------------------------------------
     bool fTieredCompilation_QuickJit;
     bool fTieredCompilation_QuickJitForLoops;
     bool fTieredCompilation_CallCounting;
-    DWORD tieredCompilation_CallCountThreshold;
+    bool fTieredCompilation_UseCallCountingStubs;
+    UINT16 tieredCompilation_CallCountThreshold;
     DWORD tieredCompilation_CallCountingDelayMs;
+    DWORD tieredCompilation_DeleteCallCountingStubsAfter;
 #endif
 
 #ifndef CROSSGEN_COMPILE
index 2d99e51..c794dac 100644 (file)
@@ -7003,9 +7003,8 @@ VOID ETW::MethodLog::SendEventsForJitMethodsHelper(LoaderAllocator *pLoaderAlloc
 #ifdef FEATURE_CODE_VERSIONING
         if (fGetCodeIds && pMD->IsVersionable())
         {
-            CodeVersionManager *pCodeVersionManager = pMD->GetCodeVersionManager();
-            _ASSERTE(pCodeVersionManager->LockOwnedByCurrentThread());
-            nativeCodeVersion = pCodeVersionManager->GetNativeCodeVersion(pMD, codeStart);
+            _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+            nativeCodeVersion = pMD->GetCodeVersionManager()->GetNativeCodeVersion(pMD, codeStart);
             if (nativeCodeVersion.IsNull())
             {
                 // The code version manager hasn't been updated with the jitted code
@@ -7142,7 +7141,7 @@ VOID ETW::MethodLog::SendEventsForJitMethods(BaseDomain *pDomainFilter, LoaderAl
 #ifdef FEATURE_CODE_VERSIONING
         if (pDomainFilter)
         {
-            CodeVersionManager::TableLockHolder lkRejitMgrModule(pDomainFilter->GetCodeVersionManager());
+            CodeVersionManager::LockHolder codeVersioningLockHolder;
             SendEventsForJitMethodsHelper(
                 pLoaderAllocatorFilter,
                 dwEventOptions,
index 9bf2c08..18fd87d 100644 (file)
@@ -155,7 +155,7 @@ PCODE FuncPtrStubs::GetFuncPtrStub(MethodDesc * pMD, PrecodeType type)
         _ASSERTE(pMD->IsVersionableWithVtableSlotBackpatch());
 
         PCODE temporaryEntryPoint = pMD->GetTemporaryEntryPoint();
-        MethodDescBackpatchInfoTracker::ConditionalLockHolder lockHolder;
+        MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder;
 
         // Set the funcptr stub's entry point to the current entry point inside the lock and after the funcptr stub is exposed,
         // to synchronize with backpatching in MethodDesc::BackpatchEntryPointSlots()
index d0eb088..46a691c 100644 (file)
@@ -736,6 +736,32 @@ Frame::Interception StubDispatchFrame::GetInterception()
 }
 
 #ifndef DACCESS_COMPILE
+CallCountingHelperFrame::CallCountingHelperFrame(TransitionBlock *pTransitionBlock, MethodDesc *pMD)
+    : FramedMethodFrame(pTransitionBlock, pMD)
+{
+    WRAPPER_NO_CONTRACT;
+}
+#endif
+
+void CallCountingHelperFrame::GcScanRoots(promote_func *fn, ScanContext *sc)
+{
+    WRAPPER_NO_CONTRACT;
+
+    FramedMethodFrame::GcScanRoots(fn, sc);
+    PromoteCallerStack(fn, sc);
+}
+
+BOOL CallCountingHelperFrame::TraceFrame(Thread *thread, BOOL fromPatch, TraceDestination *trace, REGDISPLAY *regs)
+{
+    WRAPPER_NO_CONTRACT;
+
+    // OnCallCountThresholdReached never directly calls managed code. Returning false instructs the debugger to step out of the
+    // call that erected this frame and continuing trying to trace execution from there.
+    LOG((LF_CORDB, LL_INFO1000, "CallCountingHelperFrame::TraceFrame: return FALSE\n"));
+    return FALSE;
+}
+
+#ifndef DACCESS_COMPILE
 ExternalMethodFrame::ExternalMethodFrame(TransitionBlock * pTransitionBlock)
     : FramedMethodFrame(pTransitionBlock, NULL)
 {
index 5b5dcd0..e4c5372 100644 (file)
@@ -84,6 +84,8 @@
 //    |   |
 //    |   +-StubDispatchFrame   - represents a call into the virtual call stub manager
 //    |   |
+//    |   +-CallCountingHelperFrame - represents a call into the call counting helper when the
+//    |   |                           call count threshold is reached
 //    |   |
 //    |   +-ExternalMethodFrame  - represents a call from an ExternalMethdThunk
 //    |   |
@@ -221,6 +223,7 @@ FRAME_TYPE_NAME(PInvokeCalliFrame)
 FRAME_TYPE_NAME(HijackFrame)
 #endif // FEATURE_HIJACK
 FRAME_TYPE_NAME(PrestubMethodFrame)
+FRAME_TYPE_NAME(CallCountingHelperFrame)
 FRAME_TYPE_NAME(StubDispatchFrame)
 FRAME_TYPE_NAME(ExternalMethodFrame)
 #ifdef FEATURE_READYTORUN
@@ -2231,6 +2234,31 @@ private:
 
 typedef VPTR(class StubDispatchFrame) PTR_StubDispatchFrame;
 
+class CallCountingHelperFrame : public FramedMethodFrame
+{
+    VPTR_VTABLE_CLASS(CallCountingHelperFrame, FramedMethodFrame);
+
+public:
+    CallCountingHelperFrame(TransitionBlock *pTransitionBlock, MethodDesc *pMD);
+
+    virtual void GcScanRoots(promote_func *fn, ScanContext *sc); // override
+    virtual BOOL TraceFrame(Thread *thread, BOOL fromPatch, TraceDestination *trace, REGDISPLAY *regs); // override
+
+    virtual int GetFrameType() // override
+    {
+        LIMITED_METHOD_DAC_CONTRACT;
+        return TYPE_CALL;
+    }
+
+    virtual Interception GetInterception() // override
+    {
+        LIMITED_METHOD_DAC_CONTRACT;
+        return INTERCEPTION_NONE;
+    }
+
+    // Keep as last entry in class
+    DEFINE_VTABLE_GETTER_AND_CTOR_AND_DTOR(CallCountingHelperFrame)
+};
 
 //------------------------------------------------------------------------
 // This represents a call from an ExternalMethodThunk or a VirtualImportThunk
index 1e22437..509f4e6 100644 (file)
@@ -1401,3 +1401,38 @@ NESTED_END ProfileTailcallNaked, _TEXT
 NESTED_ENTRY JIT_ProfilerEnterLeaveTailcallStub, _TEXT, NoHandler
     ret
 NESTED_END JIT_ProfilerEnterLeaveTailcallStub, _TEXT
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+LEAF_ENTRY OnCallCountThresholdReachedStub, _TEXT
+    // Pop the return address (the stub-identifying token) into a non-argument volatile register that can be trashed
+    pop     eax
+    jmp     C_FUNC(OnCallCountThresholdReachedStub2)
+LEAF_END OnCallCountThresholdReachedStub, _TEXT
+
+NESTED_ENTRY OnCallCountThresholdReachedStub2, _TEXT, NoHandler
+    STUB_PROLOG
+
+    mov     esi, esp
+
+    // Align the stack for the call
+    lea     ebx, [esp - 8]
+    and     ebx, 0fh
+    sub     esp, ebx
+
+    push    eax // stub-identifying token, see OnCallCountThresholdReachedStub
+    push    esi // TransitionBlock *
+    CHECK_STACK_ALIGNMENT
+    call    C_FUNC(OnCallCountThresholdReached)
+
+    mov     esp, esi
+
+    STUB_EPILOG
+    jmp     eax
+
+    // This will never be executed. It is just to help out stack-walking logic
+    // which disassembles the epilog to unwind the stack.
+    ret
+NESTED_END OnCallCountThresholdReachedStub2, _TEXT
+
+#endif // FEATURE_TIERED_COMPILATION
index 8969ea7..51cf4f7 100644 (file)
@@ -1768,4 +1768,33 @@ DYNAMICHELPER <DynamicHelperFrameFlags_ObjectArg OR DynamicHelperFrameFlags_Obje
 
 endif ; FEATURE_READYTORUN
 
+ifdef FEATURE_TIERED_COMPILATION
+
+EXTERN _OnCallCountThresholdReached@8:proc
+
+_OnCallCountThresholdReachedStub@0 proc public
+    ; Pop the return address (the stub-identifying token) into a non-argument volatile register that can be trashed
+    pop     eax
+    jmp     _OnCallCountThresholdReachedStub2@0
+_OnCallCountThresholdReachedStub@0 endp
+
+_OnCallCountThresholdReachedStub2@0 proc public
+    STUB_PROLOG
+
+    mov     esi, esp
+
+    push    eax ; stub-identifying token, see OnCallCountThresholdReachedStub
+    push    esi ; TransitionBlock *
+    call    _OnCallCountThresholdReached@8
+
+    STUB_EPILOG
+    jmp     eax
+
+    ; This will never be executed. It is just to help out stack-walking logic
+    ; which disassembles the epilog to unwind the stack.
+    ret
+_OnCallCountThresholdReachedStub2@0 endp
+
+endif ; FEATURE_TIERED_COMPILATION
+
     end
index 3ea0b4f..7e49a78 100644 (file)
@@ -550,4 +550,210 @@ inline BOOL ClrFlushInstructionCache(LPCVOID pCodeAddr, size_t sizeOfCode)
 #define JIT_Stelem_Ref              JIT_Stelem_Ref
 #endif // FEATURE_PAL
 
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+// Call counting
+
+#ifdef FEATURE_TIERED_COMPILATION
+
+#define DISABLE_COPY(T) \
+    T(const T &) = delete; \
+    T &operator =(const T &) = delete
+
+typedef UINT16 CallCount;
+typedef DPTR(CallCount) PTR_CallCount;
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub
+
+class CallCountingStub;
+typedef DPTR(const CallCountingStub) PTR_CallCountingStub;
+
+class CallCountingStub
+{
+public:
+    static const SIZE_T Alignment = sizeof(void *);
+
+#ifndef DACCESS_COMPILE
+protected:
+    static const PCODE TargetForThresholdReached;
+
+    CallCountingStub() = default;
+
+public:
+    static const CallCountingStub *From(TADDR stubIdentifyingToken);
+
+    PCODE GetEntryPoint() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return PINSTRToPCODE((TADDR)this);
+    }
+#endif // !DACCESS_COMPILE
+
+public:
+    PTR_CallCount GetRemainingCallCountCell() const;
+    PCODE GetTargetForMethod() const;
+
+#ifndef DACCESS_COMPILE
+protected:
+    template<class T> static INT_PTR GetRelativeOffset(const T *relRef, PCODE target)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(T) != 0);
+        static_assert_no_msg(sizeof(T) <= sizeof(void *));
+        static_assert_no_msg((sizeof(T) & (sizeof(T) - 1)) == 0); // is a power of 2
+        _ASSERTE(relRef != nullptr);
+
+        TADDR targetAddress = PCODEToPINSTR(target);
+        _ASSERTE(targetAddress != NULL);
+        return (INT_PTR)targetAddress - (INT_PTR)(relRef + 1);
+    }
+#endif
+
+protected:
+    template<class T> static PCODE GetTarget(const T *relRef)
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(T) == 1 || sizeof(T) == 2 || sizeof(T) == 4 || sizeof(T) == 8);
+        _ASSERTE(relRef != nullptr);
+
+        return PINSTRToPCODE((INT_PTR)(relRef + 1) + *relRef);
+    }
+
+    DISABLE_COPY(CallCountingStub);
+};
+
+////////////////////////////////////////////////////////////////
+// CallCountingStubShort
+
+class CallCountingStubShort;
+typedef DPTR(const CallCountingStubShort) PTR_CallCountingStubShort;
+
+#pragma pack(push, 1)
+class CallCountingStubShort : public CallCountingStub
+{
+private:
+    const UINT8 m_part0[1];
+    CallCount *const m_remainingCallCountCell;
+    const UINT8 m_part1[5];
+    const INT32 m_rel32TargetForMethod;
+    const UINT8 m_part2[1];
+    const INT32 m_rel32TargetForThresholdReached;
+    const UINT8 m_alignmentPadding[1];
+
+#ifndef DACCESS_COMPILE
+public:
+    CallCountingStubShort(CallCount *remainingCallCountCell, PCODE targetForMethod)
+        : m_part0{                                              0xb8},                  //     mov  eax,
+        m_remainingCallCountCell(remainingCallCountCell),                               //               <imm32>
+        m_part1{                                                0x66, 0xff, 0x08,       //     dec  word ptr [eax]
+                                                                0x0f, 0x85},            //     jnz  
+        m_rel32TargetForMethod(                                                         //          <rel32>
+            GetRelative32BitOffset(
+                &m_rel32TargetForMethod,
+                targetForMethod)),
+        m_part2{                                                0xe8},                  //     call
+        m_rel32TargetForThresholdReached(                                               //          <rel32>
+            GetRelative32BitOffset(
+                &m_rel32TargetForThresholdReached,
+                TargetForThresholdReached)),
+                                                                                        // (eip == stub-identifying token)
+        m_alignmentPadding{                                     0xcc}                   //     int  3
+    {
+        WRAPPER_NO_CONTRACT;
+        static_assert_no_msg(sizeof(CallCountingStubShort) % Alignment == 0);
+        _ASSERTE(remainingCallCountCell != nullptr);
+        _ASSERTE(PCODEToPINSTR(targetForMethod) != NULL);
+    }
+
+public:
+    static bool Is(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        return true;
+    }
+
+    static const CallCountingStubShort *From(TADDR stubIdentifyingToken)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(stubIdentifyingToken));
+        _ASSERTE(stubIdentifyingToken % Alignment == offsetof(CallCountingStubShort, m_alignmentPadding[0]) % Alignment);
+
+        const CallCountingStubShort *stub =
+            (const CallCountingStubShort *)(stubIdentifyingToken - offsetof(CallCountingStubShort, m_alignmentPadding[0]));
+        _ASSERTE(IS_ALIGNED(stub, Alignment));
+        return stub;
+    }
+#endif // !DACCESS_COMPILE
+
+public:
+    static bool Is(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        return true;
+    }
+
+    static PTR_CallCountingStubShort From(PTR_CallCountingStub callCountingStub)
+    {
+        WRAPPER_NO_CONTRACT;
+        _ASSERTE(Is(callCountingStub));
+
+        return dac_cast<PTR_CallCountingStubShort>(callCountingStub);
+    }
+
+    PCODE GetTargetForMethod() const
+    {
+        WRAPPER_NO_CONTRACT;
+        return GetTarget(&m_rel32TargetForMethod);
+    }
+
+#ifndef DACCESS_COMPILE
+private:
+    static INT32 GetRelative32BitOffset(const INT32 *rel32Ref, PCODE target)
+    {
+        WRAPPER_NO_CONTRACT;
+
+        INT_PTR relativeOffset = GetRelativeOffset(rel32Ref, target);
+        _ASSERTE((INT32)relativeOffset == relativeOffset);
+        return (INT32)relativeOffset;
+    }
+#endif
+
+    friend CallCountingStub;
+    DISABLE_COPY(CallCountingStubShort);
+};
+#pragma pack(pop)
+
+////////////////////////////////////////////////////////////////
+// CallCountingStub definitions
+
+#ifndef DACCESS_COMPILE
+inline const CallCountingStub *CallCountingStub::From(TADDR stubIdentifyingToken)
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(stubIdentifyingToken != NULL);
+
+    return CallCountingStubShort::From(stubIdentifyingToken);
+}
+#endif
+
+inline PTR_CallCount CallCountingStub::GetRemainingCallCountCell() const
+{
+    WRAPPER_NO_CONTRACT;
+    return PTR_CallCount(dac_cast<PTR_CallCountingStubShort>(this)->m_remainingCallCountCell);
+}
+
+inline PCODE CallCountingStub::GetTargetForMethod() const
+{
+    WRAPPER_NO_CONTRACT;
+    return CallCountingStubShort::From(PTR_CallCountingStub(this))->GetTargetForMethod();
+}
+
+////////////////////////////////////////////////////////////////
+
+#undef DISABLE_COPY
+
+#endif // FEATURE_TIERED_COMPILATION
+
+////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+
 #endif // __cgenx86_h__
index 780972f..0a32499 100644 (file)
@@ -8093,7 +8093,7 @@ CorInfoInline CEEInfo::canInline (CORINFO_METHOD_HANDLE hCaller,
         if (CORProfilerEnableRejit())
         {
             CodeVersionManager* pCodeVersionManager = pCallee->GetCodeVersionManager();
-            CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+            CodeVersionManager::LockHolder codeVersioningLockHolder;
             ILCodeVersion ilVersion = pCodeVersionManager->GetActiveILCodeVersion(pCallee);
             if (ilVersion.GetRejitState() != ILCodeVersion::kStateActive || !ilVersion.HasDefaultIL())
             {
@@ -8306,7 +8306,7 @@ void CEEInfo::reportInliningDecision (CORINFO_METHOD_HANDLE inlinerHnd,
             // If we end up reporting an inlining on a method with non-default IL it means the race
             // happened and we need to manually request ReJIT for it since it was missed.
             CodeVersionManager* pCodeVersionManager = pCallee->GetCodeVersionManager();
-            CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+            CodeVersionManager::LockHolder codeVersioningLockHolder;
             ILCodeVersion ilVersion = pCodeVersionManager->GetActiveILCodeVersion(pCallee);
             if (ilVersion.GetRejitState() != ILCodeVersion::kStateActive || !ilVersion.HasDefaultIL())
             {
@@ -14281,7 +14281,7 @@ NativeCodeVersion EECodeInfo::GetNativeCodeVersion()
     if (pMD->IsVersionable())
     {
         CodeVersionManager *pCodeVersionManager = pMD->GetCodeVersionManager();
-        CodeVersionManager::TableLockHolder lockHolder(pCodeVersionManager);
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
         return pCodeVersionManager->GetNativeCodeVersion(pMD, PINSTRToPCODE(GetStartAddress()));
     }
 #endif
index 6a3ee28..a576c1a 100644 (file)
@@ -54,6 +54,10 @@ LoaderAllocator::LoaderAllocator()
     m_pVirtualCallStubManager = NULL;
 #endif
 
+#ifdef FEATURE_TIERED_COMPILATION
+    m_callCountingManager = NULL;
+#endif
+
     m_fGCPressure = false;
     m_fTerminated = false;
     m_fUnloaded = false;
@@ -1211,6 +1215,13 @@ void LoaderAllocator::Init(BaseDomain *pDomain, BYTE *pExecutableHeapMemory)
         m_interopDataHash.Init(0, NULL, false, &lock);
     }
 #endif // FEATURE_COMINTEROP
+
+#ifdef FEATURE_TIERED_COMPILATION
+    if (g_pConfig->TieredCompilation())
+    {
+        m_callCountingManager = new CallCountingManager();
+    }
+#endif
 }
 
 
@@ -1324,6 +1335,14 @@ void LoaderAllocator::Terminate()
 #endif
     m_LoaderAllocatorReferences.RemoveAll();
 
+#ifdef FEATURE_TIERED_COMPILATION
+    if (m_callCountingManager != NULL)
+    {
+        delete m_callCountingManager;
+        m_callCountingManager = NULL;
+    }
+#endif
+
     // In collectible types we merge the low frequency and high frequency heaps
     // So don't destroy them twice.
     if ((m_pLowFrequencyHeap != NULL) && (m_pLowFrequencyHeap != m_pHighFrequencyHeap))
index 178482b..dce440a 100644 (file)
@@ -20,7 +20,7 @@ class FuncPtrStubs;
 #include "qcall.h"
 #include "ilstubcache.h"
 
-#include "callcounter.h"
+#include "callcounting.h"
 #include "methoddescbackpatchinfo.h"
 #include "crossloaderallocatorhash.h"
 
@@ -274,7 +274,7 @@ private:
     EEMarshalingData* m_pMarshalingData;
 
 #ifdef FEATURE_TIERED_COMPILATION
-    CallCounter m_callCounter;
+    PTR_CallCountingManager m_callCountingManager;
 #endif
 
 #ifndef CROSSGEN_COMPILE
@@ -594,10 +594,10 @@ public:
 
 #ifdef FEATURE_TIERED_COMPILATION
 public:
-    CallCounter* GetCallCounter()
+    PTR_CallCountingManager GetCallCountingManager()
     {
         LIMITED_METHOD_CONTRACT;
-        return &m_callCounter;
+        return m_callCountingManager;
     }
 #endif // FEATURE_TIERED_COMPILATION
 
index b36c78c..a65c057 100644 (file)
@@ -4885,7 +4885,7 @@ void MethodDesc::RecordAndBackpatchEntryPointSlot(
     WRAPPER_NO_CONTRACT;
 
     LoaderAllocator *mdLoaderAllocator = GetLoaderAllocator();
-    MethodDescBackpatchInfoTracker::ConditionalLockHolder lockHolder;
+    MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder;
 
     RecordAndBackpatchEntryPointSlot_Locked(
         mdLoaderAllocator,
@@ -4906,7 +4906,7 @@ void MethodDesc::RecordAndBackpatchEntryPointSlot_Locked(
     PCODE currentEntryPoint)
 {
     WRAPPER_NO_CONTRACT;
-    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockedByCurrentThread());
+    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
     _ASSERTE(mdLoaderAllocator != nullptr);
     _ASSERTE(mdLoaderAllocator == GetLoaderAllocator());
     _ASSERTE(slotLoaderAllocator != nullptr);
@@ -4935,7 +4935,7 @@ FORCEINLINE bool MethodDesc::TryBackpatchEntryPointSlots(
     _ASSERTE(entryPoint != NULL);
     _ASSERTE(isPrestubEntryPoint == (entryPoint == GetPrestubEntryPointToBackpatch()));
     _ASSERTE(!isPrestubEntryPoint || !onlyFromPrestubEntryPoint);
-    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockedByCurrentThread());
+    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
 
     LoaderAllocator *mdLoaderAllocator = GetLoaderAllocator();
     MethodDescBackpatchInfoTracker *backpatchInfoTracker = mdLoaderAllocator->GetMethodDescBackpatchInfoTracker();
@@ -5097,7 +5097,7 @@ void MethodDesc::SetMethodEntryPoint(PCODE addr)
 
     // Similarly to GetMethodEntryPoint(), it is up to the caller to ensure that calls to this function are appropriately
     // synchronized. Currently, the only caller synchronizes with the following lock.
-    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockedByCurrentThread());
+    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
 
     TADDR pSlot = GetAddrOfSlot();
 
index 81f6f0d..72d648f 100644 (file)
@@ -44,7 +44,6 @@ class DynamicMethodDesc;
 class ReJitManager;
 class CodeVersionManager;
 class PrepareCodeConfig;
-class CallCounter;
 
 typedef DPTR(FCallMethodDesc)        PTR_FCallMethodDesc;
 typedef DPTR(ArrayMethodDesc)        PTR_ArrayMethodDesc;
@@ -510,9 +509,6 @@ public:
 #ifdef FEATURE_CODE_VERSIONING
     CodeVersionManager* GetCodeVersionManager();
 #endif
-#ifdef FEATURE_TIERED_COMPILATION
-    CallCounter* GetCallCounter();
-#endif
 
 #ifndef CROSSGEN_COMPILE
     MethodDescBackpatchInfoTracker* GetBackpatchInfoTracker();
@@ -1346,7 +1342,7 @@ private:
     PCODE GetEntryPointToBackpatch_Locked()
     {
         WRAPPER_NO_CONTRACT;
-        _ASSERTE(MethodDescBackpatchInfoTracker::IsLockedByCurrentThread());
+        _ASSERTE(MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
         _ASSERTE(MayHaveEntryPointSlotsToBackpatch());
 
         // At the moment this is the only case, see MayHaveEntryPointSlotsToBackpatch()
@@ -1359,7 +1355,7 @@ private:
     void SetEntryPointToBackpatch_Locked(PCODE entryPoint)
     {
         WRAPPER_NO_CONTRACT;
-        _ASSERTE(MethodDescBackpatchInfoTracker::IsLockedByCurrentThread());
+        _ASSERTE(MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
         _ASSERTE(entryPoint != NULL);
         _ASSERTE(MayHaveEntryPointSlotsToBackpatch());
 
@@ -2199,7 +2195,6 @@ public:
     VersionedPrepareCodeConfig(NativeCodeVersion codeVersion);
     HRESULT FinishConfiguration();
     virtual PCODE IsJitCancellationRequested();
-    virtual BOOL SetNativeCode(PCODE pCode, PCODE * ppAlternateCodeToUse);
     virtual COR_ILMETHOD* GetILHeader();
     virtual CORJIT_FLAGS GetJitCompilationFlags();
 private:
index 800e5b2..b0a3502 100644 (file)
@@ -175,14 +175,6 @@ inline CodeVersionManager * MethodDesc::GetCodeVersionManager()
 }
 #endif
 
-#ifdef FEATURE_TIERED_COMPILATION
-inline CallCounter * MethodDesc::GetCallCounter()
-{
-    LIMITED_METHOD_CONTRACT;
-    return GetLoaderAllocator()->GetCallCounter();
-}
-#endif
-
 #ifndef CROSSGEN_COMPILE
 inline MethodDescBackpatchInfoTracker * MethodDesc::GetBackpatchInfoTracker()
 {
index 571007c..b33711c 100644 (file)
@@ -21,7 +21,7 @@ void EntryPointSlots::Backpatch_Locked(TADDR slot, SlotType slotType, PCODE entr
 {
     WRAPPER_NO_CONTRACT;
     static_assert_no_msg(SlotType_Count <= sizeof(INT32));
-    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockedByCurrentThread());
+    _ASSERTE(MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread());
     _ASSERTE(slot != NULL);
     _ASSERTE(!(slot & SlotType_Mask));
     _ASSERTE(slotType >= SlotType_Normal);
@@ -72,7 +72,7 @@ CrstStatic MethodDescBackpatchInfoTracker::s_lock;
 void MethodDescBackpatchInfoTracker::Backpatch_Locked(MethodDesc *pMethodDesc, PCODE entryPoint)
 {
     WRAPPER_NO_CONTRACT;
-    _ASSERTE(IsLockedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     _ASSERTE(pMethodDesc != nullptr);
 
     GCX_COOP();
@@ -95,7 +95,7 @@ void MethodDescBackpatchInfoTracker::Backpatch_Locked(MethodDesc *pMethodDesc, P
 void MethodDescBackpatchInfoTracker::AddSlotAndPatch_Locked(MethodDesc *pMethodDesc, LoaderAllocator *pLoaderAllocatorOfSlot, TADDR slot, EntryPointSlots::SlotType slotType, PCODE currentEntryPoint)
 {
     WRAPPER_NO_CONTRACT;
-    _ASSERTE(IsLockedByCurrentThread());
+    _ASSERTE(IsLockOwnedByCurrentThread());
     _ASSERTE(pMethodDesc != nullptr);
     _ASSERTE(pMethodDesc->MayHaveEntryPointSlotsToBackpatch());
 
@@ -108,17 +108,11 @@ void MethodDescBackpatchInfoTracker::AddSlotAndPatch_Locked(MethodDesc *pMethodD
     EntryPointSlots::Backpatch_Locked(slot, slotType, currentEntryPoint);
 }
 
-void MethodDescBackpatchInfoTracker::StaticInitialize()
-{
-    WRAPPER_NO_CONTRACT;
-    s_lock.Init(CrstMethodDescBackpatchInfoTracker);
-}
-
 #endif // DACCESS_COMPILE
 
 #ifdef _DEBUG
 
-bool MethodDescBackpatchInfoTracker::IsLockedByCurrentThread()
+bool MethodDescBackpatchInfoTracker::IsLockOwnedByCurrentThread()
 {
     WRAPPER_NO_CONTRACT;
 
index c5d92a2..d317949 100644 (file)
@@ -79,7 +79,11 @@ private:
 
 #ifndef DACCESS_COMPILE
 public:
-    static void StaticInitialize();
+    static void StaticInitialize()
+    {
+        WRAPPER_NO_CONTRACT;
+        s_lock.Init(CrstMethodDescBackpatchInfoTracker);
+    }
 #endif
 
     void Initialize(LoaderAllocator *pLoaderAllocator)
@@ -90,17 +94,17 @@ public:
 
 #ifdef _DEBUG
 public:
-    static bool IsLockedByCurrentThread();
+    static bool IsLockOwnedByCurrentThread();
 #endif
 
 public:
-    class ConditionalLockHolder : CrstHolderWithState
+    class ConditionalLockHolder : private CrstHolderWithState
     {
     public:
         ConditionalLockHolder(bool acquireLock = true)
             : CrstHolderWithState(
 #ifndef DACCESS_COMPILE
-                acquireLock ? &MethodDescBackpatchInfoTracker::s_lock : nullptr
+                acquireLock ? &s_lock : nullptr
 #else
                 nullptr
 #endif
@@ -108,6 +112,9 @@ public:
         {
             LIMITED_METHOD_CONTRACT;
         }
+
+        ConditionalLockHolder(const ConditionalLockHolder &) = delete;
+        ConditionalLockHolder &operator =(const ConditionalLockHolder &) = delete;
     };
 
 public:
@@ -128,8 +135,6 @@ public:
 public:
 #endif
 
-    friend class ConditionalLockHolder;
-
     DISABLE_COPY(MethodDescBackpatchInfoTracker);
 };
 
index 15acc84..ca7e73d 100644 (file)
@@ -44,7 +44,6 @@
 #include "perfmap.h"
 #endif
 
-#include "callcounter.h"
 #include "methoddescbackpatchinfo.h"
 
 #if defined(FEATURE_GDBJIT)
@@ -97,7 +96,7 @@ PCODE MethodDesc::DoBackpatch(MethodTable * pMT, MethodTable *pDispatchingMT, BO
 
     // Only take the lock if the method is versionable with vtable slot backpatch, for recording slots and synchronizing with
     // backpatching slots
-    MethodDescBackpatchInfoTracker::ConditionalLockHolder lockHolder(isVersionableWithVtableSlotBackpatch);
+    MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder(isVersionableWithVtableSlotBackpatch);
 
     // Get the method entry point inside the lock above to synchronize with backpatching in
     // MethodDesc::BackpatchEntryPointSlots()
@@ -1037,16 +1036,42 @@ PCODE MethodDesc::JitCompileCodeLocked(PrepareCodeConfig* pConfig, JitListLockEn
         }
 
         SetupGcCoverage(pConfig->GetCodeVersion(), (BYTE*)pCode);
+    }
+#endif // HAVE_GCCOVER
 
-        // This thread should always win the publishing race
-        // since we're under a lock.
-        if (!pConfig->SetNativeCode(pCode, &pOtherCode))
+#ifdef FEATURE_TIERED_COMPILATION
+    // Update the optimization tier if necessary before SetNativeCode() is called. As soon as SetNativeCode() is called, another
+    // thread may get the native code and the optimization tier for that code version, and it should have already been
+    // finalized.
+    bool shouldCountCalls = false;
+    if (pFlags->IsSet(CORJIT_FLAGS::CORJIT_FLAG_TIER0))
+    {
+        _ASSERTE(pConfig->GetCodeVersion().GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
+        _ASSERTE(pConfig->GetMethodDesc()->IsEligibleForTieredCompilation());
+        _ASSERTE(
+            pConfig
+                ->GetMethodDesc()
+                ->GetLoaderAllocator()
+                ->GetCallCountingManager()
+                ->IsCallCountingEnabled(pConfig->GetCodeVersion()));
+
+        if (pConfig->JitSwitchedToOptimized())
+        {
+            // Update the tier in the code version. The JIT may have decided to switch from tier 0 to optimized, in which case
+            // call counting would have to be disabled for the method.
+            NativeCodeVersion codeVersion = pConfig->GetCodeVersion();
+            if (codeVersion.IsDefaultVersion())
+            {
+                pConfig->GetMethodDesc()->GetLoaderAllocator()->GetCallCountingManager()->DisableCallCounting(codeVersion);
+            }
+            codeVersion.SetOptimizationTier(NativeCodeVersion::OptimizationTierOptimized);
+        }
+        else
         {
-            _ASSERTE(!"GC Cover native code publish failed");
+            shouldCountCalls = true;
         }
     }
-    else
-#endif // HAVE_GCCOVER
+#endif
 
     // Aside from rejit, performing a SetNativeCodeInterlocked at this point
     // generally ensures that there is only one winning version of the native
@@ -1055,6 +1080,12 @@ PCODE MethodDesc::JitCompileCodeLocked(PrepareCodeConfig* pConfig, JitListLockEn
     // JITCachedFunctionSearchStarted)
     if (!pConfig->SetNativeCode(pCode, &pOtherCode))
     {
+#ifdef HAVE_GCCOVER
+        // When GCStress is enabled, this thread should always win the publishing race
+        // since we're under a lock.
+        _ASSERTE(!GCStress<cfg_instr_jit>::IsEnabled() || !"GC Cover native code publish failed");
+#endif
+
         // Another thread beat us to publishing its copy of the JITted code.
         return pOtherCode;
     }
@@ -1062,26 +1093,10 @@ PCODE MethodDesc::JitCompileCodeLocked(PrepareCodeConfig* pConfig, JitListLockEn
 #ifdef FEATURE_CODE_VERSIONING
     pConfig->SetGeneratedOrLoadedNewCode();
 #endif
-
 #ifdef FEATURE_TIERED_COMPILATION
-    if (pFlags->IsSet(CORJIT_FLAGS::CORJIT_FLAG_TIER0))
+    if (shouldCountCalls)
     {
-        _ASSERTE(pConfig->GetCodeVersion().GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
-        _ASSERTE(pConfig->GetMethodDesc()->IsEligibleForTieredCompilation());
-        _ASSERTE(pConfig->GetMethodDesc()->GetCallCounter()->IsCallCountingEnabled(pConfig->GetMethodDesc()));
-
-        if (pConfig->JitSwitchedToOptimized())
-        {
-            // Update the tier in the code version. The JIT may have decided to switch from tier 0 to optimized, in which case
-            // call counting would have to be disabled for the method.
-            MethodDesc *methodDesc = pConfig->GetMethodDesc();
-            methodDesc->GetCallCounter()->DisableCallCounting(methodDesc);
-            pConfig->GetCodeVersion().SetOptimizationTier(NativeCodeVersion::OptimizationTierOptimized);
-        }
-        else
-        {
-            pConfig->SetShouldCountCalls();
-        }
+        pConfig->SetShouldCountCalls();
     }
 #endif
 
@@ -1172,12 +1187,12 @@ BOOL PrepareCodeConfig::SetNativeCode(PCODE pCode, PCODE * ppAlternateCodeToUse)
 {
     LIMITED_METHOD_CONTRACT;
 
-    if (m_pMethodDesc->SetNativeCodeInterlocked(pCode, NULL))
+    if (m_nativeCodeVersion.SetNativeCodeInterlocked(pCode, NULL))
     {
         return TRUE;
     }
 
-    *ppAlternateCodeToUse = m_pMethodDesc->GetNativeCode();
+    *ppAlternateCodeToUse = m_nativeCodeVersion.GetNativeCode();
     return FALSE;
 }
 
@@ -1279,7 +1294,7 @@ VersionedPrepareCodeConfig::VersionedPrepareCodeConfig(NativeCodeVersion codeVer
     LIMITED_METHOD_CONTRACT;
 
     _ASSERTE(!m_nativeCodeVersion.IsDefaultVersion());
-    _ASSERTE(m_pMethodDesc->GetCodeVersionManager()->LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     m_ilCodeVersion = m_nativeCodeVersion.GetILCodeVersion();
 }
 
@@ -1287,7 +1302,7 @@ HRESULT VersionedPrepareCodeConfig::FinishConfiguration()
 {
     STANDARD_VM_CONTRACT;
 
-    _ASSERTE(!GetMethodDesc()->GetCodeVersionManager()->LockOwnedByCurrentThread());
+    _ASSERTE(!CodeVersionManager::IsLockOwnedByCurrentThread());
 
     // Any code build stages that do just in time configuration should
     // be configured now
@@ -1308,23 +1323,6 @@ PCODE VersionedPrepareCodeConfig::IsJitCancellationRequested()
     return m_nativeCodeVersion.GetNativeCode();
 }
 
-BOOL VersionedPrepareCodeConfig::SetNativeCode(PCODE pCode, PCODE * ppAlternateCodeToUse)
-{
-    LIMITED_METHOD_CONTRACT;
-
-    //This isn't the default version so jumpstamp is never needed
-    _ASSERTE(!m_nativeCodeVersion.IsDefaultVersion());
-    if (m_nativeCodeVersion.SetNativeCodeInterlocked(pCode, NULL))
-    {
-        return TRUE;
-    }
-    else
-    {
-        *ppAlternateCodeToUse = m_nativeCodeVersion.GetNativeCode();
-        return FALSE;
-    }
-}
-
 COR_ILMETHOD* VersionedPrepareCodeConfig::GetILHeader()
 {
     STANDARD_VM_CONTRACT;
@@ -1362,7 +1360,7 @@ PrepareCodeConfigBuffer::PrepareCodeConfigBuffer(NativeCodeVersion codeVersion)
     // a bit slower path (+1 usec?)
     VersionedPrepareCodeConfig *config;
     {
-        CodeVersionManager::TableLockHolder lock(codeVersion.GetMethodDesc()->GetCodeVersionManager());
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
         config = new(m_buffer) VersionedPrepareCodeConfig(codeVersion);
     }
     config->FinishConfiguration();
index eec1be1..b38e038 100644 (file)
@@ -2503,7 +2503,7 @@ HRESULT ProfToEEInterfaceImpl::GetCodeInfo3(FunctionID functionId,
             PCODE pCodeStart = NULL;
             CodeVersionManager* pCodeVersionManager = pMethodDesc->GetCodeVersionManager();
             {
-                CodeVersionManager::TableLockHolder lockHolder(pCodeVersionManager);
+                CodeVersionManager::LockHolder codeVersioningLockHolder;
 
                 ILCodeVersion ilCodeVersion = pCodeVersionManager->GetILCodeVersion(pMethodDesc, reJitId);
 
@@ -4978,7 +4978,7 @@ HRESULT ProfToEEInterfaceImpl::GetILToNativeMapping2(FunctionID functionId,
             CodeVersionManager *pCodeVersionManager = pMD->GetCodeVersionManager();
             ILCodeVersion ilCodeVersion = NULL;
             {
-                CodeVersionManager::TableLockHolder lockHolder(pCodeVersionManager);
+                CodeVersionManager::LockHolder codeVersioningLockHolder;
 
                 pCodeVersionManager->GetILCodeVersion(pMD, reJitId);
 
@@ -6517,7 +6517,7 @@ HRESULT ProfToEEInterfaceImpl::GetNativeCodeStartAddresses(FunctionID functionID
 
         ILCodeVersion ilCodeVersion = NULL;
         {
-            CodeVersionManager::TableLockHolder lockHolder(pCodeVersionManager);
+            CodeVersionManager::LockHolder codeVersioningLockHolder;
 
             ilCodeVersion = pCodeVersionManager->GetILCodeVersion(pMD, reJitId);
 
index 278c6f9..2f46bd3 100644 (file)
 #include "../debug/ee/controller.h"
 #include "codeversion.h"
 
-// This HRESULT is only used as a private implementation detail. Corerror.xml has a comment in it
-//  reserving this value for our use but it doesn't appear in the public headers.
-#define CORPROF_E_RUNTIME_SUSPEND_REQUIRED _HRESULT_TYPEDEF_(0x80131381L)
-
 // This is just used as a unique id. Overflow is OK. If we happen to have more than 4+Billion rejits
 // and somehow manage to not run out of memory, we'll just have to redefine ReJITID as size_t.
 /* static */
@@ -637,14 +633,13 @@ HRESULT ReJitManager::UpdateActiveILVersions(
         }
     }   // for (ULONG i = 0; i < cFunctions; i++)
 
-    // For each code versioning mgr, if there's work to do, suspend EE if needed,
+    // For each code versioning mgr, if there's work to do,
     // enter the code versioning mgr's crst, and do the batched work.
-    BOOL fEESuspended = FALSE;
     SHash<CodeActivationBatchTraits>::Iterator beginIter = mgrToCodeActivationBatch.Begin();
     SHash<CodeActivationBatchTraits>::Iterator endIter = mgrToCodeActivationBatch.End();
 
     {
-        MethodDescBackpatchInfoTracker::ConditionalLockHolder lockHolder;
+        MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder;
 
         for (SHash<CodeActivationBatchTraits>::Iterator iter = beginIter; iter != endIter; iter++)
         {
@@ -662,24 +657,12 @@ HRESULT ReJitManager::UpdateActiveILVersions(
                 // ThreadStore crsts
                 SystemDomain::LockHolder lh;
 
-                if(!fEESuspended)
-                {
-                    // As a potential future optimization we could speculatively try to update the jump stamps without
-                    // suspending the runtime. That needs to be plumbed through BatchUpdateJumpStamps though.
-                    ThreadSuspend::SuspendEE(ThreadSuspend::SUSPEND_FOR_REJIT);
-                    fEESuspended = TRUE;
-                }
-
                 _ASSERTE(ThreadStore::HoldingThreadStore());
-                hr = pCodeVersionManager->SetActiveILCodeVersions(pCodeActivationBatch->m_methodsToActivate.Ptr(), pCodeActivationBatch->m_methodsToActivate.Count(), fEESuspended, &errorRecords);
+                hr = pCodeVersionManager->SetActiveILCodeVersions(pCodeActivationBatch->m_methodsToActivate.Ptr(), pCodeActivationBatch->m_methodsToActivate.Count(), &errorRecords);
                 if (FAILED(hr))
                     break;
             }
         }
-        if (fEESuspended)
-        {
-            ThreadSuspend::RestartEE(FALSE, TRUE);
-        }
     }
 
     if (FAILED(hr))
@@ -765,7 +748,7 @@ HRESULT ReJitManager::UpdateActiveILVersion(
     }
 
     {
-        CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
 
         // Bind the il code version
         ILCodeVersion* pILCodeVersion = pCodeActivationBatch->m_methodsToActivate.Append();
@@ -839,7 +822,7 @@ HRESULT ReJitManager::UpdateNativeInlinerActiveILVersions(
                     pInliner = inlinerIter.GetMethodDesc();
                     {
                         CodeVersionManager *pCodeVersionManager = pCurModule->GetCodeVersionManager();
-                        CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+                        CodeVersionManager::LockHolder codeVersioningLockHolder;
                         ILCodeVersion ilVersion = pCodeVersionManager->GetActiveILCodeVersion(pInliner);
                         if (!ilVersion.HasDefaultIL())
                         {
@@ -958,7 +941,7 @@ HRESULT ReJitManager::BindILVersion(
     }
     CONTRACTL_END;
 
-    _ASSERTE(pCodeVersionManager->LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
     _ASSERTE((pModule != NULL) && (methodDef != mdTokenNil));
 
     // Check if there was there a previous rejit request for this method that hasn't been exposed back
@@ -1043,8 +1026,7 @@ HRESULT ReJitManager::ConfigureILCodeVersion(ILCodeVersion ilCodeVersion)
 {
     STANDARD_VM_CONTRACT;
 
-    CodeVersionManager* pCodeVersionManager = ilCodeVersion.GetModule()->GetCodeVersionManager();
-    _ASSERTE(!pCodeVersionManager->LockOwnedByCurrentThread());
+    _ASSERTE(!CodeVersionManager::IsLockOwnedByCurrentThread());
 
 
     HRESULT hr = S_OK;
@@ -1055,7 +1037,7 @@ HRESULT ReJitManager::ConfigureILCodeVersion(ILCodeVersion ilCodeVersion)
 
     {
         // Serialize access to the rejit state
-        CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
         switch (ilCodeVersion.GetRejitState())
         {
         case ILCodeVersion::kStateRequested:
@@ -1118,7 +1100,7 @@ HRESULT ReJitManager::ConfigureILCodeVersion(ILCodeVersion ilCodeVersion)
                 //
                 // This code path also happens if the GetReJITParameters callback was suppressed due to
                 // the method being ReJITted as an inliner by the runtime (instead of by the user).
-                CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+                CodeVersionManager::LockHolder codeVersioningLockHolder;
                 if (ilCodeVersion.GetRejitState() == ILCodeVersion::kStateGettingReJITParameters)
                 {
                     ilCodeVersion.SetRejitState(ILCodeVersion::kStateActive);
@@ -1137,7 +1119,7 @@ HRESULT ReJitManager::ConfigureILCodeVersion(ILCodeVersion ilCodeVersion)
         {
             _ASSERTE(pFuncControl != NULL);
 
-            CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+            CodeVersionManager::LockHolder codeVersioningLockHolder;
             if (ilCodeVersion.GetRejitState() == ILCodeVersion::kStateGettingReJITParameters)
             {
                 // Inside the above call to ICorProfilerCallback4::GetReJITParameters, the profiler
@@ -1178,7 +1160,7 @@ HRESULT ReJitManager::ConfigureILCodeVersion(ILCodeVersion ilCodeVersion)
         while (true)
         {
             {
-                CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+                CodeVersionManager::LockHolder codeVersioningLockHolder;
                 if (ilCodeVersion.GetRejitState() == ILCodeVersion::kStateActive)
                 {
                     break; // the other thread got the parameters succesfully, go race to rejit
@@ -1231,7 +1213,7 @@ ReJITID ReJitManager::GetReJitId(PTR_MethodDesc pMD, PCODE pCodeStart)
         return 0;
     }
 
-    CodeVersionManager::TableLockHolder ch(pCodeVersionManager);
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
     return ReJitManager::GetReJitIdNoLock(pMD, pCodeStart);
 }
 
@@ -1259,10 +1241,9 @@ ReJITID ReJitManager::GetReJitIdNoLock(PTR_MethodDesc pMD, PCODE pCodeStart)
     CONTRACTL_END;
 
     // Caller must ensure this lock is taken!
-    CodeVersionManager* pCodeVersionManager = pMD->GetCodeVersionManager();
-    _ASSERTE(pCodeVersionManager->LockOwnedByCurrentThread());
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
 
-    NativeCodeVersion nativeCodeVersion = pCodeVersionManager->GetNativeCodeVersion(pMD, pCodeStart);
+    NativeCodeVersion nativeCodeVersion = pMD->GetCodeVersionManager()->GetNativeCodeVersion(pMD, pCodeStart);
     if (nativeCodeVersion.IsNull())
     {
         return 0;
@@ -1303,7 +1284,7 @@ HRESULT ReJitManager::GetReJITIDs(PTR_MethodDesc pMD, ULONG cReJitIds, ULONG * p
     CONTRACTL_END;
 
     CodeVersionManager* pCodeVersionManager = pMD->GetCodeVersionManager();
-    CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+    CodeVersionManager::LockHolder codeVersioningLockHolder;
 
     ULONG cnt = 0;
 
index c913132..c78c2e5 100644 (file)
 // # Important entrypoints in this code:
 //
 //
-// a) .ctor and Init(...) -  called once during AppDomain initialization
-// b) OnMethodCalled(...) -  called when a method is being invoked. When a method
-//                           has been called enough times this is currently the only
-//                           trigger that initiates re-compilation.
-// c) Shutdown() -           called during AppDomain::Exit() to begin the process
-//                           of stopping tiered compilation. After this point no more
-//                           background optimization work will be initiated but in-progress
-//                           work still needs to complete.
-// d) ShutdownAllDomains() - Called from EEShutdownHelper to block until all async work is
-//                           complete. We must do this before we shutdown the JIT.
+// a) .ctor -                called once during AppDomain initialization
+// b) HandleCallCountingForFirstCall(...) - called when a method's code version is being
+//                           invoked for the first time.
 //
 // # Overall workflow
 //
-// Methods initially call into OnMethodCalled() and once the call count exceeds
+// Methods initially call into HandleCallCountingForFirstCall() and once the call count exceeds
 // a fixed limit we queue work on to our internal list of methods needing to
 // be recompiled (m_methodsToOptimize). If there is currently no thread
 // servicing our queue asynchronously then we use the runtime threadpool
@@ -44,7 +37,7 @@
 // item we handle as many methods as possible in a fixed period of time, then
 // queue another threadpool work item if m_methodsToOptimize hasn't been drained.
 //
-// The background thread enters at StaticOptimizeMethodsCallback(), enters the
+// The background thread enters at StaticBackgroundWorkCallback(), enters the
 // appdomain, and then begins calling OptimizeMethod on each method in the
 // queue. For each method we jit it, then update the precode so that future
 // entrypoint callers will run the new code.
 
 #if defined(FEATURE_TIERED_COMPILATION) && !defined(DACCESS_COMPILE)
 
+class TieredCompilationManager::AutoResetIsBackgroundWorkScheduled
+{
+private:
+    TieredCompilationManager *m_tieredCompilationManager;
+
+public:
+    AutoResetIsBackgroundWorkScheduled(TieredCompilationManager *tieredCompilationManager)
+        : m_tieredCompilationManager(tieredCompilationManager)
+    {
+        LIMITED_METHOD_CONTRACT;
+        _ASSERTE(tieredCompilationManager == nullptr || tieredCompilationManager->m_isBackgroundWorkScheduled);
+    }
+
+    ~AutoResetIsBackgroundWorkScheduled()
+    {
+        WRAPPER_NO_CONTRACT;
+
+        if (m_tieredCompilationManager == nullptr)
+        {
+            return;
+        }
+
+        LockHolder tieredCompilationLockHolder;
+
+        _ASSERTE(m_tieredCompilationManager->m_isBackgroundWorkScheduled);
+        m_tieredCompilationManager->m_isBackgroundWorkScheduled = false;
+    }
+
+    void Cancel()
+    {
+        LIMITED_METHOD_CONTRACT;
+        m_tieredCompilationManager = nullptr;
+    }
+};
+
 // Called at AppDomain construction
 TieredCompilationManager::TieredCompilationManager() :
-    m_lock(CrstTieredCompilation),
     m_countOfMethodsToOptimize(0),
-    m_isAppDomainShuttingDown(FALSE),
-    m_countOptimizationThreadsRunning(0),
     m_countOfNewMethodsCalledDuringDelay(0),
     m_methodsPendingCountingForTier1(nullptr),
     m_tieringDelayTimerHandle(nullptr),
-    m_tier1CallCountingCandidateMethodRecentlyRecorded(false)
+    m_isBackgroundWorkScheduled(false),
+    m_tier1CallCountingCandidateMethodRecentlyRecorded(false),
+    m_isPendingCallCountingCompletion(false),
+    m_recentlyRequestedCallCountingCompletionAgain(false)
 {
     WRAPPER_NO_CONTRACT;
     // On Unix, we can reach here before EEConfig is initialized, so defer config-based initialization to Init()
@@ -107,347 +135,256 @@ NativeCodeVersion::OptimizationTier TieredCompilationManager::GetInitialOptimiza
         return NativeCodeVersion::OptimizationTier1;
     }
 
-    if (!pMethodDesc->GetCallCounter()->IsCallCountingEnabled(pMethodDesc))
+    if (!g_pConfig->TieredCompilation_QuickJit() ||
+        !pMethodDesc->GetLoaderAllocator()->GetCallCountingManager()->IsCallCountingEnabled(NativeCodeVersion(pMethodDesc)))
     {
         // Tier 0 call counting may have been disabled for several reasons, the intention is to start with and stay at an
         // optimized tier
         return NativeCodeVersion::OptimizationTierOptimized;
     }
-#endif
 
     return NativeCodeVersion::OptimizationTier0;
+#else
+    return NativeCodeVersion::OptimizationTierOptimized;
+#endif
 }
 
 #if defined(FEATURE_TIERED_COMPILATION) && !defined(DACCESS_COMPILE)
 
-bool TieredCompilationManager::OnMethodCodeVersionCalledFirstTime(MethodDesc* pMethodDesc)
+void TieredCompilationManager::HandleCallCountingForFirstCall(MethodDesc* pMethodDesc)
 {
     WRAPPER_NO_CONTRACT;
     _ASSERTE(pMethodDesc != nullptr);
     _ASSERTE(pMethodDesc->IsEligibleForTieredCompilation());
-    _ASSERTE(pMethodDesc->GetCallCounter()->IsCallCountingEnabled(pMethodDesc));
+    _ASSERTE(g_pConfig->TieredCompilation_CallCountingDelayMs() != 0);
 
-    if (g_pConfig->TieredCompilation_CallCountingDelayMs() == 0)
+    // An exception here (OOM) would mean that the method's calls would not be counted and it would not be promoted. A
+    // consideration is that an attempt can be made to reset the code entry point on exception (which can also OOM). Doesn't
+    // seem worth it, the exception is propagated and there are other cases where a method may not be promoted due to OOM.
     {
-        return false;
-    }
+        LockHolder tieredCompilationLockHolder;
 
-    while (true)
-    {
-        bool attemptedToInitiateDelay = false;
-        if (!IsTieringDelayActive())
+        SArray<MethodDesc *> *methodsPendingCounting = m_methodsPendingCountingForTier1;
+        _ASSERTE((methodsPendingCounting != nullptr) == IsTieringDelayActive());
+        if (methodsPendingCounting != nullptr)
         {
-            if (!TryInitiateTieringDelay())
+            methodsPendingCounting->Append(pMethodDesc);
+            ++m_countOfNewMethodsCalledDuringDelay;
+
+            if (!m_tier1CallCountingCandidateMethodRecentlyRecorded)
             {
-                return false;
+                // Delay call counting for currently recoded methods further
+                m_tier1CallCountingCandidateMethodRecentlyRecorded = true;
             }
-            attemptedToInitiateDelay = true;
+            return;
         }
 
-        CrstHolder holder(&m_lock);
+        NewHolder<SArray<MethodDesc *>> methodsPendingCountingHolder = new SArray<MethodDesc *>();
+        methodsPendingCountingHolder->Preallocate(64);
 
-        SArray<MethodDesc*>* methodsPendingCountingForTier1 = m_methodsPendingCountingForTier1;
-        if (methodsPendingCountingForTier1 == nullptr)
-        {
-            // Timer tick callback race, try again
-            continue;
-        }
+        methodsPendingCountingHolder->Append(pMethodDesc);
+        ++m_countOfNewMethodsCalledDuringDelay;
 
-        // Record the method to resume counting later (see TieringDelayTimerCallback)
-        bool success = false;
-        EX_TRY
-        {
-            methodsPendingCountingForTier1->Append(pMethodDesc);
-            success = true;
-        }
-        EX_CATCH
+        m_methodsPendingCountingForTier1 = methodsPendingCountingHolder.Extract();
+        _ASSERTE(!m_tier1CallCountingCandidateMethodRecentlyRecorded);
+        _ASSERTE(IsTieringDelayActive());
+    }
+
+    // Elsewhere, the tiered compilation lock is taken inside the code versioning lock. The code versioning lock is an unsafe
+    // any-GC-mode lock, so the tiering lock is also that type of lock. Inside that type of lock, there is an implicit
+    // GC_NOTRIGGER contract. So, the timer cannot be created inside the tiering lock since it may GC_TRIGGERS. At this point,
+    // this is the only thread that may attempt creating the timer. If creating the timer fails, let the exception propagate,
+    // but because the tiering lock was released above, first reset any recorded methods' code entry points and deactivate the
+    // tiering delay so that timer creation may be attempted again.
+    EX_TRY
+    {
+        NewHolder<ThreadpoolMgr::TimerInfoContext> timerContextHolder = new ThreadpoolMgr::TimerInfoContext();
+        timerContextHolder->TimerId = 0;
+
+        _ASSERTE(m_tieringDelayTimerHandle == nullptr);
+        if (!ThreadpoolMgr::CreateTimerQueueTimer(
+                &m_tieringDelayTimerHandle,
+                TieringDelayTimerCallback,
+                timerContextHolder,
+                g_pConfig->TieredCompilation_CallCountingDelayMs(),
+                (DWORD)-1 /* Period, non-repeating */,
+                0 /* flags */))
         {
+            _ASSERTE(m_tieringDelayTimerHandle == nullptr);
+            ThrowOutOfMemory();
         }
-        EX_END_CATCH(RethrowTerminalExceptions);
-        if (!success)
+
+        timerContextHolder.SuppressRelease(); // the timer context is automatically deleted by the timer infrastructure
+    }
+    EX_CATCH
+    {
+        // Since the tiering lock was released and reacquired, other methods may have been recorded in-between. Just deactivate
+        // the tiering delay. Any methods that have been recorded would not have their calls be counted and would not be
+        // promoted (due to the small window, there shouldn't be many of those). See consideration above in a similar exception
+        // case.
         {
-            return false;
+            LockHolder tieredCompilationLockHolder;
+
+            _ASSERTE(IsTieringDelayActive());
+            m_tier1CallCountingCandidateMethodRecentlyRecorded = false;
+            _ASSERTE(m_methodsPendingCountingForTier1 != nullptr);
+            delete m_methodsPendingCountingForTier1;
+            m_methodsPendingCountingForTier1 = nullptr;
+            _ASSERTE(!IsTieringDelayActive());
         }
 
-        ++m_countOfNewMethodsCalledDuringDelay;
+        EX_RETHROW;
+    }
+    EX_END_CATCH(RethrowTerminalExceptions);
 
-        if (!attemptedToInitiateDelay)
-        {
-            // Delay call counting for currently recoded methods further
-            m_tier1CallCountingCandidateMethodRecentlyRecorded = true;
-        }
-        return true;
+    if (ETW::CompilationLog::TieredCompilation::Runtime::IsEnabled())
+    {
+        ETW::CompilationLog::TieredCompilation::Runtime::SendPause();
     }
 }
 
-bool TieredCompilationManager::OnMethodCodeVersionCalledSubsequently(MethodDesc* pMethodDesc)
+bool TieredCompilationManager::TrySetCodeEntryPointAndRecordMethodForCallCounting(MethodDesc* pMethodDesc, PCODE codeEntryPoint)
 {
     WRAPPER_NO_CONTRACT;
     _ASSERTE(pMethodDesc != nullptr);
-    _ASSERTE(pMethodDesc->GetCallCounter()->IsCallCountingEnabled(pMethodDesc));
+    _ASSERTE(pMethodDesc->IsEligibleForTieredCompilation());
+    _ASSERTE(codeEntryPoint != NULL);
 
-    if (!IsTieringDelayActive() || g_pConfig->TieredCompilation_CallCountingDelayMs() == 0)
+    if (!IsTieringDelayActive())
     {
         return false;
     }
 
-    CrstHolder holder(&m_lock);
+    LockHolder tieredCompilationLockHolder;
 
-    if (!m_tier1CallCountingCandidateMethodRecentlyRecorded)
+    if (!IsTieringDelayActive())
     {
-        // This is to prevent a race where the method get recorded below, the delay timer callback resets the method's entry
-        // point to begin call counting, and then the method's entry point is set by this thread to the tier 0 entry point. In
-        // that case the method would not be counted or tiered-up anymore. So, stop call counting only when the delay timer will
-        // be extended, the extra delay makes the issue near-impossible to occur. This is not a great solution and is temporary,
-        // once the call counting scheme is changed this code and issue will disappear.
         return false;
     }
 
-    SArray<MethodDesc*>* methodsPendingCountingForTier1 = m_methodsPendingCountingForTier1;
-    if (methodsPendingCountingForTier1 == nullptr)
-    {
-        // Timer tick callback race
-        _ASSERTE(!IsTieringDelayActive());
-        return false;
-    }
+    // Set the code entry point before recording the method for call counting to avoid a race. Otherwise, the tiering delay may
+    // expire and enable call counting for the method before the entry point is set here, in which case calls to the method
+    // would not be counted anymore.
+    pMethodDesc->SetCodeEntryPoint(codeEntryPoint);
+    _ASSERTE(m_methodsPendingCountingForTier1 != nullptr);
+    m_methodsPendingCountingForTier1->Append(pMethodDesc);
+    return true;
+}
 
-    // Record the method to resume counting later (see TieringDelayTimerCallback)
-    bool success = false;
-    EX_TRY
-    {
-        methodsPendingCountingForTier1->Append(pMethodDesc);
-        success = true;
-    }
-    EX_CATCH
+void TieredCompilationManager::AsyncPromoteToTier1(
+    NativeCodeVersion tier0NativeCodeVersion,
+    bool *scheduleTieringBackgroundWorkRef)
+{
+    CONTRACTL
     {
+        THROWS;
+        GC_NOTRIGGER;
+        MODE_ANY;
     }
-    EX_END_CATCH(RethrowTerminalExceptions);
-    return success;
-}
+    CONTRACTL_END;
 
-void TieredCompilationManager::AsyncPromoteMethodToTier1(MethodDesc* pMethodDesc)
-{
-    STANDARD_VM_CONTRACT;
+    _ASSERTE(CodeVersionManager::IsLockOwnedByCurrentThread());
+    _ASSERTE(!tier0NativeCodeVersion.IsNull());
+    _ASSERTE(tier0NativeCodeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
+    _ASSERTE(scheduleTieringBackgroundWorkRef != nullptr);
 
     NativeCodeVersion t1NativeCodeVersion;
+    HRESULT hr;
 
     // Add an inactive native code entry in the versioning table to track the tier1
     // compilation we are going to create. This entry binds the compilation to a
     // particular version of the IL code regardless of any changes that may
     // occur between now and when jitting completes. If the IL does change in that
     // interval the new code entry won't be activated.
+    MethodDesc *pMethodDesc = tier0NativeCodeVersion.GetMethodDesc();
+    ILCodeVersion ilCodeVersion = tier0NativeCodeVersion.GetILCodeVersion();
+    _ASSERTE(!ilCodeVersion.HasAnyOptimizedNativeCodeVersion(tier0NativeCodeVersion));
+    hr = ilCodeVersion.AddNativeCodeVersion(pMethodDesc, NativeCodeVersion::OptimizationTier1, &t1NativeCodeVersion);
+    if (FAILED(hr))
     {
-        CodeVersionManager* pCodeVersionManager = pMethodDesc->GetCodeVersionManager();
-        CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
-        ILCodeVersion ilVersion = pCodeVersionManager->GetActiveILCodeVersion(pMethodDesc);
-        NativeCodeVersionCollection nativeVersions = ilVersion.GetNativeCodeVersions(pMethodDesc);
-        for (NativeCodeVersionIterator cur = nativeVersions.Begin(), end = nativeVersions.End(); cur != end; cur++)
-        {
-            NativeCodeVersion::OptimizationTier optimizationTier = cur->GetOptimizationTier();
-            if (optimizationTier == NativeCodeVersion::OptimizationTier1 ||
-                optimizationTier == NativeCodeVersion::OptimizationTierOptimized)
-            {
-                // we've already promoted
-                LOG((LF_TIEREDCOMPILATION, LL_INFO100000, "TieredCompilationManager::AsyncPromoteMethodToTier1 Method=0x%pM (%s::%s) ignoring already promoted method\n",
-                    pMethodDesc, pMethodDesc->m_pszDebugClassName, pMethodDesc->m_pszDebugMethodName));
-                return;
-            }
-        }
-
-        HRESULT hr = S_OK;
-        if (FAILED(hr = ilVersion.AddNativeCodeVersion(pMethodDesc, NativeCodeVersion::OptimizationTier1, &t1NativeCodeVersion)))
-        {
-            // optimization didn't work for some reason (presumably OOM)
-            // just give up and continue on
-            STRESS_LOG2(LF_TIEREDCOMPILATION, LL_WARNING, "TieredCompilationManager::AsyncPromoteMethodToTier1: "
-                "AddNativeCodeVersion failed hr=0x%x, method=%pM\n",
-                hr, pMethodDesc);
-            return;
-        }
+        ThrowHR(hr);
     }
 
     // Insert the method into the optimization queue and trigger a thread to service
     // the queue if needed.
     //
     // Note an error here could affect concurrent threads running this
-    // code. Those threads will observe m_countOptimizationThreadsRunning > 0 and return,
-    // then QueueUserWorkItem fails on this thread lowering the count and leaves them
+    // code. Those threads will observe m_isBackgroundWorkScheduled == true and return,
+    // then QueueUserWorkItem fails on this thread resetting the field to false and leaves them
     // unserviced. Synchronous retries appear unlikely to offer any material improvement
     // and complicating the code to narrow an already rare error case isn't desirable.
+    SListElem<NativeCodeVersion>* pMethodListItem = new SListElem<NativeCodeVersion>(t1NativeCodeVersion);
     {
-        SListElem<NativeCodeVersion>* pMethodListItem = new (nothrow) SListElem<NativeCodeVersion>(t1NativeCodeVersion);
-        CrstHolder holder(&m_lock);
-        if (pMethodListItem != NULL)
-        {
-            m_methodsToOptimize.InsertTail(pMethodListItem);
-            ++m_countOfMethodsToOptimize;
-        }
+        LockHolder tieredCompilationLockHolder;
+
+        m_methodsToOptimize.InsertTail(pMethodListItem);
+        ++m_countOfMethodsToOptimize;
 
-        LOG((LF_TIEREDCOMPILATION, LL_INFO10000, "TieredCompilationManager::AsyncPromoteMethodToTier1 Method=0x%pM (%s::%s), code version id=0x%x queued\n",
+        LOG((LF_TIEREDCOMPILATION, LL_INFO10000, "TieredCompilationManager::AsyncPromoteToTier1 Method=0x%pM (%s::%s), code version id=0x%x queued\n",
             pMethodDesc, pMethodDesc->m_pszDebugClassName, pMethodDesc->m_pszDebugMethodName,
             t1NativeCodeVersion.GetVersionId()));
 
-        if (!IncrementWorkerThreadCountIfNeeded())
+        if (m_isBackgroundWorkScheduled || IsTieringDelayActive())
         {
             return;
         }
     }
 
-    if (!TryAsyncOptimizeMethods())
+    // This function is called from a GC_NOTRIGGER scope and scheduling background work (creating a thread) may GC_TRIGGERS.
+    // The caller needs to schedule background work after leaving the GC_NOTRIGGER scope. The contract is that the caller must
+    // make an attempt to schedule background work in any normal path. In the event of an atypical exception (eg. OOM),
+    // background work may not be scheduled and would have to be tried again the next time some background work is queued.
+    if (!*scheduleTieringBackgroundWorkRef)
     {
-        CrstHolder holder(&m_lock);
-        DecrementWorkerThreadCount();
+        *scheduleTieringBackgroundWorkRef = true;
     }
 }
 
-void TieredCompilationManager::Shutdown()
-{
-    STANDARD_VM_CONTRACT;
-
-    CrstHolder holder(&m_lock);
-    m_isAppDomainShuttingDown = TRUE;
-}
-
 bool TieredCompilationManager::IsTieringDelayActive()
 {
     LIMITED_METHOD_CONTRACT;
     return m_methodsPendingCountingForTier1 != nullptr;
 }
 
-bool TieredCompilationManager::TryInitiateTieringDelay()
+void WINAPI TieredCompilationManager::TieringDelayTimerCallback(PVOID parameter, BOOLEAN timerFired)
 {
-    WRAPPER_NO_CONTRACT;
-    _ASSERTE(g_pConfig->TieredCompilation());
-    _ASSERTE(g_pConfig->TieredCompilation_CallCountingDelayMs() != 0);
-
-    NewHolder<SArray<MethodDesc*>> methodsPendingCountingHolder = new(nothrow) SArray<MethodDesc*>();
-    if (methodsPendingCountingHolder == nullptr)
-    {
-        return false;
-    }
-
-    bool success = false;
-    EX_TRY
-    {
-        methodsPendingCountingHolder->Preallocate(64);
-        success = true;
-    }
-    EX_CATCH
-    {
-    }
-    EX_END_CATCH(RethrowTerminalExceptions);
-    if (!success)
-    {
-        return false;
-    }
-
-    NewHolder<ThreadpoolMgr::TimerInfoContext> timerContextHolder = new(nothrow) ThreadpoolMgr::TimerInfoContext();
-    if (timerContextHolder == nullptr)
+    CONTRACTL
     {
-        return false;
+        THROWS;
+        GC_TRIGGERS;
+        MODE_PREEMPTIVE;
     }
-    timerContextHolder->TimerId = 0;
-
-    {
-        CrstHolder holder(&m_lock);
-
-        if (IsTieringDelayActive())
-        {
-            return true;
-        }
-
-        // The timer is created inside the lock to avoid some unnecessary additional complexity that would otherwise arise from
-        // there being a failure point after the timer is successfully created. For instance, if the timer is created outside
-        // the lock and then inside the lock it is found that another thread beat us to it, there would be two active timers
-        // that may tick before the extra timer is deleted, along with additional concurrency issues.
-        _ASSERTE(m_tieringDelayTimerHandle == nullptr);
-        success = false;
-        EX_TRY
-        {
-            if (ThreadpoolMgr::CreateTimerQueueTimer(
-                    &m_tieringDelayTimerHandle,
-                    TieringDelayTimerCallback,
-                    timerContextHolder,
-                    g_pConfig->TieredCompilation_CallCountingDelayMs(),
-                    (DWORD)-1 /* Period, non-repeating */,
-                    0 /* flags */))
-            {
-                success = true;
-            }
-        }
-        EX_CATCH
-        {
-        }
-        EX_END_CATCH(RethrowTerminalExceptions);
-        if (!success)
-        {
-            _ASSERTE(m_tieringDelayTimerHandle == nullptr);
-            return false;
-        }
+    CONTRACTL_END;
 
-        m_methodsPendingCountingForTier1 = methodsPendingCountingHolder.Extract();
-        _ASSERTE(!m_tier1CallCountingCandidateMethodRecentlyRecorded);
-        _ASSERTE(IsTieringDelayActive());
-    }
+    _ASSERTE(timerFired);
 
-    timerContextHolder.SuppressRelease(); // the timer context is automatically deleted by the timer infrastructure
-    if (ETW::CompilationLog::TieredCompilation::Runtime::IsEnabled())
-    {
-        ETW::CompilationLog::TieredCompilation::Runtime::SendPause();
-    }
-    return true;
+    GetAppDomain()->GetTieredCompilationManager()->DeactivateTieringDelay();
 }
 
-void WINAPI TieredCompilationManager::TieringDelayTimerCallback(PVOID parameter, BOOLEAN timerFired)
+void TieredCompilationManager::DeactivateTieringDelay()
 {
-    WRAPPER_NO_CONTRACT;
-    _ASSERTE(timerFired);
-
-    ThreadpoolMgr::TimerInfoContext* timerContext = (ThreadpoolMgr::TimerInfoContext*)parameter;
-    EX_TRY
-    {
-        GCX_COOP();
-        ManagedThreadBase::ThreadPool(TieringDelayTimerCallbackInAppDomain, nullptr);
-    }
-    EX_CATCH
+    CONTRACTL
     {
-        STRESS_LOG1(LF_TIEREDCOMPILATION, LL_ERROR, "TieredCompilationManager::TieringDelayTimerCallback: "
-            "Unhandled exception, hr=0x%x\n",
-            GET_EXCEPTION()->GetHR());
+        THROWS;
+        GC_TRIGGERS;
+        MODE_PREEMPTIVE;
     }
-    EX_END_CATCH(RethrowTerminalExceptions);
-}
-
-void TieredCompilationManager::TieringDelayTimerCallbackInAppDomain(LPVOID parameter)
-{
-    WRAPPER_NO_CONTRACT;
-
-    GCX_PREEMP();
-    GetAppDomain()->GetTieredCompilationManager()->TieringDelayTimerCallbackWorker();
-}
-
-void TieredCompilationManager::TieringDelayTimerCallbackWorker()
-{
-    WRAPPER_NO_CONTRACT;
+    CONTRACTL_END;
 
-    HANDLE tieringDelayTimerHandle;
-    SArray<MethodDesc*>* methodsPendingCountingForTier1;
-    UINT32 countOfNewMethodsCalledDuringDelay;
-    bool optimizeMethods;
+    HANDLE tieringDelayTimerHandle = nullptr;
+    SArray<MethodDesc *> *methodsPendingCounting = nullptr;
+    UINT32 countOfNewMethodsCalledDuringDelay = 0;
+    bool doBackgroundWork = false;
     while (true)
     {
-        bool tier1CallCountingCandidateMethodRecentlyRecorded;
         {
             // It's possible for the timer to tick before it is recorded that the delay is in effect. This lock guarantees that
             // the delay is in effect.
-            CrstHolder holder(&m_lock);
+            LockHolder tieredCompilationLockHolder;
             _ASSERTE(IsTieringDelayActive());
 
             tieringDelayTimerHandle = m_tieringDelayTimerHandle;
-            _ASSERTE(tieringDelayTimerHandle != nullptr);
-
-            tier1CallCountingCandidateMethodRecentlyRecorded = m_tier1CallCountingCandidateMethodRecentlyRecorded;
-            if (tier1CallCountingCandidateMethodRecentlyRecorded)
+            if (m_tier1CallCountingCandidateMethodRecentlyRecorded)
             {
                 m_tier1CallCountingCandidateMethodRecentlyRecorded = false;
             }
@@ -455,8 +392,8 @@ void TieredCompilationManager::TieringDelayTimerCallbackWorker()
             {
                 // Exchange information into locals inside the lock
 
-                methodsPendingCountingForTier1 = m_methodsPendingCountingForTier1;
-                _ASSERTE(methodsPendingCountingForTier1 != nullptr);
+                methodsPendingCounting = m_methodsPendingCountingForTier1;
+                _ASSERTE(methodsPendingCounting != nullptr);
                 m_methodsPendingCountingForTier1 = nullptr;
 
                 _ASSERTE(tieringDelayTimerHandle == m_tieringDelayTimerHandle);
@@ -466,154 +403,197 @@ void TieredCompilationManager::TieringDelayTimerCallbackWorker()
                 m_countOfNewMethodsCalledDuringDelay = 0;
 
                 _ASSERTE(!IsTieringDelayActive());
-                optimizeMethods = IncrementWorkerThreadCountIfNeeded();
+
+                if (!m_isBackgroundWorkScheduled && (m_isPendingCallCountingCompletion || m_countOfMethodsToOptimize != 0))
+                {
+                    m_isBackgroundWorkScheduled = true;
+                    doBackgroundWork = true;
+                }
 
                 break;
             }
         }
 
-        // Reschedule the timer if there has been recent tier 0 activity (when a new eligible method is called the first time) to
-        // further delay call counting
-        if (tier1CallCountingCandidateMethodRecentlyRecorded)
+        // Reschedule the timer if there has been recent tier 0 activity (when a new eligible method is called the first
+        // time) to further delay call counting
+        bool success = false;
+        EX_TRY
         {
-            bool success = false;
-            EX_TRY
-            {
-                if (ThreadpoolMgr::ChangeTimerQueueTimer(
-                        tieringDelayTimerHandle,
-                        g_pConfig->TieredCompilation_CallCountingDelayMs(),
-                        (DWORD)-1 /* Period, non-repeating */))
-                {
-                    success = true;
-                }
-            }
-            EX_CATCH
-            {
-            }
-            EX_END_CATCH(RethrowTerminalExceptions);
-            if (success)
+            if (ThreadpoolMgr::ChangeTimerQueueTimer(
+                    tieringDelayTimerHandle,
+                    g_pConfig->TieredCompilation_CallCountingDelayMs(),
+                    (DWORD)-1 /* Period, non-repeating */))
             {
-                return;
+                success = true;
             }
         }
+        EX_CATCH
+        {
+        }
+        EX_END_CATCH(RethrowTerminalExceptions);
+        if (success)
+        {
+            return;
+        }
     }
 
+    AutoResetIsBackgroundWorkScheduled autoResetIsBackgroundWorkScheduled(doBackgroundWork ? this : nullptr);
+
     if (ETW::CompilationLog::TieredCompilation::Runtime::IsEnabled())
     {
         ETW::CompilationLog::TieredCompilation::Runtime::SendResume(countOfNewMethodsCalledDuringDelay);
     }
 
     // Install call counters
-    MethodDesc** methods = methodsPendingCountingForTier1->GetElements();
-    COUNT_T methodCount = methodsPendingCountingForTier1->GetCount();
-    for (COUNT_T i = 0; i < methodCount; ++i)
     {
-        MethodDesc *methodDesc = methods[i];
-        MethodDescBackpatchInfoTracker::ConditionalLockHolder lockHolder(methodDesc->MayHaveEntryPointSlotsToBackpatch());
+        MethodDesc** methods = methodsPendingCounting->GetElements();
+        COUNT_T methodCount = methodsPendingCounting->GetCount();
+        CodeVersionManager *codeVersionManager = GetAppDomain()->GetCodeVersionManager();
 
-        EX_TRY
-        {
-            methodDesc->ResetCodeEntryPoint();
-        }
-        EX_CATCH
+        MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder;
+
+        // Backpatching entry point slots requires cooperative GC mode, see
+        // MethodDescBackpatchInfoTracker::Backpatch_Locked(). The code version manager's table lock is an unsafe lock that
+        // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
+        // must be used here to prevent deadlock.
+        GCX_COOP();
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
+
+        for (COUNT_T i = 0; i < methodCount; ++i)
         {
+            MethodDesc *methodDesc = methods[i];
+            _ASSERTE(codeVersionManager == methodDesc->GetCodeVersionManager());
+            NativeCodeVersion activeCodeVersion =
+                codeVersionManager->GetActiveILCodeVersion(methodDesc).GetActiveNativeCodeVersion(methodDesc);
+            if (activeCodeVersion.IsNull())
+            {
+                continue;
+            }
+
+            EX_TRY
+            {
+                bool wasSet =
+                    CallCountingManager::SetCodeEntryPoint(activeCodeVersion, activeCodeVersion.GetNativeCode(), false, nullptr);
+                _ASSERTE(wasSet);
+            }
+            EX_CATCH
+            {
+                STRESS_LOG1(LF_TIEREDCOMPILATION, LL_WARNING, "TieredCompilationManager::DeactivateTieringDelay: "
+                    "Exception in CallCountingManager::SetCodeEntryPoint, hr=0x%x\n",
+                    GET_EXCEPTION()->GetHR());
+            }
+            EX_END_CATCH(RethrowTerminalExceptions);
         }
-        EX_END_CATCH(RethrowTerminalExceptions);
     }
-    delete methodsPendingCountingForTier1;
 
+    delete methodsPendingCounting;
     ThreadpoolMgr::DeleteTimerQueueTimer(tieringDelayTimerHandle, nullptr);
 
-    if (optimizeMethods)
+    if (doBackgroundWork)
     {
-        OptimizeMethods();
+        autoResetIsBackgroundWorkScheduled.Cancel(); // the call below will take care of it
+        DoBackgroundWork();
     }
 }
 
-bool TieredCompilationManager::TryAsyncOptimizeMethods()
+void TieredCompilationManager::AsyncCompleteCallCounting()
 {
-    WRAPPER_NO_CONTRACT;
-    _ASSERTE(DebugGetWorkerThreadCount() != 0);
-
-    // Terminal exceptions escape as exceptions, but all other errors should gracefully
-    // return to the caller. Non-terminal error conditions should be rare (ie OOM,
-    // OS failure to create thread) and we consider it reasonable for some methods
-    // to go unoptimized or have their optimization arbitrarily delayed under these
-    // circumstances.
-    bool success = false;
-    EX_TRY
+    CONTRACTL
     {
-        if (ThreadpoolMgr::QueueUserWorkItem(StaticOptimizeMethodsCallback, this, QUEUE_ONLY, TRUE))
+        THROWS;
+        GC_TRIGGERS;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    {
+        LockHolder tieredCompilationLockHolder;
+
+        if (m_recentlyRequestedCallCountingCompletionAgain)
+        {
+            _ASSERTE(m_isPendingCallCountingCompletion);
+        }
+        else if (m_isPendingCallCountingCompletion)
         {
-            success = true;
+            // A potentially large number of methods may reach the call count threshold at about the same time or in bursts.
+            // This field is used to coalesce a burst of pending completions, see the background work.
+            m_recentlyRequestedCallCountingCompletionAgain = true;
         }
         else
         {
-            STRESS_LOG0(LF_TIEREDCOMPILATION, LL_WARNING, "TieredCompilationManager::OnMethodCalled: "
-                "ThreadpoolMgr::QueueUserWorkItem returned FALSE (no thread will run)\n");
+            m_isPendingCallCountingCompletion = true;
         }
-    }
-    EX_CATCH
-    {
-        STRESS_LOG1(LF_TIEREDCOMPILATION, LL_WARNING, "TieredCompilationManager::OnMethodCalled: "
-            "Exception queuing work item to threadpool, hr=0x%x\n",
-            GET_EXCEPTION()->GetHR());
-    }
-    EX_END_CATCH(RethrowTerminalExceptions);
-    return success;
-}
-
-// This is the initial entrypoint for the background thread, called by
-// the threadpool.
-DWORD WINAPI TieredCompilationManager::StaticOptimizeMethodsCallback(void *args)
-{
-    STANDARD_VM_CONTRACT;
 
-    TieredCompilationManager * pTieredCompilationManager = (TieredCompilationManager *)args;
-    pTieredCompilationManager->OptimizeMethodsCallback();
+        if (m_isBackgroundWorkScheduled || IsTieringDelayActive())
+        {
+            return;
+        }
+        m_isBackgroundWorkScheduled = true;
+    }
 
-    return 0;
+    AutoResetIsBackgroundWorkScheduled autoResetIsBackgroundWorkScheduled(this);
+    RequestBackgroundWork();
+    autoResetIsBackgroundWorkScheduled.Cancel();
 }
 
-void TieredCompilationManager::OptimizeMethodsCallback()
+void TieredCompilationManager::ScheduleBackgroundWork()
 {
-    STANDARD_VM_CONTRACT;
-    _ASSERTE(DebugGetWorkerThreadCount() != 0);
+    CONTRACTL
+    {
+        THROWS;
+        GC_TRIGGERS;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
 
-    // This app domain shutdown check isn't required for correctness
-    // but it should reduce some unneeded exceptions trying
-    // to enter a closed AppDomain
     {
-        CrstHolder holder(&m_lock);
-        if (m_isAppDomainShuttingDown)
+        LockHolder tieredCompilationLockHolder;
+
+        if (m_isBackgroundWorkScheduled ||
+            (!m_isPendingCallCountingCompletion && m_countOfMethodsToOptimize == 0) ||
+            IsTieringDelayActive())
         {
-            DecrementWorkerThreadCount();
             return;
         }
+        m_isBackgroundWorkScheduled = true;
     }
 
-    EX_TRY
-    {
-        GCX_COOP();
-        OptimizeMethods();
-    }
-    EX_CATCH
+    AutoResetIsBackgroundWorkScheduled autoResetIsBackgroundWorkScheduled(this);
+    RequestBackgroundWork();
+    autoResetIsBackgroundWorkScheduled.Cancel();
+}
+
+void TieredCompilationManager::RequestBackgroundWork()
+{
+    WRAPPER_NO_CONTRACT;
+    _ASSERTE(m_isBackgroundWorkScheduled);
+
+    if (!ThreadpoolMgr::QueueUserWorkItem(StaticBackgroundWorkCallback, this, QUEUE_ONLY, TRUE))
     {
-        STRESS_LOG1(LF_TIEREDCOMPILATION, LL_ERROR, "TieredCompilationManager::OptimizeMethodsCallback: "
-            "Unhandled exception on domain transition, hr=0x%x\n",
-            GET_EXCEPTION()->GetHR());
+        ThrowOutOfMemory();
     }
-    EX_END_CATCH(RethrowTerminalExceptions);
+}
+
+// This is the initial entrypoint for the background thread, called by
+// the threadpool.
+DWORD WINAPI TieredCompilationManager::StaticBackgroundWorkCallback(void *args)
+{
+    STANDARD_VM_CONTRACT;
+
+    TieredCompilationManager * pTieredCompilationManager = (TieredCompilationManager *)args;
+    pTieredCompilationManager->DoBackgroundWork();
+    return 0;
 }
 
 //This method will process one or more methods from optimization queue
 // on a background thread. Each such method will be jitted with code
 // optimizations enabled and then installed as the active implementation
 // of the method entrypoint.
-void TieredCompilationManager::OptimizeMethods()
+void TieredCompilationManager::DoBackgroundWork()
 {
     WRAPPER_NO_CONTRACT;
-    _ASSERTE(DebugGetWorkerThreadCount() != 0);
+
+    AutoResetIsBackgroundWorkScheduled autoResetIsBackgroundWorkScheduled(this);
 
     // We need to be careful not to work for too long in a single invocation of this method or we could starve the thread pool
     // and force it to create unnecessary additional threads. We will JIT for a minimum of this quantum, then schedule another
@@ -622,67 +602,148 @@ void TieredCompilationManager::OptimizeMethods()
 
     if (ETW::CompilationLog::TieredCompilation::Runtime::IsEnabled())
     {
-        ETW::CompilationLog::TieredCompilation::Runtime::SendBackgroundJitStart(m_countOfMethodsToOptimize);
+        UINT32 countOfMethodsToOptimize = m_countOfMethodsToOptimize;
+        if (m_isPendingCallCountingCompletion)
+        {
+            countOfMethodsToOptimize += CallCountingManager::GetCountOfCodeVersionsPendingCompletion();
+        }
+        ETW::CompilationLog::TieredCompilation::Runtime::SendBackgroundJitStart(countOfMethodsToOptimize);
     }
 
+    bool allMethodsJitted = false;
     UINT32 jittedMethodCount = 0;
     DWORD startTickCount = GetTickCount();
-    NativeCodeVersion nativeCodeVersion;
-    EX_TRY
+    while (true)
     {
-        GCX_PREEMP();
-        while (true)
+        bool completeCallCounting = false;
+        NativeCodeVersion nativeCodeVersionToOptimize;
         {
+            LockHolder tieredCompilationLockHolder;
+
+            if (IsTieringDelayActive())
             {
-                CrstHolder holder(&m_lock);
+                m_isBackgroundWorkScheduled = false;
+                autoResetIsBackgroundWorkScheduled.Cancel();
+                break;
+            }
 
-                if (IsTieringDelayActive() || m_isAppDomainShuttingDown)
+            bool wasPendingCallCountingCompletion = m_isPendingCallCountingCompletion;
+            if (wasPendingCallCountingCompletion)
+            {
+                if (m_recentlyRequestedCallCountingCompletionAgain)
+                {
+                    // A potentially large number of methods may reach the call count threshold at about the same time or in
+                    // bursts. To coalesce a burst of pending completions a bit, if another method has reached the call count
+                    // threshold since the last time it was checked here, don't complete call counting yet. Coalescing
+                    // call counting completions a bit helps to avoid blocking foreground threads due to lock contention as
+                    // methods are continuing to reach the call count threshold.
+                    m_recentlyRequestedCallCountingCompletionAgain = false;
+                }
+                else
                 {
-                    DecrementWorkerThreadCount();
-                    break;
+                    m_isPendingCallCountingCompletion = false;
+                    completeCallCounting = true;
                 }
+            }
 
-                nativeCodeVersion = GetNextMethodToOptimize();
-                if (nativeCodeVersion.IsNull())
+            if (!completeCallCounting)
+            {
+                nativeCodeVersionToOptimize = GetNextMethodToOptimize();
+                if (nativeCodeVersionToOptimize.IsNull())
                 {
-                    DecrementWorkerThreadCount();
-                    break;
+                    // Ran out of methods to JIT
+                    if (wasPendingCallCountingCompletion)
+                    {
+                        // If call counting completions are pending and delayed above for coalescing, complete call counting
+                        // now, as that will add more methods to be rejitted
+                        m_isPendingCallCountingCompletion = false;
+                        _ASSERTE(!m_recentlyRequestedCallCountingCompletionAgain);
+                        completeCallCounting = true;
+                    }
+                    else
+                    {
+                        m_isBackgroundWorkScheduled = false;
+                        autoResetIsBackgroundWorkScheduled.Cancel();
+                        allMethodsJitted = true;
+                        break;
+                    }
                 }
             }
+        }
 
-            OptimizeMethod(nativeCodeVersion);
+        _ASSERTE(completeCallCounting == !!nativeCodeVersionToOptimize.IsNull());
+        if (completeCallCounting)
+        {
+            EX_TRY
+            {
+                CallCountingManager::CompleteCallCounting();
+            }
+            EX_CATCH
+            {
+                STRESS_LOG1(LF_TIEREDCOMPILATION, LL_WARNING, "TieredCompilationManager::DoBackgroundWork: "
+                    "Exception in CallCountingManager::CompleteCallCounting, hr=0x%x\n",
+                    GET_EXCEPTION()->GetHR());
+            }
+            EX_END_CATCH(RethrowTerminalExceptions);
+        }
+        else
+        {
+            OptimizeMethod(nativeCodeVersionToOptimize);
             ++jittedMethodCount;
+        }
 
-            // If we have been running for too long return the thread to the threadpool and queue another event
-            // This gives the threadpool a chance to service other requests on this thread before returning to
-            // this work.
-            DWORD currentTickCount = GetTickCount();
-            if (currentTickCount - startTickCount >= OptimizationQuantumMs)
+        // If we have been running for too long return the thread to the threadpool and queue another event
+        // This gives the threadpool a chance to service other requests on this thread before returning to
+        // this work.
+        DWORD currentTickCount = GetTickCount();
+        if (currentTickCount - startTickCount >= OptimizationQuantumMs)
+        {
+            bool success = false;
+            EX_TRY
             {
-                if (!TryAsyncOptimizeMethods())
-                {
-                    CrstHolder holder(&m_lock);
-                    DecrementWorkerThreadCount();
-                }
+                RequestBackgroundWork();
+                success = true;
+            }
+            EX_CATCH
+            {
+                STRESS_LOG1(LF_TIEREDCOMPILATION, LL_WARNING, "TieredCompilationManager::DoBackgroundWork: "
+                    "Exception in RequestBackgroundWork, hr=0x%x\n",
+                    GET_EXCEPTION()->GetHR());
+            }
+            EX_END_CATCH(RethrowTerminalExceptions);
+            if (success)
+            {
+                autoResetIsBackgroundWorkScheduled.Cancel();
                 break;
             }
+
+            startTickCount = currentTickCount;
         }
     }
-    EX_CATCH
+
+    if (ETW::CompilationLog::TieredCompilation::Runtime::IsEnabled())
     {
+        UINT32 countOfMethodsToOptimize = m_countOfMethodsToOptimize;
+        if (m_isPendingCallCountingCompletion)
         {
-            CrstHolder holder(&m_lock);
-            DecrementWorkerThreadCount();
+            countOfMethodsToOptimize += CallCountingManager::GetCountOfCodeVersionsPendingCompletion();
         }
-        STRESS_LOG2(LF_TIEREDCOMPILATION, LL_ERROR, "TieredCompilationManager::OptimizeMethods: "
-            "Unhandled exception during method optimization, hr=0x%x, last method=%p\n",
-            GET_EXCEPTION()->GetHR(), nativeCodeVersion.GetMethodDesc());
+        ETW::CompilationLog::TieredCompilation::Runtime::SendBackgroundJitStop(countOfMethodsToOptimize, jittedMethodCount);
     }
-    EX_END_CATCH(RethrowTerminalExceptions);
 
-    if (ETW::CompilationLog::TieredCompilation::Runtime::IsEnabled())
+    if (allMethodsJitted)
     {
-        ETW::CompilationLog::TieredCompilation::Runtime::SendBackgroundJitStop(m_countOfMethodsToOptimize, jittedMethodCount);
+        EX_TRY
+        {
+            CallCountingManager::StopAndDeleteAllCallCountingStubs();
+        }
+        EX_CATCH
+        {
+            STRESS_LOG1(LF_TIEREDCOMPILATION, LL_WARNING, "TieredCompilationManager::DoBackgroundWork: "
+                "Exception in CallCountingManager::StopAndDeleteAllCallCountingStubs, hr=0x%x\n",
+                GET_EXCEPTION()->GetHR());
+        }
+        EX_END_CATCH(RethrowTerminalExceptions);
     }
 }
 
@@ -746,120 +807,66 @@ void TieredCompilationManager::ActivateCodeVersion(NativeCodeVersion nativeCodeV
     STANDARD_VM_CONTRACT;
 
     MethodDesc* pMethod = nativeCodeVersion.GetMethodDesc();
-    CodeVersionManager* pCodeVersionManager = pMethod->GetCodeVersionManager();
 
     // If the ilParent version is active this will activate the native code version now.
     // Otherwise if the ilParent version becomes active again in the future the native
     // code version will activate then.
     ILCodeVersion ilParent;
     HRESULT hr = S_OK;
-    bool mayHaveEntryPointSlotsToBackpatch = pMethod->MayHaveEntryPointSlotsToBackpatch();
-    MethodDescBackpatchInfoTracker::ConditionalLockHolder lockHolder(mayHaveEntryPointSlotsToBackpatch);
-
     {
+        bool mayHaveEntryPointSlotsToBackpatch = pMethod->MayHaveEntryPointSlotsToBackpatch();
+        MethodDescBackpatchInfoTracker::ConditionalLockHolder slotBackpatchLockHolder(mayHaveEntryPointSlotsToBackpatch);
+
         // Backpatching entry point slots requires cooperative GC mode, see
         // MethodDescBackpatchInfoTracker::Backpatch_Locked(). The code version manager's table lock is an unsafe lock that
         // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
         // must be used here to prevent deadlock.
         GCX_MAYBE_COOP(mayHaveEntryPointSlotsToBackpatch);
-        CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
+        CodeVersionManager::LockHolder codeVersioningLockHolder;
 
         // As long as we are exclusively using any non-JumpStamp publishing for tiered compilation
         // methods this first attempt should succeed
         ilParent = nativeCodeVersion.GetILCodeVersion();
-        hr = ilParent.SetActiveNativeCodeVersion(nativeCodeVersion, FALSE);
+        hr = ilParent.SetActiveNativeCodeVersion(nativeCodeVersion);
         LOG((LF_TIEREDCOMPILATION, LL_INFO10000, "TieredCompilationManager::ActivateCodeVersion Method=0x%pM (%s::%s), code version id=0x%x. SetActiveNativeCodeVersion ret=0x%x\n",
             pMethod, pMethod->m_pszDebugClassName, pMethod->m_pszDebugMethodName,
             nativeCodeVersion.GetVersionId(),
             hr));
     }
-    if (hr == CORPROF_E_RUNTIME_SUSPEND_REQUIRED)
-    {
-        // if we start using jump-stamp publishing for tiered compilation, the first attempt
-        // without the runtime suspended will fail and then this second attempt will
-        // succeed.
-        // Even though this works performance is likely to be quite bad. Realistically
-        // we are going to need batched updates to makes tiered-compilation + jump-stamp
-        // viable. This fallback path is just here as a proof-of-concept.
-        ThreadSuspend::SuspendEE(ThreadSuspend::SUSPEND_FOR_REJIT);
-        {
-            // Backpatching entry point slots requires cooperative GC mode, see
-            // MethodDescBackpatchInfoTracker::Backpatch_Locked(). The code version manager's table lock is an unsafe lock that
-            // may be taken in any GC mode. The lock is taken in cooperative GC mode on some other paths, so the same ordering
-            // must be used here to prevent deadlock.
-            GCX_MAYBE_COOP(mayHaveEntryPointSlotsToBackpatch);
-            CodeVersionManager::TableLockHolder lock(pCodeVersionManager);
-
-            hr = ilParent.SetActiveNativeCodeVersion(nativeCodeVersion, TRUE);
-            LOG((LF_TIEREDCOMPILATION, LL_INFO10000, "TieredCompilationManager::ActivateCodeVersion Method=0x%pM (%s::%s), code version id=0x%x. [Suspended] SetActiveNativeCodeVersion ret=0x%x\n",
-                pMethod, pMethod->m_pszDebugClassName, pMethod->m_pszDebugMethodName,
-                nativeCodeVersion.GetVersionId(),
-                hr));
-        }
-        ThreadSuspend::RestartEE(FALSE, TRUE);
-    }
     if (FAILED(hr))
     {
-        STRESS_LOG2(LF_TIEREDCOMPILATION, LL_INFO10, "TieredCompilationManager::ActivateCodeVersion: Method %pM failed to publish native code for native code version %d\n",
+        STRESS_LOG2(LF_TIEREDCOMPILATION, LL_INFO10, "TieredCompilationManager::ActivateCodeVersion: "
+            "Method %pM failed to publish native code for native code version %d\n",
             pMethod, nativeCodeVersion.GetVersionId());
     }
 }
 
 // Dequeues the next method in the optmization queue.
-// This should be called with m_lock already held and runs
-// on the background thread.
+// This runs on the background thread.
 NativeCodeVersion TieredCompilationManager::GetNextMethodToOptimize()
 {
-    STANDARD_VM_CONTRACT;
+    CONTRACTL
+    {
+        NOTHROW;
+        GC_NOTRIGGER;
+        MODE_ANY;
+    }
+    CONTRACTL_END;
+
+    _ASSERTE(IsLockOwnedByCurrentThread());
 
     SListElem<NativeCodeVersion>* pElem = m_methodsToOptimize.RemoveHead();
     if (pElem != NULL)
     {
         NativeCodeVersion nativeCodeVersion = pElem->GetValue();
         delete pElem;
+        _ASSERTE(m_countOfMethodsToOptimize != 0);
         --m_countOfMethodsToOptimize;
         return nativeCodeVersion;
     }
     return NativeCodeVersion();
 }
 
-bool TieredCompilationManager::IncrementWorkerThreadCountIfNeeded()
-{
-    WRAPPER_NO_CONTRACT;
-    // m_lock should be held
-
-    if (0 == m_countOptimizationThreadsRunning &&
-        !m_isAppDomainShuttingDown &&
-        !m_methodsToOptimize.IsEmpty() &&
-        !IsTieringDelayActive())
-    {
-        // Our current policy throttles at 1 thread, but in the future we
-        // could experiment with more parallelism.
-        m_countOptimizationThreadsRunning++;
-        return true;
-    }
-    return false;
-}
-
-void TieredCompilationManager::DecrementWorkerThreadCount()
-{
-    STANDARD_VM_CONTRACT;
-    // m_lock should be held
-    _ASSERTE(m_countOptimizationThreadsRunning != 0);
-
-    m_countOptimizationThreadsRunning--;
-}
-
-#ifdef _DEBUG
-DWORD TieredCompilationManager::DebugGetWorkerThreadCount()
-{
-    WRAPPER_NO_CONTRACT;
-
-    CrstHolder holder(&m_lock);
-    return m_countOptimizationThreadsRunning;
-}
-#endif
-
 //static
 CORJIT_FLAGS TieredCompilationManager::GetJitFlags(NativeCodeVersion nativeCodeVersion)
 {
@@ -875,8 +882,12 @@ CORJIT_FLAGS TieredCompilationManager::GetJitFlags(NativeCodeVersion nativeCodeV
         return flags;
     }
 
-    if (nativeCodeVersion.IsDefaultVersion()) // slightly faster common path during startup compared to below
+    // Determine the optimization tier for the default code version (slightly faster common path during startup compared to
+    // below), and disable call counting and set the optimization tier if it's not going to be tier 0 (this is used in other
+    // places for the default code version where necessary to avoid the extra expense of GetOptimizationTier()).
+    if (nativeCodeVersion.IsDefaultVersion())
     {
+        NativeCodeVersion::OptimizationTier newOptimizationTier;
         if (!methodDesc->RequestedAggressiveOptimization())
         {
             if (g_pConfig->TieredCompilation_QuickJit())
@@ -884,12 +895,17 @@ CORJIT_FLAGS TieredCompilationManager::GetJitFlags(NativeCodeVersion nativeCodeV
                 flags.Set(CORJIT_FLAGS::CORJIT_FLAG_TIER0);
                 return flags;
             }
+
+            newOptimizationTier = NativeCodeVersion::OptimizationTierOptimized;
         }
         else
         {
+            newOptimizationTier = NativeCodeVersion::OptimizationTier1;
             flags.Set(CORJIT_FLAGS::CORJIT_FLAG_TIER1);
         }
 
+        methodDesc->GetLoaderAllocator()->GetCallCountingManager()->DisableCallCounting(nativeCodeVersion);
+        nativeCodeVersion.SetOptimizationTier(newOptimizationTier);
     #ifdef FEATURE_INTERPRETER
         flags.Set(CORJIT_FLAGS::CORJIT_FLAG_MAKEFINALCODE);
     #endif
@@ -899,10 +915,7 @@ CORJIT_FLAGS TieredCompilationManager::GetJitFlags(NativeCodeVersion nativeCodeV
     switch (nativeCodeVersion.GetOptimizationTier())
     {
         case NativeCodeVersion::OptimizationTier0:
-            if (!g_pConfig->TieredCompilation_QuickJit())
-            {
-                goto OptTierOptimized;
-            }
+            _ASSERTE(g_pConfig->TieredCompilation_QuickJit());
             flags.Set(CORJIT_FLAGS::CORJIT_FLAG_TIER0);
             break;
 
@@ -911,7 +924,6 @@ CORJIT_FLAGS TieredCompilationManager::GetJitFlags(NativeCodeVersion nativeCodeV
             // fall through
 
         case NativeCodeVersion::OptimizationTierOptimized:
-        OptTierOptimized:
 #ifdef FEATURE_INTERPRETER
             flags.Set(CORJIT_FLAGS::CORJIT_FLAG_MAKEFINALCODE);
 #endif
@@ -923,4 +935,14 @@ CORJIT_FLAGS TieredCompilationManager::GetJitFlags(NativeCodeVersion nativeCodeV
     return flags;
 }
 
+CrstStatic TieredCompilationManager::s_lock;
+
+#ifdef _DEBUG
+bool TieredCompilationManager::IsLockOwnedByCurrentThread()
+{
+    WRAPPER_NO_CONTRACT;
+    return !!s_lock.OwnedByCurrentThread();
+}
+#endif // _DEBUG
+
 #endif // FEATURE_TIERED_COMPILATION && !DACCESS_COMPILE
index 1f4f693..ee7bac7 100644 (file)
@@ -6,6 +6,13 @@
 //
 // ===========================================================================
 
+// Exceptions (OOM)
+// - On foreground threads, exceptions are propagated unless they can be handled without any compromise
+// - On background threads, exceptions are caught and logged. The scope of an exception is limited to one per method or code
+//   version such that if there is a loop over many, an exception would not abort the entire loop, rather just one iteration of
+//   the loop.
+// - Exceptions may lead to one or more methods to not be promoted anymore, and perhaps, though rarely, with no further chance
+//   of promotion
 
 #ifndef TIERED_COMPILATION_H
 #define TIERED_COMPILATION_H
@@ -34,43 +41,79 @@ public:
 #ifdef FEATURE_TIERED_COMPILATION
 
 public:
-    bool OnMethodCodeVersionCalledFirstTime(MethodDesc* pMethodDesc);
-    bool OnMethodCodeVersionCalledSubsequently(MethodDesc* pMethodDesc);
-    void AsyncPromoteMethodToTier1(MethodDesc* pMethodDesc);
-    void Shutdown();
+    void HandleCallCountingForFirstCall(MethodDesc* pMethodDesc);
+    bool TrySetCodeEntryPointAndRecordMethodForCallCounting(MethodDesc* pMethodDesc, PCODE codeEntryPoint);
+    void AsyncPromoteToTier1(NativeCodeVersion tier0NativeCodeVersion, bool *scheduleTieringBackgroundWorkRef);
     static CORJIT_FLAGS GetJitFlags(NativeCodeVersion nativeCodeVersion);
 
 private:
     bool IsTieringDelayActive();
-    bool TryInitiateTieringDelay();
     static void WINAPI TieringDelayTimerCallback(PVOID parameter, BOOLEAN timerFired);
-    static void TieringDelayTimerCallbackInAppDomain(LPVOID parameter);
-    void TieringDelayTimerCallbackWorker();
+    void DeactivateTieringDelay();
 
-    bool TryAsyncOptimizeMethods();
-    static DWORD StaticOptimizeMethodsCallback(void* args);
-    void OptimizeMethodsCallback();
-    void OptimizeMethods();
+public:
+    void AsyncCompleteCallCounting();
+
+public:
+    void ScheduleBackgroundWork();
+private:
+    void RequestBackgroundWork();
+    static DWORD StaticBackgroundWorkCallback(void* args);
+    void DoBackgroundWork();
+
+private:
     void OptimizeMethod(NativeCodeVersion nativeCodeVersion);
     NativeCodeVersion GetNextMethodToOptimize();
     BOOL CompileCodeVersion(NativeCodeVersion nativeCodeVersion);
     void ActivateCodeVersion(NativeCodeVersion nativeCodeVersion);
 
-    bool IncrementWorkerThreadCountIfNeeded();
-    void DecrementWorkerThreadCount();
+#ifndef DACCESS_COMPILE
+private:
+    static CrstStatic s_lock;
+
+public:
+    static void StaticInitialize()
+    {
+        WRAPPER_NO_CONTRACT;
+
+        // CodeVersionManager's lock is also CRST_UNSAFE_ANYMODE. To avoid having to unnecessarily take the TieredCompilation
+        // lock for larger sections of code before the CodeVersionManager's lock, it is instead taken after the
+        // CodeVersionManager's lock for the few cases where both locks need to be held, so it must also be CRST_UNSAFE_ANYMODE.
+        s_lock.Init(CrstTieredCompilation, CrstFlags(CRST_UNSAFE_ANYMODE));
+    }
+
 #ifdef _DEBUG
-    DWORD DebugGetWorkerThreadCount();
+public:
+    static bool IsLockOwnedByCurrentThread();
 #endif
 
-    Crst m_lock;
+public:
+    class LockHolder : private CrstHolder
+    {
+    public:
+        LockHolder() : CrstHolder(&s_lock)
+        {
+            WRAPPER_NO_CONTRACT;
+        }
+
+        LockHolder(const LockHolder &) = delete;
+        LockHolder &operator =(const LockHolder &) = delete;
+    };
+
+private:
+    class AutoResetIsBackgroundWorkScheduled;
+#endif // !DACCESS_COMPILE
+
+private:
     SList<SListElem<NativeCodeVersion>> m_methodsToOptimize;
     UINT32 m_countOfMethodsToOptimize;
-    BOOL m_isAppDomainShuttingDown;
-    DWORD m_countOptimizationThreadsRunning;
     UINT32 m_countOfNewMethodsCalledDuringDelay;
     SArray<MethodDesc*>* m_methodsPendingCountingForTier1;
     HANDLE m_tieringDelayTimerHandle;
+    bool m_isBackgroundWorkScheduled;
     bool m_tier1CallCountingCandidateMethodRecentlyRecorded;
+    bool m_isPendingCallCountingCompletion;
+    bool m_recentlyRequestedCallCountingCompletionAgain;
 
     CLREvent m_asyncWorkDoneEvent;
 
index 6cc980b..b4bc623 100644 (file)
@@ -1502,28 +1502,13 @@ WorkRequest* ThreadpoolMgr::DequeueWorkRequest()
     RETURN entry;
 }
 
-DWORD WINAPI ThreadpoolMgr::ExecuteHostRequest(PVOID pArg)
-{
-    CONTRACTL
-    {
-        THROWS;
-        GC_TRIGGERS;
-        MODE_ANY;
-    }
-    CONTRACTL_END;
-
-    bool foundWork, wasNotRecalled;
-    ExecuteWorkRequest(&foundWork, &wasNotRecalled);
-    return ERROR_SUCCESS;
-}
-
 void ThreadpoolMgr::ExecuteWorkRequest(bool* foundWork, bool* wasNotRecalled)
 {
     CONTRACTL
     {
         THROWS;     // QueueUserWorkItem can throw
         GC_TRIGGERS;
-        MODE_ANY;
+        MODE_PREEMPTIVE;
     }
     CONTRACTL_END;
 
index 8087666..a9f489c 100644 (file)
@@ -774,8 +774,6 @@ public:
 
     static void ExecuteWorkRequest(bool* foundWork, bool* wasNotRecalled);
 
-    static DWORD WINAPI ExecuteHostRequest(PVOID pArg);
-
 #ifndef DACCESS_COMPILE
 
     inline static void AppendWorkRequest(WorkRequest* entry)