More efficient executable allocator (#83632)
* Add multi-element caching scheme to ExecutableAllocator
* CodePageGenerator generated code for LoaderHeaps should not cache the RW mapping created, as it is never accessed again, and caching it is evicting useful data from the cache
* Fix build breaks introduced by executable mapping cache changes
* Fix build breaks caused by changes to introduce the concept of executable pages that aren't cached ever
* Move VSD executable heaps from being LoaderHeaps to being CodeFragmentHeaps
- Should reduce the amount of contention on the ExecutableAllocator cache
- Will improve the performance of identifying what type of stub is in use by avoiding the RangeList structure
- Note, this will only apply to stubs which are used in somewhat larger applications
* Add statistics gathering features to ExecutableAllocator
* In progress
* Fix Dac api failing when called early in process startup
* Implement interleaved stubs as 16KB pages instead of 4KB pages
* Remove api incorrectly added
* Adjust cache size down to 3, and leave a breadcrumb for enabling more cache size exploration
* Fix x86 build
* Tweaks to make it all build and fix some bugs
- Notably, arm32 is now only using 4K pages as before, as it can't generate the proper immediate as needed.
* Add statistics for linked list walk lengths
* Reorder linked list on access
* Fix some more asserts and build breaks
* Fix Arm build for real this time, and fix unix arm64 miscalculation of which stubs to use
* Update based on code review comments
* More code review feedback
* Fix oops
* Attempt to fix Unix Arm64 build
* Try tweaking the number of cached mappings to see if the illegal instruction signal will go away in our testing
* Revert "Try tweaking the number of cached mappings to see if the illegal instruction signal will go away in our testing"
This reverts commit
983846083fea60eab9df53cf07a2fb5230c6c818.
* Fix last code review comment