Improve ForEach micro-benchmark for ImmutableArray (#1183)
* Improve Foreach benchmark in ImmutableArray<Int32>
Analysis of the generated ASM code vs the same benchmark for Array shows
that the GetEnumerator call is not being inlined (the loop itself is).
In the case of the ValueType ImmutableArray.GetEnumerator method,
there's a call to ThrowNullRefIfNotInitialized for validation.
By adding MethodImplAttribute(MethodImplOptions.AggressiveInlining)
to both methods, we are able to force the JIT to inline the call and get
similar results in the benchmark.
Looking at the hardware counters collected in the benchmark, there are less
CacheMisses and BranchMispredictions/Op when the inlining happens.
Unfortunately, the same fix didn't seem to work for the other overloads of
GetEnumerator, for the explicit generic implementation. That still needs
more investigation.
* AggressiveInline in ThrowNullRefIfNotInitialized isn't needed to inline GetEnumerator