Related to issue mentioned in https://github.com/dotnet/coreclr/issues/13388
- Multipying the YieldProcessor count by proc count can cause excessive delays that are not fruitful on machines with a large number of procs. Even on a 12-proc machine (6-core), the heuristics as they are without the multiply seem to perform much better.
- The issue above also mentions that the delay of PAUSE on Intel Skylake+ processors have a significantly larger delay (140 cycles vs 10 cycles). Simulating that by multiplying the YieldProcessor count by 14 shows that in both tests tested, it begins crawling at low thread counts.
- I did most of the testing on ManualResetEventSlim, and since Task is using the same spin heuristics, applied the same change there as well.
}
else
{
- Thread.SpinWait(PlatformHelper.ProcessorCount * (4 << i));
+ Thread.SpinWait(4 << i);
}
}
else if (i % HOW_MANY_YIELD_EVERY_SLEEP_1 == 0)
}
else
{
- Thread.SpinWait(PlatformHelper.ProcessorCount * (4 << i));
+ Thread.SpinWait(4 << i);
}
}