Optimize tracking of peak costs
Use a separate container for the potentially frequent write
operations when updating the peak memory consumption of all
allocation locations. This allows us to only update what is
actually required and thus makes better use of our CPU caches.
In one of my example files, this decreases the processing time
from about 2.2s down to 1.8s, i.e. it's about 22% faster. But,
compared to the original approach with the different "peak"
heuristic, which took ~1.2s, we are still about 33% slower.