nir/phi_builder: Use per-value hash table to store [block] -> def mapping
Replace the old array in each value with a hash table in each value.
Changes in peak memory usage according to Valgrind massif:
mean soft fp64 using uint64: 5,499,875,082 => 1,343,991,403
gfxbench5 aztec ruins high 11: 63,619,971 => 63,619,971
deus ex mankind divided 148: 62,887,728 => 62,887,728
deus ex mankind divided 2890: 72,402,222 => 72,399,750
dirt showdown 676: 74,466,431 => 69,464,023
dolphin ubershaders 210: 109,630,376 => 78,359,728
Run-time change for a full run on shader-db on my Haswell desktop (with
-march=native) is 1.22245% +/- 0.463879% (n=11). This is about +2.9
seconds on a 237 second run. The first time I sent this version of this
patch out, the run-time data was quite different. I had misconfigured
the script that ran the test, and none of the tests from higher GLSL
versions were run. These are generally more complex shaders, and they
are more affected by this change.
The previous version of this patch used a single hash table for the
whole phi builder. The mapping was from [value, block] -> def, so a
separate allocation was needed for each [value, block] tuple. There was
quite a bit of per-allocation overhead (due to ralloc), so the patch was
followed by a patch that added the use of the slab allocator. The
results of those two patches was not quite as good:
mean soft fp64 using uint64: 5,499,875,082 => 1,343,991,403
gfxbench5 aztec ruins high 11: 63,619,971 => 63,619,971
deus ex mankind divided 148: 62,887,728 => 62,887,728
deus ex mankind divided 2890: 72,402,222 => 72,402,222 *
dirt showdown 676: 74,466,431 => 72,443,591 *
dolphin ubershaders 210: 109,630,376 => 81,034,320 *
The * denote tests that are better now. In the tests that are the same
in both patches, the "after" peak memory usage was at a different
location. I did not check the local peaks.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>