i965/vec4: Use the sampler for pull constant loads on Broadwell.
authorKenneth Graunke <kenneth@whitecape.org>
Sat, 14 Jun 2014 19:58:03 +0000 (12:58 -0700)
committerKenneth Graunke <kenneth@whitecape.org>
Sun, 15 Jun 2014 23:51:05 +0000 (16:51 -0700)
We've used the LD sampler message for pull constant loads on earlier
hardware for some time, and also were already using it for the FS on
Broadwell.  This patch makes us use it for Broadwell VS/GS as well.

I believe that when I wrote this code in 2012, we still used the data
port in some cases, and I somehow neglected to convert it while
rebasing.

Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821%
(n = 17).  Many other applications should benefit similarly: this speeds
up uniform array access in the VS, which is commonly used for skinning
shaders, among other things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp

index 14070cd..82ea45a 100644 (file)
@@ -444,14 +444,14 @@ gen8_vec4_generator::generate_pull_constant_load(vec4_instruction *inst,
    gen8_instruction *send = next_inst(BRW_OPCODE_SEND);
    gen8_set_dst(brw, send, dst);
    gen8_set_src0(brw, send, offset);
-   gen8_set_dp_message(brw, send, GEN7_SFID_DATAPORT_DATA_CACHE,
-                       surf_index,
-                       GEN6_DATAPORT_READ_MESSAGE_OWORD_DUAL_BLOCK_READ,
-                       0,      /* message control */
-                       1,      /* mlen */
-                       1,      /* rlen */
-                       false,  /* no header */
-                       false); /* EOT */
+   gen8_set_sampler_message(brw, send,
+                            surf_index,
+                            0, /* The LD message ignores the sampler unit. */
+                            GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
+                            1,      /* rlen */
+                            1,      /* mlen */
+                            false,  /* no header */
+                            BRW_SAMPLER_SIMD_MODE_SIMD4X2);
 
    brw_mark_surface_used(&prog_data->base, surf_index);
 }