intel/fs: New shuffle_for_32bit_write and shuffle_from_32bit_read
authorJose Maria Casanova Crespo <jmcasanova@igalia.com>
Sat, 9 Jun 2018 09:45:22 +0000 (11:45 +0200)
committerJose Maria Casanova Crespo <jmcasanova@igalia.com>
Sat, 16 Jun 2018 20:39:08 +0000 (22:39 +0200)
These new shuffle functions deal with the shuffle/unshuffle operations
needed for read/write operations using 32-bit components when the
read/written components have a different bit-size (8, 16, 64-bits).
Shuffle from 32-bit to 32-bit becomes a simple MOV.

shuffle_src_to_dst takes care of doing a shuffle when source type is
smaller than destination type and an unshuffle when source type is
bigger than destination. So this new read/write functions just need
to call shuffle_src_to_dst assuming that writes use a 32-bit
destination and reads use a 32-bit source.

As shuffle_for_32bit_write/from_32bit_read components take components
in unit of source/destination types and shuffle_src_to_dst takes units
of the smallest type component, we adjust components and first_component
parameters.

To enable this new functions it is needed than there is no
source/destination overlap in the case of shuffle_from_32bit_read.
That never happens on shuffle_for_32bit_write as it allocates a new
destination register as it was at shuffle_64bit_data_for_32bit_write.

v2: Reword commit log and add comments to explain why first_component
    and components parameters are adjusted. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
src/intel/compiler/brw_fs.h
src/intel/compiler/brw_fs_nir.cpp

index faf5156..779170e 100644 (file)
@@ -519,6 +519,17 @@ void shuffle_16bit_data_for_32bit_write(const brw::fs_builder &bld,
                                         const fs_reg &src,
                                         uint32_t components);
 
+void shuffle_from_32bit_read(const brw::fs_builder &bld,
+                             const fs_reg &dst,
+                             const fs_reg &src,
+                             uint32_t first_component,
+                             uint32_t components);
+
+fs_reg shuffle_for_32bit_write(const brw::fs_builder &bld,
+                               const fs_reg &src,
+                               uint32_t first_component,
+                               uint32_t components);
+
 fs_reg setup_imm_df(const brw::fs_builder &bld,
                     double v);
 
index d91faf1..9a0de3a 100644 (file)
@@ -5463,6 +5463,49 @@ shuffle_src_to_dst(const fs_builder &bld,
    }
 }
 
+void
+shuffle_from_32bit_read(const fs_builder &bld,
+                        const fs_reg &dst,
+                        const fs_reg &src,
+                        uint32_t first_component,
+                        uint32_t components)
+{
+   assert(type_sz(src.type) == 4);
+
+   /* This function takes components in units of the destination type while
+    * shuffle_src_to_dst takes components in units of the smallest type
+    */
+   if (type_sz(dst.type) > 4) {
+      assert(type_sz(dst.type) == 8);
+      first_component *= 2;
+      components *= 2;
+   }
+
+   shuffle_src_to_dst(bld, dst, src, first_component, components);
+}
+
+fs_reg
+shuffle_for_32bit_write(const fs_builder &bld,
+                        const fs_reg &src,
+                        uint32_t first_component,
+                        uint32_t components)
+{
+   fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_D,
+                         DIV_ROUND_UP (components * type_sz(src.type), 4));
+   /* This function takes components in units of the source type while
+    * shuffle_src_to_dst takes components in units of the smallest type
+    */
+   if (type_sz(src.type) > 4) {
+      assert(type_sz(src.type) == 8);
+      first_component *= 2;
+      components *= 2;
+   }
+
+   shuffle_src_to_dst(bld, dst, src, first_component, components);
+
+   return dst;
+}
+
 fs_reg
 setup_imm_df(const fs_builder &bld, double v)
 {