i965/fs: add shuffle_64bit_data_for_32bit_write helper
authorIago Toral Quiroga <itoral@igalia.com>
Mon, 25 Jan 2016 12:37:50 +0000 (13:37 +0100)
committerSamuel Iglesias Gonsálvez <siglesias@igalia.com>
Mon, 16 May 2016 07:55:33 +0000 (09:55 +0200)
This does the inverse operation of shuffle_32bit_load_result_to_64bit_data
and we will use it when we need to write 64-bit data in the layout expected
by untyped write messages.

v2 (curro):
- Use subscript() instead of stride()
- Assert on the input types rather than silently retyping.
- Use offset() instead of horiz_offset(), drop the multiplier definition.
- Drop the temporary vgrf and force_writemask_all.
- Make component_i const.
- Move to brw_fs_nir.cpp

v3 (curro):
- Pass dst and src by reference.
- Simplify allocation of tmp register.
- Move to brw_fs_nir.cpp.
- Get rid of the temporary.

v3 (Iago):
- Check that the src and dst regions do not overlap, since that would
  typically be a bug in the caller.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
src/mesa/drivers/dri/i965/brw_fs.h
src/mesa/drivers/dri/i965/brw_fs_nir.cpp

index 286e718..f9e6792 100644 (file)
@@ -540,3 +540,8 @@ void shuffle_32bit_load_result_to_64bit_data(const brw::fs_builder &bld,
                                              const fs_reg &dst,
                                              const fs_reg &src,
                                              uint32_t components);
+
+void shuffle_64bit_data_for_32bit_write(const brw::fs_builder &bld,
+                                        const fs_reg &dst,
+                                        const fs_reg &src,
+                                        uint32_t components);
index 0d9fcda..143d5f3 100644 (file)
@@ -4128,3 +4128,35 @@ shuffle_32bit_load_result_to_64bit_data(const fs_builder &bld,
       bld.MOV(offset(dst, bld, i), tmp);
    }
 }
+
+/**
+ * This helper does the inverse operation of
+ * SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA.
+ *
+ * We need to do this when we are going to use untyped write messsages that
+ * operate with 32-bit components in order to arrange our 64-bit data to be
+ * in the expected layout.
+ *
+ * Notice that callers of this function, unlike in the case of the inverse
+ * operation, would typically need to call this with dst and src being
+ * different registers, since they would otherwise corrupt the original
+ * 64-bit data they are about to write. Because of this the function checks
+ * that the src and dst regions involved in the operation do not overlap.
+ */
+void
+shuffle_64bit_data_for_32bit_write(const fs_builder &bld,
+                                   const fs_reg &dst,
+                                   const fs_reg &src,
+                                   uint32_t components)
+{
+   assert(type_sz(src.type) == 8);
+   assert(type_sz(dst.type) == 4);
+
+   assert(!src.in_range(dst, 2 * components * bld.dispatch_width() / 8));
+
+   for (unsigned i = 0; i < components; i++) {
+      const fs_reg component_i = offset(src, bld, i);
+      bld.MOV(offset(dst, bld, 2 * i), subscript(component_i, dst.type, 0));
+      bld.MOV(offset(dst, bld, 2 * i + 1), subscript(component_i, dst.type, 1));
+   }
+}