For MTL (verx10 == 125), float64 is supported, but int64 is not.
Therefore we need to lower cluster broadcast using 32-bit int ops.
For gfx12.5+ platforms that support int64, the register regions
used by cluster broadcast aren't supported by the 64-bit pipeline.
On MTL, dEQP-VK.subgroups.clustered.*_double* and
dEQP-VK.subgroups.clustered.*_dvec* were failing to validate the
compiled shader in debug mode, and reportedly gpu-hanging in release
mode.
With this change dEQP-VK.subgroups.clustered.*_double* passed all 48
tests and dEQP-VK.subgroups.clustered.*_dvec* passed all 140 tests on
MTL.
Rework:
* Move from generator to brw_fs_lower_regioning.cpp. (Suggested by
Francisco)
* Apply to verx10 >= 125.. (Suggested by Francisco)
Cc: 23.1 <mesa-stable>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marcin Ĺšlusarz <marcin.slusarz@intel.com> (v1)
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22569>
* integer DWord multiply, indirect addressing must not be
* used."
*
+ * For MTL (verx10 == 125), float64 is supported, but int64 is not.
+ * Therefore we need to lower cluster broadcast using 32-bit int ops.
+ *
+ * For gfx12.5+ platforms that support int64, the register regions
+ * used by cluster broadcast aren't supported by the 64-bit pipeline.
+ *
* Work around the above and handle platforms that don't
* support 64-bit types at all.
*/
- if ((!has_64bit || devinfo->platform == INTEL_PLATFORM_CHV ||
+ if ((!has_64bit || devinfo->verx10 >= 125 ||
+ devinfo->platform == INTEL_PLATFORM_CHV ||
intel_device_info_is_9lp(devinfo)) && type_sz(t) > 4)
return BRW_REGISTER_TYPE_UD;
else