From: Jason Ekstrand Date: Wed, 27 Oct 2021 06:40:36 +0000 (-0500) Subject: anv,iris: Advertise a max 3D workgroup size of 1024^3 X-Git-Tag: upstream/22.3.5~15893 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=419b02c90c822e6ea89c7fae951fd19d7db20181;p=platform%2Fupstream%2Fmesa.git anv,iris: Advertise a max 3D workgroup size of 1024^3 On GFX version 12.5+ with COMPUTE_WALKER, this is the limit based on the size of the HW packet. On older HW, we can technically go a bit bigger but there's not much point. Technically, some hardware can support a scalar workgroup size up to 2048 but most apps don't go any bigger than 1024. As discussed on the merge request page, the current limit assumes SIMD32, but it is unclear if we want to encourage applications to use SIMD32 if it may lead to additional register spilling in shader programs. Many applications have likely tuned for a limit of 1024 based on the OpenGL minimum limit, so it might not gain much by advertising more than 1024. Reworks: * Jordan: Use MIN2 and limit total invocations as well. * Jordan: Add second paragraph to commit message based on merge request discussion. Reviewed-by: Tapani Pälli Reviewed-by: Lionel Landwerlin Reviewed-by: Jordan Justen Signed-off-by: Jordan Justen Part-of: --- diff --git a/src/gallium/drivers/iris/iris_screen.c b/src/gallium/drivers/iris/iris_screen.c index 44113d9..5e7e5dd 100644 --- a/src/gallium/drivers/iris/iris_screen.c +++ b/src/gallium/drivers/iris/iris_screen.c @@ -527,7 +527,8 @@ iris_get_compute_param(struct pipe_screen *pscreen, struct iris_screen *screen = (struct iris_screen *)pscreen; const struct intel_device_info *devinfo = &screen->devinfo; - const uint32_t max_invocations = 32 * devinfo->max_cs_workgroup_threads; + const uint32_t max_invocations = + MIN2(1024, 32 * devinfo->max_cs_workgroup_threads); #define RET(x) do { \ if (ret) \ diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 0c903de..af6e261 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1817,7 +1817,8 @@ void anv_GetPhysicalDeviceProperties( pdevice->has_bindless_images && pdevice->has_a64_buffer_access ? UINT32_MAX : MAX_BINDING_TABLE_SIZE - MAX_RTS - 1; - const uint32_t max_workgroup_size = 32 * devinfo->max_cs_workgroup_threads; + const uint32_t max_workgroup_size = + MIN2(1024, 32 * devinfo->max_cs_workgroup_threads); VkSampleCountFlags sample_counts = isl_device_get_sample_counts(&pdevice->isl_dev);