Do not ifdef __launch_bounds__ out for ROCm. (#15228)
authorJohannes M Dieterich <johannes.dieterich@amd.com>
Fri, 14 Dec 2018 22:45:11 +0000 (14:45 -0800)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Fri, 14 Dec 2018 22:47:32 +0000 (14:47 -0800)
commitbd368b867dc09461bbc0710322b24ef10957af1a
treec28273b32e634c8c445ee08404d84114d27212d0
parentdcd1685282510b38afca5e522237bf7d8f69561a
Do not ifdef __launch_bounds__ out for ROCm. (#15228)

Summary:
The compiler understands it and profits from knowing it by not using too
many VGPRs as it defaults to 256 default workgroup size.

Fixes a problem in bringup of ROCm 2.0 on gfx906.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15228

Differential Revision: D13470950

Pulled By: bddppq

fbshipit-source-id: f9aa44c7c95299a099c0ea9317b9044cc056acc5
aten/src/ATen/cuda/CUDAApplyUtils.cuh
aten/src/ATen/native/cuda/Dropout.cu
aten/src/ATen/native/cuda/RNN.cu
aten/src/ATen/native/cuda/TensorTransformations.cu
aten/src/THC/THCReduce.cuh
aten/src/THCUNN/SpatialCrossMapLRN.cu