[BF16] Add a missing thread local specifier to autocast_gpu_dtype (#63416)
authorYusuo Hu <yusuo@fb.com>
Thu, 19 Aug 2021 19:37:58 +0000 (12:37 -0700)
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>
Thu, 19 Aug 2021 19:39:27 +0000 (12:39 -0700)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63416

Fix a missing thread local specifier introduced by recent PR

https://github.com/pytorch/pytorch/pull/61002

Test Plan: Unit Tests

Reviewed By: ngimel

Differential Revision: D30376154

fbshipit-source-id: c70d37ec85c3eba88eb87f766f1c4e7aeff8eaf9

aten/src/ATen/autocast_mode.cpp

index 97ec9ec69dbebadad3f17514fc4fde55097d4593..1ac5ad1c88ba67b9b39a8de9b288c96d82952b26 100644 (file)
@@ -59,7 +59,7 @@ thread_local int nesting = 0;
 thread_local at::ScalarType autocast_cpu_dtype = at::kBFloat16;
 
 // autocast_gpu_dtype is the lower_precision_fp used by AutocastGPU.
-at::ScalarType autocast_gpu_dtype = at::kHalf;
+thread_local at::ScalarType autocast_gpu_dtype = at::kHalf;
 }
 
 void clear_cache() {