review.tizen.org Git - platform/upstream/pytorch.git/commit

Fix cuda multiprocessing cached memory (#14736)

Summary:
This PR fixes #11422

In the old world of CUDA IPC, when we want to share a tensor T from A to B, we have to share the whole CUDA mem allocation where T's storage sit in. And we casted it to the same type of storage of T's.

This causes problem when two different types of storage got allocated to the same CUDA mem block. When we try to reconstruct the second tensor, it will complain about wrong storage type.

In this PR we reconstruct the storage only (not the entire mem block). However, CUDA only allows one open memHandle once per process, we have to save the device pointer in a global cache so that we can reconstruct tensors as they come.

Thanks a ton to ezyang who helped design the solution and debugged the issue!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14736

Differential Revision: D13335899

Pulled By: ailzhang

fbshipit-source-id: cad69db392ed6f8fdc2b93a9dc2899f6d378c371

author	Ailing Zhang <ailzhang@fb.com>
	Wed, 5 Dec 2018 18:52:39 +0000 (10:52 -0800)
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>
	Wed, 5 Dec 2018 18:55:43 +0000 (10:55 -0800)
commit	be47470c91e467d8bddbc0e07f099cf2a200e5c0
tree	ee47275724b435760d27139db5356169746df89e	tree \| snapshot
parent	3ae721d3501c56cdf6819fe40a54dc6b77900fdd	commit \| diff

aten/src/THC/THCAllocator.cpp		diff \| blob \| history
aten/src/THC/THCAllocator.h		diff \| blob \| history
aten/src/THC/THCCachingAllocator.cpp		diff \| blob \| history
aten/src/THC/THCCachingAllocator.h		diff \| blob \| history
test/test_multiprocessing.py		diff \| blob \| history
torch/csrc/generic/StorageSharing.cpp		diff \| blob \| history
torch/multiprocessing/reductions.py		diff \| blob \| history