caffe2: fix PinnedCPUAllocator cudaHostRegister() leak (#16340)
authorBrian W. Hart <hartb@us.ibm.com>
Fri, 15 Feb 2019 14:51:23 +0000 (06:51 -0800)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Fri, 15 Feb 2019 15:02:33 +0000 (07:02 -0800)
commitfbd690c1fec0651ee9e6cc07ddcb12217ffb31bc
treee31ef09084f7f1753436e4d20c157b2a94085678
parent07b5782ff71692c2a0ef34b2d3d53185133a4d71
caffe2: fix PinnedCPUAllocator cudaHostRegister() leak (#16340)

Summary:
In the NUMA case, PinnedCPUAllocator's allocate() would return a
DataPtr constructed by DefaultCPUAllocator, which would reference
the Default... Delete() rather than the Pinned... Delete(). That
meant Pinned... Delete() would never run, so cudaHostUnregister()
would never be called when regions were freed.

See: https://github.com/pytorch/pytorch/issues/16280

This change adds a 'naked_allocate()' method to the Default allocator
that just returns a pointer to the allocated memory rather than
wrapping it in a DataPtr. Pinned allocator uses that then constructs
a DataPtr with reference to its own Delete().
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16340

Reviewed By: dzhulgakov

Differential Revision: D13843206

Pulled By: ezyang

fbshipit-source-id: 9efb572e5a01b49ef2a4aceeccc13cd0b1066528
caffe2/core/context_gpu.cu