[SE] RegisteredHostMemory for async device copies
authorJason Henline <jhen@google.com>
Mon, 12 Sep 2016 16:09:41 +0000 (16:09 +0000)
committerJason Henline <jhen@google.com>
Mon, 12 Sep 2016 16:09:41 +0000 (16:09 +0000)
commit57ea481945ffff7515b9bf3fe206f6c53ee8fd4a
tree13dbb437d9b7e90523dedea8ee66c4d50c0aaa09
parentb678219aa6263c88064cb1035d8f03b782c5885c
[SE] RegisteredHostMemory for async device copies

Summary:
Improve the error-prone interface that allows users to pass host
pointers that haven't been registered to asynchronous copy methods. In
CUDA, this is an extremely easy error to make, and instead of failing at
runtime, it succeeds and gives the right answers by turning the async
copy into a sync copy. So, you silently get a huge performance
degradation if you misuse the old interface. This new interface should
prevent that.

Reviewers: jlebar

Subscribers: jprice, beanz, parallel_libs-commits

Differential Revision: https://reviews.llvm.org/D24353

llvm-svn: 281225
parallel-libs/streamexecutor/examples/CUDASaxpy.cpp
parallel-libs/streamexecutor/include/streamexecutor/Device.h
parallel-libs/streamexecutor/include/streamexecutor/DeviceMemory.h
parallel-libs/streamexecutor/include/streamexecutor/HostMemory.h [new file with mode: 0644]
parallel-libs/streamexecutor/include/streamexecutor/PlatformDevice.h
parallel-libs/streamexecutor/include/streamexecutor/Stream.h
parallel-libs/streamexecutor/lib/CMakeLists.txt
parallel-libs/streamexecutor/lib/HostMemory.cpp [new file with mode: 0644]
parallel-libs/streamexecutor/unittests/CoreTests/DeviceTest.cpp
parallel-libs/streamexecutor/unittests/CoreTests/SimpleHostPlatformDevice.h
parallel-libs/streamexecutor/unittests/CoreTests/StreamTest.cpp