Before this change, a Tensor contained a device pointer and a TensorInfoManager datastructure contained a mapping from device pointer to XlaTensorInfo object. This TensorInfoManager needed to be an Allocator too, so it could be informed when a Tensor is released.
After this change, a Tensor on an XlaDevice contains an XlaTensor object. The XlaTensor object is the equivalent of the old XlaTensorInfo object.
This has advantages and drawbacks:
+ We don't need yet another allocator wrapper, as there is no side-band data to manage.
+ No hashtable lookups are required.
- As XlaLocalLaunchOp could either be on an XlaDevice or a TF-classic device, we need some way to distinguish whether a Tensor is a TF-classic tensor (holds a device pointer) or an XlaTensor (we use a tagged pointer).
As part of this, allocate ShapedBuffers using the XLA backend's allocator directly instead of a roundabout route where we:
Wrapped the XLA allocator in an XlaDeviceAllocator
Then wrapped the XlaDeviceAllocator in an XlaAllocator
This leaves less to go wrong. Ideally we'd actually use StreamExecutor's allocator here, but this is less useful than XLA's as it doesn't provide helpful OOM messages (just returns nullptr).
PiperOrigin-RevId:
191048184