Improve caffe2 operator wrapping (#17743)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17743
- caffe2::Operator::SetOutputTensor() can now be used in operators that are called from c10/PyTorch.
- If the operator uses SetOutputTensor() instead of XOutput(), the wrapper doesn't preallocate an empty tensor for the operator anymore. Only outputs accessed in XOutput() will get an output tensor preallocated.
- Remove the copying of the vector with output tensors into a vector with pointer to output tensors.
- Preallocated outputs are now passed in as one TensorList argument on the stack. This TensorList argument has a well-defined name so other wrappers (i.e. the wrapper calling from c2 into c10) can recognize and use it).
- Macros for exporting caffe2 operators to c10 are simplified. Instead of having `c10_op_handle_for_c2_op`, we now pass in the operator handle as a template argument.
- `SetOutputTensor` and `OutputTensorOrUndefined` now work with operators exported to c10
Reviewed By: ezyang
Differential Revision:
D14362434
fbshipit-source-id:
44a5e717204f21ea8e9728437429d9b84906f9f5