Minor eager performance improvements
- remove linear regression dependence on global step.
This speeds things up a lot for the benchmark (since it removes a bunch of
unnecessary code), but is obviously not a fair comparison.
I think its worth doing, since I don't see any reason to have a global step
in eager.
- nn_ops dropout had an unnecessary convert_to_tensor, convert back to numpy
(with a GPU this would copy out, copy back).
- cudnn_recurrent reshape would always fallback to the slow path - so I just
converted it to be in the fastpath - this will be low impact.
- tensor_shape should not generate a new object every time
- remove unnecessary list creation and searching in some dtypes functions
PiperOrigin-RevId:
198127757