[XLA:GPU] Extend the CustomCall for cudnn convolutions to represent
authorBixia Zheng <bixia@google.com>
Tue, 13 Feb 2018 00:56:28 +0000 (16:56 -0800)
committerTensorFlower Gardener <gardener@tensorflow.org>
Tue, 13 Feb 2018 00:59:58 +0000 (16:59 -0800)
commit929e3ee91ecf7f9685b50fa1681f39d9b25e568b
treef504d222b6255c9d48d67341282d27374ee26c5f
parent96564330fb0508a50a0515be11c9202c64b0f5b7
[XLA:GPU] Extend the CustomCall for cudnn convolutions to represent
tensor_ops_enabled.

The convolution algorithms returned from the stream executor have a flag
for whether tensor_ops is enabled. This flag is used when running each
algorithm during auto-tunning. However, this flag is not currently represented
in the CustomCall representing the auto-tune result. As a result, the algorithm
may be run differently after auto-tune.

This change adds a constant to the CustomCall for cudnn convolution algorithm
selected by auto-tune, to represent whether tensor_ops is enabled during
auto-tune. This information is used by convolution thunk to ensure that the
algorithm is run with the same flag after auto-tune.

PiperOrigin-RevId: 185458497
tensorflow/compiler/xla/service/gpu/convolution_thunk.cc
tensorflow/compiler/xla/service/gpu/convolution_thunk.h
tensorflow/compiler/xla/service/gpu/cudnn_convolution_algorithm_picker.cc
tensorflow/compiler/xla/service/gpu/cudnn_convolution_algorithm_picker.h
tensorflow/compiler/xla/service/gpu/cudnn_convolution_runner.cc
tensorflow/compiler/xla/service/gpu/gpu_copy_insertion.cc
tensorflow/compiler/xla/service/gpu/ir_emission_utils.h
tensorflow/compiler/xla/service/gpu/ir_emitter_unnested.cc