Add ScopedAllocatorOptimizer in support of CollectiveReduce.
authorA. Unique TensorFlower <gardener@tensorflow.org>
Fri, 25 May 2018 19:54:49 +0000 (12:54 -0700)
committerTensorFlower Gardener <gardener@tensorflow.org>
Fri, 25 May 2018 19:57:18 +0000 (12:57 -0700)
commit0b522fd22b986704d1056254961cc7988ae182eb
tree472c18f77c5e6b2c1dae0f1aacd6234f5e53436b
parentae0eb1b7f81f6d98e0503b9568c72feaa805e655
Add ScopedAllocatorOptimizer in support of CollectiveReduce.

The efficiency of CollectiveReduce is greatly improved by merging
multiple parallel reductions over smaller tensors into a single
reduction over a larger tensor that is the concatentation of the
smaller tensors.  Because CollectiveReduce is essentially an
element-wise array operation which operates on a 1-D reshape of
the input tensor it is eligible for a ScopedAllocation optimization.

The optimization works by looking for serially independent instances
of CollectiveReduce that lie within the same name-scope tier and
have the same control-flow (e.g. loop) embedding structure.  Where
two or more such nodes are found the upstream nodes that generate
their inputs are modified to write their outputs into consecutive
regions of a single tensor buffer maintained by a ScopedAllocator.
The multiple CollectiveReduce nodes are then replaced by a single
CollectiveReduce that operates in-place on the backing buffer.

The effectiveness of the optimization depends on there being candidate
CollectiveReduce nodes with these characteristics that become eligible
for execution at close to the same time.  If the name scope is too
large, and includes nodes that become execution eligible at very different
times, this graph rewrite could result in a slowdown.

Note that this optimization is experimental: it is not guaranteed to
work, especially for ops other than CollectiveReduce.

PiperOrigin-RevId: 198089642
12 files changed:
tensorflow/core/common_runtime/scoped_allocator_mgr.cc
tensorflow/core/common_runtime/scoped_allocator_mgr.h
tensorflow/core/grappler/op_types.cc
tensorflow/core/grappler/op_types.h
tensorflow/core/grappler/optimizers/BUILD
tensorflow/core/grappler/optimizers/meta_optimizer.cc
tensorflow/core/grappler/optimizers/scoped_allocator_optimizer.cc [new file with mode: 0644]
tensorflow/core/grappler/optimizers/scoped_allocator_optimizer.h [new file with mode: 0644]
tensorflow/core/grappler/optimizers/scoped_allocator_optimizer_test.cc [new file with mode: 0644]
tensorflow/core/kernels/scoped_allocator_ops_test.cc
tensorflow/core/ops/scoped_allocator_ops.cc
tensorflow/core/protobuf/rewriter_config.proto