[tf.data] Fix memory leak when not all elements of a `Dataset.from_generator()` are...
authorDerek Murray <mrry@google.com>
Tue, 20 Feb 2018 22:11:35 +0000 (14:11 -0800)
committerTensorFlower Gardener <gardener@tensorflow.org>
Tue, 20 Feb 2018 22:18:09 +0000 (14:18 -0800)
commitbe862d5b91e9b9044f4e028dcdae0b6ad283e8b4
tree9f1a2c5d2569fdfbd68601701b3985245e2e8dc1
parent249065a49ed007bf631de453506bfbf22accbb39
[tf.data] Fix memory leak when not all elements of a `Dataset.from_generator()` are consumed.

This change introduces a new C++ Dataset implementation
(`GeneratorDataset`) that takes three functions:

1. An initialization function that is called before the first use.
2. A "get next" function that is called to produce the elements, until a call
   raises the OutOfRange error.
3. A finalization function that is called before the iterator is destroyed.

Previously, the generator state would only be cleaned up if the caller
consumed *every* element of the generator. In the new version, the
finalization function ensures that the Python-side state of the
generator is released regardless of how the iterator is disposed.

Fixes #16163.

PiperOrigin-RevId: 186360401
tensorflow/core/api_def/base_api/api_def_GeneratorDataset.pbtxt [new file with mode: 0644]
tensorflow/core/kernels/data/BUILD
tensorflow/core/kernels/data/captured_function.cc
tensorflow/core/kernels/data/captured_function.h
tensorflow/core/kernels/data/generator_dataset_op.cc [new file with mode: 0644]
tensorflow/core/ops/dataset_ops.cc
tensorflow/python/data/kernel_tests/dataset_from_generator_op_test.py
tensorflow/python/data/ops/dataset_ops.py