Follow up on earlier change, which tried to avoid reading the input file twice for...
authorA. Unique TensorFlower <gardener@tensorflow.org>
Fri, 6 Apr 2018 22:51:49 +0000 (15:51 -0700)
committerTensorFlower Gardener <gardener@tensorflow.org>
Fri, 6 Apr 2018 22:54:27 +0000 (15:54 -0700)
It turns out all files end at some point and thus and OutOfRange status is encountered on all successful reads. The old code would then compare next_id_ to total_size(), to see whether or not we should return an error. But this is exactly what we tried to prevent. Instead use vocab_size_ if it was initialized or don't return an error.

PiperOrigin-RevId: 191952441

tensorflow/core/kernels/lookup_util.cc

index 27031d9..77386a1 100644 (file)
@@ -101,9 +101,10 @@ class TextFileLineIterator
     string line;
     status_ = input_buffer_->ReadLine(&line);
     if (!status_.ok()) {
-      if (errors::IsOutOfRange(status_) && next_id_ != total_size()) {
+      if (errors::IsOutOfRange(status_) && vocab_size_ != -1 &&
+          next_id_ != vocab_size_) {
         status_ = errors::InvalidArgument("Invalid vocab_size in ", filename_,
-                                          ": expected ", total_size(),
+                                          ": expected ", vocab_size_,
                                           " but got ", next_id_);
       }
       valid_ = false;