It turns out all files end at some point and thus and OutOfRange status is encountered on all successful reads. The old code would then compare next_id_ to total_size(), to see whether or not we should return an error. But this is exactly what we tried to prevent. Instead use vocab_size_ if it was initialized or don't return an error.
PiperOrigin-RevId:
191952441
string line;
status_ = input_buffer_->ReadLine(&line);
if (!status_.ok()) {
- if (errors::IsOutOfRange(status_) && next_id_ != total_size()) {
+ if (errors::IsOutOfRange(status_) && vocab_size_ != -1 &&
+ next_id_ != vocab_size_) {
status_ = errors::InvalidArgument("Invalid vocab_size in ", filename_,
- ": expected ", total_size(),
+ ": expected ", vocab_size_,
" but got ", next_id_);
}
valid_ = false;