Skip to content

Two bugs in C++ reader #9180

@JiayiFeng

Description

@JiayiFeng

There are two bugs in our current code:

  1. In the double buffer reader:
    The double buffer reader creates a prefetching thread to read data from files and yield them into a member variable bufer_. However, if the double buffer itself is destroyed before the prefetching thread, it will raise a segment fault because the thread is still trying to yield data to a destroyed address.

  2. In the shuffle reader:

    void ReadIntoBuffers() {
    buffer_.clear();
    buffer_.reserve(buffer_size_);
    iteration_pos_ = 0;
    PADDLE_ENFORCE(reader_->HasNext());
    for (size_t i = 0; i < buffer_size_; ++i) {
    if (!reader_->HasNext()) {
    break;
    }
    buffer_.emplace_back();
    reader_->ReadNext(&buffer_.back());
    }
    std::mt19937 g(seed_);
    std::shuffle(buffer_.begin(), buffer_.end(), g);
    seed_ = g(); // update seed_;
    VLOG(10) << "random buffer size = " << buffer_.size();
    }

    The ReadIntoBuffers () is invoked by ShuffleReader's constructor. So if the underlying reader holds an empty file, an exception will be thrown by line 53. That is unexpected. The exception of 'no next data' should be thrown in ReadNext(), not here.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions