- Notifications
You must be signed in to change notification settings - Fork 5.9k
Closed
Description
There are two bugs in our current code:
-
In the double buffer reader:
The double buffer reader creates a prefetching thread to read data from files and yield them into a member variablebufer_. However, if the double buffer itself is destroyed before the prefetching thread, it will raise a segment fault because the thread is still trying to yield data to a destroyed address. -
In the shuffle reader:
Paddle/paddle/fluid/operators/reader/create_shuffle_reader_op.cc
Lines 49 to 65 in d7d0c1e
void ReadIntoBuffers() { buffer_.clear(); buffer_.reserve(buffer_size_); iteration_pos_ = 0; PADDLE_ENFORCE(reader_->HasNext()); for (size_t i = 0; i < buffer_size_; ++i) { if (!reader_->HasNext()) { break; } buffer_.emplace_back(); reader_->ReadNext(&buffer_.back()); } std::mt19937 g(seed_); std::shuffle(buffer_.begin(), buffer_.end(), g); seed_ = g(); // update seed_; VLOG(10) << "random buffer size = " << buffer_.size(); }
TheReadIntoBuffers ()is invoked byShuffleReader's constructor. So if the underlying reader holds an empty file, an exception will be thrown by line 53. That is unexpected. The exception of 'no next data' should be thrown inReadNext(), not here.
Metadata
Metadata
Assignees
Labels
No labels