- Notifications
You must be signed in to change notification settings - Fork 10.6k
Weaken memory fences for objects we do not care about #70637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f080786 to eeed690 Compare | @swift-ci please smoke test macOS |
| @AnthonyLatsis It passed. We should try the other tests and benchmarks |
In the case where we discard the value of expected after the compare and exchange, we can make the failure memory order for them std::memory_order_relaxed, as the value no longer matters in these cases. Furthermore, memory_order_acq_rel is not necessary for most of these operations anyway, as the value being assigned to the atomic variable does not depend on the current exact value of said atomic variable.
| I am not familiar with these code paths and have not taken time to review your changes. I would rather wait for a code owner to decide whether they look reasonable enough to continue testing them. I will trigger benchmarks, but please note that we might very well be missing an appropriate benchmark test for this. |
| @swift-ci please benchmark |
rokhinip left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this PR is actually changing/improving things - we need more justification for each call site that is being changed for why we do not need all the of the acquire/acq_rel barriers stated.
| | ||
| auto waitingTask = waitQueue.load(std::memory_order_acquire); | ||
| if (!waitQueue.compare_exchange_strong(waitingTask, nullptr)) { | ||
| if (!waitQueue.compare_exchange_strong(waitingTask, nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a store release here? What other memory write are we trying to make visible before we publish this null ptr? I don't see any other memory write here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is seq_cst by default so are you saying we can make this relaxed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a store release here? What other memory write are we trying to make visible before we publish this null ptr? I don't see any other memory write here?
The loads of other threads need to see the write to nullptr.
| // As another offer gets to run, it will have either a different waiting task, or no waiting task at all. | ||
| auto waitingTask = waitQueue.load(std::memory_order_acquire); | ||
| if (!waitQueue.compare_exchange_strong(waitingTask, nullptr)) { | ||
| if (!waitQueue.compare_exchange_strong(waitingTask, nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above.
| assumedStatus, newStatus.completingPendingReadyWaiting(this).status, | ||
| /*success*/ std::memory_order_release, | ||
| /*failure*/ std::memory_order_acquire)) { | ||
| /*failure*/ std::memory_order_relaxed)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct? I think the acquire here is needed to make sure that we see the write by the child task whose results we are ready to consume.
| } | ||
| // Put the waiting task at the beginning of the wait queue. | ||
| if (waitQueue.compare_exchange_strong( | ||
| if (waitQueue.compare_exchange_weak( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| expectedType, hashableBaseType ? hashableBaseType | ||
| : reinterpret_cast<const Metadata *>(1), | ||
| std::memory_order_acq_rel); | ||
| std::memory_order_relaxed, std::memory_order_relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know this part of the runtime logic but I would be dubious of the fact that this acq_rel or the other acquire barriers in this file are spurious.
Furthermore, memory_order_acq_rel is not necessary for most of these operations anyway, as the value being assigned to the atomic variable does not depend on the current exact value of said atomic variable.
That is not the right way to think about this - you should be reasoning about whether there are other memory updates apart from the atomic, that need to be made visible to the thread in this operation.
| FWIW, I also don't believe any of the CI tests or benchmarks will catch what is being changed here cause (a) they all likely run on x86_64 which has TSO. And most of these memory barriers issues only show up after deep stress tests on ARM platforms. The only way to be sure is to intellectually convince ourselves that the changes here (a) are correct (b) actually not regressing anything. |
| @AtariDreams perhaps the way forward is to justify each of these changes one by one, in separate PRs. Twenty at once is a lot. |
al45tair left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you genuinely think that any of these are justifiable changes, please break this up into separate, smaller, PRs. Thinking each of these through is non-trivial effort, and I doubt anyone is especially keen to go through that for all of the places you're changing here in one go.
To re-iterate what others have said: testing alone is insufficient to tell whether your changes are right — even on a non-TSO platform it might require hours of stress testing to trigger a problem. The only way to be sure is to carefully reason through the program's behaviour, and since you are proposing weakening the memory ordering constraints there's a real chance you're adding subtle bugs here.
In the case where we discard the value of expected after the compare and exchange, we can make the failure memory order for them std::memory_order_relaxed, as the value no longer matters in these cases. Furthermore, memory_order_acq_rel is not necessary for most of these operations anyway, as the value being assigned to the atomic variable does not depend on the current exact value of said atomic variable.