- Notifications
You must be signed in to change notification settings - Fork 10.6k
Fix division by zero in ColdBlockInfo::inferFromEdgeProfile #84432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix division by zero in ColdBlockInfo::inferFromEdgeProfile #84432
Conversation
When profile data shows that a function was instrumented but never executed (function count = 0), the current code causes a division by zero error in ColdBlockInfo::inferFromEdgeProfile when calculating taken probabilities. This fix adds a check for totalCount < 1 before the division operation, following the same pattern used elsewhere in the Swift compiler for handling zero execution counts. When encountered, the function conservatively returns false to skip inference and let other heuristics handle the block. The fix includes comprehensive test cases covering: - Blocks with zero execution counts on all branches - Mixed zero/non-zero execution counts - Inlining scenarios with zero-count blocks This resolves compiler crashes during SIL optimization passes when using profile-guided optimization with functions that were never executed.
kavon left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Overall this looks great. Do you think you could squash this into one commit?
| | ||
| // Handle the case where the block was instrumented but never executed | ||
| // This aligns with LLVM's handling of zero-count blocks | ||
| if (totalCount.getValue() < 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A ProfileCounter is effectively an Optional<UInt> and there's some nuance here between .none and .some(0). I believe .none was meant to convey "missing" data, perhaps due to sampling misses, and the .some(0) was meant to convey "definitely no execution happened".
So, I think some of this ZeroCountStrategy stuff needs to also be added a few lines up, to decide whether to interpret a .none value as 0. I think this interpretation should only happen in the optimistic strategy if there's at least one ProfileCounter in the block's successors list with a .some, regardless of what the count of it is. Otherwise the optimistic strategy may mark every block cold.
…k analysis Address Kavon's feedback on PR swiftlang#84432 by implementing a refined approach to handling profile counter data in inferFromEdgeProfile: - Distinguish between ProfileCounter without value (.none) and ProfileCounter(0) - Apply optimistic strategy only when there's evidence of profiling - Treat missing data as zero only if at least one successor has non-zero count - Maintain conservative behavior for completely unprofiled code This enables better optimization decisions when profile data is partial, while preventing unprofiled code from being incorrectly marked as entirely cold. Key changes: - Two-pass algorithm: first detect profiling evidence, then build counts - Missing data treated as zero only with >=1 non-zero successor count - Updated comments to reflect the new optimistic strategy logic 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
| TODO: Implement full SamplePGO profile count propagation algorithm The current implementation handles missing profile data with a simple
Future implementation should:
The paper reports achieving up to 98% of instrumentation-based PGO performance Reference: Diego Novillo. "SamplePGO - The Power of Profile Guided |
How about we keep this PR focused on the bug fix and save the new algorithm for a follow-up PR? |
kavon left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ping me when you think this is ready for another round of review! Thanks.
| // future propagation algorithms, which need to know where data is missing. | ||
| bool hasAnyNonZeroCount = false; | ||
| bool hasAnyMissingData = false; | ||
| SmallVector<std::optional<ProfileCounter>, 2> counters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: std::optional<ProfileCounter> is redundant here. ProfileCounter already has a representation of "missing"
| // Second pass: build the count vector, treating missing as zero if optimistic | ||
| for (size_t i = 0; i < counters.size(); i++) { | ||
| ProfileCounter count; | ||
| if (counters[i].has_value()) { | ||
| count = counters[i].value(); | ||
| } else { | ||
| // We're being optimistic - treat missing as zero | ||
| count = ProfileCounter(0); | ||
| LLVM_DEBUG(llvm::dbgs() | ||
| << "ColdBlockInfo: treating missing profile data as zero for " | ||
| << toString(BB->getSuccessors()[i]) | ||
| << " (optimistic strategy - found non-zero counts on other edges)\n"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is applying the optimistic strategy unconditionally. I would expect it to be aligned with the flag.
| // especially since we only have two temperatures. | ||
| // especially since we only have two temperatures (cold/warm). | ||
| // TODO: With propagation algorithm, this limitation can be removed to support | ||
| // multi-way branches (switch statements) and arbitrary successor counts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inaccurate addition to this comment
There already is propagation. The reason for this limitation is hinted at in the original comment: there's only two temperatures being set and propagated for blocks. We set a block cold only if it's taken with very low probability, currently under 3%, otherwise it's just "warm". If a block has 100 successors and they are all taken with equal probability of 1%, then we'd mistakenly mark all blocks cold.
Definitely could use a better strategy here, but I think for this PR let's stay focused on fixing the divide-by-zero and having the "assume cold for missing data".
Fix handling of profile data in inferFromEdgeProfile to distinguish between missing data and explicit zero counts, enabling better optimization decisions. The bug: Code treated both missing profile data and zero counts identically, causing missed optimization opportunities when profile data showed edges were definitively never taken. Solution: Apply optimistic strategy when there's evidence of profiling: - If any successor has non-zero count, treat missing data as zero - If all successors are missing or zero, remain conservative Implementation: - Two-pass algorithm: detect evidence of profiling, then build counts - Use std::optional<ProfileCounter> to track missing vs zero distinction - Add ZeroCountStrategy flag for handling all-zero cases Testing: Added comprehensive test cases for all missing/zero/non-zero combinations in cold_block_zero_count.sil Addresses feedback from @kavon on swiftlang#84432
da07093 to 8bdd378 Compare | return false; | ||
| } | ||
| // Continue to build count vector, treating missing as zero | ||
| LLVM_DEBUG(llvm::dbgs() << "ColdBlockInfo: applying optimistic strategy for " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we consider missing data to be cold only when data is partial i.e. on of the branches was executed at least once
| // CHECK-CONSERVATIVE-LABEL: sil @test_zero_count_block | ||
| // CHECK-OPTIMISTIC-LABEL: sil @test_zero_count_block | ||
| sil @test_zero_count_block : $@convention(thin) () -> () !function_entry_count(100) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only CHECK-* lines in this test are to ensure the existence of the function after the performance inliner runs. There's nothing verifying that blocks got marked cold or not with the different zero-count strategies. As an example of how to do that, see test/SILOptimizer/cold_block_info.swift where I use the debug output from this analysis check that it's working as expected.
Address review feedback on PR swiftlang#84432 by adding debug output verification to pgo_si_reduce.swift and pgo_si_inlinelarge.swift. These tests now verify that blocks are correctly marked as cold or warm based on profile data, using the cold-block-info debug output similar to cold_block_info.swift. Previously, these tests only checked for function existence after the performance inliner runs. Now they also verify the cold block analysis works correctly with different profile count scenarios. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Address review feedback on PR swiftlang#84432 by adding debug output verification to pgo_si_reduce.swift and pgo_si_inlinelarge.swift. These tests now verify that blocks are correctly marked as cold or warm based on profile data, using the cold-block-info debug output similar to cold_block_info.swift. The tests verify different zero-count scenarios: - pgo_si_reduce: Tests with blocks that have zero execution counts (e.g., x==0, x==1 in bar() are never hit when called with even numbers) - pgo_si_inlinelarge: Tests with many conditional blocks, most with zero counts, to verify cold block detection with more complex control flow Both tests verify that: 1. Blocks with zero execution counts are correctly identified as cold 2. Blocks with high execution counts are correctly identified as warm 3. The cold block analysis integrates properly with profile-guided optimization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
This PR fixes a division by zero crash in the Swift compiler's profile-guided optimization implementation and provides user-configurable strategies for handling zero-count blocks, inspired by similar improvements in LLVM.
Background
LLVM recently addressed similar issues with zero-count blocks in profile-guided optimization (see LLVM PR #154437), which added configurable strategies for handling blocks that were instrumented but never executed during profiling. This change brings similar capabilities to Swift.
Problem
The issue manifests in
ColdBlockInfo::inferFromEdgeProfilewhen calculating taken probabilities:When
totalCountis 0 (block was instrumented but never executed), this causes a division by zero crash during SIL optimization passes.Solution
Added a check for
totalCount.getValue() < 1before the division operation, with user-configurable strategies for handling zero-count blocks:Command-Line Option
Available Strategies
conservative(default): Skip inference, let other heuristics decideoptimistic: Assume zero-count blocks are coldChanges
ColdBlockInfo.cpp:
cold_block_zero_count.sil:
Design Rationale
The two-strategy approach is based on logical reasoning about zero-count scenarios:
No "aggressive/warm" strategy is included because warm/important code would logically have execution counts during profiling.
Testing
The test reproduces the division by zero condition and verifies both strategies handle it gracefully:
Profile data with zero counts is common for error handling paths, platform-specific code, and rarely executed functions. This fix ensures the compiler handles such cases without crashing while providing optimization flexibility.
Compatibility