Fix division by zero in ColdBlockInfo::inferFromEdgeProfile #84432

karim-alweheshy · 2025-09-22T12:41:49Z

Summary

This PR fixes a division by zero crash in the Swift compiler's profile-guided optimization implementation and provides user-configurable strategies for handling zero-count blocks, inspired by similar improvements in LLVM.

Background

LLVM recently addressed similar issues with zero-count blocks in profile-guided optimization (see LLVM PR #154437), which added configurable strategies for handling blocks that were instrumented but never executed during profiling. This change brings similar capabilities to Swift.

Problem

The issue manifests in ColdBlockInfo::inferFromEdgeProfile when calculating taken probabilities:

double takenProbability = succCount[i].getValue() / (double)totalCount.getValue();

When totalCount is 0 (block was instrumented but never executed), this causes a division by zero crash during SIL optimization passes.

Solution

Added a check for totalCount.getValue() < 1 before the division operation, with user-configurable strategies for handling zero-count blocks:

Command-Line Option

-sil-zero-count-strategy=<strategy>

Available Strategies

conservative (default): Skip inference, let other heuristics decide
- Safe default behavior that prevents crashes
- Makes no assumptions about missing execution data
- Follows the same pattern as other Swift compiler components
optimistic: Assume zero-count blocks are cold
- Logical assumption: zero execution counts = never executed = cold code
- Useful for aggressive size optimization when profile data is trusted
- Aligns with the reasoning that warm/important code would have execution counts

Changes

ColdBlockInfo.cpp:
- Added zero-count check with configurable strategies
- Command-line option with enum-based strategy selection
- Debug logging for strategy decisions
cold_block_zero_count.sil:
- Comprehensive test cases covering both strategies
- Tests blocks with zero execution counts on all branches
- Tests mixed zero/non-zero execution counts
- Tests inlining scenarios with zero-count blocks

Design Rationale

The two-strategy approach is based on logical reasoning about zero-count scenarios:

Zero counts indicate: either "truly not executed" or "missing profile data"
Conservative approach: When uncertain, make no assumptions (safe default)
Optimistic approach: Assume zero counts mean cold code (aggressive optimization)

No "aggressive/warm" strategy is included because warm/important code would logically have execution counts during profiling.

Testing

The test reproduces the division by zero condition and verifies both strategies handle it gracefully:

# Conservative strategy (default) %target-sil-opt %s -performance-inline # Optimistic strategy  %target-sil-opt %s -performance-inline -sil-zero-count-strategy=optimistic

Profile data with zero counts is common for error handling paths, platform-specific code, and rarely executed functions. This fix ensures the compiler handles such cases without crashing while providing optimization flexibility.

Compatibility

Backward compatible: conservative strategy is the default
Follows existing Swift patterns for command-line options
Aligns with LLVM's approach to profile-guided optimization control

When profile data shows that a function was instrumented but never executed (function count = 0), the current code causes a division by zero error in ColdBlockInfo::inferFromEdgeProfile when calculating taken probabilities. This fix adds a check for totalCount < 1 before the division operation, following the same pattern used elsewhere in the Swift compiler for handling zero execution counts. When encountered, the function conservatively returns false to skip inference and let other heuristics handle the block. The fix includes comprehensive test cases covering: - Blocks with zero execution counts on all branches - Mixed zero/non-zero execution counts - Inlining scenarios with zero-count blocks This resolves compiler crashes during SIL optimization passes when using profile-guided optimization with functions that were never executed.

kavon

Thanks! Overall this looks great. Do you think you could squash this into one commit?

kavon · 2025-09-25T19:26:35Z

lib/SILOptimizer/Analysis/ColdBlockInfo.cpp


+ // Handle the case where the block was instrumented but never executed
+ // This aligns with LLVM's handling of zero-count blocks
+ if (totalCount.getValue() < 1) {


A ProfileCounter is effectively an Optional<UInt> and there's some nuance here between .none and .some(0). I believe .none was meant to convey "missing" data, perhaps due to sampling misses, and the .some(0) was meant to convey "definitely no execution happened".

So, I think some of this ZeroCountStrategy stuff needs to also be added a few lines up, to decide whether to interpret a .none value as 0. I think this interpretation should only happen in the optimistic strategy if there's at least one ProfileCounter in the block's successors list with a .some, regardless of what the count of it is. Otherwise the optimistic strategy may mark every block cold.

…k analysis Address Kavon's feedback on PR swiftlang#84432 by implementing a refined approach to handling profile counter data in inferFromEdgeProfile: - Distinguish between ProfileCounter without value (.none) and ProfileCounter(0) - Apply optimistic strategy only when there's evidence of profiling - Treat missing data as zero only if at least one successor has non-zero count - Maintain conservative behavior for completely unprofiled code This enables better optimization decisions when profile data is partial, while preventing unprofiled code from being incorrectly marked as entirely cold. Key changes: - Two-pass algorithm: first detect profiling evidence, then build counts - Missing data treated as zero only with >=1 non-zero successor count - Updated comments to reflect the new optimistic strategy logic 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

karim-alweheshy · 2025-10-15T16:22:27Z

TODO: Implement full SamplePGO profile count propagation algorithm

The current implementation handles missing profile data with a simple
optimistic strategy (treat missing as zero when evidence of profiling exists).
The SamplePGO paper by Diego Novillo describes more sophisticated techniques
for inferring missing counts through iterative propagation across the CFG:

Equivalence Classes: Compute sets of blocks guaranteed to execute the
same number of times using dominance, post-dominance, and loop nesting:
- B1 dominates B2
- B2 post-dominates B1
- B1 and B2 in same loop nest
  All blocks in same class get same weight.
Iterative Propagation: Use flow conservation to infer unknown weights:
Weight(Block) = Sum(Incoming Edges) = Sum(Outgoing Edges)
Algorithm:
- If all edges known → compute block weight
- If block known and one edge unknown → compute that edge
- Iterate until convergence or max iterations
Multi-Successor Support: Extend beyond 2-successor limitation to handle
switch statements, multi-way branches, and complex control flow.

Future implementation should:

Add propagateProfileWeights() method as a third pass after this collection
Use the existing 'counters' vector as input (already distinguishes missing)
Maintain backward compatibility with ZeroCountStrategy flag
Support incremental adoption (propagation as optional enhancement)
Add new command-line flag: -sil-profile-propagation=[none|basic|full]

The paper reports achieving up to 98% of instrumentation-based PGO performance
gains with this approach, making it highly valuable for handling incomplete
sampling profiles in real-world scenarios.

Reference: Diego Novillo. "SamplePGO - The Power of Profile Guided
Optimizations without the Usability Burden."
LLVM-HPC 2014. DOI: 10.1109/LLVM-HPC.2014.8
PDF: https://storage.googleapis.com/gweb-research2023-media/pubtools/pdf/45290.pdf

kavon · 2025-10-15T22:48:50Z

TODO: Implement full SamplePGO profile count propagation algorithm

How about we keep this PR focused on the bug fix and save the new algorithm for a follow-up PR?

kavon

Please ping me when you think this is ready for another round of review! Thanks.

kavon · 2025-10-15T22:59:40Z

lib/SILOptimizer/Analysis/ColdBlockInfo.cpp

+ // future propagation algorithms, which need to know where data is missing.
+ bool hasAnyNonZeroCount = false;
+ bool hasAnyMissingData = false;
+ SmallVector<std::optional<ProfileCounter>, 2> counters;


nit: std::optional<ProfileCounter> is redundant here. ProfileCounter already has a representation of "missing"

kavon · 2025-10-15T23:06:24Z

lib/SILOptimizer/Analysis/ColdBlockInfo.cpp

+ // Second pass: build the count vector, treating missing as zero if optimistic
+ for (size_t i = 0; i < counters.size(); i++) {
+ ProfileCounter count;
+ if (counters[i].has_value()) {
+ count = counters[i].value();
+ } else {
+ // We're being optimistic - treat missing as zero
+ count = ProfileCounter(0);
+ LLVM_DEBUG(llvm::dbgs()
+ << "ColdBlockInfo: treating missing profile data as zero for "
+ << toString(BB->getSuccessors()[i])
+ << " (optimistic strategy - found non-zero counts on other edges)\n");
+ }


This is applying the optimistic strategy unconditionally. I would expect it to be aligned with the flag.

kavon · 2025-10-15T23:17:08Z

lib/SILOptimizer/Analysis/ColdBlockInfo.cpp

- // especially since we only have two temperatures.
+ // especially since we only have two temperatures (cold/warm).
+ // TODO: With propagation algorithm, this limitation can be removed to support
+ // multi-way branches (switch statements) and arbitrary successor counts.


nit: inaccurate addition to this comment

There already is propagation. The reason for this limitation is hinted at in the original comment: there's only two temperatures being set and propagated for blocks. We set a block cold only if it's taken with very low probability, currently under 3%, otherwise it's just "warm". If a block has 100 successors and they are all taken with equal probability of 1%, then we'd mistakenly mark all blocks cold.

Definitely could use a better strategy here, but I think for this PR let's stay focused on fixing the divide-by-zero and having the "assume cold for missing data".

@kavon

Fix handling of profile data in inferFromEdgeProfile to distinguish between missing data and explicit zero counts, enabling better optimization decisions. The bug: Code treated both missing profile data and zero counts identically, causing missed optimization opportunities when profile data showed edges were definitively never taken. Solution: Apply optimistic strategy when there's evidence of profiling: - If any successor has non-zero count, treat missing data as zero - If all successors are missing or zero, remain conservative Implementation: - Two-pass algorithm: detect evidence of profiling, then build counts - Use std::optional<ProfileCounter> to track missing vs zero distinction - Add ZeroCountStrategy flag for handling all-zero cases Testing: Added comprehensive test cases for all missing/zero/non-zero combinations in cold_block_zero_count.sil Addresses feedback from @kavon on swiftlang#84432

karim-alweheshy · 2025-10-20T08:55:20Z

lib/SILOptimizer/Analysis/ColdBlockInfo.cpp

+ return false;
+ }
+ // Continue to build count vector, treating missing as zero
+ LLVM_DEBUG(llvm::dbgs() << "ColdBlockInfo: applying optimistic strategy for "


we consider missing data to be cold only when data is partial i.e. on of the branches was executed at least once

kavon · 2025-10-22T20:31:45Z

test/SILOptimizer/cold_block_zero_count.sil

+// CHECK-CONSERVATIVE-LABEL: sil @test_zero_count_block
+// CHECK-OPTIMISTIC-LABEL: sil @test_zero_count_block
+sil @test_zero_count_block : $@convention(thin) () -> () !function_entry_count(100) {


The only CHECK-* lines in this test are to ensure the existence of the function after the performance inliner runs. There's nothing verifying that blocks got marked cold or not with the different zero-count strategies. As an example of how to do that, see test/SILOptimizer/cold_block_info.swift where I use the debug output from this analysis check that it's working as expected.

Address review feedback on PR swiftlang#84432 by adding debug output verification to pgo_si_reduce.swift and pgo_si_inlinelarge.swift. These tests now verify that blocks are correctly marked as cold or warm based on profile data, using the cold-block-info debug output similar to cold_block_info.swift. Previously, these tests only checked for function existence after the performance inliner runs. Now they also verify the cold block analysis works correctly with different profile count scenarios. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Address review feedback on PR swiftlang#84432 by adding debug output verification to pgo_si_reduce.swift and pgo_si_inlinelarge.swift. These tests now verify that blocks are correctly marked as cold or warm based on profile data, using the cold-block-info debug output similar to cold_block_info.swift. The tests verify different zero-count scenarios: - pgo_si_reduce: Tests with blocks that have zero execution counts (e.g., x==0, x==1 in bar() are never hit when called with even numbers) - pgo_si_inlinelarge: Tests with many conditional blocks, most with zero counts, to verify cold block detection with more complex control flow Both tests verify that: 1. Blocks with zero execution counts are correctly identified as cold 2. Blocks with high execution counts are correctly identified as warm 3. The cold block analysis integrates properly with profile-guided optimization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

karim-alweheshy marked this pull request as ready for review September 22, 2025 13:01

karim-alweheshy requested a review from eeckstein as a code owner September 22, 2025 13:01

eeckstein requested a review from kavon September 22, 2025 16:12

kavon requested changes Sep 25, 2025

View reviewed changes

karim-alweheshy requested a review from kavon September 29, 2025 17:07

kavon requested changes Oct 15, 2025

View reviewed changes

karim-alweheshy force-pushed the fix-coldblockinfo-division-by-zero branch from da07093 to 8bdd378 Compare October 17, 2025 11:00

karim-alweheshy requested a review from kavon October 17, 2025 11:00

karim-alweheshy commented Oct 20, 2025

View reviewed changes

kavon reviewed Oct 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix division by zero in ColdBlockInfo::inferFromEdgeProfile #84432

Fix division by zero in ColdBlockInfo::inferFromEdgeProfile #84432

Uh oh!

karim-alweheshy commented Sep 22, 2025 •

edited

Loading

kavon left a comment

kavon Sep 25, 2025

karim-alweheshy commented Oct 15, 2025 •

edited

Loading

kavon commented Oct 15, 2025 •

edited

Loading

kavon left a comment

kavon Oct 15, 2025

kavon Oct 15, 2025

kavon Oct 15, 2025

karim-alweheshy Oct 20, 2025

kavon Oct 22, 2025

Labels

2 participants

Fix division by zero in ColdBlockInfo::inferFromEdgeProfile #84432

Are you sure you want to change the base?

Fix division by zero in ColdBlockInfo::inferFromEdgeProfile #84432

Uh oh!

Conversation

karim-alweheshy commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Problem

Solution

Command-Line Option

Available Strategies

Changes

Design Rationale

Testing

Compatibility

kavon left a comment

Choose a reason for hiding this comment

kavon Sep 25, 2025

Choose a reason for hiding this comment

karim-alweheshy commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kavon commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kavon left a comment

Choose a reason for hiding this comment

kavon Oct 15, 2025

Choose a reason for hiding this comment

kavon Oct 15, 2025

Choose a reason for hiding this comment

kavon Oct 15, 2025

Choose a reason for hiding this comment

karim-alweheshy Oct 20, 2025

Choose a reason for hiding this comment

kavon Oct 22, 2025

Choose a reason for hiding this comment

Labels

2 participants

karim-alweheshy commented Sep 22, 2025 •

edited

Loading

karim-alweheshy commented Oct 15, 2025 •

edited

Loading

kavon commented Oct 15, 2025 •

edited

Loading