Skip to content

Conversation

@ooples
Copy link
Owner

@ooples ooples commented Nov 11, 2025

User Story / Context

  • Reference: [US-XXX] (if applicable)
  • Base branch: merge-dev2-to-master

Summary

  • What changed and why (scoped strictly to the user story / PR intent)

Verification

  • Builds succeed (scoped to changed projects)
  • Unit tests pass locally
  • Code coverage >= 90% for touched code
  • Codecov upload succeeded (if token configured)
  • TFM verification (net46, net6.0, net8.0) passes (if packaging)
  • No unresolved Copilot comments on HEAD

Copilot Review Loop (Outcome-Based)

Record counts before/after your last push:

  • Comments on HEAD BEFORE: [N]
  • Comments on HEAD AFTER (60s): [M]
  • Final HEAD SHA: [sha]

Files Modified

  • List files changed (must align with scope)

Notes

  • Any follow-ups, caveats, or migration details
Copilot AI review requested due to automatic review settings November 11, 2025 21:16
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 11, 2025

Warning

Rate limit exceeded

@ooples has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 3 minutes and 19 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e713a2b and 81c56f6.

📒 Files selected for processing (9)
  • scripts/add-half-conditional-v2.py (0 hunks)
  • scripts/add-half-conditional.py (0 hunks)
  • scripts/add-half-conditionals.py (0 hunks)
  • scripts/add-half-ifdef.sh (0 hunks)
  • scripts/check-encoding.sh (0 hunks)
  • scripts/fix-encoding.py (0 hunks)
  • scripts/fix-half-conditionals.py (0 hunks)
  • src/Interfaces/IPredictionModelBuilder.cs (1 hunks)
  • src/PredictionModelBuilder.cs (1 hunks)

Summary by CodeRabbit

  • New Features

    • Added mixed-precision training support with configurable loss scaling and dynamic overflow detection
    • Introduced precision modes (FP32, FP16, BF16, FP64) for optimized neural network training
    • Added tensor type casting functionality for precision conversion
    • Enabled Half-precision (FP16) numeric operations for improved training efficiency
  • Tests

    • Added comprehensive test coverage for mixed-precision training workflows
    • Added encoding validation tests to ensure source file integrity

Walkthrough

This pull request introduces comprehensive mixed-precision neural network training support to AiDotNet. Changes include a new PrecisionMode enum, enhanced numeric operation classes with precision-aware conversions, new mixed-precision infrastructure (LossScaler, MixedPrecisionConfig, MixedPrecisionContext, MixedPrecisionTrainingLoop), a Tensor.Cast<TOut>() method for type conversion, and integration of mixed-precision support into core neural network components via fluent configuration.

Changes

Cohort / File(s) Summary
Enums
src/Enums/PrecisionMode.cs
New public enum with five precision modes: FP32, FP16, Mixed, BF16, FP64; provides strongly-typed specification for numeric precision in training.
Interface Expansions
src/Interfaces/INumericOperations.cs, src/Interfaces/IPredictionModelBuilder.cs
Added PrecisionBits property and numeric conversion methods (ToFloat, FromFloat, ToDouble, and conditional ToHalf/FromHalf for NET5.0+) to INumericOperations<T>; added ConfigureMixedPrecision method to IPredictionModelBuilder.
Numeric Operations (All Types)
src/NumericOperations/FloatOperations.cs, src/NumericOperations/DoubleOperations.cs, src/NumericOperations/ByteOperations.cs, src/NumericOperations/SByteOperations.cs, src/NumericOperations/ShortOperations.cs, src/NumericOperations/Int32Operations.cs, src/NumericOperations/Int64Operations.cs, src/NumericOperations/UInt16Operations.cs, src/NumericOperations/UInt32Operations.cs, src/NumericOperations/UInt64Operations.cs, src/NumericOperations/UIntOperations.cs, src/NumericOperations/DecimalOperations.cs, src/NumericOperations/ComplexOperations.cs
Consistently added PrecisionBits property and conversion methods (ToFloat, FromFloat, ToDouble; conditionally ToHalf/FromHalf for NET5.0+) to all numeric type operations. Documentation symbols updated for consistency.
New Half Operations
src/NumericOperations/HalfOperations.cs
New public class implementing INumericOperations<Half> with full arithmetic, comparison, and conversion support for FP16; conditionally compiled for NET5.0+.
Mixed-Precision Infrastructure
src/MixedPrecision/LossScaler.cs
New generic class providing dynamic loss scaling, overflow detection, and gradient unscaling with state tracking (TotalUpdates, SkippedUpdates, OverflowRate) and configurable growth/backoff behavior.
Mixed-Precision Configuration
src/MixedPrecision/MixedPrecisionConfig.cs
New configuration class encapsulating mixed-precision settings (loss scale bounds, growth parameters, FP32 compute flags) with factory methods (Conservative(), Aggressive(), NoScaling()).
Mixed-Precision Context
src/MixedPrecision/MixedPrecisionContext.cs
New class managing FP32 master weights, FP16 working weights, gradient preparation with unscaling and overflow checking; integrates LossScaler for dynamic scaling.
Mixed-Precision Training Loop
src/MixedPrecision/MixedPrecisionTrainingLoop.cs
New generic class orchestrating complete mixed-precision training steps: forward pass, FP32 loss computation, loss scaling, backward pass, gradient unscaling with overflow detection, and conditional parameter updates.
Core Library Updates
src/LinearAlgebra/Tensor.cs
Added public Cast<TOut>() method for element-wise type conversion preserving tensor shape; documentation symbol corrections (× notation).
Math Helper & Helpers
src/Helpers/MathHelper.cs
Added support for Half type in GetNumericOperations<T>() branching; documentation updates for mathematical notation consistency.
Neural Network Integration
src/NeuralNetworks/NeuralNetworkBase.cs
Added protected _mixedPrecisionContext field and public IsMixedPrecisionEnabled property; internal methods for lifecycle management.
Optimizer Integration
src/Optimizers/GradientBasedOptimizerBase.cs
Added mixed-precision context field, IsMixedPrecisionEnabled property, and EnableMixedPrecision/DisableMixedPrecision lifecycle methods; new ApplyGradientsWithMixedPrecision pathway.
Builder Integration
src/PredictionModelBuilder.cs
Added ConfigureMixedPrecision public method and internal wiring to apply mixed-precision configuration during build; stores MixedPrecisionConfig for propagation to networks and optimizers.
Tests – Encoding Verification
tests/AiDotNet.Tests/UnitTests/Encoding/Utf8EncodingTests.cs
New test class verifying UTF-8 encoding integrity across source files, detecting replacement characters (U+FFFD) and flagging encoding corruption.
Tests – Mixed-Precision Loss Scaler
tests/AiDotNet.Tests/UnitTests/MixedPrecision/LossScalerTests.cs
Comprehensive test suite covering constructor behavior, loss scaling, gradient unscaling, overflow detection, dynamic scaling, boundary enforcement, and state reset.

Sequence Diagram

sequenceDiagram participant Client participant TrainLoop as MixedPrecisionTrainingLoop participant Network as NeuralNetworkBase participant LossFunc as ILossFunction participant Optimizer as IGradientBasedOptimizer participant MPContext as MixedPrecisionContext participant LossScaler Client->>TrainLoop: TrainStep(input, target) TrainLoop->>MPContext: CastWeightsToFP16() Note over MPContext: Master Weights (FP32) → Working Weights (FP16) TrainLoop->>Network: Forward(inputFP16) Network-->>TrainLoop: outputFP16 TrainLoop->>LossFunc: Calculate(outputFP16, targetFP32) LossFunc-->>TrainLoop: lossFP32 TrainLoop->>LossScaler: ScaleLoss(lossFP32) LossScaler-->>TrainLoop: scaledLoss TrainLoop->>Network: Backward(scaledLoss) Network-->>TrainLoop: gradientsFP16 TrainLoop->>MPContext: PrepareGradientsForUpdate(gradientsFP16) MPContext->>LossScaler: UnscaleGradients(gradientsFP16) LossScaler->>LossScaler: DetectOverflow() alt Overflow Detected LossScaler->>LossScaler: SkippedUpdates++, ReduceScale() LossScaler-->>MPContext: false MPContext-->>TrainLoop: false TrainLoop-->>Client: false (step skipped) else No Overflow LossScaler-->>MPContext: true, gradientsFP32 MPContext->>Optimizer: ApplyGradientsWithMixedPrecision() Optimizer->>MPContext: UpdateMasterWeights(gradients, learningRate) MPContext-->>Optimizer: ✓ Optimizer-->>TrainLoop: ✓ LossScaler->>LossScaler: CheckGrowthInterval(), PotentiallyGrowScale() TrainLoop-->>Client: true (step applied) end 
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Specific areas requiring extra attention:
    • src/MixedPrecision/LossScaler.cs: Complex overflow detection logic, scale growth/backoff state machine, and gradient unscaling across tensors/vectors; verify boundary conditions (MinScale/MaxScale clamping).
    • src/MixedPrecision/MixedPrecisionContext.cs: Weight casting, gradient preparation flow, and integration with loss scaler; ensure FP32/FP16 conversions preserve correctness and handle edge cases.
    • src/MixedPrecision/MixedPrecisionTrainingLoop.cs: High-level training orchestration; verify T-is-float constraint enforcement and that overflow detection properly halts updates.
    • src/Optimizers/GradientBasedOptimizerBase.cs and src/NeuralNetworks/NeuralNetworkBase.cs: Mixed-precision context lifecycle; ensure enable/disable logic prevents double-initialization and properly cleans up resources.
    • src/NumericOperations/HalfOperations.cs: New FP16 arithmetic implementation; verify precision loss expectations in Exp/Log operations and conversion rounding behavior.
    • All numeric operation files: Verify consistency of conversion implementations (rounding, clamping) across 13 types to avoid cross-type precision inconsistencies.

Possibly related PRs

  • Implement Mixed-Precision Training Architecture #475: Implements the identical mixed-precision training architecture (PrecisionMode enum, HalfOperations, INumericOperations extensions, LossScaler, MixedPrecisionConfig/Context, Tensor.Cast, NeuralNetwork/optimizer builder changes, and tests)—strongly overlapping implementation.
  • Implement production-ready placeholder methods #474: Modifies NeuralNetworkBase.cs by adding new public members (autodiff/feature-extraction); both PRs touch the same base class and may encounter merge conflicts.

Poem

🐰 Half-precision hops with grace,
Loss scales dance through training space,
FP16 weights in working glow,
Master FP32's steady flow,
Mixed precision, swift and true—
Neural nets now lighter too!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is a template with placeholders and provides no actual information about what changed or why, making it completely unrelated to the changeset. Fill in the template with specific details about the mixed-precision training architecture changes, verification status, and any relevant notes about the implementation.
Title check ❓ Inconclusive The title is vague and appears to contain auto-generated identifiers, making it unclear what the primary change is without reading the detailed summary. Replace with a clear, descriptive title that summarizes the main change, e.g., 'Add mixed-precision training architecture support' or 'Implement mixed-precision training with FP16/FP32 support'.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot finished reviewing on behalf of ooples November 11, 2025 21:19
@ooples ooples force-pushed the claude/mixed-precision-training-architecture-011CV13wmJSAx6sGj6Ryu37k branch from 0501169 to e8b3fb5 Compare November 11, 2025 21:22
Enhanced ConfigureMixedPrecision() documentation in both PredictionModelBuilder.cs and IPredictionModelBuilder.cs to clearly explain technical constraints: 1. Type Constraint: float only - Mixed-precision converts between FP32 (float) and FP16 (Half) - Cannot use double, decimal, or integer types 2. Gradient-Based Optimizers Only - Requires gradient computation for loss scaling, master weights, and gradient accumulation - Does NOT work with non-gradient methods (genetic algorithms, random search, Bayesian optimization) 3. Neural Networks (Recommended) - Best suited for networks with large parameter counts - Requires GPU with Tensor Core support for 2-3x speedup - Provides 50% memory reduction for massive models Also removed temporary development scripts from scripts/ directory: - add-half-conditional*.py (conditional compilation helpers) - add-half-ifdef.sh (development utility) - check-encoding.sh (encoding validation) - fix-encoding.py (encoding repair) - fix-half-conditionals.py (development utility) These were accidentally committed during Half type conditional compilation work. Kept launch-distributed-training scripts as they are part of the public API.
@ooples ooples force-pushed the claude/mixed-precision-training-architecture-011CV13wmJSAx6sGj6Ryu37k branch from e8b3fb5 to 81c56f6 Compare November 11, 2025 21:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@ooples ooples merged commit f1af25b into master Nov 11, 2025
5 checks passed
@ooples ooples deleted the claude/mixed-precision-training-architecture-011CV13wmJSAx6sGj6Ryu37k branch November 11, 2025 21:32
@coderabbitai coderabbitai bot mentioned this pull request Nov 13, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants