-
- Notifications
You must be signed in to change notification settings - Fork 7
Description
Goal: Evaluate SuperNet Architecture for Better Neural Network Integration
Investigate whether SuperNet (DARTS-based Neural Architecture Search) should inherit from NeuralNetworkBase to align with the library's neural network architecture patterns.
Related: PR #393
Current State
SuperNet is located in src/NeuralNetworks/ but implements IFullModel<T, Tensor<T>, Tensor<T>> directly:
Current Implementation:
- ✅ Implements IFullModel interface
- ✅ Provides NumOps via MathHelper.GetNumericOperations()
- ✅ Returns ModelType.NeuralNetwork
- ✅ Located in AiDotNet.NeuralNetworks namespace
- ❌ Does NOT inherit from NeuralNetworkBase
- ❌ Does NOT use layer-based architecture (List<ILayer>)
- ❌ Does NOT use NeuralNetworkArchitecture
Why Different: SuperNet is fundamentally an architecture search framework (DARTS), not a traditional neural network:
- Maintains architecture parameters (alpha matrices) that weight different operations
- Contains multiple operation types with separate weight dictionaries
- Uses SearchSpace instead of NeuralNetworkArchitecture
- Learns WHICH architecture to use (meta-learning), not just network weights
Architectural Question
Should SuperNet inherit from NeuralNetworkBase? Two approaches:
Option 1: Keep Current Design (Recommended for Now)
Rationale:
- SuperNet serves a different purpose than traditional networks
- DARTS requires unique data structures (alpha parameters, operation dictionaries)
- Forcing layer-based architecture would complicate the search algorithm
- SuperNet is a meta-model that searches for architectures, not a trainable network itself
Trade-offs:
- Pro: Clean separation of concerns
- Pro: No forced abstraction violations
- Con: Doesn't leverage NeuralNetworkBase utilities (if any)
- Con: Some code duplication (NumOps initialization, etc.)
Option 2: Inherit from NeuralNetworkBase (Requires Refactoring)
Rationale:
- Consistency with other neural network implementations
- Could leverage shared base class functionality
- Better type hierarchy for neural network models
Requirements:
- Adapt SuperNet to use layer-based architecture
- Map DARTS operations to ILayer implementations
- Convert SearchSpace to NeuralNetworkArchitecture
- Ensure alpha parameters integrate with layer weights
Trade-offs:
- Pro: Consistent architecture across all neural networks
- Pro: Potential code reuse from base class
- Con: Significant refactoring required
- Con: Might force unnatural abstractions
- Con: DARTS algorithm complexity increases
Proposed Investigation Steps
-
Audit NeuralNetworkBase Functionality:
- What shared utilities does it provide?
- Do any apply to SuperNet's use case?
- Are there benefits beyond consistency?
-
Evaluate Refactoring Complexity:
- Estimate effort to map DARTS to layer-based architecture
- Identify potential abstraction leaks
- Assess impact on DARTS algorithm clarity
-
Benchmark Current vs Proposed:
- Compare memory usage
- Compare training performance
- Compare code maintainability
-
Decision Criteria:
- If NeuralNetworkBase provides significant reusable functionality → Refactor
- If primarily cosmetic consistency → Keep current design
- If architecture search benefits from separation → Keep current design
Questions to Answer
- What functionality does NeuralNetworkBase provide that SuperNet could benefit from?
- Can DARTS operations be cleanly mapped to ILayer without violating SRP?
- Does inheriting from NeuralNetworkBase improve or hinder SuperNet's maintainability?
- Are there other architecture search methods (ENAS, NASNet) that would benefit from a shared base?
- Should there be a separate
NeuralArchitectureSearchBase<T>instead?
Recommended Action
Short-term: Keep SuperNet's current design (IFullModel implementation)
- No immediate changes required
- Focus on correctness and functionality
- Document architectural decision
Long-term: After library matures, revisit based on:
- User feedback on consistency expectations
- Discovery of shared functionality opportunities
- Addition of more NAS methods (if they share patterns)
References
- DARTS Paper: https://arxiv.org/abs/1806.09055
- SuperNet Location:
src/NeuralNetworks/SuperNet.cs - NeuralNetworkBase:
src/NeuralNetworks/NeuralNetworkBase.cs - Related Discussion: PR Work on issue 309 and gather info #393 review comments