DEV Community

mahmoudabbasi
mahmoudabbasi

Posted on

🧠 Analyzing SOLID Principles in an Epsilon-Greedy Recommender (Java)

In this post, we’ll take a simple implementation of an Epsilon-Greedy Recommender in Java and check whether it follows the SOLID principles. Then, we’ll see how to refactor it for better maintainability, extensibility, and testability.

*The Example Code
*

public class EpsilonGreedyRecommender { private int nItems; private double epsilon; private int[] counts; private double[] values; private Random random; public EpsilonGreedyRecommender(int nItems, double epsilon) { this.nItems = nItems; this.epsilon = epsilon; this.counts = new int[nItems]; this.values = new double[nItems]; this.random = new Random(); } public int recommend() { if (random.nextDouble() < epsilon) { return random.nextInt(nItems); } int bestIndex = 0; for (int i = 1; i < nItems; i++) { if (values[i] > values[bestIndex]) { bestIndex = i; } } return bestIndex; } public void update(int item, double reward) { counts[item]++; values[item] += (reward - values[item]) / counts[item]; } public double[] getValues() { return values; } public int[] getCounts() { return counts; } } 
Enter fullscreen mode Exit fullscreen mode
  1. SRP – Single Responsibility Principle

πŸ“– Definition:
A class should have only one reason to change – it should have a single responsibility.

πŸ” Analysis:
This class is doing multiple things:

  • Storing the bandit state (counts, values)
  • Implementing the selection policy (recommend())
  • Updating statistics (update())

This means any change to the policy logic, or to how state is stored, requires modifying the same class.

βœ… Verdict: SRP is partially violated – we have multiple responsibilities in one place.

  1. OCP – Open/Closed Principle

πŸ“– Definition:
Classes should be open for extension but closed for modification.

πŸ” Analysis:
If we want to switch to a different policy (e.g., Softmax, UCB), we would have to edit the recommend() method directly.
Better design: define a SelectionPolicy interface and plug in different implementations.

❌ Verdict: OCP is violated – adding new policies requires modifying the class.

  1. LSP – Liskov Substitution Principle

πŸ“– Definition:
Subtypes must be substitutable for their base types without changing program correctness.

πŸ” Analysis:
We don’t have inheritance here, so there is nothing to violate.

βœ… Verdict: LSP is respected.

  1. ISP – Interface Segregation Principle

πŸ“– Definition:
Clients should not be forced to depend on interfaces they do not use.

πŸ” Analysis:
Since we have no interfaces at all, there’s no problem here.

βœ… Verdict: ISP is respected.

  1. DIP – Dependency Inversion Principle

πŸ“– Definition:
Depend on abstractions, not on concrete implementations.

πŸ” Analysis:
The class creates its own Random instance. This is a direct dependency on a concrete class, which makes testing harder (no way to inject a predictable RNG).

Better design: inject Random as a dependency via the constructor (or use an interface).

❌ *Verdict: DIP is violated *– we depend on a concrete Random implementation.

Summary Table
Principle Status Notes
SRP ❌ Multiple responsibilities (state + policy + update logic)
OCP ❌ Cannot add new policies without modifying code
LSP βœ… No inheritance, no violation
ISP βœ… No large interfaces, no violation
DIP ❌ Direct dependency on Random, hard to test

Refactored Design
**
Let’s refactor the code to follow **SOLID
:

  • Introduce a SelectionPolicy interface (Strategy Pattern)
  • Inject Random from outside to improve testability

*Step 1: Define the Policy Interface
*

public interface SelectionPolicy { int select(double[] values); } 
Enter fullscreen mode Exit fullscreen mode

*Step 2: Implement Epsilon-Greedy Policy
*

import java.util.Random; public class EpsilonGreedyPolicy implements SelectionPolicy { private final double epsilon; private final Random random; public EpsilonGreedyPolicy(double epsilon, Random random) { this.epsilon = epsilon; this.random = random; } @Override public int select(double[] values) { int nItems = values.length; if (random.nextDouble() < epsilon) { return random.nextInt(nItems); } int bestIndex = 0; for (int i = 1; i < nItems; i++) { if (values[i] > values[bestIndex]) { bestIndex = i; } } return bestIndex; } } 
Enter fullscreen mode Exit fullscreen mode

*Step 3: Make the Bandit Class Focus on State
*

public class Bandit { private final int[] counts; private final double[] values; private final SelectionPolicy policy; public Bandit(int nItems, SelectionPolicy policy) { this.counts = new int[nItems]; this.values = new double[nItems]; this.policy = policy; } public int recommend() { return policy.select(values); } public void update(int item, double reward) { counts[item]++; values[item] += (reward - values[item]) / counts[item]; } } 
Enter fullscreen mode Exit fullscreen mode

βœ… Now:

SRP is respected β†’ Bandit only manages state, EpsilonGreedyPolicy only handles selection.

OCP is respected β†’ We can add new policies without touching Bandit.

DIP is respected β†’ Random is injected, so we can pass a mock RNG in tests.

Key Takeaways

Applying SOLID makes your code easier to extend and maintain.

Using interfaces and dependency injection helps make your code testable and more robust.

Even small classes can benefit from SOLID – especially if you expect the algorithm to evolve over time.

πŸ’‘ What do you think? Would you keep the state and policy together for small projects, or always split them like this?

Top comments (0)