Hybrid Reinforcement Learning and minimax agent for Tablut game. Combines PPO trained value networks with alpha beta search for competitive play.
reinforcement-learning minimax alpha-beta-pruning game-ai game-playing-agent proximal-policy-optimization ppo value-network tablut self-play stable-baselines3
- Updated
Nov 23, 2025 - HTML