DEV Community

Cover image for ⚡ Hogwarts Spell Caster: Real-Time Voice Magic with AssemblyAI Universal-Streaming

⚡ Hogwarts Spell Caster: Real-Time Voice Magic with AssemblyAI Universal-Streaming

This is a submission for the AssemblyAI Voice Agents Challenge

What I Built

I created a real-time voice-controlled spell casting system that transforms spoken Harry Potter spells into instant keyboard commands. This project addresses the Real-Time Performance category by achieving ultra-low latency voice recognition for gaming applications where every millisecond matters.

The system recognizes over 30 different spells (like "Lumos", "Wingardium Leviosa", "Stupefy") and instantly triggers corresponding game actions through keyboard shortcuts. It features advanced fuzzy matching to handle pronunciation variations and partial transcript processing for immediate response - perfect for immersive gaming experiences.

Demo

🎥 YouTube Demo Video - Watch the spell casting in action!

Key features demonstrated:

  • ⚡ Sub-300ms response time from speech to action
  • 🎯 Accurate recognition of complex spell names
  • 🔄 Handles pronunciation variations and partial words
  • 🛡️ Smart spam prevention for rapid casting
  • 🎮 Seamless integration with game controls

GitHub Repository

⚡ Hogwarts Spell Caster

Python AssemblyAI License

A real-time voice-controlled spell casting system that transforms spoken Harry Potter spells into instant keyboard commands using AssemblyAI's Ultra-Fast Universal-Streaming technology. Cast spells with your voice and watch them trigger game actions in under 300ms!

🎯 Features

  • Ultra-Low Latency: Sub-300ms response time from speech to action
  • 🎭 30+ Harry Potter Spells: Complete spell repertoire from the wizarding world
  • 🧠 Intelligent Recognition: Advanced fuzzy matching handles pronunciation variations
  • 🚀 Partial Processing: Acts on incomplete words for instant response
  • 🛡️ Spam Prevention: Smart cooldowns prevent accidental rapid-fire casting
  • 🎮 Gaming Ready: Direct keyboard integration for seamless game control
  • 🔧 Optimized Performance: Pre-computed variations and early-exit logic

🎬 Demo

Hogwarts Spell Caster Demo

Click to watch the magic in action!

🚀 Quick Start

Prerequisites

  • Python 3.8 or higher
  • Microphone access
  • AssemblyAI API key (free tier includes $50 credits)

Installation

  1. Clone the

Technical Implementation & AssemblyAI Integration

Core Architecture

The system leverages AssemblyAI's Universal-Streaming technology with aggressive optimization for minimal latency:

# Optimized streaming parameters for ultra-low latency client.connect( StreamingParameters( sample_rate=16000, format_turns=True, # Aggressive turn detection for faster response  end_of_turn_confidence_threshold=0.5, # Lower threshold for faster detection  min_end_of_turn_silence_when_confident=100, # Reduced from 160ms  max_turn_silence=1500, # Reduced from 2400ms  ) ) 
Enter fullscreen mode Exit fullscreen mode

Real-Time Processing Innovation

The key innovation is dual-layer processing that handles both partial and complete transcripts:

def on_turn(self: Type[StreamingClient], event: TurnEvent): transcript = event.transcript is_partial = not event.end_of_turn if is_partial: # Process partial transcript for immediate response (min 4 characters)  if len(transcript) >= 4: print(f"👂 Partial: {transcript}") if process_transcript(transcript, confidence_threshold=0.8, is_partial=True): print("✨ Spell cast from partial transcript!") else: # Process complete transcript with lower threshold  print(f"🗣️ Complete: {transcript}") process_transcript(transcript, confidence_threshold=0.6, is_partial=False) 
Enter fullscreen mode Exit fullscreen mode

Intelligent Spell Matching

I implemented an optimized fuzzy matching system that prioritizes speed:

def optimized_fuzzy_match(text, spell_list, threshold=0.6): text = text.lower().strip() # First, try exact matches in pre-computed variations  if text in SPELL_VARIATIONS: return SPELL_VARIATIONS[text] # Quick substring check for common patterns  for spell in spell_list: if spell in text or text in spell: if len(text) >= len(spell) * 0.7: # At least 70% of spell length  return spell # Fallback to SequenceMatcher only when needed  # ... fuzzy matching logic 
Enter fullscreen mode Exit fullscreen mode

Performance Optimizations

  1. Pre-computed Spell Variations: Common spell variations are cached for instant lookup
  2. Spam Prevention: Prevents accidental rapid-fire casting with time-based cooldowns
  3. Early Exit Logic: Avoids expensive fuzzy matching when exact matches are found
  4. Partial Processing: Acts on partial transcripts for sub-300ms response times

AssemblyAI Features Utilized

  • Universal-Streaming: Core real-time transcription with 300ms latency
  • Turn Detection: Intelligent endpointing for natural speech flow
  • High Accuracy: Handles complex fantasy terminology and pronunciation variations
  • Partial Transcripts: Enables immediate response without waiting for complete utterances

Results

The system consistently achieves sub-300ms latency from speech input to game action, making spell casting feel truly magical and responsive. The combination of AssemblyAI's ultra-fast streaming with optimized processing creates an immersive gaming experience where voice commands feel as natural as pressing keys.

Perfect for Harry Potter games, VR experiences, or any application requiring instant voice command recognition! 🧙‍♂️✨

Top comments (30)

Collapse
 
divyasinghdev profile image
Divya

Incredible demo!

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

thaaaanks <3

happy to see so many positive comments

Collapse
 
divyasinghdev profile image
Divya

It was an unexpectedly, refreshing awesome HP project with a captivating demo, and coz i am a potterhead ig. Hence, i like it 😁

Thread Thread
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

I played HP on Nintendo.
Had to buy game in Steam to showcase this tool I made :D
But I made Slytherin character on PC now. I heard you get avada kedavra sooner when playing Slytherin :DD

Thread Thread
 
divyasinghdev profile image
Divya

Slytherins are the coolest imo, followed closely by the Ravens.

Thread Thread
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

didn't know) will try it out.

Do you follow the HP 2 release?
many say gonna be only in 2027 :((((

Thread Thread
 
divyasinghdev profile image
Divya

It has been lovely childhood memories for me. I saw the new cast, and also, it would be too late ig.

I won't be following it 😅

What will you try out btw?

Thread Thread
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

the Ravens in Hogwarts Legacy game.
So far played with Gryf and Slyth.
But for every house they have different quests from what I know

Thread Thread
 
divyasinghdev profile image
Divya

Sounds really interesting.
Never played it myself though 😅, so I have no idea about it.

Collapse
 
pravesh_sudha_3c2b0c2b5e0 profile image
Pravesh Sudha

Man the Thumbnail looks DOPE!!!!

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

It feels crazy!!

Collapse
 
sarahokolo profile image
sahra 💫

Gotta agree here🔥. What tool was used to create it?

Thread Thread
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

I mean, no actual tool. Just python and assemblyAi))

Thread Thread
 
sarahokolo profile image
sahra 💫

Oh no, I was referring to the thumbnail😅

Thread Thread
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

oh sorry, I had like 4 hour sleep in the last 48 hours. haha.
Didn't pay enough attention.

I actually have a telegram bot that creates thumbnails. but in short the IP here is just good prompting for gpt-image model :)

Thread Thread
 
sarahokolo profile image
sahra 💫

Oh that's awesome✨✨. Nice one.

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

sorry man. I was replying when i had like 0 sleep in some 24h+
thought you were referring to project. haha.

But yeah, the thumbnail is also cool!

Collapse
 
ansellmaximilian profile image
Ansell Maximilian

Awesome! 🔥🔥

I always love your video demos!

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

Thanks, friend! You would do a big favour to me if subscribe on my YouTube channel!

Collapse
 
prasant_f0c8a07abb232c0f0 profile image
Prasant

great demo

Collapse
 
marianko profile image
Marian

I can already imagine yelling, Stupefy! at my screen xD. Gotta try this. Is it open source?

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

Ahahah, I can't wait to Unlock Avada Kedavra really!!!

"Harry Potter... The Boy who Lived, come to die."

Collapse
 
alanislucky profile image
Alanis

I love how you combined real-time voice recognition with game commands. Did you test it on any specific games or is it adaptable to multiple ones?

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

As you can see in Demo video it is for Harry potter game Hogwarts Legacy, but obviously commands can do whatever you like in any game.

Collapse
 
sarahokolo profile image
sahra 💫 • Edited

This looks awesome✨. So, to execute the commands in the game, are you running the script concurrently while playing it? Or did you integrate it somehow with the game?

Collapse
 
axrisi profile image
Nikoloz Turazashvili (@axrisi)

Thank you!) 🙏
Yes you just run it concurrently. Didn't complicate it with extra rules and code. Just run it when already in the game

Collapse
 
sarahokolo profile image
sahra 💫

Awesome👍

Some comments may only be visible to logged-in visitors. Sign in to view all comments.