TinyTTS — The Neural Module That Makes Photos Talk

Description

When we flip through old photos, our minds fill in the sounds — laughter in the kitchen, the hum of a train station, someone saying “remember?” We built this project so those voices could return, not just live in memory. Now a picture can speak for itself — short, warm phrases, like someone you love is right there again. It began with a Family Storyteller: a few childhood photos, a couple of sentences about funny moments — press Play, and the frame speaks, completely offline. Then came the idea of Talking Places: open a travel snapshot and hear the whisper of a story — as if the city itself tells you where to go next

Details

All of it runs fully on-device — no cloud, no delay. The speech is generated right on the TinyTTS module inside the frame.
You are the director: control pacing, pauses, and sequence — or just let the story unfold.
The formula is simple: photo + text → voice. That’s it. The rest is emotion.

We don’t just share a tutorial — we hand you a template for your own stories: family, museum, travel, educational.
Add a few lines, press play, and watch faces light up when the frame starts to talk. The first time it speaks — it really does feel like magic.

How It Works

Most “talking” projects stream audio from the cloud.
This one shows true on-device TTS — real speech, fully generated on a microcontroller.

CrowPanel handles UI, SD read, mode switching, and UART transmission.

TinyTTS receives lines via UART, generates voice, and signals completion.

Data flow: CrowPanel (UART) → TinyTTS → Speaker

CrowPanel reads image and text files.
Text is sent over UART to TinyTTS.
TinyTTS generates speech.
Audio output from TinyTTS goes to the speaker.
Everything runs fully offline.

What’s Next

The same setup can evolve into museum guides, educational kits, talking toys, or home displays — all running fully offline.

No internet, no latency, no cloud dependencies — just local voice and logic you can adapt for any storytelling scenario.

Create the first chapter of your talking album today.
Flash the ready demo, load five photos and five short lines, press Play.

The rest of the story — you’ll discover along the way.

Resources

GitHub Repository — Crowpanel_ImgTTS_cards
TinyTTS Kit: on Elecrow; on Tindie
CrowPanel Series
Grovety Official

Build Instructions

Collapse

1
Quick Start — Ready Demo
💡 Ready binaries for instant start.

You’ll Need
- TinyTTS kit (Elecrow, or Tindie) — MCU module with embedded neural TTS
- CrowPanel Advanced (Elecrow) (ESP32-S3 display controller)
- Speaker — connect to TinyTTS audio out
1. Connect the CrowPanel (ESP32-S3) to your computer via USB/Serial.
Set the panel’s function-select switch to “WM(0,1)” (UART1-OUT mode).

Do not connect the tinyTTS module yet.

2. Flash the firmware.

Option A — Use prebuilt images:
In firmware/, pick
- binaries_album/ for the album scenario, or
- binaries_travel/ for the traveling scenario.
Flash using the provided flasher (see firmware/flash_tool.md).

Option B — Build from source:
Install ESP-IDF v5.4, clone the repo, choose the scenario in
components/ui/scenario_build.h, then run:
```
 idf.py fullcleanidf.py buildidf.py -p PORT flash
```
3. After flashing:
Connect the tinyTTS module to the CrowPanel (UART0 ↔ UART1-OUT) with the 4-pin cable.
Attach the audio output (3.5 mm jack) from tinyTTS to a speaker or headphones.

4. Run the demo:

The screen shows a photo card.
- Press Play - the card text is spoken by tinyTTS.
- Press "→" - next card appears (looped list).
More details in the Readme
2
Make Your Own Album - Example: third
(Based on the Add Scenario Guide)

1. Create Scenario Folders

In the repository root, add:
```
components/ui/third/ ← new folder for visuals spiffs_root/assets_third/ ← new folder for binary images
```
Each must follow the naming convention: lowercase scenario name = third.

2. Enable Scenario in Code

Edit
components/ui/include/scenario_build.h
```
#define UI_SCENARIO_ALBUM 0 #define UI_SCENARIO_TRAVEL 0 #define UI_SCENARIO_THIRD 1 // activate the new one
```
Only one scenario can be set to 1.

3. Add Assets

Place .bin images (RAW format) inside
spiffs_root/assets_third/.

You can generate them using SquareLine Studio or the LVGL Image Converter
(Color format: True color (RGB565), Output: Binary).

4. Add Texts and Visuals

Texts:
Create components/ui/builtin_texts_third.c —
defines kThirdTexts[] array with your text strings.

Visuals:
Create components/ui/third/visuals_third.c —
maps each text to an image (import from img_third_*.c).

Example:
```
const case_visual_t kVisuals[CASE_TXT_COUNT] = { [CASE_TXT_01] = { &ui_img_third_01, ui_img_third_01_load }, [CASE_TXT_02] = { &ui_img_third_02, ui_img_third_02_load }, [CASE_TXT_03] = { &ui_img_third_03, ui_img_third_03_load }, };
```
5. Update CMakeLists.txt

Add the new block inside components/ui/CMakeLists.txt:
```
elseif(_SCEN_SELECTED_NAME_LOWER STREQUAL "third") list(APPEND UI_SCENARIO_SRCS  ${CMAKE_CURRENT_LIST_DIR}/builtin_texts_third.c ${CMAKE_CURRENT_LIST_DIR}/third/visuals_third.c )
```
6. Build and Flash

Clean, build, and flash:
```
idf.py fullcleanidf.py buildidf.py -p PORT flash monitor
```
Expected log output:
```
 -- UI scenario selected: THIRD (lower='third')
```

View all instructions

TinyTTS — The Neural Module That Makes Photos Talk

Description

Details

How It Works

What’s Next

Resources

Components

Build Instructions

Collapse

Discussions

Similar Projects

SonoSight: The AR glass for hearing impaired

Project Stark Framework

Home Automation Console.

Reactron Overdrive

TinyTTS — The Neural Module That Makes Photos Talk

Become a Hackaday.io member

Just one more thing

Description

Details

How It Works

What’s Next

Resources

Components

Build Instructions Collapse

Enjoy this project?

Discussions

Become a Hackaday.io Member

Similar Projects

SonoSight: The AR glass for hearing impaired

Project Stark Framework

Home Automation Console.

Reactron Overdrive

Does this project spark your interest?

Build Instructions

Collapse