mcp-tts

MCP Server for TTS (Text-to-Speech)

What? 🤔

Adds Text-to-Speech to things like Claude Desktop and Cursor IDE.

It registers four TTS tools:

say_tts
elevenlabs_tts
google_tts
openai_tts

`say_tts`

Uses the macOS say binary to speak the text with built-in system voices

`elevenlabs_tts`

Uses the ElevenLabs text-to-speech API to speak the text with premium AI voices

`google_tts`

Uses Google's Gemini TTS models to speak the text with 30 high-quality voices. Available voices include:

Zephyr (Bright), Puck (Upbeat), Charon (Informative)
Kore (Firm), Fenrir (Excitable), Leda (Youthful)
Orus (Firm), Aoede (Breezy), Callirhoe (Easy-going)
Autonoe (Bright), Enceladus (Breathy), Iapetus (Clear)
And 18 more voices with various characteristics

`openai_tts`

Uses OpenAI's Text-to-Speech API to speak the text with 10 natural-sounding voices:

alloy (Warm, conversational, modern)
ash (Confident, assertive, slightly textured)
ballad (Gentle, melodious, slightly lyrical)
coral (Cheerful, fresh, upbeat)
echo (Neutral, calm, balanced)
fable (Storyteller-like, expressive)
nova (Clear, precise, slightly formal)
onyx (Deep, authoritative, resonant)
sage (Soothing, empathetic, reassuring)
shimmer (Bright, animated, playful)
verse (Versatile, expressive)

Supports three quality models:

gpt-4o-mini-tts - Default, optimized quality and speed
tts-1 - Standard quality, faster generation
tts-1-hd - High definition audio, premium quality

Additional features:

Speed control from 0.25x to 4.0x (default: 1.0x)
Custom voice instructions (e.g., "Speak in a cheerful and positive tone") via parameter or OPENAI_TTS_INSTRUCTIONS environment variable

Configuration

Sequential vs Concurrent TTS

By default, the TTS server enforces sequential speech operations - only one TTS request can play audio at a time. This prevents multiple agents from speaking simultaneously and creating an unintelligible cacophony. Subsequent requests will wait in a queue until the current speech completes.

Multi-Instance Protection: The mutex works both within a single MCP server process and across multiple Claude Desktop instances. When running multiple Claude Desktop terminals, they coordinate via a system-wide file lock to prevent overlapping speech.

To allow concurrent TTS operations (multiple speeches playing simultaneously):

Environment Variable:

export MCP_TTS_ALLOW_CONCURRENT=true

Command Line Flag:

mcp-tts --sequential-tts=false

Note: Concurrent TTS may result in overlapping audio that's difficult to understand. Use this option only when you explicitly want multiple TTS operations to run simultaneously.

Suppressing "Speaking:" Output

By default, TTS tools return a message like "Speaking: [text]" when speech completes. This can interfere with LLM responses. To suppress this output:

Environment Variable:

export MCP_TTS_SUPPRESS_SPEAKING_OUTPUT=true

Command Line Flag:

mcp-tts --suppress-speaking-output

When enabled, tools return "Speech completed" instead of echoing the spoken text.

Getting Started

Install

go install github.com/blacktop/mcp-tts@latest

❱ mcp-tts --help TTS (text-to-speech) MCP Server. Provides multiple text-to-speech services via MCP protocol: • say_tts - Uses macOS built-in 'say' command (macOS only) • elevenlabs_tts - Uses ElevenLabs API for high-quality speech synthesis • google_tts - Uses Google's Gemini TTS models for natural speech • openai_tts - Uses OpenAI's TTS API with various voice options Each tool supports different voices, rates, and configuration options. Requires appropriate API keys for cloud-based services. Designed to be used with the MCP (Model Context Protocol). Usage: mcp-tts [flags] Flags: -h, --help help for mcp-tts --sequential-tts Enforce sequential TTS (prevent concurrent speech) (default true) --suppress-speaking-output Suppress 'Speaking:' text output -v, --verbose Enable verbose debug logging

Set Claude Desktop Config

{ "mcpServers": { "say": { "command": "mcp-tts", "env": { "ELEVENLABS_API_KEY": "********", "ELEVENLABS_VOICE_ID": "1SM7GgM6IMuvQlz2BwM3", "GOOGLE_AI_API_KEY": "********", "OPENAI_API_KEY": "********", "OPENAI_TTS_INSTRUCTIONS": "Speak in a cheerful and positive tone", "MCP_TTS_SUPPRESS_SPEAKING_OUTPUT": "true", "MCP_TTS_ALLOW_CONCURRENT": "false" } } } }

Environment Variables

ELEVENLABS_API_KEY: Your ElevenLabs API key (required for elevenlabs_tts)
ELEVENLABS_VOICE_ID: ElevenLabs voice ID (optional, defaults to a built-in voice)
GOOGLE_AI_API_KEY or GEMINI_API_KEY: Your Google AI API key (required for google_tts)
OPENAI_API_KEY: Your OpenAI API key (required for openai_tts)
OPENAI_TTS_INSTRUCTIONS: Custom voice instructions for OpenAI TTS (optional, e.g., "Speak in a cheerful and positive tone")
MCP_TTS_SUPPRESS_SPEAKING_OUTPUT: Set to "true" to suppress "Speaking:" output (optional)
MCP_TTS_ALLOW_CONCURRENT: Set to "true" to allow concurrent TTS operations (optional, defaults to sequential)

Test

Test macOS TTS

❱ cat test/say.json | go run main.go --verbose 2025/03/23 22:41:49 INFO Starting MCP server name="Say TTS Service" version=1.0.0 2025/03/23 22:41:49 DEBU Say tool called request="{Request:{Method:tools/call Params:{Meta:<nil>}} Params:{Name:say_tts Arguments:map[text:Hello, world!] Meta:<nil>}}" 2025/03/23 22:41:49 DEBU Executing say command args="[--rate 200 Hello, world!]" 2025/03/23 22:41:49 INFO Speaking text text="Hello, world!"

{"jsonrpc":"2.0","id":3,"result":{"content":[{"type":"text","text":"Speaking: Hello, world!"}]}}

Test Google TTS

❱ cat test/google_tts.json | go run main.go --verbose 2025/05/23 18:26:45 INFO Starting MCP server name="Say TTS Service" version="" 2025/05/23 18:26:45 DEBU Google TTS tool called request="{...}" 2025/05/23 18:26:45 DEBU Generating TTS audio model=gemini-2.5-flash-preview-tts voice=Kore text="Hello! This is a test of Google's TTS API. How does it sound?" 2025/05/23 18:26:49 INFO Playing TTS audio via beep speaker bytes=181006 2025/05/23 18:26:53 INFO Speaking via Google TTS text="Hello! This is a test of Google's TTS API. How does it sound?" voice=Kore

{"jsonrpc":"2.0","id":4,"result":{"content":[{"type":"text","text":"Speaking: Hello! This is a test of Google's TTS API. How does it sound? (via Google TTS with voice Kore)"}]}}

Test OpenAI TTS

❱ cat test/openai_tts.json | go run main.go --verbose 2025/05/23 19:15:32 INFO Starting MCP server name="Say TTS Service" version="" 2025/05/23 19:15:32 DEBU OpenAI TTS tool called request="{...}" 2025/05/23 19:15:32 DEBU Generating OpenAI TTS audio model=tts-1 voice=nova speed=1.2 text="Hello! This is a test of OpenAI's text-to-speech API. I'm using the nova voice at 1.2x speed." 2025/05/23 19:15:34 DEBU Decoding MP3 stream from OpenAI 2025/05/23 19:15:34 DEBU Initializing speaker for OpenAI TTS sampleRate=22050 2025/05/23 19:15:36 INFO Speaking text via OpenAI TTS text="Hello! This is a test of OpenAI's text-to-speech API. I'm using the nova voice at 1.2x speed." voice=nova model=tts-1 speed=1.2

{"jsonrpc":"2.0","id":5,"result":{"content":[{"type":"text","text":"Speaking: Hello! This is a test of OpenAI's text-to-speech API. I'm using the nova voice at 1.2x speed. (via OpenAI TTS with voice nova)"}]}}

Test the mutex behavior with multiple TTS requests

# Sequential mode (default) - speeches play one after another cat test/sequential.json | go run main.go --verbose # Concurrent mode - allows overlapping speech  cat test/sequential.json | go run main.go --verbose --sequential-tts=false

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github		.github
.vscode		.vscode
cmd		cmd
docs		docs
hack/make		hack/make
test		test
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
integration_test.go		integration_test.go
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

mcp-tts

MCP Server for TTS (Text-to-Speech)

What? 🤔

`say_tts`

`elevenlabs_tts`

`google_tts`

`openai_tts`

Configuration

Sequential vs Concurrent TTS

Suppressing "Speaking:" Output

Getting Started

Install

Set Claude Desktop Config

Environment Variables

Test

Test macOS TTS

Test Google TTS

Test OpenAI TTS

Test the mutex behavior with multiple TTS requests

License

About

Uh oh!

Releases 21

Packages

Uh oh!

Contributors 4

Languages

Uh oh!

License

Uh oh!

blacktop/mcp-tts

Folders and files

Latest commit

History

Repository files navigation

mcp-tts

MCP Server for TTS (Text-to-Speech)

What? 🤔

say_tts

elevenlabs_tts

google_tts

openai_tts

Configuration

Sequential vs Concurrent TTS

Suppressing "Speaking:" Output

Getting Started

Install

Set Claude Desktop Config

Environment Variables

Test

Test macOS TTS

Test Google TTS

Test OpenAI TTS

Test the mutex behavior with multiple TTS requests

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 21

Packages 0

Uh oh!

Contributors 4

Languages

`say_tts`

`elevenlabs_tts`

`google_tts`

`openai_tts`

Packages