Top 23 Python Speech Projects

TTS

1 244 43,441 8.1 Python

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Project mention: AI Twin — Voice Cloning with Text-to-Speech | dev.to | 2025-12-16

Coqui TTS - The amazing text-to-speech library that powers this project
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
MockingBird

2 9 36,745 5.6 Python

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
datasets

3 18 21,001 9.3 Python

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Project mention: Training with Big Data on Any Cloud | dev.to | 2025-06-20

Hugging Face Datasets -- the library that lets you download and manage datasets from the Hugging Face Hub, as well as being a convenient vendor-neutral interface for your own datasets.
whisperX

4 37 19,239 8.6 Python

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Project mention: Making AI Models Faster, Cheaper, and Greener — Here’s How | dev.to | 2025-11-03

2.3X speed improvement over WhisperX and a 3X speed boost compared to HuggingFace Pipeline with FlashAttention 2 (Insanely Fast Whisper)
AudioGPT

5 4 10,200 0.0 Python

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
modelscope

6 3 8,579 9.4 Python

ModelScope: bring the notion of Model-as-a-Service to life.
EmotiVoice

7 5 8,367 7.9 Python

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Stream

getstream.io featured

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
silero-vad

8 15 7,669 8.4 Python

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Project mention: 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1) | dev.to | 2025-09-20

Silero VAD is the gold standard and pipecat has builtin support so I have choosen that :
ultravox

9 6 4,294 6.1 Python

A fast multimodal LLM for real-time voice

Project mention: I Open-Sourced My AI Toy Company That Runs on ESP32 and OpenAI Realtime API | news.ycombinator.com | 2025-04-22

This looks like so much fun! I have recently gotten into working with electronics, so it seems like a nice little project to undertake.
I noticed that it is dependent on openAIs realtime API, so it got me wondering what open alternatives there are.
I could only find ultravox (https://github.com/fixie-ai/ultravox) that would seem to really work as realtime. It seems to be some model that wires up llama and whisper somehow, rather than treating them as separate steps which is common with other projects,
What other options are available for this kind of real-time behaviour?
speech-to-speech

10 3 4,251 8.7 Python

Speech To Speech: an effort for an open-sourced and modular GPT4-o
metavoice-src

11 5 4,191 7.8 Python

Foundational model for human-like, expressive TTS
DeepFilterNet

12 13 3,407 7.3 Python

Noise supression using deep filtering

Project mention: Show HN: Background noise removal in multimedia with a single command | news.ycombinator.com | 2025-10-06
whisper-asr-webservice

13 11 3,070 8.1 Python

OpenAI Whisper ASR Webservice API
VoxCPM

14 1 2,988 8.4 Python

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Project mention: VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and Voice Cloning | news.ycombinator.com | 2025-12-05
lingvo

15 1 2,856 6.0 Python

Lingvo
aeneas

16 4 2,781 0.0 Python

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
whisper-timestamped

17 2 2,700 6.1 Python

Multilingual Automatic Speech Recognition with word-level timestamps and confidence
gTTS

18 3 2,568 2.7 Python

Python library and CLI tool to interface with Google Translate's text-to-speech API
IMS-Toucan

19 1 2,065 7.3 Python

Controllable and fast Text-to-Speech for over 7000 languages!
openai-edge-tts

20 1 1,478 7.4 Python

Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs

Project mention: Open source TTS by Resemble (claiming they are sota) | news.ycombinator.com | 2025-06-11

It can definitely run on CPU — but I'm not sure if it can run on a machine without a GPU _entirely_.
To be honest, it uses a decently large amount of resources. If you had a GPU, you could expect about 4-5 gb memory usage. And given the optimizations for tensors on GPUs, I'm not sure how well thinks would work "CPU only".
If you try it, let me know. There are some "CPU" Docker builds in the repo you could look at for guidance.
If you want free TTS without using local resources, you could try edge-tts https://github.com/travisvn/openai-edge-tts
SALMONN

21 2 1,369 7.3 Python

SALMONN family: A suite of advanced multi-modal LLMs
voicefixer

22 2 1,252 3.5 Python

General Speech Restoration
StreamSpeech

23 3 1,213 3.3 Python

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Speech discussion

Python Speech related posts

AI Twin — Voice Cloning with Text-to-Speech

2 projects | dev.to | 16 Dec 2025
Making AI Models Faster, Cheaper, and Greener — Here’s How

5 projects | dev.to | 3 Nov 2025
2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1)

7 projects | dev.to | 20 Sep 2025
Ask HN: What Speaker Diarization tools should I look into?

1 project | news.ycombinator.com | 23 Jul 2025
Training with Big Data on Any Cloud

4 projects | dev.to | 20 Jun 2025
Show HN: Mikey – No bot meeting notetaker for Windows

6 projects | news.ycombinator.com | 12 Feb 2025
Ask HN: Is Whisper Still Relevant?

2 projects | news.ycombinator.com | 12 Feb 2025
A note from our sponsor - InfluxDB
www.influxdata.com | 22 Dec 2025

InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source Speech projects in Python? This list will help you:

#	Project	Stars
1	TTS	43,441
2	MockingBird	36,745
3	datasets	21,001
4	whisperX	19,239
5	AudioGPT	10,200
6	modelscope	8,579
7	EmotiVoice	8,367
8	silero-vad	7,669
9	ultravox	4,294
10	speech-to-speech	4,251
11	metavoice-src	4,191
12	DeepFilterNet	3,407
13	whisper-asr-webservice	3,070
14	VoxCPM	2,988
15	lingvo	2,856
16	aeneas	2,781
17	whisper-timestamped	2,700
18	gTTS	2,568
19	IMS-Toucan	2,065
20	openai-edge-tts	1,478
21	SALMONN	1,369
22	voicefixer	1,252
23	StreamSpeech	1,213

Python Speech

Top 23 Python Speech Projects

Python Speech discussion

Python Speech related posts

AI Twin — Voice Cloning with Text-to-Speech

Making AI Models Faster, Cheaper, and Greener — Here’s How

2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1)

Ask HN: What Speaker Diarization tools should I look into?

Training with Big Data on Any Cloud

Show HN: Mikey – No bot meeting notetaker for Windows

Ask HN: Is Whisper Still Relevant?

Index

Did you know that Python is the 2nd most popular programming language based on number of references?

Did you know that Python is
the 2nd most popular programming language
based on number of references?