A Survey of Spoken Dialogue Models (60 pages)
streaming duplex speech moshi speech-representation encodec gpt-4o speech-language-model spoken-dialogue-models modal-alignment intreaction mini-omni llama-omni wavtokenizer
- Updated
Nov 28, 2024
A Survey of Spoken Dialogue Models (60 pages)
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
Add a description, image, and links to the wavtokenizer topic page so that developers can more easily learn about it.
To associate your repository with the wavtokenizer topic, visit your repo's landing page and select "manage topics."