Skip to content

speech-to-speech

AI Agents & Assistants

An open‑source, modular speech‑to‑speech pipeline that builds local voice agents using transformer models.

Python Latest 2025 · 3mo ago Security brief →

Features

  • Cascaded pipeline: VAD → STT → LLM → TTS
  • Supports multiple open‑source models via Hugging Face Transformers (e.g., Whisper, Parakeet‑TDT, Qwen3‑TTS)
  • Runs locally or as a server with WebSocket/realtime and TCP streaming modes

Recent releases

View all 1 releases →
2025 New feature
Notable features
  • Add paraformer_zh ASR for Chinese speech recognition
  • Add ChatTTS with Chinese language support
  • Add DeepFilterNet for speech enhancement (clearer audio)
Full changelog

What's Changed

  • Minor doc fix. by @Vaibhavs10 in https://github.com/huggingface/speech-to-speech/pull/2
  • Fix missing sounddevice module by @AlexHayton in https://github.com/huggingface/speech-to-speech/pull/7
  • Update README.md by @RodriMora in https://github.com/huggingface/speech-to-speech/pull/23
  • fix issue with ntlk by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/29
  • Dockerize by @codearranger in https://github.com/huggingface/speech-to-speech/pull/22
  • Add support to MPS by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/20
  • adding apache license by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/31
  • refactor arguments folder + run ruff by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/32
  • Allow LM selection and MLX Gemma by @RonanKMcGovern in https://github.com/huggingface/speech-to-speech/pull/40
  • Improvements mlx pipeline by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/41
  • refactor all the handlers - folder structure by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/43
  • add min new tokens by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/49
  • add warning to install flash attn by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/51
  • improve logging by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/52
  • feat:add paraformer_zh asr by @wuhongsheng in https://github.com/huggingface/speech-to-speech/pull/48
  • Assigning min new tokens to a compiled whisper graph on a thread brea… by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/58
  • Add paraformer - Chinese STT by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/53
  • feat:add chatTTS by @wuhongsheng in https://github.com/huggingface/speech-to-speech/pull/55
  • Add ChatTTS - Chinese support by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/59
  • feat:add DeepFilterNet for speech enhancement to obtain clear speech … by @wuhongsheng in https://github.com/huggingface/speech-to-speech/pull/61
  • improve documentation by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/77
  • Add support for multiple languages by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/60
  • Update module_arguments.py by @AgainstEntropy in https://github.com/huggingface/speech-to-speech/pull/78
  • Add language arg to lightning whisper handler by @rchan26 in https://github.com/huggingface/speech-to-speech/pull/84
  • Fix relative link in README by @rchan26 in https://github.com/huggingface/speech-to-speech/pull/85
  • fix by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/87
  • fix: Changed [True] to [False] in help text for audio_enhancement to align with actual default by @BrutalCoding in https://github.com/huggingface/speech-to-speech/pull/91
  • Update: Added multi-language support for macOS. by @ybm911 in https://github.com/huggingface/speech-to-speech/pull/93
  • Mac multi language by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/98
  • Refactor for inference by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/106
  • feat:Add rest call support similar to oepn-api style by @wuhongsheng in https://github.com/huggingface/speech-to-speech/pull/81
  • Improve auto language by @eustlb in https://github.com/huggingface/speech-to-speech/pull/112
  • Readme update + clarity improvements by @eustlb in https://github.com/huggingface/speech-to-speech/pull/113
  • updated readme for a small typo by @ankanpy in https://github.com/huggingface/speech-to-speech/pull/115
  • Fix hanging client on KeyboardInterrupt by @3manifold in https://github.com/huggingface/speech-to-speech/pull/121
  • made small fixes in arguments_classes and TTS folder by @ankanpy in https://github.com/huggingface/speech-to-speech/pull/116
  • Facebook mms merge by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/123
  • New new faster whisper by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/124
  • Add moonshine by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/127
  • set keras backend to torch. by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/129
  • Fixed typos in README.md by @sergiopaniego in https://github.com/huggingface/speech-to-speech/pull/137
  • Bugfix: can not concatenate str + GenerationResponse by @baldassarreFe in https://github.com/huggingface/speech-to-speech/pull/144
  • Update Parler-TTS base model and description by @ylacombe in https://github.com/huggingface/speech-to-speech/pull/147
  • adding more languages by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/148
  • multilingual improvements for parler by @andimarafioti in https://github.com/huggingface/speech-to-speech/pull/149
  • Improved Error Message for get_tts_handler by @Arslan-Mehmood1 in https://github.com/huggingface/speech-to-speech/pull/155

New Contributors

  • @Vaibhavs10 made their first contribution in https://github.com/huggingface/speech-to-speech/pull/2
  • @AlexHayton made their first contribution in https://github.com/huggingface/speech-to-speech/pull/7
  • @RodriMora made their first contribution in https://github.com/huggingface/speech-to-speech/pull/23
  • @andimarafioti made their first contribution in https://github.com/huggingface/speech-to-speech/pull/29
  • @codearranger made their first contribution in https://github.com/huggingface/speech-to-speech/pull/22
  • @RonanKMcGovern made their first contribution in https://github.com/huggingface/speech-to-speech/pull/40
  • @wuhongsheng made their first contribution in https://github.com/huggingface/speech-to-speech/pull/48
  • @rchan26 made their first contribution in https://github.com/huggingface/speech-to-speech/pull/84
  • @BrutalCoding made their first contribution in https://github.com/huggingface/speech-to-speech/pull/91
  • @ybm911 made their first contribution in https://github.com/huggingface/speech-to-speech/pull/93
  • @eustlb made their first contribution in https://github.com/huggingface/speech-to-speech/pull/112
  • @ankanpy made their first contribution in https://github.com/huggingface/speech-to-speech/pull/115
  • @3manifold made their first contribution in https://github.com/huggingface/speech-to-speech/pull/121
  • @sergiopaniego made their first contribution in https://github.com/huggingface/speech-to-speech/pull/137
  • @baldassarreFe made their first contribution in https://github.com/huggingface/speech-to-speech/pull/144
  • @ylacombe made their first contribution in https://github.com/huggingface/speech-to-speech/pull/147
  • @Arslan-Mehmood1 made their first contribution in https://github.com/huggingface/speech-to-speech/pull/155

Full Changelog: https://github.com/huggingface/speech-to-speech/commits/2025

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
4,818
Forks
574
Languages
Python Dockerfile

Install & Platforms

Install via
pip

Beta — feedback welcome: [email protected]