Also by the creator of Voicebox: Spacebot , an AI agent OS for teams. Connect Discord, Slack, or Telegram in one click.
Open source voice cloning powered by Qwen3-TTS. Create natural-sounding speech from text with near-perfect voice replication.
View on GitHub
Voicebox is a local-first voice cloning studio with DAW-like features for professional voice synthesis. Think of it as a local, free and open-source alternative to ElevenLabs — download models, clone voices, and generate speech entirely on your machine.
Unlike cloud services that lock your voice data behind subscriptions, Voicebox gives you complete privacy, professional tools, and native performance. Download a voice model, clone any voice from a few seconds of audio, and compose multi-voice projects with studio-grade editing tools.
Optimized for performance with Metal acceleration on Mac and CUDA acceleration on Windows/Linux for fast, local inference.
No Python install required.
Powered by Alibaba's Qwen3-TTS model for exceptional voice quality and accuracy.
Create multi-voice narratives with a timeline-based editor. Arrange tracks, trim clips, and mix conversations.
Combine multiple voice samples for higher quality and more natural-sounding results.
Run GPU inference locally or connect to a remote machine. One-click server setup.
Powered by Whisper for accurate speech-to-text. Extract reference text from voice samples automatically.
Available for macOS, Windows, and Linux. No Python installation required.