Runtime Text To Speech - offline, cross-platform TTS, over 35 languages and 900 voices (+Kokoro support)

gtreshchev · January 17, 2025, 10:43pm

Transform your game with real-time, offline, cross-platform text-to-speech synthesis!

Add powerful offline text-to-speech capabilities to your project with over 40 languages and 900 voices featuring more than 190 voice qualities. Synthesize speech in real-time without internet connectivity, powered by Piper, Kokoro and ONNX Runtime.

New! Now featuring Kokoro voice models – high-quality, open-source TTS architectures with studio-level voice synthesis . Includes 49 models across 8 languages, offering natural and expressive speech output.

Quick links:

Fab link
Packaged Demo Project (Windows)
Documentation
YouTube video demonstration
Discord support chat
Custom Development: solutions@georgy.dev (tailored solutions for teams & organizations)

Key features:

Core Capabilities:

Complete offline text-to-speech synthesis
39 languages supported
900+ unique voices available
120+ voice qualities
Support for Kokoro voice models!
Cross-platform support: Windows, Linux, Mac, Android (including Oculus/Meta Quest), iOS
Experimental support for Meta Quest and Apple Vision Pro

Voice System:

One-click voice model downloads through editor interface
In-editor voice preview and testing
Runtime voice model selection
Raw PCM float audio output
Flexible integration with any audio playback solution
Built-in compatibility with Runtime Audio Importer

Development Features:

Full Blueprint and C++ API support
Easy voice model management and packaging
Comprehensive voice metadata access
Simple voice model selection via dropdown
Automated voice model packaging with projects

Supported Languages:

English (United States) (with Kokoro models)
English (British) (with Kokoro models)
Simplified Chinese (简体中文) [(with Kokoro models)]
Spanish (Mexican / Español Mexicano)
Spanish (European / Español Europeo) (with Kokoro models)
Korean (한국어)
Russian (Русский)
Portuguese (Brazil / Português do Brasil) (with Kokoro models)
Portuguese (Portugal / Português de Portugal)
Hindi (हिन्दी) (with Kokoro models)
Malayalam (മലയാളം)
German (Deutsch)
French (Français) (with Kokoro models)
🇹🇷 Turkish (Türkçe)
Polish (Polski)
Italian (Italiano)
Ukrainian (Украї́нська мо́ва)
Catalan (Català)
Czech (Čeština)
Welsh (Cymraeg)
Danish (Dansk)
Greek (Ελληνικά)
Farsi (فارسی)
Finnish (Suomi)
Hungarian (Magyar)
Icelandic (Íslenska)
Georgian (ქართული ენა)
Kazakh (Қазақша)
Luxembourgish (Lëtzebuergesch)
Latvian (Latviešu)
Nepali (नेपाली)
Dutch (Belgium / Vlaams)
Dutch (Netherlands / Nederlands)
Norwegian (Bokmål / Nynorsk)
Romanian (Română)
Slovak (Slovenčina)
Slovenian (Slovenščina)
Serbian (Srpski)
Swedish (Svenska)
Swahili (Kiswahili)
Vietnamese (Tiếng Việt)

Perfect for:

Accessible game interfaces
Dynamic NPC conversations
Voice-driven tutorials and hints
Procedurally generated content
Localization solutions
Assistive technologies
Interactive storytelling
Educational applications