Runtime Text To Speech - offline, cross-platform TTS, over 35 languages and 900 voices (+Kokoro support)

:speaking_head: Transform your game with real-time, offline, cross-platform text-to-speech synthesis!

Add powerful offline text-to-speech capabilities to your project with over 35 languages and 900 voices featuring more than 120 voice qualities. Synthesize speech in real-time without internet connectivity, powered by Piper, Kokoro and ONNX Runtime.

:rocket: New! Now featuring Kokoro voice models – high-quality, open-source TTS architectures with studio-level voice synthesis . Includes 45 models across 6 languages, offering natural and expressive speech output.

Quick links:

Key features:

:dart: Core Capabilities:

  • Complete offline text-to-speech synthesis
  • 39 languages supported
  • 900+ unique voices available
  • 120+ voice qualities
  • Support for Kokoro voice models!
  • Cross-platform support: Windows, Linux, Mac, Android (including Oculus/Meta Quest), iOS
  • Experimental support for Meta Quest and Apple Vision Pro

:zap: Voice System:

  • One-click voice model downloads through editor interface
  • In-editor voice preview and testing
  • Runtime voice model selection
  • Raw PCM float audio output
  • Flexible integration with any audio playback solution
  • Built-in compatibility with Runtime Audio Importer

:hammer_and_wrench: Development Features:

  • Full Blueprint and C++ API support
  • Easy voice model management and packaging
  • Comprehensive voice metadata access
  • Simple voice model selection via dropdown
  • Automated voice model packaging with projects

:earth_africa: Supported Languages:

  • :us: English (United States) (with Kokoro models)
  • :uk: English (British) (with Kokoro models)
  • :cn: Simplified Chinese (简体中文)
  • :mexico: Spanish (Mexican / Español Mexicano)
  • :es: Spanish (European / Español Europeo) (with Kokoro models)
  • :ru: Russian (Русский)
  • :brazil: Portuguese (Brazil / Português do Brasil) (with Kokoro models)
  • :portugal: Portuguese (Portugal / Português de Portugal)
  • :india: Hindi (हिन्दी) (with Kokoro models)
  • :de: German (Deutsch)
  • :fr: French (Français) (with Kokoro models)
  • :tr: Turkish (Türkçe)
  • :poland: Polish (Polski)
  • :it: Italian (Italiano)
  • :ukraine: Ukrainian (Украї́нська мо́ва)
  • :andorra: Catalan (Català)
  • :czech_republic: Czech (Čeština)
  • :wales: Welsh (Cymraeg)
  • :denmark: Danish (Dansk)
  • :greece: Greek (Ελληνικά)
  • :iran: Farsi (فارسی)
  • :finland: Finnish (Suomi)
  • :hungary: Hungarian (Magyar)
  • :iceland: Icelandic (Íslenska)
  • :georgia: Georgian (ქართული ენა)
  • :kazakhstan: Kazakh (Қазақша)
  • :luxembourg: Luxembourgish (Lëtzebuergesch)
  • :latvia: Latvian (Latviešu)
  • :nepal: Nepali (नेपाली)
  • :belgium: Dutch (Belgium / Vlaams)
  • :netherlands: Dutch (Netherlands / Nederlands)
  • :norway: Norwegian (Bokmål / Nynorsk)
  • :romania: Romanian (Română)
  • :slovakia: Slovak (Slovenčina)
  • :slovenia: Slovenian (Slovenščina)
  • :serbia: Serbian (Srpski)
  • :sweden: Swedish (Svenska)
  • :kenya: Swahili (Kiswahili)
  • :vietnam: Vietnamese (Tiếng Việt)

:video_game: Perfect for:

  • Accessible game interfaces
  • Dynamic NPC conversations
  • Voice-driven tutorials and hints
  • Procedurally generated content
  • Localization solutions
  • Assistive technologies
  • Interactive storytelling
  • Educational applications