kimilou - KMAudioCap - Offline Speech Recognition

KMAudioCap - UE Voice Recognition (Eng, Chi & Dialects)

KMAudioCap is a high-performance, localized speech recognition solution specifically designed for Unreal Engine developers. It deeply integrates the advanced open-source speech recognition framework, sherpa-onnx, and includes an accurate Voice Activity Detection (VAD) feature. Whether your project requires English or Chinese voice interaction, or demands high real-time performance and privacy for voice input, KMAudioCap provides robust and reliable support. It enables your UE project to easily achieve Speech-to-Text functionality, offering players an immersive voice interaction experience.

Key Features:

Chinese & English Speech Recognition: Optimized primarily for Mandarin Chinese with English language support, meeting the demands of diverse project requirements.

High-Accuracy Voice Activity Detection (VAD): Intelligently identifies and filters out non-speech segments (e.g., silence, background noise), ensuring only valid speech data is processed. This significantly improves recognition accuracy and efficiency.

Local Offline Operation: Entirely based on local ONNX models, eliminating the need for any external network services. This provides:

  • Low Latency: Fast speech processing with responsive performance.

  • Data Privacy Assurance: User voice data does not need to be uploaded to the cloud, protecting privacy and security.

  • Stable & Reliable: Unaffected by network connectivity, ensuring consistent performance in any environment.

High Performance & Optimized Resources: The plugin is optimized with advanced thread-safe processing and intelligent VAD parameters to provide powerful speech recognition capabilities while maintaining low CPU and memory footprint, ensuring minimal impact on your UE project's performance.

Memory Management: Built-in cache cleanup functionality ensures optimal performance during extended use sessions.

Easy Integration: Provides user-friendly and intuitive Blueprint and C++ interfaces, allowing developers to quickly integrate it into existing UE projects for voice commands, voice chat, AI voice interaction, and more.

Real-time Streaming Recognition: KMAudioCap is designed specifically for continuous recognition of real-time audio streams from microphones. It does not process pre-recorded audio files.

Ideal Use Cases:

  • In-Game Voice Commands: Enable voice control for character movement, actions, or interaction with NPCs.

  • Immersive AI Voice Interaction: Grant AI characters in games or VR applications the ability for natural spoken dialogue.

  • Voice Chat Systems: Integrate efficient Speech-to-Text chat functionality in multiplayer games.

  • Educational & Simulation Applications: Provide voice input methods for learning, training, and simulated environments.

  • General Voice Input: Any Unreal Engine project requiring conversion of user voice to text input.

System Compatibility:

Unreal Engine Version: Compatible with UE 5.3 and later versions.

Supported Platform: Windows (64-bit). (Currently only Windows platform is supported)

System Requirements & Notes:

This plugin relies on Unreal Engine's built-in AudioCapture system for real-time microphone input. Please ensure AudioCapture is enabled in your project settings for full functionality. If you disable AudioCapture, the plugin will not be able to process microphone audio.

Quick Start:

  1. Unzip the KMAudioCap plugin and place it into your Unreal Engine project's Plugins folder.

  2. Enable the KMAudioCap plugin in the UE Editor.

  3. Refer to the detailed documentation and sample projects included with the plugin to start integrating and using powerful speech recognition features in your UE application.

KMAudioCap will bring an unprecedented intelligent voice interaction experience to your Unreal Engine projects.