lucoiso
(lucoiso)
February 7, 2022, 6:33pm
1
An Unreal Engine plugin that integrates Azure Speech Cognitive Services into the Engine by adding functions to perform recognition and synthesis via asynchronous tasks.
Editor Tool
AzSpeech also includes a new Editor Tool to generate audios as USoundWaves directly in the Engine:
Links
Support me: Sponsor @lucoiso on GitHub Sponsors
16 Likes
lucoiso
(lucoiso)
March 8, 2022, 7:07pm
3
6 Likes
Any tutorial or even a detailed document on Speech to Text using this plugin is helpful and appreciated! Thanks for the contribution!
6 Likes
lucoiso
(lucoiso)
March 11, 2022, 5:28pm
5
Thanks for the feedback, I’ll get to work on it! : )
1 Like
lucoiso
(lucoiso)
March 11, 2022, 5:30pm
6
Added a new function: Text to WAV!
This new function is capable to save a text into a .wav audio file inside the specified location/path.
Look at the demo video:
Already on UE Marketplace! : )
Is there a possibility to play the audio async inside the game?
Also can we get sound wav file out from the node?
1 Like
lucoiso
(lucoiso)
March 14, 2022, 7:51pm
9
Hey! : )
What’s going on?
This function automatically uses the default microphone from Windows settings.
Are you using the function via blueprint or C++?
And check your API Access Key from your Azure portal:
lucoiso
(lucoiso)
March 14, 2022, 7:58pm
10
Hey! : )
Not with this plugin, but Unreal Engine provides some classes that allow you to get a file from your computer.
Only in Editor
I created an example via C++ where I used a Sound Factory to create a USoundWave at runtime:
USoundFactory* NewFactory = NewObject<USoundFactory>(this, TEXT("Sound Factory"));
bool bOperationCanceled;
UObject* MyRuntimeObject = NewFactory->FactoryCreateFile(USoundWave::StaticClass(), this, "RuntimeAudio",
EObjectFlags::RF_NoFlags, TEXT("D:\\PC\\Music\\Captured Audios\\Test.wav"),
TEXT(""), nullptr, bOperationCanceled);
USoundWave* MyRuntimeAudio = Cast<USoundWave>(MyRuntimeObject);
if (IsValid(MyRuntimeAudio))
{
AudioCompEx->SetSound(MyRuntimeAudio);
AudioCompEx->Play();
}
Obs.: AudioCompEx is a UAudioComponent;
Obs. 2: Add “AudioEditor” module in your Build.cs to enable USoundFactory class
The result:
With this, you can load a file and play in a determined location on level, apply attenuation, etc, via your Audio Component/Sound Wave Component
The “AudioEditor” module only works in the Editor. : (
There is a plugin that does this job: gtreshchev/RuntimeAudioImporter: Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime. (github.com)
KILSU77
(KILSU77)
March 15, 2022, 5:31am
12
I’m sorry… that was my fault, i did test voice to text
‘LanguageID = ko-KR’ setting, it’s dosen’t work, but ‘LanguegeId = en-US’ setting, it’s work very well.
1 Like
lucoiso
(lucoiso)
April 5, 2022, 4:35pm
13
Working with official UE5 released version!
4TC0
(4TC0)
April 9, 2022, 1:16pm
14
Hello!
During Speech to Text process, I would like to set a Push to talk button and recieve a recognized string only after releasing the key, granting that my full phrase was parsed, not after a period of time without audio. Is there a way to do this?
Also, with my language (pt-BR), the recognized string returns a character encoding that is nos correct. The recognized string returns “é” instead of “é”, “á” instead of “á”, etc. Is there a way to change this?
Thanks!!
1 Like
lucoiso
(lucoiso)
April 9, 2022, 4:42pm
15
Released a fix to this:
Release AzSpeech v2.1.1 · lucoiso/UEAzSpeech (github.com)
This update will be on the Marketplace soon
Solution:
Changed the string conversion from wchar_t* to UTF8_TO_TCHAR(STRING.c_str())
1 Like
Noyag777
(Noyag777)
April 18, 2022, 10:40pm
16
Great plugin. Would be nice to have SSML supported and I am trying to figure out how to abort some sound. Can’t figure out the location of the ‘default audio component’ so far. Is it at the Game Instance? Or at the Level Blueprint? Or at the character that speaks audio?
lucoiso
(lucoiso)
April 19, 2022, 3:31pm
17
Hello @Noyag777 !
This plugins uses the default output/input audio devices from Windows (OS) Settings, at this moment is not possible to change this with this plugin.
But with the Text-To-WAV file, you can save a speech into a .WAV file and stream it to a USoundWave to use inside a Audio Component. : )
Maybe this plugin could help you to stream a .WAV into a UAudioComponent/USoundWave: gtreshchev/RuntimeAudioImporter: Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime. (github.com) 4
1 Like
Noyag777
(Noyag777)
April 19, 2022, 9:37pm
18
This plugins uses the default output/input audio devices from Windows (OS) Settings
That’s what I was afraid of indeed.
Maybe this plugin could help you to stream a .WAV into a UAudioComponent/USoundWave
Aah, that’s great indeed. I was already trying to figure out if there was a way to do that. Thanks for the link!
1 Like
lucoiso
(lucoiso)
April 20, 2022, 2:38am
19
Some upcoming features:
I’ve committed this update to a separate branch (upcoming-test ) and will release it when I think it’s fully functional and bug free.
But if you want to test these features, feel free to use it and give me feedback! : )
Please note that these features are in testing, some things may change until the release date.
1 Like
Does this technology work with voice commands? I want to use it like Siri: “Hey Unreal, set the weather to thunder and rain.”
lucoiso
(lucoiso)
April 20, 2022, 3:19pm
21
Doesn’t support voice commands by itself but there’s some functions that convert your speech into a string (VoiceToTextAsync) and convert a string into speech (TextToVoiceAsync and the upcoming TextToStreamAsync) that you can use to to construct this logic.
For example: the VoiceToTextAsync returns a string of your recognized speech that you can use inside a comparing logic, etc.
1 Like
lucoiso
(lucoiso)
April 20, 2022, 7:50pm
22
1 Like
lucoiso
(lucoiso)
April 27, 2022, 1:44pm
23
A short example:
Voice to Text blueprint:
When the key M is pressed, Azure will listen to your default microphone (from Windows OS settings) and print the recognized speech as a string in your screen.
Getting APIAccess Key and Region: lucoiso/UEAzSpeech (github.com)
More informations: UEAzSpeech/README.md at main · lucoiso/UEAzSpeech (github.com)
1 Like