[FREE] AzSpeech plugin: Text-to-Speech, Speech-to-Text and more with Microsoft Azure Cognitive Services

lucoiso · February 7, 2022, 6:33pm

An Unreal Engine plugin that integrates Azure Speech Cognitive Services into the Engine by adding functions to perform recognition and synthesis via asynchronous tasks.

Editor Tool

AzSpeech also includes a new Editor Tool to generate audios as USoundWaves directly in the Engine:

Links

Support me: Sponsor @lucoiso on GitHub Sponsors

lucoiso · March 8, 2022, 7:07pm

Plugin published on Unreal Marketplace!

Download: AzSpeech - Async Text-to-Voice and Voice-to-Text in Code Plugins - UE Marketplace (unrealengine.com)

UnrealFreak100 · March 9, 2022, 8:08am

Any tutorial or even a detailed document on Speech to Text using this plugin is helpful and appreciated! Thanks for the contribution!

lucoiso · March 11, 2022, 5:28pm

Thanks for the feedback, I’ll get to work on it! : )

lucoiso · March 11, 2022, 5:30pm

Added a new function: Text to WAV!

This new function is capable to save a text into a .wav audio file inside the specified location/path.

Look at the demo video:

Already on UE Marketplace! : )

UnrealFreak100 · March 12, 2022, 4:23am

Is there a possibility to play the audio async inside the game?
Also can we get sound wav file out from the node?

lucoiso · March 14, 2022, 7:51pm

Hey! : )
What’s going on?
This function automatically uses the default microphone from Windows settings.

Are you using the function via blueprint or C++?
And check your API Access Key from your Azure portal:

lucoiso · March 14, 2022, 7:58pm

Hey! : )
Not with this plugin, but Unreal Engine provides some classes that allow you to get a file from your computer.

Only in Editor

I created an example via C++ where I used a Sound Factory to create a USoundWave at runtime:
USoundFactory* NewFactory = NewObject<USoundFactory>(this, TEXT("Sound Factory"));

bool bOperationCanceled;
UObject* MyRuntimeObject = NewFactory->FactoryCreateFile(USoundWave::StaticClass(), this, "RuntimeAudio", 
							EObjectFlags::RF_NoFlags, TEXT("D:\\PC\\Music\\Captured Audios\\Test.wav"), 
							TEXT(""), nullptr, bOperationCanceled);

USoundWave* MyRuntimeAudio = Cast<USoundWave>(MyRuntimeObject);
if (IsValid(MyRuntimeAudio))
{
	AudioCompEx->SetSound(MyRuntimeAudio);
	AudioCompEx->Play();
}
Obs.: AudioCompEx is a UAudioComponent;
Obs. 2: Add “AudioEditor” module in your Build.cs to enable USoundFactory class

The result:

With this, you can load a file and play in a determined location on level, apply attenuation, etc, via your Audio Component/Sound Wave Component

The “AudioEditor” module only works in the Editor. : (

There is a plugin that does this job: gtreshchev/RuntimeAudioImporter: Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime. (github.com)

KILSU77 · March 15, 2022, 5:31am

I’m sorry… that was my fault, i did test voice to text
‘LanguageID = ko-KR’ setting, it’s dosen’t work, but ‘LanguegeId = en-US’ setting, it’s work very well.

lucoiso · April 5, 2022, 4:35pm

Working with official UE5 released version!

4TC0 · April 9, 2022, 1:16pm

Hello!

During Speech to Text process, I would like to set a Push to talk button and recieve a recognized string only after releasing the key, granting that my full phrase was parsed, not after a period of time without audio. Is there a way to do this?
Also, with my language (pt-BR), the recognized string returns a character encoding that is nos correct. The recognized string returns “Ã©” instead of “é”, “Ã¡” instead of “á”, etc. Is there a way to change this?

Thanks!!

lucoiso · April 9, 2022, 4:42pm

Released a fix to this:
Release AzSpeech v2.1.1 · lucoiso/UEAzSpeech (github.com)

This update will be on the Marketplace soon

Solution:

Changed the string conversion from wchar_t* to UTF8_TO_TCHAR(STRING.c_str())

Noyag777 · April 18, 2022, 10:40pm

Great plugin. Would be nice to have SSML supported and I am trying to figure out how to abort some sound. Can’t figure out the location of the ‘default audio component’ so far. Is it at the Game Instance? Or at the Level Blueprint? Or at the character that speaks audio?

lucoiso · April 19, 2022, 3:31pm

Hello @Noyag777!

This plugins uses the default output/input audio devices from Windows (OS) Settings, at this moment is not possible to change this with this plugin.

But with the Text-To-WAV file, you can save a speech into a .WAV file and stream it to a USoundWave to use inside a Audio Component. : )

Maybe this plugin could help you to stream a .WAV into a UAudioComponent/USoundWave: gtreshchev/RuntimeAudioImporter: Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime. (github.com) 4

Noyag777 · April 19, 2022, 9:37pm

This plugins uses the default output/input audio devices from Windows (OS) Settings

That’s what I was afraid of indeed.

Maybe this plugin could help you to stream a .WAV into a UAudioComponent/USoundWave

Aah, that’s great indeed. I was already trying to figure out if there was a way to do that. Thanks for the link!

lucoiso · April 20, 2022, 2:38am

Some upcoming features:

I’ve committed this update to a separate branch (upcoming-test) and will release it when I think it’s fully functional and bug free.

But if you want to test these features, feel free to use it and give me feedback! : )

Please note that these features are in testing, some things may change until the release date.

Engineering_Visions · April 20, 2022, 4:36am

Does this technology work with voice commands? I want to use it like Siri: “Hey Unreal, set the weather to thunder and rain.”

lucoiso · April 20, 2022, 3:19pm

Doesn’t support voice commands by itself but there’s some functions that convert your speech into a string (VoiceToTextAsync) and convert a string into speech (TextToVoiceAsync and the upcoming TextToStreamAsync) that you can use to to construct this logic.

For example: the VoiceToTextAsync returns a string of your recognized speech that you can use inside a comparing logic, etc.

lucoiso · April 20, 2022, 7:50pm

New features released:

Marketplace: AzSpeech - Text and Voice in Code Plugins - UE Marketplace (unrealengine.com)
Github: Release AzSpeech v3.0 · lucoiso/UEAzSpeech (github.com)

lucoiso · April 27, 2022, 1:44pm

A short example:

Voice to Text blueprint:

image904×334 60.2 KB

When the key M is pressed, Azure will listen to your default microphone (from Windows OS settings) and print the recognized speech as a string in your screen.

Getting APIAccess Key and Region: lucoiso/UEAzSpeech (github.com)

image1009×668 43.9 KB

More informations: UEAzSpeech/README.md at main · lucoiso/UEAzSpeech (github.com)