[FREE] AzSpeech plugin: Text-to-Speech, Speech-to-Text and more with Microsoft Azure Cognitive Services

SpeechGPT

image1920×1032 21.3 KB

About

Created an example project of a basic “voice assistant” mixing 2 plugins that I’ve created:

HttpGPT: An plugin that allows the dev to send HTTP requests to OpenAI GPT.

AzSpeech: An plugin that integrates Azure Speech Cognitive Services to Unreal Engine.

Project Repository: UESpeechGPT (github.com)

Video

lucoiso · February 5, 2023, 9:14pm

AzSpeech v1.3.12

Release: AzSpeech v1.3.12 (github.com)
Marketplace: AzSpeech - Voice and Text in Code Plugins - UE Marketplace (unrealengine.com)
Pull Request: v1.3.12 by @lucoiso in v1.3.12 by lucoiso · Pull Request #105 · lucoiso/UEAzSpeech · GitHub

What’s Changed

Hotfix: Set param as default to self to avoid multiple inputs in blueprint pure node by @lucoiso in hotfix: set param as default to self to avoid multiple inputs in blueprint pure node by lucoiso · Pull Request #100 · lucoiso/UEAzSpeech · GitHub
Exposed settings: Candidate Languages; Phrase List Map; Recognition Map; String Delimiters by @lucoiso in Exposed settings: Candidate Languages; Phrase List Map; Recognition Map; String Delimiters by lucoiso · Pull Request #102 · lucoiso/UEAzSpeech · GitHub
Marketplace: Change (Again) the way the dependencies linkages occur when the plugin is targeting editor builds if installed via marketplace by @lucoiso in Marketplace: Change (Again) the way the dependencies linkages occur when the plugin is targeting editor builds if installed via marketplace by lucoiso · Pull Request #104 · lucoiso/UEAzSpeech · GitHub

Full Changelog: Comparing v1.3.11...v1.3.12 · lucoiso/UEAzSpeech · GitHub

Experimental Multi-Platform Support: experimental/MULTI-PLATFORM (github.com)

New Example Project: SpeechGPT

Example Project that uses both OpenAI GPT-3 and Microsoft Azure Speech Services

Links:

lucoiso · February 7, 2023, 12:17am

Related to the new example project, created a new post for HttpGPT! : )

lucoiso · February 19, 2023, 4:35pm

Hi @BeehiveBob !

I made some changes in the Sound Wave generation, but I can’t test it on the Steam Deck because I don’t have it, hehehe.

Ref. commit: Set Sound Wave generation to allow storing the file in project content ( (github.com)
Branch: development (github.com)

About the changes

Added more information to generated audio and a way to save generated Soundwaves to allow users to generate audio that can be packaged into the project as USoundWaves.

This gave me the opportunity to create an Editor Tool to generate and save the audios without the need to have a blueprint and generate them in simulations, this will help to avoid costs in runtime, as we will be able to generate the audios before packaging, hehehehe. I will start working on it soon.

These changes were made in the ‘Convert … to Sound Wave’ functions:

Example:

Notes

These changes are currently only in development branch: development (github.com)
Tasks that uses the Sound Wave generation are already using these changes but will continue generating Transient Sound Waves.
To generate the Sound Waves and save in project’s content, you can use the ‘… to Audio Data’ tasks and generate a Sound Wave using the Audio Data as parameter in the modified functions.

lucoiso · February 19, 2023, 10:09pm

Currently only tested on UE5.1, but it’s coming!

And including a new function to get the available voices:

Issues

Branch: feature/EDITOR-TOOL-113 (github.com)

lucoiso · February 20, 2023, 1:51am

AzSpeech v1.4.0

Release: AzSpeech v1.4.0 (github.com)
Marketplace: Cancelled - Sent a newer version
Pull Request: v1.4.0 by lucoiso · Pull Request #117 · lucoiso/UEAzSpeech · GitHub

Changes

Notes

Documentation will be updated soon: Update the documentation with new Features · Issue #116 · lucoiso/UEAzSpeech · GitHub

Known Issues

Add checks: Verify Settings before performing some tasks · Issue #118 · lucoiso/UEAzSpeech · GitHub

Screenshots

New Editor Tool: Audio Generator

New Function: Get Available Voices

New Settings

lucoiso · February 20, 2023, 2:22pm

AzSpeech v1.4.1

Release: AzSpeech v1.4.1 (github.com)
Marketplace: AzSpeech - Voice and Text in Code Plugins - UE Marketplace (unrealengine.com)
Pull Request: v1.4.1 by @lucoiso in v1.4.1 by lucoiso · Pull Request #122 · lucoiso/UEAzSpeech · GitHub

What’s Changed

Fix Audio Generator Tool: Avoid multiple calls if there's no change in locale param · Issue #120 · lucoiso/UEAzSpeech · GitHub
Fix Audio Generator Tool: Change the Modules path finding to only get the owning plugins with Content enabled · Issue #119 · lucoiso/UEAzSpeech · GitHub
Fix Add checks: Verify Settings before performing some tasks · Issue #118 · lucoiso/UEAzSpeech · GitHub

Full Changelog: Comparing v1.4.0...v1.4.1 · lucoiso/UEAzSpeech · GitHub

Post Commits

v1.4.1 Marketplace Hotfix: Add missing includes · lucoiso/UEAzSpeech@daf9316 · GitHub

lucoiso · February 20, 2023, 10:32pm

AzSpeech v1.4.2

Release: AzSpeech v1.4.2 (github.com)
Marketplace: AzSpeech - Voice and Text in Code Plugins - UE Marketplace (unrealengine.com)
Pull Request: v1.4.2 Hotfix: Fix Audio Generator’ Generation Button Status (github.com)

What’s Changed

v1.4.2 Hotfix: Fix Audio Generator’ Generation Button Status by @lucoiso in v1.4.2 Hotfix: Fix Audio Generator' Generation Button Status by lucoiso · Pull Request #124 · lucoiso/UEAzSpeech · GitHub

Full Changelog: Comparing v1.4.1...v1.4.2 · lucoiso/UEAzSpeech · GitHub

lucoiso · February 21, 2023, 4:57pm

AzSpeech v1.4.3

Release: AzSpeech v1.4.3 (github.com)
Marketplace: Waiting for approval
Pull Request: v1.4.3 by @lucoiso in v1.4.3 by lucoiso · Pull Request #132 · lucoiso/UEAzSpeech · GitHub

What’s Changed

Full Changelog: Comparing v1.4.2...v1.4.3 · lucoiso/UEAzSpeech · GitHub

nietolerancyjny · February 22, 2023, 7:35am

Hey,
Great plugin, love it.

I have a few problems though:

when I use the text to speech function it only gets the second part of the sentence as result. E.g. if I say “lorem ipsum one two three four five” I get the string as “three four five”.
the same problem occurs if I use the button to speak (when I press the button it calls the text to speech function)
I decided to work around the truncated sentence problem by recording the audio using the audio component to a wav file. And then using the wav file to text function. Unfortunately in this case the wav file to text function does not work at all, i.e. I do not get any result.

The logic is as follows:
KeyPressed: Audio Capture Enable → Start Recording Output
KeyReleased: Audio Capture Disable → Finish Recording Output (export type wav file) → delay (1sec) → Wav file to text

To make sure the wav file was saved correctly, I changed the wav file to text function to convert wav file to usoundwave and then plugged its output into play sound 2d and the wav file was read correctly. Unfortunately the wav file to text function does not work for me at all.

Logs:
UE4.27
AzSpeech v1.4.3

PIE View:

Task: WavFileToText (155787)
ActivationTime: 6milliseconds
ActiveTime (102 second) -> Never stops
Current recognised string:

OutputLog:

LogAzSpeech: Display: Task: WavFileToText (155787); Function: Activate; Message: Activating task
LogAzSpeech_Internal: Display: Task: WavFileToText (155787); Function: StartAzureTaskWork; Message: Starting Azure SDK task
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: Init; Message: Initializing runnable thread
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: CanInitializeTask; Message: Checking if can initialize task in current context
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: Run; Message: Running runnable thread work
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: InitializeAzureObject; Message: Initializing Azure Object
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: InitializeAzureObject; Message: Creating recognizer object
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: CreateSpeechConfig; Message: Creating Azure SDK speech config
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: ApplySDKSettings; Message: Applying Azure SDK Settings
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: EnableLogInConfiguration; Message: Enabling Azure SDK log
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: ApplySDKSettings; Message: Using language: en-PL
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: Run; Message: Starting recognition

lucoiso · February 22, 2023, 3:36pm

Hi @nietolerancyjny ! : )

I was testing here (UE4.27 in editor) and I couldn’t reproduce the issue using Text to Speech:

But when i try to use .wav to Text or Speech to Text using en-PL as language or candidate language, the tasks doesn’t recognize anything.

Searching the supported languages page, I didn’t find en-PL: Language support - Speech service - Azure Cognitive Services | Microsoft Learn

I’m using en-PL because I saw in your log that you are using it, hehehe

One of the problems is possibly because of the unsupported language, could you test using another one like en-US instead of en-PL?

I will continue to investigate the reason for the first issue mentioned. But could you send me the SDK logs? : )

You can enable it in Project Settings → Plugins → AzSpeech → Enable SDK Logs
The logs are generated in: PROJECT_DIRECTORY/Saved/Logs/AzSpeech

nietolerancyjny · February 24, 2023, 8:12am

I couldn’t reproduce the issue using Text to Speech

maybe it’s not a problem with the plugin but with the internet connection or some other settings? I’ve also noticed that using the built-in editor function “AzSpeech audio generator” not always, but sometimes cuts off my first words of text (both in Polish and English). For example, I set your text to the built-in AZSpeech audio generator engine:

"One of the problems is possibly because of the unsupported language, could you test using another one like en-US instead of en-PL?
I will continue to investigate the reason for the first issue mentioned. But could you send me the SDK logs? : )"

And in the wav file I received I can only hear “DK logs”
If I use the generate audio button again, I already get the whole speech

UPDATE:
ok, I think I found the cause, now it doesn’t cut off the first part of my sentence. I disabled the micforon input device (live gamer portable 2 plus) in the windows settings and left only Vive Pro Multimedia Audio.

But when i try to use .wav to Text or Speech to Text using en-PL as language or candidate language, the tasks doesn’t recognize anything.

I use pl-PL not en-PL

Searching the supported languages page, I didn’t find en-PL

as above. I use pl-PL and that is on the supported list.

AZ settings

logs:
UEAzSpeech 2023.02.24-07.54.53.log (72.1 KB)
UEAzSpeech 2023.02.24-07.54.13.log (123.1 KB)
UEAzSpeech 2023.02.24-07.52.43.log (105.1 KB)
UEAzSpeech 2023.02.24-07.47.39.log (98.1 KB)
UEAzSpeech 2023.02.24-07.47.13.log (97.8 KB)

BTW.
How can I force speech to text to work continually without having to use any buttons? I’ve done something like this, but when i exit PIE my editor crash

lucoiso · February 25, 2023, 5:49pm

I’ll check this soon

lucoiso · February 25, 2023, 5:50pm

AzSpeech v1.4.4

Release: AzSpeech v1.4.4 (github.com)
Marketplace: Publish failed
Pull Request: v1.4.4 (github.com)

Changes

Add new settings: Change Recognition Format & Change Synthesis Format
Set Compression to use Default Compression Type from Project Settings → Audio
Categorize Settings
Add Scope Locks in Runnable’ funcs

lucoiso · February 26, 2023, 4:36pm

AzSpeech v1.4.5

Release: AzSpeech v1.4.5 (github.com)
Marketplace: AzSpeech - Voice and Text in Code Plugins - UE Marketplace
Pull Request: v1.4.5 by @lucoiso in v1.4.5 by lucoiso · Pull Request #139 · lucoiso/UEAzSpeech · GitHub

What’s Changed

Hotfix: Fix compiling in UE5.0 by setting sound wave generation to use Bink Compression

Full Changelog: Comparing v1.4.4...v1.4.5 · lucoiso/UEAzSpeech · GitHub

IkaranX07 · March 2, 2023, 8:14am

I love this plugin! It’s working well for me. Already doing some simple tests with it. Tech Update: Building a YouTube Cohost - Real-time conversation with AI Assistant in Unreal Engine - YouTube

thank you!

Any chance you’ll be updating it to use the new ChatGPT API connections soon?

lucoiso · March 2, 2023, 3:16pm

I’m already working on it, hehehe

Here: GitHub - lucoiso/UESpeechGPT at development
And here: GitHub - lucoiso/UEHttpGPT at development

Started to change this sample project to use the new Chat API with a chat history, etc. : )

lucoiso · March 2, 2023, 3:36pm

Sample project updated

SpeechGPT v1.1.0

image1920×1031 75.5 KB

Release: SpeechGPT v1.1.0 (github.com)

Pull Request: v1.1.0 by @lucoiso in v1.1.0 by lucoiso · Pull Request #1 · lucoiso/UESpeechGPT · GitHub

What’s Changed

Now using the new Chat API

More settings, delegates & logs

Chat history

New options

Full Changelog: Comparing v1.0.0...v1.1.0 · lucoiso/UESpeechGPT · GitHub