[FREE] AzSpeech plugin: Text-to-Speech, Speech-to-Text and more with Microsoft Azure Cognitive Services

AzSpeech v1.3.10

What’s Changed

Full Changelog: Comparing v1.3.9...v1.3.10 · lucoiso/UEAzSpeech · GitHub


Experimental Multi-Platform Support: experimental/MULTI-PLATFORM (github.com)

1 Like

AzSpeech v1.3.11

What’s Changed

Full Changelog: Comparing v1.3.10...v1.3.11 · lucoiso/UEAzSpeech · GitHub


Post commits


Experimental Multi-Platform Support: experimental/MULTI-PLATFORM (github.com)

1 Like

New example project created using AzSpeech

SpeechGPT

About

Created an example project of a basic “voice assistant” mixing 2 plugins that I’ve created:

  • HttpGPT: An plugin that allows the dev to send HTTP requests to OpenAI GPT.
  • AzSpeech: An plugin that integrates Azure Speech Cognitive Services to Unreal Engine.

Project Repository: UESpeechGPT (github.com)

Video

3 Likes

AzSpeech v1.3.12

What’s Changed

Full Changelog: Comparing v1.3.11...v1.3.12 · lucoiso/UEAzSpeech · GitHub


Experimental Multi-Platform Support: experimental/MULTI-PLATFORM (github.com)


New Example Project: SpeechGPT

Example Project that uses both OpenAI GPT-3 and Microsoft Azure Speech Services

Links:

2 Likes

Related to the new example project, created a new post for HttpGPT! : )

2 Likes

Hi @BeehiveBob !

I made some changes in the Sound Wave generation, but I can’t test it on the Steam Deck because I don’t have it, hehehe.

About the changes

Added more information to generated audio and a way to save generated Soundwaves to allow users to generate audio that can be packaged into the project as USoundWaves.

This gave me the opportunity to create an Editor Tool to generate and save the audios without the need to have a blueprint and generate them in simulations, this will help to avoid costs in runtime, as we will be able to generate the audios before packaging, hehehehe. I will start working on it soon.

These changes were made in the ‘Convert … to Sound Wave’ functions:

Example:

image

Notes

  • These changes are currently only in development branch: development (github.com)
  • Tasks that uses the Sound Wave generation are already using these changes but will continue generating Transient Sound Waves.
  • To generate the Sound Waves and save in project’s content, you can use the ‘… to Audio Data’ tasks and generate a Sound Wave using the Audio Data as parameter in the modified functions.

Currently only tested on UE5.1, but it’s coming! :eyes:

And including a new function to get the available voices:

Issues

Branch: feature/EDITOR-TOOL-113 (github.com)

1 Like

AzSpeech v1.4.0

Changes

Notes

Known Issues

Screenshots

New Editor Tool: Audio Generator

New Function: Get Available Voices

New Settings

1 Like

AzSpeech v1.4.1

What’s Changed

Full Changelog: Comparing v1.4.0...v1.4.1 · lucoiso/UEAzSpeech · GitHub

Post Commits

1 Like

AzSpeech v1.4.2

What’s Changed

Full Changelog: Comparing v1.4.1...v1.4.2 · lucoiso/UEAzSpeech · GitHub

1 Like

AzSpeech v1.4.3

What’s Changed

Full Changelog: Comparing v1.4.2...v1.4.3 · lucoiso/UEAzSpeech · GitHub

1 Like

Hey,
Great plugin, love it.

I have a few problems though:

  1. when I use the text to speech function it only gets the second part of the sentence as result. E.g. if I say “lorem ipsum one two three four five” I get the string as “three four five”.

  2. the same problem occurs if I use the button to speak (when I press the button it calls the text to speech function)

  3. I decided to work around the truncated sentence problem by recording the audio using the audio component to a wav file. And then using the wav file to text function. Unfortunately in this case the wav file to text function does not work at all, i.e. I do not get any result.

The logic is as follows:
KeyPressed: Audio Capture Enable → Start Recording Output
KeyReleased: Audio Capture Disable → Finish Recording Output (export type wav file) → delay (1sec) → Wav file to text

To make sure the wav file was saved correctly, I changed the wav file to text function to convert wav file to usoundwave and then plugged its output into play sound 2d and the wav file was read correctly. Unfortunately the wav file to text function does not work for me at all.

Logs:
UE4.27
AzSpeech v1.4.3

PIE View:

Task: WavFileToText (155787)
ActivationTime: 6milliseconds
ActiveTime (102 second) -> Never stops
Current recognised string:

OutputLog:

LogAzSpeech: Display: Task: WavFileToText (155787); Function: Activate; Message: Activating task
LogAzSpeech_Internal: Display: Task: WavFileToText (155787); Function: StartAzureTaskWork; Message: Starting Azure SDK task
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: Init; Message: Initializing runnable thread
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: CanInitializeTask; Message: Checking if can initialize task in current context
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: Run; Message: Running runnable thread work
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: InitializeAzureObject; Message: Initializing Azure Object
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: InitializeAzureObject; Message: Creating recognizer object
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: CreateSpeechConfig; Message: Creating Azure SDK speech config
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: ApplySDKSettings; Message: Applying Azure SDK Settings
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: EnableLogInConfiguration; Message: Enabling Azure SDK log
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: ApplySDKSettings; Message: Using language: en-PL
LogAzSpeech_Internal: Display: Thread: AzSpeech_WavFileToText_155787; Function: Run; Message: Starting recognition
1 Like

Hi @nietolerancyjny ! : )

I was testing here (UE4.27 in editor) and I couldn’t reproduce the issue using Text to Speech:

But when i try to use .wav to Text or Speech to Text using en-PL as language or candidate language, the tasks doesn’t recognize anything.

Searching the supported languages page, I didn’t find en-PL: Language support - Speech service - Azure Cognitive Services | Microsoft Learn

I’m using en-PL because I saw in your log that you are using it, hehehe

One of the problems is possibly because of the unsupported language, could you test using another one like en-US instead of en-PL?

I will continue to investigate the reason for the first issue mentioned. But could you send me the SDK logs? : )

You can enable it in Project Settings → Plugins → AzSpeech → Enable SDK Logs
The logs are generated in: PROJECT_DIRECTORY/Saved/Logs/AzSpeech

I couldn’t reproduce the issue using Text to Speech

maybe it’s not a problem with the plugin but with the internet connection or some other settings? I’ve also noticed that using the built-in editor function “AzSpeech audio generator” not always, but sometimes cuts off my first words of text (both in Polish and English). For example, I set your text to the built-in AZSpeech audio generator engine:

"One of the problems is possibly because of the unsupported language, could you test using another one like en-US instead of en-PL?
I will continue to investigate the reason for the first issue mentioned. But could you send me the SDK logs? : )"

And in the wav file I received I can only hear “DK logs” :slight_smile:
If I use the generate audio button again, I already get the whole speech

UPDATE:
ok, I think I found the cause, now it doesn’t cut off the first part of my sentence. I disabled the micforon input device (live gamer portable 2 plus) in the windows settings and left only Vive Pro Multimedia Audio.

image

But when i try to use .wav to Text or Speech to Text using en-PL as language or candidate language, the tasks doesn’t recognize anything.

I use pl-PL not en-PL

Searching the supported languages page, I didn’t find en-PL

as above. I use pl-PL and that is on the supported list.

AZ settings

logs:
UEAzSpeech 2023.02.24-07.54.53.log (72.1 KB)
UEAzSpeech 2023.02.24-07.54.13.log (123.1 KB)
UEAzSpeech 2023.02.24-07.52.43.log (105.1 KB)
UEAzSpeech 2023.02.24-07.47.39.log (98.1 KB)
UEAzSpeech 2023.02.24-07.47.13.log (97.8 KB)

BTW.
How can I force speech to text to work continually without having to use any buttons? I’ve done something like this, but when i exit PIE my editor crash

1 Like

I’ll check this soon :hushed:

AzSpeech v1.4.4

Changes

  • Add new settings: Change Recognition Format & Change Synthesis Format
  • Set Compression to use Default Compression Type from Project Settings → Audio
  • Categorize Settings
  • Add Scope Locks in Runnable’ funcs

1 Like

AzSpeech v1.4.5

What’s Changed

  • Hotfix: Fix compiling in UE5.0 by setting sound wave generation to use Bink Compression

Full Changelog: Comparing v1.4.4...v1.4.5 · lucoiso/UEAzSpeech · GitHub

1 Like

I love this plugin! It’s working well for me. Already doing some simple tests with it. Tech Update: Building a YouTube Cohost - Real-time conversation with AI Assistant in Unreal Engine - YouTube

thank you!

Any chance you’ll be updating it to use the new ChatGPT API connections soon?

I’m already working on it, hehehe

Here: GitHub - lucoiso/UESpeechGPT at development
And here: GitHub - lucoiso/UEHttpGPT at development

Started to change this sample project to use the new Chat API with a chat history, etc. : )

1 Like

Sample project updated

SpeechGPT v1.1.0

What’s Changed

  • Now using the new Chat API
  • More settings, delegates & logs
  • Chat history
  • New options

Full Changelog: Comparing v1.0.0...v1.1.0 · lucoiso/UESpeechGPT · GitHub

1 Like