[FREE] AzSpeech plugin: Text-to-Speech, Speech-to-Text and more with Microsoft Azure Cognitive Services

lucoiso · March 20, 2023, 6:24pm

Sample Project Updated:

SpeechGPT v1.2.1

Release: SpeechGPT v1.2.1 (github.com)

Pull Request: v1.2.1 by lucoiso · Pull Request #4 · lucoiso/UESpeechGPT (github.com)

Changes

HttpGPT Update: HttpGPT v1.4.2 Latest (github.com)

Now including the new Editor Tool: HttpGPT Chat

Editor1920×1080 351 KB

lucoiso · March 22, 2023, 3:00pm

AzSpeech v1.6.1

Release: AzSpeech v1.6.1 (github.com)
Marketplace: Waiting for approval
Pull Request: v1.6.1 by lucoiso · Pull Request #162 · lucoiso/UEAzSpeech · GitHub

Changes

Move tasks update back to Game Thread to avoid delays in getters after a broadcast
Reorder settings initialization
Implement a template function (protected - not exposed) to get speech result properties

lucoiso · March 23, 2023, 3:51pm

AzSpeech v1.6.2

Release: AzSpeech v1.6.2 (github.com)
Marketplace: Unreal Engine Marketplace
Pull Request: v1.6.2 by lucoiso · Pull Request #165 · lucoiso/UEAzSpeech · GitHub

Changes

Fix module availability checking on Sound Wave generation
Adjust categories of helper functions (adding sub-categories to them)
Rename OutputModulePath param to OutputModule
Add new functions: Get Plugin Name, Get Plugin Version + Is Content Module Available + Cast to AzSpeech Task Base

New Sample Project

Link: GitHub - lucoiso/UEAzSpeechSampleProject: Unreal Engine 5.1 sample project for the AzSpeech plugin.

Rising2014 · March 24, 2023, 4:51pm

Hello, when I was using AZSpeech 1.62, I found that when the blueprint node SSML To Sound Wave with Custom Options was running, an Unhandled Exception would occur and the engine would crash, as shown in the figure below. I have emailed you the crash log.

In addition, when I was using the blueprint node Speech to Text with Default Options, I found that no matter what value I set for Attempt Timeout in Seconds, it would always be forcibly interrupted after running for about 35 seconds. The reason is unknown. As shown below.

lucoiso · March 24, 2023, 8:12pm

Thanks for the info. Checking right now and saw the logs that you sent via email! : )

lucoiso · March 24, 2023, 9:59pm

Hello again @Rising2014 ! : )

Found the reason of the bug:

In UE5.0, for some weird reason, bUsePrivateEndpoint is being always true, I’m still trying to identify the cause of this behavior.

Due to this behavior, the tasks were trying to use the empty endpoint URL and throwing a exception with the message: Invalid URL Scheme.

I’m adding a endpoint validator to the settings to change the default value (when bUsePrivateEndpoint is false in Default Options) to https://REGION_ID.api.cognitive.microsoft.com/sts/v1.0/issuetoken to avoid the crashes while this issue is still occurring.

While the update isn’t on the marketplace, you can set the Private Endpoint value in Project Settings → Plugins → AzSpeech to: https://southeastasia.api.cognitive.microsoft.com/sts/v1.0/issuetoken

To edit the private endpoint, you’ll need to enable Use Private Endpoint first. You can disable it again after setting the endpoint url.

lucoiso · March 24, 2023, 11:15pm

AzSpeech v1.6.3

Release: AzSpeech v1.6.3 (github.com)
Marketplace: Unreal Engine Marketplace
Pull Request: v1.6.3 by lucoiso · Pull Request #168 · lucoiso/UEAzSpeech · GitHub

Changes

Add endpoint validation (Fix: UE5.0 setting bUsePrivateEndpoind always to true and crashing if has empty value)
Remove some unnecessary properties & functions
Adjust includes & remove unnecessary
Remove unnecessary async calls in broadcast of some tasks (they’re already being called in the game thread)

lucoiso · March 24, 2023, 11:33pm

Sent a fix for the crash to the marketplace and its already merged in the main branch.

About the Speech to Text cancellation, I’ll continue investigating this issue.
Tested here and noticed that Azure is sending the reason 3 when the speech reach 30s duration:

    /// <summary>
    /// Indicates the speech result contains final text that has been recognized.
    /// Speech Recognition is now complete for this phrase.
    /// </summary>
    RecognizedSpeech = 3,

This means that the task is completed. But it’s being received even if I’m still speaking.

In this part of documentation they said that the single recognition has 15s of duration limit, but I didn’t find anything related to Continuous Recognition max duration yet.

The TimeOutLimitInSeconds property is related to the attempt to initialize the task and its being used in C++ std::future::wait_for(…), I need to adjust the comment.

Ref:

/* Time limit in seconds to wait for related asynchronous tasks to complete */
	UPROPERTY(GlobalConfig, EditAnywhere, Category = "Tasks", Meta = (DisplayName = "Attempt Timeout in Seconds", ClampMin = "1", UIMin = "1", ClampMax = "600", UIMax = "600"))
	int32 TimeOutInSeconds;

Rising2014 · March 25, 2023, 3:31am

Hi, Lucoiso. Thank you very much, looking forward to your new version.

lucoiso · April 8, 2023, 4:20pm

Sorry for the lack of updates, i was unable to access this thread. UE forum was freezing every time I tried to open this page .-.

Made some updates that you can check here: Releases 34 (github.com)

I’m adding support for UE5.2 and it’s working fine in the new version after a small change in a .Build.cs file!

Issue: Add support for UE5.2 (github.com)
Branch: feature/UE52-SUPPORT-185 (github.com)

Rising2014 · April 12, 2023, 4:27am

Hi, lucoiso. I’ve emailed you before about the on/off control of the mic, and I want to explain again why I’m making this suggestion. Because when using the two blueprint nodes Speech to Text with Custom Options or Speech to Text with Default Options, if you want them to stop working, I can only use the Stop AzSpeech Task node to complete it, but due to the processing of Speech to Text There is a delay, so I can’t use the Stop AzSpeech Task node immediately, I can only wait for the Speech to Text processing to end, but at this time it is very likely that the microphone can continue to receive voice, and this may not be what the user wants If you want, if you can add a microphone switch or volume control, you can solve this problem very well. Have a nice day.

lucoiso · April 13, 2023, 8:38pm

Hi! Thanks for your suggestion! : )

Replied the email! hehe

lucoiso · April 13, 2023, 9:20pm

AzSpeech v1.6.6

Release: AzSpeech v1.6.6 (github.com)
Marketplace: Waiting for approval

Changes

Update Azure SDK to v1.27.0
Adjust MacOS libraries [Experimental]

Rising2014 · April 14, 2023, 1:26pm

Hi, Does AzSpeech1.6.6 add the switch control of the microphone?

lucoiso · April 14, 2023, 2:35pm

Hi! : )

Nope, but the next version will contain something that could help a little bit: Before finishing the tasks, the signals will be disconnected to avoid undesired updates while closing the connection.

lucoiso · April 14, 2023, 2:36pm

AzSpeech v1.6.7

Release: AzSpeech v1.6.7 (github.com)
Marketplace: AzSpeech - Voice and Text in Code Plugins - UE Marketplace (unrealengine.com)
Pull Request: v1.6.7 by lucoiso · Pull Request #193 · lucoiso/UEAzSpeech · GitHub

Changes

Disconnect signals before trying to end the task to avoid undesired updates while closing the connection
Move Recognition Started delegate broadcast to a new signal callback: Session Started

Rising2014 · April 18, 2023, 8:55am

Hi, AzSpeech v1.6.7 version Unreal Marketplace has not been approved yet? The current version of Marketplace is still 1.6.6

Wachaoo · April 18, 2023, 9:32am

Hello Lucoiso!
First of all, thank you for the great plugin this helps me a lot with my project and I can’t wait to see your next version.
Besides that, I want to ask you about the Text To Speech function, I want to get when the task is done but the Is Task Still Valid function always returns True, and the Is Task Ready To Destroy function always returns False although the speech stopped.
Can we get the Duration time of the speech or something?

lucoiso · April 18, 2023, 2:01pm

That’s strange. They already approved. :0

I’ll send a new version to update the current one

lucoiso · April 18, 2023, 2:17pm

The next version will contain some functions to get the duration of synthesis and recognitions in milliseconds, hehe