Speech Recognition Plugin - Sphinx-UE4

@n00854180t
I knew that the audio stream pocketsphinx expects is little-endian 16khz mono.
I wasn’t sure how to get the microphone audio from UE at the time, nor what format it was going to be in.
At the time, I just whipped it together, and it seemed easiest to use the existing method from the pocketsphinx examples to obtain the audio stream.
I`d say it’s worth finding out what sampling rate/format the OVR lip sync plugin requires, to ensure compatibility with the stream from UE

@BigVulpes
Funny you should mention this, JTensai committed a change to add events to trigger when an utterance has been started, and when it has stopped.
I have merged this into the experimental branch, and also updated the Binary dll’s.
Let me know of any problems.

In regards to general changes, I have a number of big changes in the works.
I should have these complete and tested, in the next few days.

This includes

  • Adding grammar support.
  • Setting of additional pocketsphinx parameters at run-time.
  • Allowing keyword sets to be switched between one another at run-time.
    This will allow you to detect only the phrases which are context specific.
    Eg, One set of phrases will be detectable in-front of a door, another set will be detectable at a locker.
    This improves accuracy, as you are trying to match against a smaller set of phrases.

In the mean-time, if people wanted to test grammar mode detection, download the following zip, extract, and run listen.bat to see an example of grammar detection.

https://drive.google.com/open?id=0BxR5qe2wdwSLSnMtMmNhOHpvZlE

You can use this to get started writing and testing your own grammar files.
I still feel for most cases, Keyword mode detection will be preferable, due to custom tolerence settings for phrases.

@ What do you mean with “experimental branch”? Does that mean you implemented it already into the most recent version?

@n00854180t i copied the stuff and renamed it, as you said. But the compiling still failed. It tells me that FWordSpokenSignature is unkown override specifier. I even tried to comment the stuff from wordspoken out, thinking that it might cause a problem if there a two dynamicmulticatdelegates, but that wont help either. I Attach the pictures of the files as they are now
First the .cpp file
43d5c5ed193331e664f498d9bb15a1e8d9a566a3.jpeg
and the .h file
ecee6710466ca844607c79ad3ce15612b8389d57.jpeg

@BigVulpes - I suggest just using the branch that mentioned.

That said, you’re almost there!

The only thing I see is that you should uncomment the part in the header that declares the WordSpoken_method.

@BigVulpes
Yes, that’s what I meant.
https://github.com/shanecolb/sphinx-ue4/tree/experimental

Take a look at the code from the following commit

Cheers,
Shane

Hey guys, so I finally got the OVR Lip Sync library working in UE4.

To use, create a blueprint of class VisemeGenerationActor. In your BP, call “Init” on Begin Play, and “Shutdown” on End Play. Then, on Begin Play again, assign an event to VisemeGenerated, which will give you an FOVRLipSyncFrame, containing an array of viseme values which can be fed into appropriate morph targets.

I’m eventually going to change this to be a component, as well as to allow passing arbitrary data, but for now it works well enough to have people check it out.

w00t!! Good job!

Does it work on Gear VR (Android) ?

Are there any tutorials about using it with characters in UE4 ?

I see Oculus got a hold of you over at github. I posted about your plugin on Oculus forums in UE4 section (apparently not whole a lot of people use UE4 with Oculus VR; so I post about anything new that comes out) Sorry if I put you on the spot :o

It’s not currently set up to work on Android (there are some defines/checks in the code that look for WIN32/64), but there is a libOVRLipSync.so file available in the Unity package, so it would mainly be an issue of fixing the defines and setting up the Build.cs to find the correct file (the .so) when on Android.

The other thing is that I’m not sure if the VoiceCapture interface operates the same way on Android, so that might be a sticking point.

As far as tutorials for setting it up, there aren’t any currently - I’m going to try and upload an example project here soon that uses the Unity example head mesh.

That said it’s quite simple really - the event returns an FOVRLipSyncFrame, which holds an array of 0-1 float values corresponding to viseme morphs - e.g., SIL, FF, PP, oh etc - mouth shapes. You then just feed these values into the appropriate morphs in your custom event (e.g., foreach loop, set morph value on your mesh).

I will try and get an example project up shortly.

No worries! I was planning to contact them to try and get this into an official release at some point anyhow. :slight_smile:

I was actually thinking of using OVRLipSync with NPCs. I know it was made to be used on player avatars, but since I am not really planning getting into multiplayer of things just yet, I figured it would be cool to have NPCs talking to the player (or among each other) with their lips moving more or less properly.

That should work, though it’s not set up for that right now (it always just gets the data from the local mic). There’s no reason the functions can’t be exposed to BP or whatever and used on an arbitrary audio buffer, though.

BTW, I will be putting up a thread specifically for the port later today, along with the example project, so we don’t keep derailing 's thread :stuck_out_tongue:

Sounds like a plan! :smiley:

Any luck with Android version ? :o

Hey bud, I put up a separate thread for the OVRLipSync plugin, let’s move discussion there so as not to clutter 's thread up XD OVRLipSync Plugin for UE4 - Plugins - Epic Developer Community Forums

As for Android, I haven’t had time to spend messing with it, but it’s honestly just a matter of changing some defines to account for Android, and putting in the .so from the Unity package in the right folder. If you want to try your hand at it, add me on Steam (same nick as here) and I’ll walk you through it.

I know, I posted there before. I was actually asking @ :wink:

Added you, thanks, but I’m gonna have to wait until you add sound input (sound file, not mic) as I am interested in talking NPCs (single player) :o

motorsep, yes, it’s in the works. Sorry, I just haven’t had much time, and the work has been very annoying, since each testing cycle require build/deploy/test/debug on the device…much longer than Windows testing.
I have speech recognition within Android working, through an Unreal Engine environment.
It’s just the communication between the two that I am currently having some issues with.
I still think any usage will require some manual steps, and a development background. It won’t be as simple as setting it up for Windows.
I`ll have some free-time over the weekend to work on it further.

yay! Sounds like you are almost there!

I am pretty sure it’s possible to build it as .so lib, something like OVRLipSync, but either way, if you can make a short step-by-step write up how to set it up and it wouldn’t require engine rebuild, it should be fine.

Sphinx-UE4 (Version 1.0):
I have made a number of changes to the plugin. Please checkout the following link, to download the example project and test away.
Feel free to provide feedback and suggestions within the thread.

Updated the speech recognition plugin I wrote, as well as improved the demo project.

  • Fixes a bug preventing Chinese from working.
  • Compile for UE4.13 and UE4.14
  • Added a few more languages.
  • Fixed a bug where a minor stutter would occur, when initially loading the plugin.
  • Added an experimental method to obtain an approximation of the Volume
    The demo project has been expanded, I personally like how the Grammar allows a super crude calculator to exist.

Check it out :slight_smile:

Nice update!

Any news about Android support by chance ? :o