Hey guys,
I spent the weekend researching voice recognition APIs and how to integrate them in with UE4. I’ve worked out most of the technical stuff (in the C# SAPI library, for example), but the usage of the plugin kinda dictates how I expose it to you, the user.
In the following video (early testing in only a C# console application), I defined:
“Find”
“restaurants”, “hotels”, “gas stations”
“near”
“Seattle”, “Boston”, “Dallas”
and
“Team”
“alpha”, “bravo”, “charlie”, “delta”, “echo”, “foxtrot”
“attack”, “defend”, “retreat from”
“Sydney”, “Brisbane”, “Melbourne”
Because it’s phrase-directed rather than going for pure dictation, it’s pretty fast. I screwed it up by being a bit hasty and hanging back off of the microphone:
?v=7jBSRmDL_s0
I’m intending to allow the user to bind blueprint and C++ functions to the ‘speech recognized’ callbacks, so that if something has been recognized, you’ll be notified (including the phrase itself). Possibly a wrapper to that so that you can bind entire functions to phrases and let the plugin invoke them under the hood (i.e. you bind your ‘Select All’ function to the recognition callback, and it only calls ‘Select All’ when the chosen phrase is recognized… which would probably be the phrase “Select All”, but would be up to you).
The problem with any of these approaches is that it can be hard to add new phrases on the fly. It’s possible, but I need to be sure that that’s actually desired.
Would you prefer to set up the speech in a document and have that loaded in, or specify via blueprint nodes/C++ functions, or via properties/member variables? The former is easier for lots of text and adding wildcards and tricky combinations, but makes binding callbacks more complicated (and less likely for the more complicated callbacks options to be included as a feature).
Throw your feedback at me!