HateDread
(HateDread)
March 10, 2015, 4:18am
5
Are these APIs English only? For someone whose native tongue is not English there are two problems with all these voice command games:
You probably have some kind of accent that screws up the parser. This leads to you having to put a lot of effort into getting the words right. Most of the time a voice command system is meant to give you an extra mode of input, other than your hands, but the combined result is that you spend so much focus on getting the voice commands right that you forget about your hands and everything they’re doing in the game.
It feels unnatural to speak these robotic phrases in language that isn’t your first one (ie. to speak English when you otherwise wouldn’t). You can compare it to taking a sip from a glass of water. Speaking these words would be represented by taking a sip. If the language is your own the glass is next to you, well within reach, and it’s easy to drink. If the language is not your own the glass is instead in a different room and each time you want to take a sip you have to get up and walk over to the glass, take a sip, leave the glass in the other room and walk back to where you were before. At some point it simply becomes more effort than it’s worth.
I’d suggest you put on a really thick foreign accent and see how hard it is to get it to work. Any sort of language support or calibration will do a lot to make this more accessible to the masses.
Either way this is cool and a good alternative for people who are unable or do not wish to play with their hands (or feet).
While it’s not something I have to worry about (as can be heard from the video), it’s definitely something to think about. I’m at the mercy of Microsoft and the voice packs they’ve released. If I’m not mistaken, these are the available packs:
MSSpeech_SR_en-US_TELE.msi
MSSpeech_SR_ca-ES_TELE.msi
MSSpeech_SR_da-DK_TELE.msi
MSSpeech_SR_de-DE_TELE.msi
MSSpeech_SR_en-AU_TELE.msi
MSSpeech_SR_en-CA_TELE.msi
MSSpeech_SR_en-GB_TELE.msi
MSSpeech_SR_en-IN_TELE.msi
MSSpeech_SR_es-ES_TELE.msi
MSSpeech_SR_es-MX_TELE.msi
MSSpeech_SR_fi-FI_TELE.msi
MSSpeech_SR_fr-CA_TELE.msi
MSSpeech_SR_fr-FR_TELE.msi
MSSpeech_SR_it-IT_TELE.msi
MSSpeech_SR_ja-JP_TELE.msi
MSSpeech_SR_ko-KR_TELE.msi
MSSpeech_SR_nb-NO_TELE.msi
MSSpeech_SR_nl-NL_TELE.msi
MSSpeech_SR_pl-PL_TELE.msi
MSSpeech_SR_pt-BR_TELE.msi
MSSpeech_SR_pt-PT_TELE.msi
MSSpeech_SR_ru-RU_TELE.msi
MSSpeech_SR_sv-SE_TELE.msi
MSSpeech_SR_zh-CN_TELE.msi
MSSpeech_SR_zh-HK_TELE.msi
MSSpeech_SR_zh-TW_TELE.msi
I’m not sure if they determine the words to recognize or the accent (or both), but I’ll be taking a look for sure. Not sure if I can calibrate with this API, either.
My intention for such things in-game is more as a supplementary tool than a replacement for commanding by hand.
How flexible are these systems? Can they ignore filler or unimportant words like Siri and others do or is it really a fixed comparison to a set of preloaded phrases? I’ve experimented with something like this a few years ago, but it was a little bit inconvenient having to remember and use exact wordings. I don’t know how much was improved since then, though.
That’s the thing - if I allow the speech to be setup via some sort of file, it’s easy to include wild-cards in your phrases. It means that you can make them rather easy to say, but with more work on the developer’s end (where it should be!).