Hi. May i know how u do this? Do u provide the service?
What exactly are you interested in? Speech, animation, tracking or dialogue?
Speech recognition is open-source locally working toolkit Vosk, speech synthesis is cloud service (I used Yandex Speech Kit, but Google should work better for English).
Facial animation has three layers.
- a few pre-defined states: friendly, annoyed, thinking (when awaiting for speech syntesis)
- lip-sync (I use my plugin, itās on the marketplace now)
- facial animation while speaking. This is poorly trained neural net. I used iPhone live link to capture facial animation and PyTorch to train it.
For dialugue I used another my old plugin, but itās possible to afford on UE4 without third-party solutions. Plus, it makes sense to connect Dialogflow instead.
Tracking & AR. iPhone with NDI HX app is a camera. Is isnāt good solution. You can see a mistiming between video and tracking. For tracking I use SteamVR (Vive Trackers) attached to both chair and iPhone.
No, I donāt provide any service. I just had an interesting idea and I did it.
Good luck!
I wud like to integrate dialogflo n lipsync. Your lipsyn plugin cannot run runtime. Any suggestions or solytions for this?
My lipsync works in runtime (and you can see it in the video), it doesnāt run in real-time. I.e. it canāt animate lips from microphone input. It takes some time to recognize a word.
Hi Thanks for ur reply. Can we keep on feeding new wave files? Will metahuman keep speaking new waves files when we replace the files at runtime? ( published game )
Yes, and you actually can download executable demo from the marketplace page and test it. It can play wave files from your PC.
Just small update on my efforts: UE4 MetaHuman: Automatic Lip-sync + Facial Animation - YouTube
Unfortunately, itās not in a state I can share. But I think about sharing a tools I developed to work with metahuman facial animation.
Hi! Does it work with Spanish from Mexico?
No, unfortunately.
And just in case: last video is captured in my personal project. Lip-sync and facial animation like this isnāt part of my plugin on marketplace.
Iāve used AWS Polly for a simple text to voice iteration but it has some very enhanced and useful features. I hope this helps. FYI, some of you folks with your knowledge and projects are fantastic.
Do you have a tutotrial or something that could help me? Thanks in advance.
Well, thank you.
I used the awscore-polly plugin from marketplace.
Polly supports 3 spanish voices, and one mexican female voice:
https://docs.aws.amazon.com/polly/latest/dg/voicelist.html
The speach marks allow you to activate the required viseme pose in time (see my image above). If you need more details, tell me which and i can post more screen shots of the solution here.
if you have anything working to connect metahumans with dialogflow, i would like to connect. we would like to use that to work with/speak with patients. contact me at csilva@sphinxmedtech.com
Hi everyone. First post, so the noob can stay implied. I have a quick question: while it may not supply a complete solution, and certainly doesnāt provide an end to end process for speech input ā reaction by agent ā speech output (with facial animation and lipsync), why hasnāt anybody here mentioned the MetaHuman SDK from the Unreal Marketplace?
Itās free, and certainly seems to provide some of the same services as the Nvidia Omniverse Audio2Faceā¦ unless Iām missing something? Anyway, downloading it now. Happy for any correction if Iām talking BS.
re: noob: thx.
re: end to end: we are also willing to pay for assistance.
re: Nvidia Omniverse Audio2Face: checkign it out.
re: MetaHuman SDK from the Unreal Marketplace: having a look.
the short answer is that the metahuman concept kicks a** so well, we really want to try deploying it with test patients.
Hi! Yes, actually Iām working with AWSCore-Polly plugin too, but I have found hard to do the lip-sync.
Since Iām technically new to UE, I was wondering if youād mind helping me. Do I have to define the visemes in a blueprint apart and call it from a Level Sequence to animate or the animation can be achieved in real time using the audio and visemes?
Please let me know if I can write you via e-mail.
Thanks in advance.
Regards.
Hi! You need a webhook in order to achieve the communication between Dialogflow and UE/Metahumans.
got it! where can we locate the webhook API info to run metahumans as stand alone via a browser?