This video demonstrates using Safari on an iPhone to enter English text to produce speech audio from CereProc and complimentary mouth and facial animation for Metahumans in Unreal Engine.
Sorry for seeing this question earlier! I am terrible person.
All the tts videos I have posted here do happen in realtime. However, when you request tts from CereVoice Cloud, there obviously some amout of delay to get a reply. The reply provides a url for the wav file and a url to a metadata file, which contains timing information of the words in the audio file. The lenght of this delay increases with the word count of the request but it is pretty quick.
More recently I have made it so that all the dialog urls and metadata url are generated before the animation is played back, however this does still download and playback the audio file in as close to realtime as possible.
I just posted here with the alternative option to just enter the words into a Sequencer timeline along side a standard wav file.
Thanks for watching and commenting. I promise to reply much swifter next time…
I’m surprised I didn’t see this sooner. The quality of what you’re doing is very lovely and I cannot wait to see this grow further. I would be a little curious to see more expressions as this develops further.
In the recent months has it been a little easier to work on?
Hi @The_M0ss_Man Thanks for watching and commenting. I can get distracted with the many other things UE can accomplish so I have just come back to this after a break. Yeah I have also added general facial expressions which can be layered on top of the mouth animations, such as happy, sad, quizzical, angry etc. They are added just by another sequencer event timeline with the string for the expression. In general it is something that just gets better with more time spent working on it…