Kirill and I (Snayss) recently made a music video in Unreal Engine 5, exploring the new advances in Metahumans, Lumen Lighting, and Face Capture technology. This was a very cool exploration in digital production, and left us feeling like we had just been "on set" for one month :) I want to go through some of the details of the project in this article, as I think it will be interesting to those with eyes on the metaverse!
Our drive to start this project actually emerged from an introduction to the Metahuman software for Unreal Engine 5. I had seen the amazing advances in Unreal Engine over the last few years, but could not convince myself to learn this beast of a program until I saw THE METAHUMANS. Metahuman Creator is a web app that allows you to create and customize an avatar. During this process, the app streams to you a video feed in order to avoid any client side render costs. When you are ready, you can export and load all the textures / models / shaders / blueprints for a functional metahuman in Unreal Engine.
It’s very mesmerizing to use this tool, blending between different faces and sculpting the face and textures to your taste. It’s really hard to look away from her, watching her animate shyly, looking at me, looking away…
We did the first pass at recreating Kiki Yago - we found this task quite challenging, as the tool always seemed to clamp the features we wanted to sculpt, for example we could not find a way to get the avatars eyes to be wider apart. Anyway, we sent this draft to Kiki and she said that it was cool but definitely did not look like her, and that she would give it a try.
I was skeptical that it would be possible to create an avatar that looks like yourself, since we all have some sort of self perception deficiencies and biases… but I was proven wrong! Kiki’s versions capture the essence of her facial features way better than ours, even with our restrictions that she must be bald:)
I am very curious now to create my own avatar, and think it could be a cool exercise in self perception for all of us, since it’s the first avatar creator that can actually photo-realistically look like you, but do you even think you look like you?:0
The goal was to get Kiki to sing and record her facial capture for the entire song, keeping in mind that Kiki was in Saint Petersburg / Russia , and we were in San Francisco / USA. We decided on using the FaceCap application, since our friend used it for her Little Martian project and had promising results, and all it needed was an iPhone 10 or above! Kiki downloaded it on her iPhone, and could send us the FBX over telegram!
It’s amazing with the technologies that we have today and the power of virtual production, we were able to come up with a face capture solution that did not require any studio equipment or even being in the same country!
Kiki sent us some short initial test recordings, and we spent a few days struggling through the technical details of getting the animation to map to the Metahuman facial rig -- this was mostly due to the fact that it was our first time using Unreal Engine for.. anything, and also that Unreal Engine completely changed their animation retargeting system in UE5.
However, thanks to watching this video, its beautiful elevator soundtrack, and the FaceCap morph target re-mapping data that the author generously shared, we were able to get an initial working demo!
This was super promising! Meta Kiki was alive! But obviously there was a lot of work to be done on adjusting the remapping of FaceCap to Metahuman Rig, specifically in the mouth and jaw area.
We asked Kiki to send us the FaceCap FBX along with a recording of herself from a separate camera and a WAV file, on Telegram, our Remote Face Capture Studio. This was a complete picture of the data, so that we had reference videos while adjusting the Metahuman facial rig remapping, and reference audio for aligning the face capture to the main audio track.
After we got a good pipeline going, we could focus on ways to improve the overall effect of the face capture. We found that eye acting and head rotations came across really well and was a good place to focus the performance efforts. It was also critical to over articulate words and press lips together, which was actually a difficult task for this song where Kiki is pushing her lyrical alliterations at 100 words/second:)
Kiki pushed forward on this, coming up with some crazy fast eye blinking action and creepy doll expressions. She split the song into 12 sections of around 20 seconds each, because we found that smaller recordings were much easier to process in Unreal Engine.
Once we were confident in our Face Capture remote pipeline, we moved on to the environment design and visual effects! This is a big and slightly unrelated topic since most of this work was done in HoudiniFX -- so i’m going to split that into another article:)
We are really excited about this technology, and to take this whole experience to the next level. Since the metahumans are so realistic, we definitely need to put some work into the 3D environment, shader, and geometry styles that are harmonious. This will be an exploration that i’m very looking forward to, with some shorter concept video renders.
@snayss - Realtime Unreal Engine and graphics dev
@xenofontova_dasha - Vocals, realtime face capture
@kif11 - Environment artist and pipeline
A copy of this article also exist on mirror.xyz and medium.com