Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facial avatars reconstructed from commodity hardware are incomplete and miss out on parts of the side and back of the head, severely limiting the usability of the avatar. This limitation in prior work stems from their requirement of face tracking, which fails for profile and back views. To address this issue, we propose to learn person-specific animatable avatars from images without assuming to have access to precise facial expression tracking. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a nunc odio. Etiam in purus cursus, hendrerit nunc in, tincidunt dolor. Ut sit amet molestie velit, vitae accumsan erat. Morbi elementum leo eu ipsum tincidunt, sit amet ornare mauris pharetra. Quisque vestibulum, nibh quis blandit tincidunt, lorem magna eleifend ante, a placerat elit velit et libero. Nam sed consectetur nulla, vitae auctor est. Aenean quis convallis sem. Sed et magna

GAN-Avatar: Controllable Personalized GAN-based Human Head Avatar

GAN-Avatar - Controllable Personalized GAN-based Human Head Avatar [3DV2024]

1Max Planck Institute for Intelligent Systems, Tübingen, Germany, 2Max Planck Institute for Informatics, Germany 3University of Tübingen 4Technical University of Darmstadt 5Meta Reality Labs

Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facial avatars reconstructed from commodity hardware are incomplete and miss out on parts of the side and back of the head, severely limiting the usability of the avatar. This limitation in prior work stems from their requirement of face tracking, which fails for profile and back views. To address this issue, we propose to learn person-specific animatable avatars from images without assuming to have access to precise facial expression tracking. At the core of our method, we leverage a 3D-aware generative model that is trained to reproduce the distribution of facial expressions from the training data. To train this appearance model, we only assume to have a collection of 2D images with the corresponding camera parameters. For controlling the model, we learn a mapping from 3DMM facial expression parameters to the latent space of the generative model. This mapping can be learned by sampling the latent space of the appearance model and reconstructing the facial parameters from a normalized frontal view, where facial expression estimation performs well. With this scheme, we decouple 3D appearance reconstruction and animation control to achieve high fidelity in image synthesis. In a series of experiments, we compare our proposed technique to state-of-the-art monocular methods and show superior quality while not requiring expression tracking of the training data.


Video

BibTeX

@inproceedings{kabadayi24ganavatar,
      title = {GAN-Avatar: Controllable Personalized GAN-based Human Head Avatar},
      author = {Kabadayi, Berna and Zielonka, Wojciech and Bhatnagar, 
                Bharat Lal  and Pons-Moll, Gerard and Thies, Justus},
      booktitle = {International Conference on 3D Vision (3DV)},
      month = {March},
      year = {2024},
}