GASP: Gaussian Avatars with Synthetic Priors
This post was written by Claude 4.6 as a summary of Jack Saunders’ paper.
Gaussian Splatting has become one of the most exciting techniques in real-time 3D rendering — producing photorealistic results at high frame rates. Applying it to animatable human avatars is a natural next step, and recent Gaussian Avatar methods have produced impressive results. But there’s a catch: the best-quality models require expensive multi-camera rigs. Models trained on a single camera look great from that viewpoint, but degrade significantly when viewed from other angles.
GASP (Gaussian Avatars with Synthetic Priors) closes this gap.
The problem with single-camera data.
When training on a single viewpoint, a model simply doesn’t see most of the head — the back, the sides, the top. Without this information, it can’t learn to reconstruct those regions faithfully. The missing data problem is fundamental.
The solution: a synthetic prior.
GASP’s key insight is to train a generative prior over Gaussian Avatars using synthetic data. Synthetic datasets can be perfectly annotated (exact camera parameters, full 360° coverage, clean correspondences to a 3D morphable model) and arbitrarily diverse — none of which is true for real captured datasets.
The prior learns correlations between Gaussians across the head. When you later fit it to a real person using a single camera, these correlations allow the model to infer unseen regions from what it can see — updating the back of the hair from the front, filling in the ears from the cheeks.
Three-stage fitting.
User enrolment uses a three-stage process: first optimise the latent identity code, then refine the decoder MLP exploiting per-Gaussian feature correlations, then fine-tune the Gaussians directly. The prior is only needed for fitting — inference runs without it, enabling real-time rendering at 70fps on commercial hardware.
GASP was accepted at CVPR 2025, conducted during Jack’s internship at Microsoft.
Links: Project Page · arXiv · Video · CVPR