Hand-drawn character animation is a vibrant research area in computer graphics and presents unique challenges in achieving geometric consistency while conveying expressive motion details. Traditional skeletal animation methods maintain geometric consistency but often struggle with complex non-rigid elements like flowing hair and skirts, resulting in unnatural deformation and missing secondary dynamics. In contrast, video diffusion models effectively synthesize physically plausible dynamics, but exhibit real-human-like characteristics and geometric distortions when applied to stylized drawings due to the domain gap. In this work, we propose a novel hybrid animation system that integrates the strengths of skeletal animation and video diffusion priors. The core idea is to first generate coarse images from characters retargeted with skeletal animations for geometric consistency guidance, and then enhance these images in terms of texture details and secondary dynamics using video diffusion priors. We formulate the enhancement of coarse images as an inpainting task and propose a domain-adapted diffusion model to refine user-masked regions requiring improvement, particularly those involving secondary dynamics. To further enhance motion realism, we propose a Secondary Dynamics Injection (SDI) strategy during the denoising process to incorporate latent features from a pre-trained diffusion model enriched with human motion priors. Additionally, to address unnatural deformation artifacts caused by the integrated hair-body geometry in low-poly single-mesh character modeling, we introduce a Hair Layering Modeling (HLM) technique that employs segmentation maps to separate hair from the body in implicit fields, enabling more natural animation of challenging long-hair characters. Through extensive experiments, we demonstrate that our system outperforms state-of-the-art works in both quantitative and qualitative evaluations.
@inproceedings{zhouqu2025waving,
author = {Zhou, Jie and Qu, Linzi and Lam, Miu-Ling and Fu, Hongbo},
title = {From Rigging to Waving: 3D-Guided Diffusion for Natural Animation of Hand-Drawn Characters},
booktitle = {ACM Transactions on Graphics (TOG)},
year = {2025},
}