-
[pdf]
[supp]
[bibtex]@InProceedings{Zhang_2024_ACCV, author = {Zhang, Haosong and Leong, Mei Chee and Li, Liyuan and Lin, Weisi}, title = {RD-Diff: RLTransformer-based Diffusion Model with Diversity-Inducing Modulator for Human Motion Prediction}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {3531-3551} }
RD-Diff: RLTransformer-based Diffusion Model with Diversity-Inducing Modulator for Human Motion Prediction
Abstract
Human Motion Prediction (HMP) is crucial for human-robot collaboration, surveillance, and autonomous driving applications. Recently, diffusion models have shown promising progress due to their ease of training and realistic generation capabilities. To enhance both accuracy and diversity of the diffusion model in HMP, we present RD-Diff: RLTransformer-based Diffusion model with Diversity-inducing modulator. First, to improve transformers effectiveness on the frequency representation of human motion transformed by Discrete Cosine Transform (DCT), we introduce a novel Regulated Linear Transformer (RLTransformer) with a specially designed linear-attention mechanism. Next, to further enhance the performance, we propose a Diversity- Inducing Modulator (DIM) to generate noise-modulated observation conditions for a pretrained diffusion model. Experimental results show that our RD-Diff establishes a new state-of-the-art performance on both accuracy and diversity compared to existing methods.
Related Material