-
[pdf]
[supp]
[bibtex]@InProceedings{Ke_2026_CVPR, author = {Ke, Zundong and Chen, Junlin and Zhu, Jiayi and Xia, Kuanhao and Zhao, Boyi and Gu, Jiayuan}, title = {REACH: Explicit Recovery Behavior for Diffusion Policies}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {38498-38508} }
REACH: Explicit Recovery Behavior for Diffusion Policies
Abstract
Diffusion policies have emerged as a powerful paradigm for robot learning, but their inherent multi-modality can lead to a diverse set of plausible--though not always optimal--actions from a single observation. We posit that for a given task, an optimal action exists within this distribution. Inspired by negative prompting in generative models, we introduce a novel method that leverages an error detector to identify out-of-distribution (OOD) execution histories and uses them to construct negative action prompts. This allows our policy to steer away from suboptimal behaviors and converge towards higher-performance actions. We present a comprehensive ablation study demonstrating the effectiveness of positive, and negative prompts, and validate our approach on a suite of simulated benchmarks and real-world robotic tasks. Our results show that the proposed Negative-Prompt-guided Diffusion Policy achieves significant improvement in task performance by effectively filtering undesirable action modes.
Related Material

