REACH: Explicit Recovery Behavior for Diffusion Policies

Ke, Zundong; Chen, Junlin; Zhu, Jiayi; Xia, Kuanhao; Zhao, Boyi; Gu, Jiayuan

Zundong Ke, Junlin Chen, Jiayi Zhu, Kuanhao Xia, Boyi Zhao, Jiayuan Gu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 38498-38508

Abstract

Diffusion policies have emerged as a powerful paradigm for robot learning, but their inherent multi-modality can lead to a diverse set of plausible--though not always optimal--actions from a single observation. We posit that for a given task, an optimal action exists within this distribution. Inspired by negative prompting in generative models, we introduce a novel method that leverages an error detector to identify out-of-distribution (OOD) execution histories and uses them to construct negative action prompts. This allows our policy to steer away from suboptimal behaviors and converge towards higher-performance actions. We present a comprehensive ablation study demonstrating the effectiveness of positive, and negative prompts, and validate our approach on a suite of simulated benchmarks and real-world robotic tasks. Our results show that the proposed Negative-Prompt-guided Diffusion Policy achieves significant improvement in task performance by effectively filtering undesirable action modes.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Ke_2026_CVPR, author = {Ke, Zundong and Chen, Junlin and Zhu, Jiayi and Xia, Kuanhao and Zhao, Boyi and Gu, Jiayuan}, title = {REACH: Explicit Recovery Behavior for Diffusion Policies}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {38498-38508} }