E$^2$-SCI: Elastic Edge-Cloud Speculative Decoding via Credit Inertia

Senyao Li, Haozhao Wang, Zhaobai Jiang, Zhanbo Jin, Hao Fan, Ruixuan Li; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 12944-12954

Abstract


In edge-cloud environments, efficiency of speculative decoding is heavily constrained by uplink transmission and cloud-side verification. In this work, we identify a phenomenon we term credit inertia, where acceptance rates of adjacent token windows exhibit strong temporal consistency. Tokens following recently well-performing windows are likely to pass verification, whereas tokens following poorly performing windows are likely to fail. Motivated by this observation, we propose E^2-SCI, an elastic edge-cloud speculative decoding framework that dynamically adjusts draft token verification thresholds based on recent historical performance. This adaptive mechanism allows system to be more permissive for windows with strong historical performance and stricter for windows with weak performance, effectively leveraging temporal consistency to reduce overall latency. We further introduce Progressive Lookahead Concurrency (PLC), which pipelines draft generation and verification asynchronously to hide latency. Experiments across multiple benchmarks show that E^2-SCI achieves over 9.4 tokens/s on DeepSeek-R1-Distill-Qwen (1.5B/32B), delivering an 88.5% speed improvement over FSD baseline while maintaining accuracy. Notably, E^2-SCI integrates seamlessly with existing frameworks, demonstrating broad applicability and superior efficiency-quality trade-offs.

Related Material


[pdf]
[bibtex]
@InProceedings{Li_2026_CVPR, author = {Li, Senyao and Wang, Haozhao and Jiang, Zhaobai and Jin, Zhanbo and Fan, Hao and Li, Ruixuan}, title = {E\${\textasciicircum}2\$-SCI: Elastic Edge-Cloud Speculative Decoding via Credit Inertia}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {12944-12954} }