MSCC: Multi-Scale Transformers for Camera Calibration

Xu Song, Hao Kang, Atsunori Moteki, Genta Suzuki, Yoshie Kobayashi, Zhiming Tan; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 3262-3271

Abstract


Camera calibration is very important for some vision tasks, like rendering 3D scenes, environment reconstruction, and self-localization, etc. In this paper, we propose a framework of multi-scale transformers for camera calibration. With the input of a single image, the multi-scale features output from the model's backbone are utilized to estimate camera parameters. At the same time, we show that the way of coarse-to-fine is effective to locate global structures and detailed features in the image, by studying the attention response of horizon line estimation. Moreover, deep supervision is proven to get more precise results and accelerated training. Our method outperforms all the state-of-the-art methods by objective and subjective experiments on Google Street View dataset and Pano360.

Related Material


[pdf]
[bibtex]
@InProceedings{Song_2024_WACV, author = {Song, Xu and Kang, Hao and Moteki, Atsunori and Suzuki, Genta and Kobayashi, Yoshie and Tan, Zhiming}, title = {MSCC: Multi-Scale Transformers for Camera Calibration}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {3262-3271} }