CrackFormer: Transformer Network for Fine-Grained Crack Detection

Huajun Liu, Xiangyu Miao, Christoph Mertz, Chengzhong Xu, Hui Kong; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 3783-3792


Cracks are irregular line structures that are of interest in many computer vision applications. Crack detection (e.g., from pavement images) is a challenging task due to intensity in-homogeneity, topology complexity, low contrast and noisy background. The overall crack detection accuracy can be significantly affected by the detection performance on fine-grained cracks. In this work, we propose a Crack Transformer network (CrackFormer) for fine-grained crack detection. The CrackFormer is composed of novel attention modules in a SegNet-like encoder-decoder architecture. Specifically, it consists of novel self-attention modules with 1x1 convolutional kernels for efficient contextual information extraction across feature-channels, and efficient positional embedding to capture large receptive field contextual information for long range interactions. It also introduces new scaling-attention modules to combine outputs from the corresponding encoder and decoder blocks to suppress non-semantic features and sharpen semantic cracks. The CrackFormer is trained and evaluated on three classical crack datasets. The experimental results show that CrackFormer achieves ODS values of 0.871, 0.877 and 0.881, respectively, on the three datasets and outperforms the state-of-the-art methods.

Related Material

@InProceedings{Liu_2021_ICCV, author = {Liu, Huajun and Miao, Xiangyu and Mertz, Christoph and Xu, Chengzhong and Kong, Hui}, title = {CrackFormer: Transformer Network for Fine-Grained Crack Detection}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {3783-3792} }