CrackFormerSV2: Advanced Pavement Crack Segmentation with Swin Transformer V2 and Dual Attention Mechanisms

VILLANUEVA, Fredy Gabriel Ramírez; NOGUERA, Jose Luis Vazquez; AYALA, Horacio Andrés Legal; ROMÁN, Julio Cesar Mello Román Julio Cesar Mello; ESTIGARRIBIA, Pastor Enmanuel Pérez

CrackFormerSV2: Advanced Pavement Crack Segmentation with Swin Transformer V2 and Dual Attention Mechanisms

- 322795

Abstract

How to cite this paper?

Abstract

Pavement crack detection and segmentation are critical tasks for effective infrastructure maintenance. Despite promising advances driven by deep learning, significant challenges persist due to the inherent complexities of pavement crack characteristics. This paper introduces CrackFormerSV2, a novel encoder-decoder architecture specifically designed for robust pavement crack segmentation. A key feature of CrackFormerSV2 is its integration of the hierarchical feature extraction capabilities of Swin Transformer V2. Furthermore, the architecture incorporates dual attention mechanisms: the Convolutional Block Attention Module (CBAM) within the decoder blocks to refine feature maps, and a novel Skip Attention module that enhances traditional skip connections through a cross-attention strategy between corresponding encoder and decoder features. An Atrous Spatial Pyramid Pooling (ASPP) module is utilized at the bottleneck to effectively aggregate multi-scale contextual information crucial for capturing diverse crack patterns. The model is trained using a strategic learning rate schedule, employing distinct rates for the pre-trained encoder and the decoder. Evaluations conducted on established public benchmarks and a proprietary dataset demonstrate that CrackFormerSV2 achieves significant performance improvements across key metrics, including Intersection over Union (IoU), recall, precision, and F1 score, outperforming a baseline UNet-ResNet model. Continued optimization, including gradual unfreezing and the exploration of advanced loss functions, suggests a strong potential for CrackFormerSV2 to achieve or even surpass current state-of-the-art results in pavement crack segmentation.

Programme

16:45 to 17:45 on 09/18/2025

Foyer 12 andar - ST…

Institutions

¹ Universidad Nacional de Caaguazú UNCA
² Facultad Politécnica, Universidad Nacional de Asunción
³ Facultad Politécnica, Universidad Nacional de Asunción.
⁴ Facultad Politécnica Universidad Nacional de Asunción

Track

ST03 - Scientific Computing

Keywords

Crack Segmentation

Deep Learning

Vision Transformer

Swin Transformer V2

Attention Mechanism

CNMAC-2025

Book of abstracts of the XLIV National Congress of Applied and Computational Mathematics

CrackFormerSV2: Advanced Pavement Crack Segmentation with Swin Transformer V2 and Dual Attention Mechanisms

How to cite this paper?

Share your ideas or questions with the authors!

Streamline your Scholarly Event