CrackFormerSV2: Advanced Pavement Crack Segmentation with Swin Transformer V2 and Dual Attention Mechanisms

VILLANUEVA, Fredy Gabriel Ramírez; NOGUERA, Jose Luis Vazquez; AYALA, Horacio Andrés Legal; ROMÁN, Julio Cesar Mello Román Julio Cesar Mello; ESTIGARRIBIA, Pastor Enmanuel Pérez

CrackFormerSV2: Advanced Pavement Crack Segmentation with Swin Transformer V2 and Dual Attention Mechanisms

- 322795

Resumo

Como citar esse trabalho?

Resumo

Pavement crack detection and segmentation are critical tasks for effective infrastructure maintenance. Despite promising advances driven by deep learning, significant challenges persist due to the inherent complexities of pavement crack characteristics. This paper introduces CrackFormerSV2, a novel encoder-decoder architecture specifically designed for robust pavement crack segmentation. A key feature of CrackFormerSV2 is its integration of the hierarchical feature extraction capabilities of Swin Transformer V2. Furthermore, the architecture incorporates dual attention mechanisms: the Convolutional Block Attention Module (CBAM) within the decoder blocks to refine feature maps, and a novel Skip Attention module that enhances traditional skip connections through a cross-attention strategy between corresponding encoder and decoder features. An Atrous Spatial Pyramid Pooling (ASPP) module is utilized at the bottleneck to effectively aggregate multi-scale contextual information crucial for capturing diverse crack patterns. The model is trained using a strategic learning rate schedule, employing distinct rates for the pre-trained encoder and the decoder. Evaluations conducted on established public benchmarks and a proprietary dataset demonstrate that CrackFormerSV2 achieves significant performance improvements across key metrics, including Intersection over Union (IoU), recall, precision, and F1 score, outperforming a baseline UNet-ResNet model. Continued optimization, including gradual unfreezing and the exploration of advanced loss functions, suggests a strong potential for CrackFormerSV2 to achieve or even surpass current state-of-the-art results in pavement crack segmentation.

Programação

16:45 até 17:45 em 18/09/2025

Foyer 12 andar - ST…

Instituições

¹ Universidad Nacional de Caaguazú UNCA
² Facultad Politécnica, Universidad Nacional de Asunción
³ Facultad Politécnica, Universidad Nacional de Asunción.
⁴ Facultad Politécnica Universidad Nacional de Asunción

Eixo Temático

ST03 - Computação Científica

Palavras-chave

Crack Segmentation

Deep Learning

Vision Transformer

Swin Transformer V2

Attention Mechanism

CNMAC-2025

Caderno de resumos do XLIV Congresso Nacional de Matemática Aplicada e Computacional

CrackFormerSV2: Advanced Pavement Crack Segmentation with Swin Transformer V2 and Dual Attention Mechanisms

Como citar esse trabalho?

Compartilhe suas ideias ou dúvidas com os autores!

Discussões Científicas de Qualidade