A critical review and comparative study on image segmentation-based techniques for pavement crack detection

N Kheradmandi, V Mehranfar - Construction and Building Materials, 2022 - Elsevier
The prompt detection of early decay in the pavement could be an auspicious technique in
road maintenance. Admittedly, early crack detection allows preventive measures to be taken …

A review of multimodal image matching: Methods and applications

X Jiang, J Ma, G Xiao, Z Shao, X Guo - Information Fusion, 2021 - Elsevier
Multimodal image matching, which refers to identifying and then corresponding the same or
similar structure/content from two or more images that are of significant modalities or …

Segment anything

A Kirillov, E Mintun, N Ravi, H Mao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for
image segmentation. Using our efficient model in a data collection loop, we built the largest …

Adding conditional control to text-to-image diffusion models

L Zhang, A Rao, M Agrawala - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
We present ControlNet, a neural network architecture to add spatial conditioning controls to
large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large …

Visual chatgpt: Talking, drawing and editing with visual foundation models

C Wu, S Yin, W Qi, X Wang, Z Tang, N Duan - arXiv preprint arXiv …, 2023 - arxiv.org
ChatGPT is attracting a cross-field interest as it provides a language interface with
remarkable conversational competency and reasoning capabilities across many domains …

Uni-controlnet: All-in-one control to text-to-image diffusion models

S Zhao, D Chen, YC Chen, J Bao… - Advances in …, 2024 - proceedings.neurips.cc
Text-to-Image diffusion models have made tremendous progress over the past two years,
enabling the generation of highly realistic images based on open-domain text descriptions …

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc
Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

Swin transformer embedding UNet for remote sensing image semantic segmentation

X He, Y Zhou, J Zhao, D Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Global context information is essential for the semantic segmentation of remote sensing (RS)
images. However, most existing methods rely on a convolutional neural network (CNN) …

PixArt-: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

J Chen, J Yu, C Ge, L Yao, E Xie, Y Wu, Z Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
The most advanced text-to-image (T2I) models require significant training costs (eg, millions
of GPU hours), seriously hindering the fundamental innovation for the AIGC community …

Layercam: Exploring hierarchical class activation maps for localization

PT Jiang, CB Zhang, Q Hou… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
The class activation maps are generated from the final convolutional layer of CNN. They can
highlight discriminative object regions for the class of interest. These discovered object …