Magic ELF: Image deraining meets association learning and transformer

K Jiang, Z Wang, C Chen, Z Wang, L Cui… - arXiv preprint arXiv …, 2022 - arxiv.org
Convolutional neural network (CNN) and Transformer have achieved great success in
multimedia applications. However, little effort has been made to effectively and efficiently …

Refign: Align and refine for adaptation of semantic segmentation to adverse conditions

D Brüggemann, C Sakaridis… - Proceedings of the …, 2023 - openaccess.thecvf.com
Due to the scarcity of dense pixel-level semantic annotations for images recorded in adverse
visual conditions, there has been a keen interest in unsupervised domain adaptation (UDA) …

Open scene understanding: Grounded situation recognition meets segment anything for helping people with visual impairments

R Liu, J Zhang, K Peng, J Zheng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Grounded Situation Recognition (GSR) is capable of recognizing and interpreting
visual scenes in a contextually intuitive way, yielding salient activities (verbs) and the …

Multi-scale fusion and decomposition network for single image deraining

Q Wang, K Jiang, Z Wang, W Ren… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) and self-attention (SA) have demonstrated
remarkable success in low-level vision tasks, such as image super-resolution, deraining …

Refined semantic enhancement towards frequency diffusion for video captioning

X Zhong, Z Li, S Chen, K Jiang, C Chen… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Video captioning aims to generate natural language sentences that describe the given video
accurately. Existing methods obtain favorable generation by exploring richer visual …

Only a few classes confusing: Pixel-wise candidate labels disambiguation for foggy scene understanding

L Liao, W Chen, Z Zhang, J Xiao, Y Yang… - Proceedings of the …, 2023 - ojs.aaai.org
Not all semantics become confusing when deploying a semantic segmentation model for
real-world scene understanding of adverse weather. The true semantics of most pixels have …

[HTML][HTML] Screening the stones of Venice: Mapping social perceptions of cultural significance through graph-based semi-supervised classification

N Bai, P Nourian, R Luo, T Cheng… - ISPRS Journal of …, 2023 - Elsevier
Mapping cultural significance of heritage properties in urban environment from the
perspective of the public has become an increasingly relevant process, as highlighted by the …

High temporal frequency vehicle counting from low-resolution satellite images

L Liao, J Xiao, Y Yang, X Ma, Z Wang… - ISPRS Journal of …, 2023 - Elsevier
Frequent object counting at a specific location (FOC@ Loc) is becoming a newly emerging
but highly demanded task since the evolution of human activities can provide crucial …

Exploring the point feature relation on point cloud for multi-view stereo

R Zhao, X Han, X Guo, L Kuang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Learning-based multi-view stereo (MVS) is gaining prominence as a method for 3D
reconstruction. However, existing methods in the process of feature learning fail to focus on …

A Comprehensive Study on Self-Learning Methods and Implications to Autonomous Driving

J Xing, D Wei, S Zhou, T Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
As artificial intelligence (AI) has already seen numerous successful applications, the
upcoming challenge lies in how to realize artificial general intelligence (AGI). Self-learning …