A multimodal saliency model for videos with high audio-visual correspondence

X Min, G Zhai, J Zhou, XP Zhang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Audio information has been bypassed by most of current visual attention prediction studies.
However, sound could have influence on visual attention and such influence has been …

Joint learning of audio–visual saliency prediction and sound source localization on multi-face videos

M Qiao, Y Liu, M Xu, X Deng, B Li, W Hu… - International Journal of …, 2023 - Springer
Visual and audio events simultaneously occur and both attract attention. However, most
existing saliency prediction works ignore the influence of audio and only consider vision …

Predicting video saliency with object-to-motion CNN and two-layer convolutional LSTM

L Jiang, M Xu, Z Wang - arXiv preprint arXiv:1709.06316, 2017 - arxiv.org
Over the past few years, deep neural networks (DNNs) have exhibited great success in
predicting the saliency of images. However, there are few works that apply DNNs to predict …

A review of machine learning in scanpath analysis for passive gaze-based interaction

A Mohamed Selim, M Barz, OS Bhatti… - Frontiers in Artificial …, 2024 - frontiersin.org
The scanpath is an important concept in eye tracking. It refers to a person's eye movements
over a period of time, commonly represented as a series of alternating fixations and …

A novel lightweight audio-visual saliency model for videos

D Zhu, X Shao, Q Zhou, X Min, G Zhai… - ACM Transactions on …, 2023 - dl.acm.org
Audio information has not been considered an important factor in visual attention models
regardless of many psychological studies that have shown the importance of audio …

Learning to predict salient faces: A novel visual-audio saliency model

Y Liu, M Qiao, M Xu, B Li, W Hu, A Borji - Computer Vision–ECCV 2020 …, 2020 - Springer
Recently, video streams have occupied a large proportion of Internet traffic, most of which
contain human faces. Hence, it is necessary to predict saliency on multiple-face videos …

Lavs: A lightweight audio-visual saliency prediction model

D Zhu, D Zhao, X Min, T Han, Q Zhou… - … on Multimedia and …, 2021 - ieeexplore.ieee.org
Audio information is essential for guiding human attention and visual perception, which has
been verified by many comprehensive psychological studies. However, the audio modality …

DeepVS2. 0: A saliency-structured deep learning method for predicting dynamic visual attention

L Jiang, M Xu, Z Wang, L Sigal - International Journal of Computer Vision, 2021 - Springer
Deep neural networks (DNNs) have exhibited great success in image saliency prediction.
However, few works apply DNNs to predict the saliency of generic videos. In this paper, we …

[HTML][HTML] EyeTrackUAV2: A large-scale binocular eye-tracking dataset for UAV videos

AF Perrin, V Krassanakis, L Zhang, V Ricordel… - Drones, 2020 - mdpi.com
The fast and tremendous evolution of the unmanned aerial vehicle (UAV) imagery gives
place to the multiplication of applications in various fields such as military and civilian …

Visual Saliency Modeling with Deep Learning: A Comprehensive Review

SE Abraham, BC Kovoor - Journal of Information & Knowledge …, 2023 - World Scientific
Visual saliency models mimic the human visual system to gaze towards fixed pixel positions
and capture the most conspicuous regions in the scene. They have proved their efficacy in …