A behaviorally inspired fusion approach for computational audiovisual saliency modeling

Y Wei, D Hu, Y Tian, X Li - arXiv preprint arXiv:2208.09579, 2022 - arxiv.org

Sight and hearing are two senses that play a vital role in human communication and scene
understanding. To mimic human perception ability, audio-visual learning, aimed at …

被引用次数：62 相关文章所有 2 个版本

[PDF] ieee.org

A comprehensive survey on video saliency detection with auditory information: the audio-visual consistency perceptual is the key!

C Chen, M Song, W Song, L Guo… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Video saliency detection (VSD) aims at fast locating the most attractive
objects/things/patterns in a given video clip. Existing VSD-related works have mainly relied …

被引用次数：26 相关文章所有 5 个版本

[PDF] thecvf.com

Stavis: Spatio-temporal audiovisual saliency network

A Tsiami, P Koutras, P Maragos - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com

We introduce STAViS, a spatio-temporal audiovisual saliency network that combines spatio-
temporal visual and auditory information in order to efficiently address the problem of …

被引用次数：85 相关文章所有 12 个版本

[PDF] arxiv.org

Listen to look into the future: Audio-visual egocentric gaze anticipation

B Lai, F Ryan, W Jia, M Liu, JM Rehg - European Conference on Computer …, 2025 - Springer

Egocentric gaze anticipation serves as a key building block for the emerging capability of
Augmented Reality. Notably, gaze behavior is driven by both visual cues and audio signals …

被引用次数：5 相关文章所有 2 个版本

[PDF] google.com

A novel lightweight audio-visual saliency model for videos

D Zhu, X Shao, Q Zhou, X Min, G Zhai… - ACM Transactions on …, 2023 - dl.acm.org

Audio information has not been considered an important factor in visual attention models
regardless of many psychological studies that have shown the importance of audio …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Audio–visual collaborative representation learning for dynamic saliency prediction

H Ning, B Zhao, Z Hu, L He, E Pei - Knowledge-Based Systems, 2022 - Elsevier

Abstract The Dynamic Saliency Prediction (DSP) task simulates the human selective
attention mechanism to perceive a dynamic scene, which is significant and imperative in …

被引用次数：12 相关文章所有 4 个版本

Can consumers' visual attention be predictable? A saliency modelling-based approach on fashion advertisements

SH Lee, Y Liang, Y Chen, A Mahdi… - International Journal of …, 2021 - Taylor & Francis

As collaborative research between engineering and fashion, the purpose of this study was to
investigate if saliency models can be applied for predicting consumers' visual attention to …

被引用次数：9 相关文章所有 2 个版本

[PDF] acm.org

NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos

Z Yang, S Ren, Z Wu, N Zhao, J Wang, J Qin… - Proceedings of the 31st …, 2023 - dl.acm.org

Non-photorealistic videos are in demand with the wave of the metaverse, but lack of
sufficient research studies. This work aims to take a step forward to understand how humans …

被引用次数：2 相关文章所有 4 个版本

Human attention based movie summarization: Dataset and baseline model

D Zhao, D Zhu, X Min, J Yue, K Zhang, Q Zhou, G Zhai… - Neurocomputing, 2023 - Elsevier

A movie summarization model can automatically edit a condensed version of a movie by
selecting keyframes. Some previous works have proposed some movie summarizers based …

被引用次数：1 相关文章所有 4 个版本

TM2SP: A Transformer-based Multi-Level Spatiotemporal Feature Pyramid Network for Video Saliency Prediction

C Li, S Liu - IEEE Transactions on Circuits and Systems for …, 2025 - ieeexplore.ieee.org

This paper proposes an end-to-end video saliency prediction network model, termed TM2SP-
Net (Transformer-based Multi-level Spatiotemporal Feature Pyramid Network). Leveraging …

高级搜索

QQ 群