Translation, accessibility and the viewing experience of foreign, deaf and blind audiences has long been a neglected area of research within film studies. The same applies to the film …
We propose the ViNet architecture for audio-visual saliency prediction. ViNet is a fully convolutional encoder-decoder architecture. The encoder uses visual features from a …
Conversation scenes are a typical example in which classical models of visual attention dramatically fail to predict eye positions. Indeed, these models rarely consider faces as …
X Min, G Zhai, K Gu, X Yang - ACM Transactions on Multimedia …, 2016 - dl.acm.org
In this article, we propose to predict human eye fixation through incorporating both audio and visual cues. Traditional visual attention models generally make the utmost of stimuli's …
How people look at visual information reveals fundamental information about them; their interests and their states of mind. Previous studies showed that scanpath, ie, the sequence …
Across the academy, scholars are debating the question of what bearing scientific inquiry has upon the humanities. The latest addition to the AFI Film Readers series, Cognitive …
The human face is central to our everyday social interactions. Recent studies have shown that while gazing at faces, each one of us has a particular eye-scanning pattern, highly …
Eye tracking and the analysis of gaze behaviour are established tools to produce insights into how humans observe their surroundings and consume visual multimedia content. For …
This article presents two studies that deepen the theme of how soundtracks shape our interpretation of audiovisuals. Embracing a multivariate perspective, Study 1 (N= 118) …