Multimodal saliency and fusion for movie summarization based on aural, visual, and textual attention

G Evangelopoulos, A Zlatintsi… - IEEE Transactions …, 2013 - ieeexplore.ieee.org
Multimodal streams of sensory information are naturally parsed and integrated by humans
using signal-level feature extraction and higher level cognitive processes. Detection of …

[HTML][HTML] COGNIMUSE: A multimodal video database annotated with saliency, events, semantics and emotion with application to summarization

A Zlatintsi, P Koutras, G Evangelopoulos… - EURASIP Journal on …, 2017 - Springer
Research related to computational modeling for machine-based understanding requires
ground truth data for training, content analysis, and evaluation. In this paper, we present a …

A perceptually based spatio-temporal computational framework for visual saliency estimation

P Koutras, P Maragos - Signal Processing: Image Communication, 2015 - Elsevier
The purpose of this paper is to demonstrate a perceptually based spatio-temporal
computational framework for visual saliency estimation. We have developed a new spatio …

Human pose and path estimation from aerial video using dynamic classifier selection

AG Perera, YW Law, J Chahl - Cognitive Computation, 2018 - Springer
We consider the problem of estimating human pose and trajectory by an aerial robot with a
monocular camera in near real time. We present a preliminary solution whose distinguishing …

The importance of multiple temporal scales in motion recognition: when shallow model can support deep multi scale models

V D'Amato, L Oneto, A Camurri… - 2022 International Joint …, 2022 - ieeexplore.ieee.org
The execution of a human movement involves different muscles that are activated and
coordinated by the brain at different temporal scales in a complex cognitive process. For this …

A behaviorally inspired fusion approach for computational audiovisual saliency modeling

A Tsiami, P Koutras, A Katsamanis, A Vatakis… - Signal Processing …, 2019 - Elsevier
Human attention is highly influenced by multi-modal combinations of perceived sensory
information and especially audiovisual information. Although systematic behavioral …

Smoke detection in endoscopic surgery videos: a first step towards retrieval of semantic events

C Loukas, E Georgiou - The International Journal of Medical …, 2015 - Wiley Online Library
Background Event‐based annotation of surgical operations has not received much attention,
mainly due to diversity of the visual content. As a first attempt at retrieval of surgical events …

A real-time active pedestrian tracking system inspired by the human visual system

Y Wang, Q Zhao, B Wang, S Wang, Y Zhang… - Cognitive …, 2016 - Springer
Pedestrian detection and tracking play a significant role in surveillance. Despite the
numerous detection and tracking methods proposed in the literature, when the pedestrian is …

[PDF][PDF] Serious games: supporting occupational engagement of people aged 50+ based on intelligent tutoring systems

MA Bruno, L Griffiths - Ingeniare. Revista chilena de ingeniería, 2014 - redalyc.org
This paper offers an overview of the requirements of serious games to support occupational
engagement for people at age 50 and above, and the features for such a serious game. A …

An iteratively reweighting algorithm for dynamic video summarization

P Dong, Y Xia, S Wang, L Zhuo, DD Feng - Multimedia Tools and …, 2015 - Springer
Abstract Information explosion has imposed unprecedented challenges on the conventional
ways of video data consumption. Hence providing condensed and meaningful video …