Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Video summarization with long short-term memory

K Zhang, WL Chao, F Sha, K Grauman - … 14, 2016, Proceedings, Part VII 14, 2016 - Springer
We propose a novel supervised learning technique for summarizing videos by automatically
selecting keyframes or key subshots. Casting the task as a structured prediction problem …

Video summarization via multi-view representative selection

J Meng, S Wang, H Wang, J Yuan… - Proceedings of the …, 2017 - openaccess.thecvf.com
Video contents are inherently heterogeneous. To exploit different feature modalities in a
diverse video collection for video summarization, we propose to formulate the task as a multi …

Event-based large scale surveillance video summarization

X Song, L Sun, J Lei, D Tao, G Yuan, M Song - Neurocomputing, 2016 - Elsevier
Recent advances in sensor manufacture and computer vision technologies have simulated
the applications of intelligent transportation systems, while a key yet under-addressed issue …

Online video summarization: Predicting future to better summarize present

S Lal, S Duggal, I Sreedevi - 2019 IEEE Winter Conference on …, 2019 - ieeexplore.ieee.org
Automatically generating the summary of a video is a challenging problem due to its
subjective nature. Most of the previous works in the field consider the entire video to extract …

MSKVS: Adaptive mean shift-based keyframe extraction for video summarization and a new objective verification approach

R Hannane, A Elboushaki, K Afdel - Journal of Visual Communication and …, 2018 - Elsevier
Video abstraction is an interesting topic that aims at briefly representing the entire video
stream by producing a short summary either statically or dynamically. In this paper, we …

Video classification using compacted dataset based on selected keyframe

RF Rachmadi, K Uchimura… - 2016 IEEE Region 10 …, 2016 - ieeexplore.ieee.org
Shared human actions in the video are the biggest problem for video classification system.
For example, long jump sports video will share a running action with the long jump or …

Community-aware federated video summarization

F Wan, J Wang, H Duan, Y Song… - … Joint Conference on …, 2023 - ieeexplore.ieee.org
Video summarization aims to extract representative frames to retain high-level information.
Increasing concerns about privacy issues have been raised because conventional large …

Foundations of Multisensory Artificial Intelligence

PP Liang - arXiv preprint arXiv:2404.18976, 2024 - arxiv.org
Building multisensory AI systems that learn from multiple sensory inputs such as text,
speech, video, real-world sensors, wearable devices, and medical data holds great promise …