Deep multimodal learning: A survey on recent advances and trends

D Ramachandram, GW Taylor - IEEE signal processing …, 2017 - ieeexplore.ieee.org
The success of deep learning has been a catalyst to solving increasingly complex machine-
learning problems, which often involve multiple data modalities. We review recent advances …

Learn to combine modalities in multimodal deep learning

K Liu, Y Li, N Xu, P Natarajan - arXiv preprint arXiv:1805.11730, 2018 - arxiv.org
Combining complementary information from multiple modalities is intuitively appealing for
improving the performance of learning-based approaches. However, it is challenging to fully …

Recent advances and trends in multimodal deep learning: A review

J Summaira, X Li, AM Shoib, S Li, J Abdul - arXiv preprint arXiv …, 2021 - arxiv.org
Deep Learning has implemented a wide range of applications and has become increasingly
popular in recent years. The goal of multimodal deep learning is to create models that can …

A survey of multimodal hybrid deep learning for computer vision: Architectures, applications, trends, and challenges

K Bayoudh - Information Fusion, 2023 - Elsevier
In recent years, deep learning algorithms have rapidly revolutionized artificial intelligence,
particularly machine learning, enabling researchers and practitioners to extend previously …

Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

PP Liang, Y Lyu, X Fan, Z Wu, Y Cheng… - Advances in neural …, 2021 - ncbi.nlm.nih.gov
Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …

Multimodal intelligence: Representation learning, information fusion, and applications

C Zhang, Z Yang, X He, L Deng - IEEE Journal of Selected …, 2020 - ieeexplore.ieee.org
Deep learning methods haverevolutionized speech recognition, image recognition, and
natural language processing since 2010. Each of these tasks involves a single modality in …

What makes multi-modal learning better than single (provably)

Y Huang, C Du, Z Xue, X Chen… - Advances in Neural …, 2021 - proceedings.neurips.cc
The world provides us with data of multiple modalities. Intuitively, models fusing data from
different modalities outperform their uni-modal counterparts, since more information is …

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

K Bayoudh, R Knani, F Hamdaoui, A Mtibaa - The Visual Computer, 2022 - Springer
The research progress in multimodal learning has grown rapidly over the last decade in
several areas, especially in computer vision. The growing potential of multimodal data …

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …