Learn to combine modalities in multimodal deep learning

K Liu, Y Li, N Xu, P Natarajan - arXiv preprint arXiv:1805.11730, 2018 - arxiv.org
Combining complementary information from multiple modalities is intuitively appealing for
improving the performance of learning-based approaches. However, it is challenging to fully …

Meta-transformer: A unified framework for multimodal learning

Y Zhang, K Gong, K Zhang, H Li, Y Qiao… - arXiv preprint arXiv …, 2023 - arxiv.org
Multimodal learning aims to build models that can process and relate information from
multiple modalities. Despite years of development in this field, it still remains challenging to …

What makes multi-modal learning better than single (provably)

Y Huang, C Du, Z Xue, X Chen… - Advances in Neural …, 2021 - proceedings.neurips.cc
The world provides us with data of multiple modalities. Intuitively, models fusing data from
different modalities outperform their uni-modal counterparts, since more information is …

Deep multimodal learning: A survey on recent advances and trends

D Ramachandram, GW Taylor - IEEE signal processing …, 2017 - ieeexplore.ieee.org
The success of deep learning has been a catalyst to solving increasingly complex machine-
learning problems, which often involve multiple data modalities. We review recent advances …

Modality competition: What makes joint training of multi-modal network fail in deep learning?(provably)

Y Huang, J Lin, C Zhou, H Yang… - … conference on machine …, 2022 - proceedings.mlr.press
Despite the remarkable success of deep multi-modal learning in practice, it has not been
well-explained in theory. Recently, it has been observed that the best uni-modal network …

Recent advances and trends in multimodal deep learning: A review

J Summaira, X Li, AM Shoib, S Li, J Abdul - arXiv preprint arXiv …, 2021 - arxiv.org
Deep Learning has implemented a wide range of applications and has become increasingly
popular in recent years. The goal of multimodal deep learning is to create models that can …

[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

PP Liang, Y Lyu, X Fan, Z Wu, Y Cheng… - Advances in neural …, 2021 - ncbi.nlm.nih.gov
Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …

Multimodal representation learning by alternating unimodal adaptation

X Zhang, J Yoon, M Bansal… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Multimodal learning which integrates data from diverse sensory modes plays a pivotal role
in artificial intelligence. However existing multimodal learning methods often struggle with …

EmbraceNet: A robust deep learning architecture for multimodal classification

JH Choi, JS Lee - Information Fusion, 2019 - Elsevier
Classification using multimodal data arises in many machine learning applications. It is
crucial not only to model cross-modal relationship effectively but also to ensure robustness …

Improved multimodal deep learning with variation of information

K Sohn, W Shang, H Lee - Advances in neural information …, 2014 - proceedings.neurips.cc
Deep learning has been successfully applied to multimodal representation learning
problems, with a common strategy to learning joint representations that are shared across …