A survey of multimodal deep generative models

M Suzuki, Y Matsuo - Advanced Robotics, 2022 - Taylor & Francis
Multimodal learning is a framework for building models that make predictions based on
different types of modalities. Important challenges in multimodal learning are the inference of …

Joint multimodal learning with deep generative models

M Suzuki, K Nakayama, Y Matsuo - arXiv preprint arXiv:1611.01891, 2016 - arxiv.org
We investigate deep generative models that can exchange multiple modalities bi-
directionally, eg, generating images from corresponding texts and vice versa. Recently …

Variational mixture-of-experts autoencoders for multi-modal deep generative models

Y Shi, B Paige, P Torr - Advances in neural information …, 2019 - proceedings.neurips.cc
Learning generative models that span multiple data modalities, such as vision and
language, is often motivated by the desire to learn more useful, generalisable …

On the limitations of multimodal vaes

I Daunhawer, TM Sutter, K Chin-Cheong… - arXiv preprint arXiv …, 2021 - arxiv.org
Multimodal variational autoencoders (VAEs) have shown promise as efficient generative
models for weakly-supervised data. Yet, despite their advantage of weak supervision, they …

Multimodal generative learning utilizing jensen-shannon-divergence

T Sutter, I Daunhawer, J Vogt - Advances in neural …, 2020 - proceedings.neurips.cc
Learning from different data types is a long-standing goal in machine learning research, as
multiple information sources co-occur when describing natural phenomena. However …

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …

Learning factorized multimodal representations

YHH Tsai, PP Liang, A Zadeh, LP Morency… - arXiv preprint arXiv …, 2018 - arxiv.org
Learning multimodal representations is a fundamentally complex research problem due to
the presence of multiple heterogeneous sources of information. Although the presence of …

Improved multimodal deep learning with variation of information

K Sohn, W Shang, H Lee - Advances in neural information …, 2014 - proceedings.neurips.cc
Deep learning has been successfully applied to multimodal representation learning
problems, with a common strategy to learning joint representations that are shared across …

Prompt tuning for generative multimodal pretrained models

H Yang, J Lin, A Yang, P Wang, C Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org
Prompt tuning has become a new paradigm for model tuning and it has demonstrated
success in natural language pretraining and even vision pretraining. In this work, we explore …

Variational methods for conditional multimodal deep learning

G Pandey, A Dukkipati - 2017 international joint conference on …, 2017 - ieeexplore.ieee.org
In this paper, we address the problem of conditional modality learning, whereby one is
interested in generating one modality given the other. While it is straightforward to learn a …