Multimodal fusion on low-quality data: A comprehensive survey

Q Zhang, Y Wei, Z Han, H Fu, X Peng, C Deng… - arXiv preprint arXiv …, 2024 - arxiv.org
Multimodal fusion focuses on integrating information from multiple modalities with the goal of
more accurate prediction, which has achieved remarkable progress in a wide range of …

Diagnosing and Re-learning for Balanced Multimodal Learning

Y Wei, S Li, R Feng, D Hu - European Conference on Computer Vision, 2025 - Springer
To overcome the imbalanced multimodal learning problem, where models prefer the training
of specific modalities, existing methods propose to control the training of uni-modal …

MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance

Y Wei, D Hu - arXiv preprint arXiv:2405.17730, 2024 - arxiv.org
Multimodal learning methods with targeted unimodal learning objectives have exhibited
their superior efficacy in alleviating the imbalanced multimodal learning problem. However …

Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation

R Feng, D Hu, W Ma, X Li - arXiv preprint arXiv:2408.01366, 2024 - arxiv.org
Humans possess a remarkable talent for flexibly alternating to different senses when
interacting with the environment. Picture a chef skillfully gauging the timing of ingredient …

On-the-fly Modulation for Balanced Multimodal Learning

Y Wei, D Hu, H Du, JR Wen - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Multimodal learning is expected to boost model performance by integrating information from
different modalities. However, its potential is not fully exploited because the widely-used …

Frequency-Decoupled Cross-Modal Knowledge Distillation

J Liu, Y Zhang, T Huang, H Wang, W Xu, S Zhang… - openreview.net
Knowledge distillation (KD) has proven highly effective for compressing large models and
enhancing the performance of smaller ones. However, its effectiveness diminishes in cross …