A systematic literature review on multimodal machine learning: Applications, challenges, gaps and future directions

A Barua, MU Ahmed, S Begum - IEEE Access, 2023 - ieeexplore.ieee.org
Multimodal machine learning (MML) is a tempting multidisciplinary research area where
heterogeneous data from multiple modalities and machine learning (ML) are combined to …

Discriminant correlation analysis: Real-time feature level fusion for multimodal biometric recognition

M Haghighat, M Abdel-Mottaleb… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Information fusion is a key step in multimodal biometric systems. The fusion of information
can occur at different levels of a recognition system, ie, at the feature level, matching-score …

Recent progress on tactile object recognition

H Liu, Y Wu, F Sun, D Guo - International Journal of …, 2017 - journals.sagepub.com
Conventional visual perception technology is subject to many restrictions, such as
illumination, background clutter, and occlusion. Many intrinsic properties of objects, like …

Joint medical image fusion, denoising and enhancement via discriminative low-rank sparse dictionaries learning

H Li, X He, D Tao, Y Tang, R Wang - Pattern Recognition, 2018 - Elsevier
Medical image fusion is important in image-guided medical diagnostics, treatment, and other
computer vision tasks. However, most current approaches assume that the source images …

Memory-augmented deep unfolding network for guided image super-resolution

M Zhou, K Yan, J Pan, W Ren, Q Xie, X Cao - International Journal of …, 2023 - Springer
Guided image super-resolution (GISR) aims to obtain a high-resolution (HR) target image by
enhancing the spatial resolution of a low-resolution (LR) target image under the guidance of …

Revisiting multimodal representation in contrastive learning: from patch and token embeddings to finite discrete tokens

Y Chen, J Yuan, Y Tian, S Geng, X Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Contrastive learning-based vision-language pre-training approaches, such as CLIP, have
demonstrated great success in many vision-language tasks. These methods achieve cross …

Multi-level feature abstraction from convolutional neural networks for multimodal biometric identification

S Soleymani, A Dabouei, H Kazemi… - 2018 24th …, 2018 - ieeexplore.ieee.org
In this paper, we propose a deep multimodal fusion network to fuse multiple modalities (face,
iris, and fingerprint) for person identification. The proposed deep multimodal fusion …

Online data organizer: micro-video categorization by structure-guided multimodal dictionary learning

M Liu, L Nie, X Wang, Q Tian… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Micro-videos have rapidly become one of the most dominant trends in the era of social
media. Accordingly, how to organize them draws our attention. Distinct from the traditional …

Enhancing micro-video understanding by harnessing external sounds

L Nie, X Wang, J Zhang, X He, H Zhang… - Proceedings of the 25th …, 2017 - dl.acm.org
Different from traditional long videos, micro-videos are much shorter and usually recorded at
a specific place with mobile devices. To better understand the semantics of a micro-video …

Scalable multi-view semi-supervised classification via adaptive regression

H Tao, C Hou, F Nie, J Zhu, D Yi - IEEE Transactions on Image …, 2017 - ieeexplore.ieee.org
With the advent of multi-view data, multi-view learning has become an important research
direction in machine learning and image processing. Considering the difficulty of obtaining …