Multimodal machine learning: A survey and taxonomy

T Baltrušaitis, C Ahuja… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Our experience of the world is multimodal-we see objects, hear sounds, feel texture, smell
odors, and taste flavors. Modality refers to the way in which something happens or is …

A survey of multimodal deep generative models

M Suzuki, Y Matsuo - Advanced Robotics, 2022 - Taylor & Francis
Multimodal learning is a framework for building models that make predictions based on
different types of modalities. Important challenges in multimodal learning are the inference of …

Content-aware local gan for photo-realistic super-resolution

JK Park, S Son, KM Lee - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Recently, GAN has successfully contributed to making single-image super-resolution (SISR)
methods produce more realistic images. However, natural images have complex distribution …

FFDNet: Toward a fast and flexible solution for CNN-based image denoising

K Zhang, W Zuo, L Zhang - IEEE Transactions on Image …, 2018 - ieeexplore.ieee.org
Due to the fast inference and good performance, discriminative learning methods have been
widely studied in image denoising. However, these methods mostly learn a specific model …

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …

Structural deep network embedding

D Wang, P Cui, W Zhu - Proceedings of the 22nd ACM SIGKDD …, 2016 - dl.acm.org
Network embedding is an important method to learn low-dimensional representations of
vertexes in networks, aiming to capture and preserve the network structure. Almost all the …

Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval

S Su, Z Zhong, C Zhang - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com
Cross-modal hashing encodes the multimedia data into a common binary hash space in
which the correlations among the samples from different modalities can be effectively …

Deep cross-modal hashing

QY Jiang, WJ Li - Proceedings of the IEEE conference on …, 2017 - openaccess.thecvf.com
Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been
widely used for similarity search in multimedia retrieval applications. However, most existing …

Multi-modal hashing for efficient multimedia retrieval: A survey

L Zhu, C Zheng, W Guan, J Li, Y Yang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the explosive growth of multimedia contents, multimedia retrieval is facing
unprecedented challenges on both storage cost and retrieval speed. Hashing technique can …

Weakly-supervised semantic guided hashing for social image retrieval

Z Li, J Tang, L Zhang, J Yang - International Journal of Computer Vision, 2020 - Springer
Hashing has been widely investigated for large-scale image retrieval due to its search
effectiveness and computation efficiency. In this work, we propose a novel Semantic Guided …