Multimodal data fusion for systems improvement: A review

N Gaw, S Yousefi, MR Gahrooei - … from the Air Force Institute of …, 2022 - taylorfrancis.com
In recent years, information available from multiple data modalities has become increasingly
common for industrial engineering and operations research applications. There have been a …

Cross-modal retrieval: a systematic review of methods and future directions

T Wang, F Li, L Zhu, J Li, Z Zhang, HT Shen - arXiv preprint arXiv …, 2023 - arxiv.org
With the exponential surge in diverse multi-modal data, traditional uni-modal retrieval
methods struggle to meet the needs of users seeking access to data across various …

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

K Bayoudh, R Knani, F Hamdaoui, A Mtibaa - The Visual Computer, 2022 - Springer
The research progress in multimodal learning has grown rapidly over the last decade in
several areas, especially in computer vision. The growing potential of multimodal data …

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …

Deep supervised cross-modal retrieval

L Zhen, P Hu, X Wang, D Peng - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Cross-modal retrieval aims to enable flexible retrieval across different modalities. The core
of cross-modal retrieval is how to measure the content similarity between different types of …

Detfusion: A detection-driven infrared and visible image fusion network

Y Sun, B Cao, P Zhu, Q Hu - Proceedings of the 30th ACM international …, 2022 - dl.acm.org
Infrared and visible image fusion aims to utilize the complementary information between the
two modalities to synthesize a new image containing richer information. Most existing works …

Cross-modal attention with semantic consistence for image–text matching

X Xu, T Wang, Y Yang, L Zuo, F Shen… - IEEE transactions on …, 2020 - ieeexplore.ieee.org
The task of image-text matching refers to measuring the visual-semantic similarity between
an image and a sentence. Recently, the fine-grained matching methods that explore the …

Survey on deep multi-modal data analytics: Collaboration, rivalry, and fusion

Y Wang - ACM Transactions on Multimedia Computing …, 2021 - dl.acm.org
With the development of web technology, multi-modal or multi-view data has surged as a
major stream for big data, where each modal/view encodes individual property of data …

Ternary adversarial networks with self-supervision for zero-shot cross-modal retrieval

X Xu, H Lu, J Song, Y Yang… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
Given a query instance from one modality (eg, image), cross-modal retrieval aims to find
semantically similar instances from another modality (eg, text). To perform cross-modal …

CM-GANs: Cross-modal generative adversarial networks for common representation learning

Y Peng, J Qi - ACM Transactions on Multimedia Computing …, 2019 - dl.acm.org
It is known that the inconsistent distributions and representations of different modalities, such
as image and text, cause the heterogeneity gap, which makes it very challenging to correlate …