Variational causal inference network for explanatory visual question answering

D Xue, S Qian, C Xu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Explanatory Visual Question Answering (EVQA) is a recently proposed multimodal
reasoning task that requires answering visual questions and generating multimodal …

A survey on cross-media search based on user intention understanding in social networks

L Shi, J Luo, C Zhu, F Kou, G Cheng, X Liu - Information Fusion, 2023 - Elsevier
With the increasing popularity of online social networks, more and more people are posting
information, updating their statuses, and searching for topics there. Massive cross-media big …

Adversarial representation with intra-modal and inter-modal graph contrastive learning for multimodal emotion recognition

Y Shou, T Meng, W Ai, K Li - arXiv preprint arXiv:2312.16778, 2023 - arxiv.org
With the release of increasing open-source emotion recognition datasets on social media
platforms and the rapid development of computing resources, multimodal emotion …

Adversarial alignment and graph fusion via information bottleneck for multimodal emotion recognition in conversations

Y Shou, T Meng, W Ai, F Zhang, N Yin, K Li - Information Fusion, 2024 - Elsevier
With the rapid development of social media and human–computer interaction, multimodal
emotion recognition in conversations (MERC) tasks have begun to receive widespread …

Open-world social event classification

S Qian, H Chen, D Xue, Q Fang, C Xu - Proceedings of the ACM Web …, 2023 - dl.acm.org
With the rapid development of Internet and the expanding scale of social media, social event
classification has attracted increasing attention. The key to social event classification is …

EduCross: Dual adversarial bipartite hypergraph learning for cross-modal retrieval in multimodal educational slides

M Li, S Zhou, Y Chen, C Huang, Y Jiang - Information Fusion, 2024 - Elsevier
In the digital education landscape, cross-modal retrieval (CMR) from multimodal educational
slides represents a significant challenge, particularly because of the complex nature of …

Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval

F Zhang, XS Hua, C Chen… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
This paper studies the problem of semi-supervised 2D-3D retrieval which aims to align both
labeled and unlabeled 2D and 3D data into the same embedding space. The problem is …

Multi-level Contrastive Learning: Hierarchical Alleviation of Heterogeneity in Multimodal Sentiment Analysis

C Fan, K Zhu, J Tao, G Yi, J Xue… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recently, multimodal fusion efforts have achieved remarkable success in Multimodal
Sentiment Analysis (MSA). However, most of the existing methods are based on model-level …

Deep supervised multi-view learning with graph priors

P Hu, L Zhen, X Peng, H Zhu, J Lin… - … on Image Processing, 2023 - ieeexplore.ieee.org
This paper presents a novel method for supervised multi-view representation learning,
which projects multiple views into a latent common space while preserving the …

Graph-guided deep hashing networks for similar patient retrieval

Y Gu, X Yang, M Sun, C Wang, H Yang, C Yang… - Computers in Biology …, 2024 - Elsevier
With the rapid growth and widespread application of electronic health records (EHRs),
similar patient retrieval has become an important task for downstream clinical decision …