Few-Shot Joint Multimodal Entity-Relation Extraction via Knowledge-Enhanced Cross-modal Prompt Model

L Yuan, Y Cai, J Huang - Proceedings of the 32nd ACM International …, 2024 - dl.acm.org
Joint Multimodal Entity-Relation Extraction (JMERE) is a challenging task that aims to extract
entities and their relations from textimage pairs in social media posts. Existing methods for …

Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor

J Chen, X Hei, Y Xue, Y Wei, J Xie, Y Cai… - Proceedings of the 32nd …, 2024 - dl.acm.org
Large multimodal models (LMMs) have shown remarkable performance in the visual
commonsense reasoning (VCR) task, which aims to answer a multiple-choice question …

FRVA: Fact-Retrieval and Verification Augmented Entailment Tree Generation for Explainable Question Answering

Y Fan, H Zhang, R Li, Y Wang, H Tan… - Findings of the …, 2024 - aclanthology.org
Structured entailment tree can exhibit the reasoning chains from knowledge facts to
predicted answers, which is important for constructing an explainable question answering …