J Chen, X Hei, Y Xue, Y Wei, J Xie, Y Cai… - Proceedings of the 32nd …, 2024 - dl.acm.org
Large multimodal models (LMMs) have shown remarkable performance in the visual commonsense reasoning (VCR) task, which aims to answer a multiple-choice question …
Y Fan, H Zhang, R Li, Y Wang, H Tan… - Findings of the …, 2024 - aclanthology.org
Structured entailment tree can exhibit the reasoning chains from knowledge facts to predicted answers, which is important for constructing an explainable question answering …