Competence-based multimodal curriculum learning for medical report generation

F Liu, S Ge, Y Zou, X Wu - arXiv preprint arXiv:2206.14579, 2022 - arxiv.org
Medical report generation task, which targets to produce long and coherent descriptions of
medical images, has attracted growing research interests recently. Different from the general …

Clova: A closed-loop visual assistant with tool usage and update

Z Gao, Y Du, X Zhang, X Ma, W Han… - Proceedings of the …, 2024 - openaccess.thecvf.com
Utilizing large language models (LLMs) to compose off-the-shelf visual tools represents a
promising avenue of research for developing robust visual assistants capable of addressing …

From easy to hard: Learning language-guided curriculum for visual question answering on remote sensing data

Z Yuan, L Mou, Q Wang, XX Zhu - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Visual question answering (VQA) for remote sensing scene has great potential in intelligent
human–computer interaction system. Although VQA in computer vision has been widely …

[PDF][PDF] Building public trust in Islamic school through adaptive curriculum

H Baharun, C Muali, F Rozi, MW Fajry - Jurnal Pendidikan …, 2022 - risbang.unuja.ac.id
This study aims to investigate an adaptive curriculum serving as a medium to build public
trust at Islamic Elementary School (SDI) Tompokersan Lumajang, East Java. It is qualitative …

Re-attention for visual question answering

W Guo, Y Zhang, J Yang, X Yuan - IEEE Transactions on Image …, 2021 - ieeexplore.ieee.org
A simultaneous understanding of questions and images is crucial in Visual Question
Answering (VQA). While the existing models have achieved satisfactory performance by …

Closed loop neural-symbolic learning via integrating neural perception, grammar parsing, and symbolic reasoning

Q Li, S Huang, Y Hong, Y Chen… - … on Machine Learning, 2020 - proceedings.mlr.press
The goal of neural-symbolic computation is to integrate the connectionist and symbolist
paradigms. Prior methods learn the neural-symbolic models using reinforcement learning …

Learning by fixing: Solving math word problems with weak supervision

Y Hong, Q Li, D Ciao, S Huang, SC Zhu - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Previous neural solvers of math word problems (MWPs) are learned with full supervision
and fail to generate diverse solutions. In this paper, we address this issue by introducing a …

Comphy: Compositional physical reasoning of objects and events from videos

Z Chen, K Yi, Y Li, M Ding, A Torralba… - arXiv preprint arXiv …, 2022 - arxiv.org
Objects' motions in nature are governed by complex interactions and their properties. While
some properties, such as shape and material, can be identified via the object's visual …

CLCL: Non-compositional expression detection with contrastive learning and curriculum learning

J Zhou, Z Zeng, S Bhat - Proceedings of the 61st Annual Meeting …, 2023 - aclanthology.org
Non-compositional expressions present a substantial challenge for natural language
processing (NLP) systems, necessitating more intricate processing compared to general …

Suppressing biased samples for robust VQA

N Ouyang, Q Huang, P Li, Y Cai, B Liu… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Most existing visual question answering (VQA) models strongly rely on language bias to
answer questions, ie, they always tend to fit question-answer pairs on the train split and …