W Zheng, L Yan,
FY Wang - IEEE Transactions on Systems …, 2023 - ieeexplore.ieee.org
While texts related to images convey fundamental messages for scene understanding and
reasoning, text-based visual question answering tasks concentrate on visual questions that …