Due to the flexible representation of arbitrary-shaped scene text and simple pipeline, bottom- up segmentation-based methods begin to be mainstream in real-time scene text detection …
Abstract Recent advanced Table Structure Recognition (TSR) models adopt image-to-text solutions to parse table structure. These methods can be formulated as image caption …
G Zeng, Y Zhang, Y Zhou, B Fang, G Zhao… - Proceedings of the 31st …, 2023 - dl.acm.org
Recently, generative Text-based visual question answering (TextVQA) methods, which are often based on language models, have exhibited impressive results and drawn increasing …
Z Hu, P Yang, Y Jiang, Z Bai - Pattern Recognition, 2024 - Elsevier
Abstract Existing studies apply Large Language Model (LLM) to knowledge-based Visual Question Answering (VQA) with encouraging results. Due to the insufficient input …
W Zheng, L Yan, FY Wang - IEEE Transactions on Systems …, 2023 - ieeexplore.ieee.org
While texts related to images convey fundamental messages for scene understanding and reasoning, text-based visual question answering tasks concentrate on visual questions that …
L Qiao, C Li, Z Cheng, Y Xu, Y Niu, X Li - Pattern Recognition, 2024 - Elsevier
Reading order detection aims to arrange the text logically, which is essential in understanding visual documents. Current methods mostly model the problem as a sequence …
S Zhang, Y Wu, X Zhang, Z Feng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Textbook question answering (TQA) task aims to infer answers for given questions from a multimodal context, including text and diagrams. The existing studies have aggregated …
X Yang, D Yang, Y Zhou, Y Guo… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
The application of text recognition in the automatic analysis of invoices, contracts and other documents has significantly raised office efficiency, but the stamps overlapping with the texts …
Y Shu, S Liu, Y Zhou, H Xu… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Text detection in natural scenarios, has made significant progress with the deep learning architecture. Towards arbitrary-shaped text detection, fracture detection is the major concern …