Y Liu, B Yang, Q Liu, Z Li, Z Ma, S Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks,
including document question answering (DocVQA) and scene text analysis. Our approach …