Beyond OCR+ VQA: Involving OCR into the Flow for Robust and Accurate TextVQA G Zeng, Y Zhang, Y Zhou, X Yang Proceedings of the 29th ACM International Conference on Multimedia, 376-385, 2021 | 41 | 2021 |
PororoGAN: An Improved Story Visualization Model on Pororo-SV Dataset G Zeng, Z Li, Y Zhang Proceedings of the 2019 3rd International Conference on Computer Science and …, 2019 | 27 | 2019 |
Beyond OCR+ VQA: Towards End-to-End Reading and Reasoning for Robust and Accurate TextVQA G Zeng, Y Zhang, Y Zhou, X Yang, N Jiang, G Zhao, W Wang, XC Yin Pattern Recognition 138, 109337, 2023 | 22 | 2023 |
TextBlock: Towards Scene Text Spotting without Fine-grained Detection J Wei, Y Zhang, Y Zhou, G Zeng, Z Qiao, Y Guo, H Wu, H Wang, W Wang Proceedings of the 30th ACM International Conference on Multimedia, 5892-5902, 2022 | 14 | 2022 |
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering C Fang, G Zeng, Y Zhou, D Wu, C Ma, D Hu, W Wang 2022 ICME, 2022 | 12 | 2022 |
A Survey of Temporal Activity Localization via Language in Untrimmed Videos Y Yang, Z Li, G Zeng 2020 International Conference on Culture-oriented Science & Technology …, 2020 | 10 | 2020 |
Filling in the blank: Rationale-augmented prompt tuning for TextVQA G Zeng, Y Zhang, Y Zhou, B Fang, G Zhao, X Wei, W Wang Proceedings of the 31st ACM International Conference on Multimedia, 1261-1272, 2023 | 8 | 2023 |
A Cost-Efficient Framework for Scene Text Detection in the Wild G Zeng, Y Zhang, Y Zhou, X Yang PRICAI 2021: Trends in Artificial Intelligence: 18th Pacific Rim …, 2021 | 7 | 2021 |
Feature Enhancement with Text-Specific Region Contrast for Scene Text Detection X Sun, J Lyu, Y Zhang, G Zeng, B Fang, Y Zhou, E Xie, C Ma Chinese Conference on Pattern Recognition and Computer Vision (PRCV), 3-14, 2023 | 3 | 2023 |
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval G Zeng, Y Zhang, J Wei, D Yang, P Zhang, Y Gao, X Qin, Y Zhou Proceedings of the 32nd ACM International Conference on Multimedia, 2525-2534, 2024 | 2 | 2024 |
TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model J Lyu, J Wei, G Zeng, Z Li, E Xie, W Wang, Y Zhou arXiv preprint arXiv:2403.10047, 2024 | 2 | 2024 |
Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues Y Zhang, G Zeng, H Shen, D Wu, Y Zhou, C Ma arXiv preprint arXiv:2412.12502, 2024 | 1 | 2024 |
Towards better video services: An EEG-based interpretable model for functional quality of experience evaluation Y Niu, K Di, G Zeng, T Wei, Y Zhang, X Wu Displays 82, 102657, 2024 | 1 | 2024 |
Show Exemplars and Tell Me What You See: In-context Learning with Frozen Large Language Models for TextVQA Yan Zhang, Gangyan Zeng, Huawen Shen, Can Ma, Yu Zhou PRCV, 2024 | 1* | 2024 |
Perception-Enhanced Generative Transformer for Key Information Extraction from Documents R Zhao, JJ Ou Yang, C Gao, X Qin, G Zeng, X Hu, P Zhang International Conference on Pattern Recognition, 91-106, 2025 | | 2025 |
Improving Multimodal Rumor Detection via Dynamic Graph Modeling X Wu, X Hu, X Qin, P Zhang, G Zeng, Y Guo, R Zhao, X Huang International Conference on Pattern Recognition, 242-258, 2025 | | 2025 |
以文字为中心的图像理解技术综述 张言, 李强, 申化文, 曾港艳, 周宇, 马灿, 张远, 王伟平 中国图象图形学报 28 (8), 2253-2275, 2023 | | 2023 |