[HTML][HTML] Summary of chatgpt-related research and perspective towards the future of large language models

Y Liu, T Han, S Ma, J Zhang, Y Yang, J Tian, H He, A Li… - Meta-Radiology, 2023 - Elsevier
This paper presents a comprehensive survey of ChatGPT-related (GPT-3.5 and GPT-4)
research, state-of-the-art large language models (LLM) from the GPT series, and their …

On the hidden mystery of ocr in large multimodal models

Y Liu, Z Li, B Yang, C Li, X Yin, C Liu, L Jin… - arXiv preprint arXiv …, 2023 - arxiv.org
Large models have recently played a dominant role in natural language processing and
multimodal vision-language learning. However, their effectiveness in text-related visual …

Benchmarking ChatGPT-4 on ACR radiation oncology in-training (TXIT) exam and red journal gray zone cases: Potentials and challenges for AI-assisted medical …

Y Huang, A Gomaa, S Semrau, M Haderlein… - arXiv preprint arXiv …, 2023 - arxiv.org
The potential of large language models in medicine for education and decision making
purposes has been demonstrated as they achieve decent scores on medical exams such as …

LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding

C Luo, Y Shen, Z Zhu, Q Zheng… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recently leveraging large language models (LLMs) or multimodal large language models
(MLLMs) for document understanding has been proven very promising. However previous …

Read, diagnose and chat: Towards explainable and interactive LLMs-augmented depression detection in social media

W Qin, Z Chen, L Wang, Y Lan, W Ren… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper proposes a new depression detection system based on LLMs that is both
interpretable and interactive. It not only provides a diagnosis, but also diagnostic evidence …

T-sciq: Teaching multimodal chain-of-thought reasoning via large language model signals for science question answering

L Wang, Y Hu, J He, X Xu, N Liu, H Liu… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Abstract Large Language Models (LLMs) have recently demonstrated exceptional
performance in various Natural Language Processing (NLP) tasks. They have also shown …

Layout and task aware instruction prompt for zero-shot document image question answering

W Wang, Y Li, Y Ou, Y Zhang - arXiv preprint arXiv:2306.00526, 2023 - arxiv.org
Layout-aware pre-trained models has achieved significant progress on document image
question answering. They introduce extra learnable modules into existing language models …

Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

Z Zhao, J Tang, C Lin, B Wu, C Huang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Scene text recognition (STR) in the wild frequently encounters challenges when coping with
domain variations font diversity shape deformations etc. A straightforward solution is …

Understanding Strategies and Challenges of Conducting Daily Data Analysis (DDA) Among Blind and Low-vision People

C Jiang, W Lei, E Kuang, T Han, M Fan - Proceedings of the 25th …, 2023 - dl.acm.org
Being able to analyze and derive insights from data, which we call Daily Data Analysis
(DDA), is an increasingly important skill in everyday life. While the accessibility community …

Is ChatGPT capable of crafting gamification strategies for software engineering tasks?

T Fulcini, M Torchiano - Proceedings of the 2nd International Workshop …, 2023 - dl.acm.org
Gamification has gained significant attention in the last decade for its potential to enhance
engagement and motivation in various domains. During the last year ChatGPT, a state-of-the …