Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Dress: Instructing large vision-language models to align and interact with humans via natural language feedback

Y Chen, K Sikka, M Cogswell, H Ji… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present DRESS a large vision language model (LVLM) that innovatively exploits Natural
Language feedback (NLF) from Large Language Models to enhance its alignment and …

[HTML][HTML] Fine-tuning ChatGPT for automatic scoring

E Latif, X Zhai - Computers and Education: Artificial Intelligence, 2024 - Elsevier
This study highlights the potential of fine-tuned ChatGPT (GPT-3.5) for automatically scoring
student written constructed responses using example assessment tasks in science …

Exploring ocr capabilities of gpt-4v (ision): A quantitative and in-depth evaluation

Y Shi, D Peng, W Liao, Z Lin, X Chen, C Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper presents a comprehensive evaluation of the Optical Character Recognition
(OCR) capabilities of the recently released GPT-4V (ision), a Large Multimodal Model …

Muffin: Curating multi-faceted instructions for improving instruction following

R Lou, K Zhang, J Xie, Y Sun, J Ahn, H Xu… - The Twelfth …, 2023 - openreview.net
In the realm of large language models (LLMs), enhancing instruction-following capability
often involves curating expansive training data. This is achieved through two primary …

Wavecoder: Widespread and versatile enhanced instruction tuning with refined data generation

Z Yu, X Zhang, N Shang, Y Huang, C Xu… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent work demonstrates that, after being fine-tuned on a high-quality instruction dataset,
the resulting model can obtain impressive capabilities to address a wide range of tasks …

Tuning LayerNorm in Attention: Towards efficient multi-modal llm finetuning

B Zhao, H Tu, C Wei, J Mei, C Xie - arXiv preprint arXiv:2312.11420, 2023 - arxiv.org
This paper introduces an efficient strategy to transform Large Language Models (LLMs) into
Multi-Modal Large Language Models (MLLMs). By conceptualizing this transformation as a …

Cogenesis: A framework collaborating large and small language models for secure context-aware instruction following

K Zhang, J Wang, E Hua, B Qi, N Ding… - arXiv preprint arXiv …, 2024 - arxiv.org
With the advancement of language models (LMs), their exposure to private data is
increasingly inevitable, and their deployment (especially for smaller ones) on personal …

Stabilizing RLHF through advantage model and selective rehearsal

B Peng, L Song, Y Tian, L Jin, H Mi, D Yu - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have revolutionized natural language processing, yet
aligning these models with human values and preferences using RLHF remains a significant …

Automatic instruction optimization for open-source llm instruction tuning

Y Liu, S Tao, X Zhao, M Zhu, W Ma, J Zhu, C Su… - arXiv preprint arXiv …, 2023 - arxiv.org
Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to
human instructions. The quality of instruction pairs used for tuning greatly affects the …