Instruction tuning for large language models: A survey

S Zhang, L Dong, X Li, S Zhang, X Sun, S Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper surveys research works in the quickly advancing field of instruction tuning (IT), a
crucial technique to enhance the capabilities and controllability of large language models …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Zhongjing: Enhancing the chinese medical capabilities of large language model through expert feedback and real-world multi-turn dialogue

S Yang, H Zhao, S Zhu, G Zhou, H Xu, Y Jia… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Abstract Recent advances in Large Language Models (LLMs) have achieved remarkable
breakthroughs in understanding and responding to user intents. However, their performance …

How far can camels go? exploring the state of instruction tuning on open resources

Y Wang, H Ivison, P Dasigi, J Hessel… - Advances in …, 2023 - proceedings.neurips.cc
In this work we explore recent advances in instruction-tuning language models on a range of
open instruction-following datasets. Despite recent claims that open models can be on par …

State of what art? a call for multi-prompt llm evaluation

M Mizrahi, G Kaplan, D Malkin, R Dror… - Transactions of the …, 2024 - direct.mit.edu
Recent advances in LLMs have led to an abundance of evaluation benchmarks, which
typically rely on a single instruction template per task. We create a large-scale collection of …

Can large language models reason about medical questions?

V Liévin, CE Hother, AG Motzfeldt, O Winther - Patterns, 2024 - cell.com
Although large language models often produce impressive outputs, it remains unclear how
they perform in real-world scenarios requiring strong reasoning skills and expert domain …

Cmmlu: Measuring massive multitask language understanding in chinese

H Li, Y Zhang, F Koto, Y Yang, H Zhao, Y Gong… - arXiv preprint arXiv …, 2023 - arxiv.org
As the capabilities of large language models (LLMs) continue to advance, evaluating their
performance becomes increasingly crucial and challenging. This paper aims to bridge this …

Jais and jais-chat: Arabic-centric foundation and instruction-tuned open generative large language models

N Sengupta, SK Sahu, B Jia, S Katipomu, H Li… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and
instruction-tuned open generative large language models (LLMs). The models are based on …

Benchmarking and defending against indirect prompt injection attacks on large language models

J Yi, Y Xie, B Zhu, E Kiciman, G Sun, X Xie… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent remarkable advancements in large language models (LLMs) have led to their
widespread adoption in various applications. A key feature of these applications is the …

Large language models in biomedical natural language processing: benchmarks, baselines, and recommendations

Q Chen, J Du, Y Hu, V Kuttichi Keloth, X Peng… - arXiv e …, 2023 - ui.adsabs.harvard.edu
Biomedical literature is growing rapidly, making it challenging to curate and extract
knowledge manually. Biomedical natural language processing (BioNLP) techniques that …