Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

Instruction following without instruction tuning

J Hewitt, NF Liu, P Liang, CD Manning - arXiv preprint arXiv:2409.14254, 2024 - arxiv.org
Instruction tuning commonly means finetuning a language model on instruction-response
pairs. We discover two forms of adaptation (tuning) that are deficient compared to instruction …

Codebook llms: Adapting political science codebooks for llm use and adapting llms to follow codebooks

A Halterman, KA Keith - arXiv preprint arXiv:2407.10747, 2024 - arxiv.org
Codebooks--documents that operationalize constructs and outline annotation procedures--
are used almost universally by social scientists when coding unstructured political texts …

Lions: An empirically optimized approach to align language models

X Yu, Q Wu, Y Li, Z Yu - arXiv preprint arXiv:2407.06542, 2024 - arxiv.org
Alignment is a crucial step to enhance the instruction-following and conversational abilities
of language models. Despite many recent work proposing new algorithms, datasets, and …

A Closer Look at Machine Unlearning for Large Language Models

X Yuan, T Pang, C Du, K Chen, W Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) may memorize sensitive or copyrighted content, raising
privacy and legal concerns. Due to the high cost of retraining from scratch, researchers …

Understanding the Role of User Profile in the Personalization of Large Language Models

B Wu, Z Shi, HA Rahmani, V Ramineni… - arXiv preprint arXiv …, 2024 - arxiv.org
Utilizing user profiles to personalize Large Language Models (LLMs) has been shown to
enhance the performance on a wide range of tasks. However, the precise role of user …

Understanding likelihood over-optimisation in direct alignment algorithms

Z Shi, S Land, A Locatelli, M Geist, M Bartolo - arXiv preprint arXiv …, 2024 - arxiv.org
Direct Alignment Algorithms (DAAs), such as Direct Preference Optimisation (DPO) and
Identity Preference Optimisation (IPO), have emerged as alternatives to online …

MDCure: A Scalable Pipeline for Multi-Document Instruction-Following

GKM Liu, B Shi, A Caciularu, I Szpektor… - arXiv preprint arXiv …, 2024 - arxiv.org
Multi-document (MD) processing is crucial for LLMs to handle real-world tasks such as
summarization and question-answering across large sets of documents. While LLMs have …

SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

Y Xiao, S Zhang, W Zhou, M Ghassemi… - arXiv preprint arXiv …, 2024 - arxiv.org
To induce desired behaviors in large language models (LLMs) for interaction-driven tasks,
the instruction-tuning stage typically trains LLMs on instruction-response pairs using the next …

All-in-One Tuning and Structural Pruning for Domain-Specific LLMs

L Lu, Z Wang, R Bao, M Wang, F Li, Y Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing pruning techniques for large language models (LLMs) targeting domain-specific
applications typically follow a two-stage process: pruning the pretrained general-purpose …