Instruction Tuning With Loss Over Instructions

Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arXiv preprint arXiv …, 2024 - arxiv.org

Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Instruction following without instruction tuning

J Hewitt, NF Liu, P Liang, CD Manning - arXiv preprint arXiv:2409.14254, 2024 - arxiv.org

Instruction tuning commonly means finetuning a language model on instruction-response
pairs. We discover two forms of adaptation (tuning) that are deficient compared to instruction …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Codebook llms: Adapting political science codebooks for llm use and adapting llms to follow codebooks

A Halterman, KA Keith - arXiv preprint arXiv:2407.10747, 2024 - arxiv.org

Codebooks--documents that operationalize constructs and outline annotation procedures--
are used almost universally by social scientists when coding unstructured political texts …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Lions: An empirically optimized approach to align language models

X Yu, Q Wu, Y Li, Z Yu - arXiv preprint arXiv:2407.06542, 2024 - arxiv.org

Alignment is a crucial step to enhance the instruction-following and conversational abilities
of language models. Despite many recent work proposing new algorithms, datasets, and …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

A Closer Look at Machine Unlearning for Large Language Models

X Yuan, T Pang, C Du, K Chen, W Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) may memorize sensitive or copyrighted content, raising
privacy and legal concerns. Due to the high cost of retraining from scratch, researchers …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Understanding the Role of User Profile in the Personalization of Large Language Models

B Wu, Z Shi, HA Rahmani, V Ramineni… - arXiv preprint arXiv …, 2024 - arxiv.org

Utilizing user profiles to personalize Large Language Models (LLMs) has been shown to
enhance the performance on a wide range of tasks. However, the precise role of user …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Understanding likelihood over-optimisation in direct alignment algorithms

Z Shi, S Land, A Locatelli, M Geist, M Bartolo - arXiv preprint arXiv …, 2024 - arxiv.org

Direct Alignment Algorithms (DAAs), such as Direct Preference Optimisation (DPO) and
Identity Preference Optimisation (IPO), have emerged as alternatives to online …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

高级搜索

QQ 群