P Liu, C Shi, WW Sun - arXiv preprint arXiv:2410.02504, 2024 - arxiv.org
Aligning large language models (LLMs) with human preferences is critical to recent advances in generative artificial intelligence. Reinforcement learning from human feedback …
Z Li, C Chen, T Xu, Z Qin, J Xiao, R Sun… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models rely on Supervised Fine-Tuning (SFT) to specialize in downstream tasks. Cross Entropy (CE) loss is the de facto choice in SFT, but it often leads to overfitting …
Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre- trained models directly in weight space, by adding the fine-tuned weights of different tasks …
Abstract Knowledge bases play a vital role in the modern world, offering a systematic and structured approach to integrate various entities, concepts, rules, and relationships …