What makes good data for alignment? a comprehensive study of automatic data selection in...

A Albalak, Y Elazar, SM Xie, S Longpre… - arXiv preprint arXiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

被引用次数：71 相关文章所有 2 个版本

A survey on stability of learning with limited labelled data and its sensitivity to the effects of randomness

B Pecher, I Srba, M Bielikova - ACM Computing Surveys, 2024 - dl.acm.org

Learning with limited labelled data, such as prompting, in-context learning, fine-tuning, meta-
learning, or few-shot learning, aims to effectively train a model using only a small amount of …

被引用次数：4 相关文章

[PDF] arxiv.org

Yi: Open foundation models by 01. ai

A Young, B Chen, C Li, C Huang, G Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce the Yi model family, a series of language and multimodal models that
demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and …

被引用次数：335 相关文章所有 2 个版本

[PDF] arxiv.org

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

被引用次数：123 相关文章所有 7 个版本

[PDF] aclanthology.org

Llama-moe: Building mixture-of-experts from llama with continual pre-training

T Zhu, X Qu, D Dong, J Ruan, J Tong… - Proceedings of the …, 2024 - aclanthology.org

Abstract Mixture-of-Experts (MoE) has gained increasing popularity as a promising
framework for scaling up large language models (LLMs). However, training MoE from …

被引用次数：30 相关文章所有 4 个版本

[PDF] arxiv.org

Less: Selecting influential data for targeted instruction tuning

M Xia, S Malladi, S Gururangan, S Arora… - arXiv preprint arXiv …, 2024 - arxiv.org

Instruction tuning has unlocked powerful capabilities in large language models (LLMs),
effectively using combined datasets to develop generalpurpose chatbots. However, real …

被引用次数：118 相关文章所有 4 个版本

[PDF] arxiv.org

A survey of multimodal large language model from a data-centric perspective

T Bai, H Liang, B Wan, Y Xu, X Li, S Li, L Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

Multimodal large language models (MLLMs) enhance the capabilities of standard large
language models by integrating and processing data from multiple modalities, including text …

被引用次数：32 相关文章所有 2 个版本

[PDF] arxiv.org

Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving

Y Tong, X Zhang, R Wang, R Wu, J He - arXiv preprint arXiv:2407.13690, 2024 - arxiv.org

Solving mathematical problems requires advanced reasoning abilities and presents notable
challenges for large language models. Previous works usually synthesize data from …

被引用次数：22 相关文章

[PDF] arxiv.org

Clustering and ranking: Diversity-preserved instruction selection through expert-aligned quality estimation

Y Ge, Y Liu, C Hu, W Meng, S Tao, X Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org

With contributions from the open-source community, a vast amount of instruction tuning (IT)
data has emerged. Given the significant resource allocation required for training and …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

A survey of table reasoning with large language models

X Zhang, D Wang, L Dou, Q Zhu, W Che - Frontiers of Computer Science, 2025 - Springer

Table reasoning aims to generate inference results based on the user requirement and the
provided table. Enhancing the table reasoning capability of the model can aid in obtaining …

被引用次数：8 相关文章所有 2 个版本

高级搜索

QQ 群