A survey on data synthesis and augmentation for large language models

K Wang, J Zhu, M Ren, Z Liu, S Li, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
The success of Large Language Models (LLMs) is inherently linked to the availability of vast,
diverse, and high-quality data for training and evaluation. However, the growth rate of high …

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

H Li, Q Dong, J Chen, H Su, Y Zhou, Q Ai, Z Ye… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid advancement of Large Language Models (LLMs) has driven their expanding
application across various fields. One of the most promising applications is their role as …

MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making

D Fu, B Qi, Y Gao, C Jiang, G Dong, B Zhou - arXiv preprint arXiv …, 2024 - arxiv.org
Long-term memory is significant for agents, in which insights play a crucial role. However,
the emergence of irrelevant insight and the lack of general insight can greatly undermine the …

EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

LG Research, S An, K Bae, E Choi, K Choi… - arXiv preprint arXiv …, 2024 - arxiv.org
This technical report introduces the EXAONE 3.5 instruction-tuned language models,
developed and released by LG AI Research. The EXAONE 3.5 language models are offered …

CareBot: A Pioneering Full-Process Open-Source Medical Language Model

L Zhao, W Zeng, X Shi, H Zhou - arXiv preprint arXiv:2412.15236, 2024 - arxiv.org
Recently, both closed-source LLMs and open-source communities have made significant
strides, outperforming humans in various general domains. However, their performance in …

Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models

L Zhao, W Zeng, X Shi, H Zhou, D Hao, Y Lin - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, both closed-source LLMs and open-source communities have made significant
strides, outperforming humans in various general domains. However, their performance in …

Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

Y Ding, X Shi, X Liang, J Li, Q Zhu, M Zhang - arXiv preprint arXiv …, 2024 - arxiv.org
The availability of high-quality data is one of the most important factors in improving the
reasoning capability of LLMs. Existing works have demonstrated the effectiveness of …

AIGS: Generating Science from AI-Powered Automated Falsification

Z Liu, K Liu, Y Zhu, X Lei, Z Yang, Z Zhang, P Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Rapid development of artificial intelligence has drastically accelerated the development of
scientific discovery. Trained with large-scale observation data, deep neural networks extract …