A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

Metamath: Bootstrap your own mathematical questions for large language models

L Yu, W Jiang, H Shi, J Yu, Z Liu, Y Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …

Llemma: An open language model for mathematics

Z Azerbayev, H Schoelkopf, K Paster… - arXiv preprint arXiv …, 2023 - arxiv.org
We present Llemma, a large language model for mathematics. We continue pretraining
Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing …

Can llms generate novel research ideas? a large-scale human study with 100+ nlp researchers

C Si, D Yang, T Hashimoto - arXiv preprint arXiv:2409.04109, 2024 - arxiv.org
Recent advancements in large language models (LLMs) have sparked optimism about their
potential to accelerate scientific discovery, with a growing number of works proposing …

Building machines that learn and think with people

KM Collins, I Sucholutsky, U Bhatt, K Chandra… - Nature human …, 2024 - nature.com
What do we want from machine intelligence? We envision machines that are not just tools
for thought but partners in thought: reasonable, insightful, knowledgeable, reliable and …

Openwebmath: An open dataset of high-quality mathematical web text

K Paster, MD Santos, Z Azerbayev, J Ba - arXiv preprint arXiv:2310.06786, 2023 - arxiv.org
There is growing evidence that pretraining on high quality, carefully thought-out tokens such
as code or mathematics plays an important role in improving the reasoning abilities of large …

Towards responsible development of generative AI for education: An evaluation-driven approach

I Jurenka, M Kunesch, KR McKee, D Gillick… - arXiv preprint arXiv …, 2024 - arxiv.org
A major challenge facing the world is the provision of equitable and universal access to
quality education. Recent advances in generative AI (gen AI) have created excitement about …

[HTML][HTML] Does ChatGPT enhance student learning? A systematic review and meta-analysis of experimental studies

R Deng, M Jiang, X Yu, Y Lu, S Liu - Computers & Education, 2024 - Elsevier
Abstract Chat Generative Pre-Trained Transformer (ChatGPT) has generated excitement
and concern in education. While cross-sectional studies have highlighted correlations …

[PDF][PDF] From computation to adjudication: Evaluating large language model judges on mathematical reasoning and precision calculation

D Yanid, A Davenport, X Carmichael, N Thompson - 2024 - files.osf.io
Recent developments in language models have sparked interest in their potential
applications beyond natural language tasks, including domains that require precise …