Llms and the abstraction and reasoning corpus: Successes, failures, and the importance of...

S Mirchandani, F Xia, P Florence, B Ichter… - arXiv preprint arXiv …, 2023 - arxiv.org

We observe that pre-trained large language models (LLMs) are capable of autoregressively
completing complex token sequences--from arbitrary ones procedurally generated by …

被引用次数：116 相关文章所有 4 个版本

[PDF] mdpi.com

Generative pre-trained transformer (GPT) in research: A systematic review on data augmentation

F Sufi - Information, 2024 - mdpi.com

GPT (Generative Pre-trained Transformer) represents advanced language models that have
significantly reshaped the academic writing landscape. These sophisticated language …

被引用次数：15 相关文章所有 3 个版本

[PDF] arxiv.org

Hypothesis search: Inductive reasoning with language models

R Wang, E Zelikman, G Poesia, Y Pu, N Haber… - arXiv preprint arXiv …, 2023 - arxiv.org

Inductive reasoning is a core problem-solving capacity: humans can identify underlying
principles from a few examples, which can then be robustly generalized to novel scenarios …

被引用次数：41 相关文章所有 4 个版本

[PDF] arxiv.org

Phenomenal yet puzzling: Testing inductive reasoning capabilities of language models with hypothesis refinement

L Qiu, L Jiang, X Lu, M Sclar, V Pyatkin… - arXiv preprint arXiv …, 2023 - arxiv.org

The ability to derive underlying principles from a handful of observations and then
generalize to novel situations--known as inductive reasoning--is central to human …

被引用次数：18 相关文章所有 3 个版本

[PDF] openreview.net

Comparing humans, gpt-4, and gpt-4v on abstraction and reasoning tasks

M Mitchell, AB Palmarini, A Moskvichev - arXiv preprint arXiv:2311.09247, 2023 - arxiv.org

We explore the abstract reasoning abilities of text-only and multimodal versions of GPT-4,
using the ConceptARC benchmark [10], which is designed to evaluate robust understanding …

被引用次数：35 相关文章所有 7 个版本

[PDF] arxiv.org

Efficient causal graph discovery using large language models

T Jiralerspong, X Chen, Y More, V Shah… - arXiv preprint arXiv …, 2024 - arxiv.org

We propose a novel framework that leverages LLMs for full causal graph discovery. While
previous LLM-based methods have used a pairwise query approach, this requires a …

被引用次数：10 相关文章所有 3 个版本

[PDF] frontiersin.org

Examining the potential and pitfalls of ChatGPT in science and engineering problem-solving

KD Wang, E Burkholder, C Wieman, S Salehi… - Frontiers in …, 2024 - frontiersin.org

The study explores the capabilities of OpenAI's ChatGPT in solving different types of physics
problems. ChatGPT (with GPT-4) was queried to solve a total of 40 problems from a college …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

Codeit: Self-improving language models with prioritized hindsight replay

N Butt, B Manczak, A Wiggers, C Rainone… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models are increasingly solving tasks that are commonly believed to
require human-level reasoning ability. However, these models still perform very poorly on …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning

H Lee, S Kim, S Lee, S Hwang, J Lee, BJ Lee… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper introduces ARCLE, an environment designed to facilitate reinforcement learning
research on the Abstraction and Reasoning Corpus (ARC). Addressing this inductive …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Large language model (llm) as a system of multiple expert agents: An approach to solve the abstraction and reasoning corpus (arc) challenge

JCM Tan, M Motani - arXiv preprint arXiv:2310.05146, 2023 - arxiv.org

We attempt to solve the Abstraction and Reasoning Corpus (ARC) Challenge using Large
Language Models (LLMs) as a system of multiple expert agents. Using the flexibility of LLMs …

被引用次数：4 相关文章所有 2 个版本

高级搜索

QQ 群