Understanding transformer reasoning capabilities via graph algorithms

S Jelassi, C Mohri, D Brandfonbrener, A Gu… - arXiv preprint arXiv …, 2024 - arxiv.org

The Mixture-of-Experts (MoE) architecture enables a significant increase in the total number
of model parameters with minimal computational overhead. However, it is not clear what …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Reasoning in large language models: A geometric perspective

R Cosentino, S Shekkizhar - arXiv preprint arXiv:2407.02678, 2024 - arxiv.org

The advancement of large language models (LLMs) for real-world applications hinges
critically on enhancing their reasoning capabilities. In this work, we explore the reasoning …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

A theory for compressibility of graph transformers for transductive learning

H Shirzad, H Lin, A Velingker, B Venkatachalam… - arXiv preprint arXiv …, 2024 - arxiv.org

Transductive tasks on graphs differ fundamentally from typical supervised machine learning
tasks, as the independent and identically distributed (iid) assumption does not hold among …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Lost-in-Distance: Impact of Contextual Proximity on LLM Performance in Graph Tasks

H Firooz, M Sanjabi, W Jiang, X Zhai - arXiv preprint arXiv:2410.01985, 2024 - arxiv.org

Despite significant advancements, Large Language Models (LLMs) exhibit blind spots that
impair their ability to retrieve and process relevant contextual data effectively. We …

The CLRS-Text Algorithmic Reasoning Language Benchmark

L Markeeva, S McLeish, B Ibarz, W Bounsi… - arXiv preprint arXiv …, 2024 - arxiv.org

Eliciting reasoning capabilities from language models (LMs) is a critical direction on the path
towards building intelligent systems. Most recent studies dedicated to reasoning focus on …

被引用次数：5 相关文章所有 2 个版本

Graph Reasoning with LLMs (GReaL)

A Tsitsulin, B Perozzi, B Fatemi… - Proceedings of the 30th …, 2024 - dl.acm.org

Graphs are a powerful tool for representing and analyzing complex relationships in real-
world applications. Large Language Models (LLMs) have demonstrated impressive …

[PDF] arxiv.org

Text-space Graph Foundation Models: Comprehensive Benchmarks and New Insights

Z Chen, H Mao, J Liu, Y Song, B Li, W Jin… - arXiv preprint arXiv …, 2024 - arxiv.org

Given the ubiquity of graph data and its applications in diverse domains, building a Graph
Foundation Model (GFM) that can work well across different graphs and tasks with a unified …

被引用次数：6 相关文章

[PDF] arxiv.org

被引用次数：11 相关文章所有 2 个版本

高级搜索

QQ 群