Rest-mcts*: Llm self-training via process reward guided tree search

D Zhang, J Wu, J Lei, T Che, J Li, T Xie… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper presents an advanced mathematical problem-solving framework, LLaMA-Berry,
for enhancing the mathematical reasoning ability of Large Language Models (LLMs). The …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Towards a unified view of preference learning for large language models: A survey

B Gao, F Song, Y Miao, Z Cai, Z Yang, L Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial
factors to achieve success is aligning the LLM's output with human preferences. This …

被引用次数：5 相关文章所有 2 个版本

[PDF] aclanthology.org

AutoRE: document-level relation extraction with large language models

L Xue, D Zhang, Y Dong, J Tang - … of the 62nd Annual Meeting of …, 2024 - aclanthology.org

Abstract Large Language Models (LLMs) have demonstrated exceptional abilities in
comprehending and generating text, motivating numerous researchers to utilize them for …

被引用次数：5 相关文章

[PDF] arxiv.org

Towards building specialized generalist ai with system 1 and system 2 fusion

K Zhang, B Qi, B Zhou - arXiv preprint arXiv:2407.08642, 2024 - arxiv.org

In this perspective paper, we introduce the concept of Specialized Generalist Artificial
Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Fast best-of-n decoding via speculative rejection

H Sun, M Haider, R Zhang, H Yang, J Qiu, M Yin… - arXiv preprint arXiv …, 2024 - arxiv.org

The safe and effective deployment of Large Language Models (LLMs) involves a critical step
called alignment, which ensures that the model's responses are in accordance with human …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Enhancing multi-step reasoning abilities of language models through direct q-function optimization

G Liu, K Ji, R Zheng, Z Wu, C Dun, Q Gu… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs)
with human preferences and improving their ability to perform complex tasks. However …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Beyond examples: High-level automated reasoning paradigm in in-context learning via mcts

J Wu, M Feng, S Zhang, F Che, Z Wen, J Tao - arXiv preprint arXiv …, 2024 - arxiv.org

In-context Learning (ICL) enables large language models (LLMs) to tackle downstream
tasks through sophisticated prompting and high-quality demonstrations. However, this …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Progressive multimodal reasoning via active retrieval

G Dong, C Zhang, M Deng, Y Zhu, Z Dou… - arXiv preprint arXiv …, 2024 - arxiv.org

Multi-step multimodal reasoning tasks pose significant challenges for multimodal large
language models (MLLMs), and finding effective ways to enhance their performance in such …

被引用次数：1 相关文章所有 2 个版本

[PDF] rsc.org

Towards a science exocortex

KG Yager - Digital Discovery, 2024 - pubs.rsc.org

Artificial intelligence (AI) methods are poised to revolutionize intellectual work, with
generative AI enabling automation of text analysis, text generation, and simple decision …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

Scaling inference-time search with vision value model for improved visual comprehension

W Xiyao, Y Zhengyuan, L Linjie, L Hongjin… - arXiv preprint arXiv …, 2024 - arxiv.org

Despite significant advancements in vision-language models (VLMs), there lacks effective
approaches to enhance response quality by scaling inference-time computation. This …

被引用次数：1 相关文章所有 3 个版本

高级搜索

QQ 群