Llama-berry: Pairwise optimization for o1-like olympiad-level mathematical reasoning

D Zhang, J Wu, J Lei, T Che, J Li, T Xie… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper presents an advanced mathematical problem-solving framework, LLaMA-Berry,
for enhancing the mathematical reasoning ability of Large Language Models (LLMs). The …

Towards a unified view of preference learning for large language models: A survey

B Gao, F Song, Y Miao, Z Cai, Z Yang, L Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial
factors to achieve success is aligning the LLM's output with human preferences. This …

AutoRE: document-level relation extraction with large language models

L Xue, D Zhang, Y Dong, J Tang - … of the 62nd Annual Meeting of …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) have demonstrated exceptional abilities in
comprehending and generating text, motivating numerous researchers to utilize them for …

Towards building specialized generalist ai with system 1 and system 2 fusion

K Zhang, B Qi, B Zhou - arXiv preprint arXiv:2407.08642, 2024 - arxiv.org
In this perspective paper, we introduce the concept of Specialized Generalist Artificial
Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence …

Fast best-of-n decoding via speculative rejection

H Sun, M Haider, R Zhang, H Yang, J Qiu, M Yin… - arXiv preprint arXiv …, 2024 - arxiv.org
The safe and effective deployment of Large Language Models (LLMs) involves a critical step
called alignment, which ensures that the model's responses are in accordance with human …

Enhancing multi-step reasoning abilities of language models through direct q-function optimization

G Liu, K Ji, R Zheng, Z Wu, C Dun, Q Gu… - arXiv preprint arXiv …, 2024 - arxiv.org
Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs)
with human preferences and improving their ability to perform complex tasks. However …

Beyond examples: High-level automated reasoning paradigm in in-context learning via mcts

J Wu, M Feng, S Zhang, F Che, Z Wen, J Tao - arXiv preprint arXiv …, 2024 - arxiv.org
In-context Learning (ICL) enables large language models (LLMs) to tackle downstream
tasks through sophisticated prompting and high-quality demonstrations. However, this …

Progressive multimodal reasoning via active retrieval

G Dong, C Zhang, M Deng, Y Zhu, Z Dou… - arXiv preprint arXiv …, 2024 - arxiv.org
Multi-step multimodal reasoning tasks pose significant challenges for multimodal large
language models (MLLMs), and finding effective ways to enhance their performance in such …

Towards a science exocortex

KG Yager - Digital Discovery, 2024 - pubs.rsc.org
Artificial intelligence (AI) methods are poised to revolutionize intellectual work, with
generative AI enabling automation of text analysis, text generation, and simple decision …

Scaling inference-time search with vision value model for improved visual comprehension

W Xiyao, Y Zhengyuan, L Linjie, L Hongjin… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite significant advancements in vision-language models (VLMs), there lacks effective
approaches to enhance response quality by scaling inference-time computation. This …