PPL-MCTS: Constrained textual generation through discriminator-guided MCTS decoding

LM Antunes, KT Butler, R Grau-Crespo - Nature Communications, 2024 - nature.com

The generation of plausible crystal structures is often the first step in predicting the structure
and properties of a material from its chemical composition. However, most current methods …

被引用次数：43 相关文章所有 2 个版本

[PDF] arxiv.org

Planning with large language models for code generation

S Zhang, Z Chen, Y Shen, M Ding… - arXiv preprint arXiv …, 2023 - arxiv.org

Existing large language model-based code generation pipelines typically use beam search
or sampling algorithms during the decoding process. Although the programs they generate …

被引用次数：141 相关文章所有 4 个版本

[PDF] neurips.cc

Transformer-based planning for symbolic regression

P Shojaee, K Meidani… - Advances in Neural …, 2023 - proceedings.neurips.cc

Symbolic regression (SR) is a challenging task in machine learning that involves finding a
mathematical expression for a function based on its values. Recent advancements in SR …

被引用次数：27 相关文章所有 7 个版本

[PDF] arxiv.org

Controlled decoding from language models

S Mudgal, J Lee, H Ganapathy, YG Li, T Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

We propose controlled decoding (CD), a novel off-policy reinforcement learning method to
control the autoregressive generation from language models towards high reward …

被引用次数：55 相关文章所有 4 个版本

[PDF] openreview.net

Alphazero-like tree-search can guide large language model decoding and training

Z Wan, X Feng, M Wen, SM McAleer, Y Wen… - … on Machine Learning, 2024 - openreview.net

Recent works like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim to augment
the multi-step reasoning capabilities of LLMs by using tree-search algorithms. These …

被引用次数：11 相关文章

[PDF] arxiv.org

WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off

E Giboulot, T Furon - arXiv preprint arXiv:2403.04808, 2024 - arxiv.org

Watermarking is a technical means to dissuade malfeasant usage of Large Language
Models. This paper proposes a novel watermarking scheme, so-called WaterMax, that …

被引用次数：16 相关文章所有 2 个版本

[PDF] arxiv.org

Making ppo even better: Value-guided monte-carlo tree search decoding

J Liu, A Cohen, R Pasunuru, Y Choi… - arXiv preprint arXiv …, 2023 - arxiv.org

Inference-time search algorithms such as Monte-Carlo Tree Search (MCTS) may seem
unnecessary when generating natural language text based on state-of-the-art reinforcement …

被引用次数：18 相关文章所有 2 个版本

[PDF] arxiv.org

ARGS: Alignment as reward-guided search

M Khanov, J Burapacheep, Y Li - arXiv preprint arXiv:2402.01694, 2024 - arxiv.org

Aligning large language models with human objectives is paramount, yet common
approaches including RLHF suffer from unstable and resource-intensive training. In …

被引用次数：30 相关文章所有 3 个版本

[PDF] arxiv.org

Detikzify: Synthesizing graphics programs for scientific figures and sketches with tikz

J Belouadi, SP Ponzetto, S Eger - arXiv preprint arXiv:2405.15306, 2024 - arxiv.org

Creating high-quality scientific figures can be time-consuming and challenging, even though
sketching ideas on paper is relatively easy. Furthermore, recreating existing figures that are …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Deliberate reasoning for llms as structure-aware planning with accurate world model

S Xiong, A Payani, Y Yang, F Fekri - arXiv preprint arXiv:2410.03136, 2024 - arxiv.org

Enhancing the reasoning capabilities of large language models (LLMs) remains a key
challenge, especially for tasks that require complex, multi-step decision-making. Humans …

被引用次数：2 相关文章所有 2 个版本

高级搜索

QQ 群