Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking

F Yu, L Jiang, H Kang, S Hao, L Qin - arXiv preprint arXiv:2406.05673, 2024 - arxiv.org
Divergent thinking, the cognitive process of generating diverse solutions, is a hallmark of
human creativity and problem-solving. For machines, sampling diverse solution trajectories …

Rectifying Reinforcement Learning for Reward Matching

H He, E Bengio, Q Cai, L Pan - arXiv preprint arXiv:2406.02213, 2024 - arxiv.org
The Generative Flow Network (GFlowNet) is a probabilistic framework in which an agent
learns a stochastic policy and flow functions to sample objects with probability proportional …

Generative Flow Networks: Theory and Applications to Structure Learning

T Deleu - arXiv preprint arXiv:2501.05498, 2025 - arxiv.org
Without any assumptions about data generation, multiple causal models may explain our
observations equally well. To avoid selecting a single arbitrary model that could result in …

Improving GFlowNets with Monte Carlo Tree Search

N Morozov, D Tiapkin, S Samsonov, A Naumov… - arXiv preprint arXiv …, 2024 - arxiv.org
Generative Flow Networks (GFlowNets) treat sampling from distributions over compositional
discrete spaces as a sequential decision-making problem, training a stochastic policy to …

Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks

R Hu, Y Zhang, Z Li, L Huang - arXiv preprint arXiv:2410.02596, 2024 - arxiv.org
Generative Flow Networks (GFlowNets) are a novel class of generative models designed to
sample from unnormalized distributions and have found applications in various important …