Machine learning-guided protein engineering

P Kouba, P Kohout, F Haddadi, A Bushuiev… - ACS …, 2023 - ACS Publications
Recent progress in engineering highly promising biocatalysts has increasingly involved
machine learning methods. These methods leverage existing experimental and simulation …

Let the flows tell: Solving graph combinatorial problems with gflownets

D Zhang, H Dai, N Malkin… - Advances in …, 2024 - proceedings.neurips.cc
Combinatorial optimization (CO) problems are often NP-hard and thus out of reach for exact
algorithms, making them a tempting domain to apply machine learning methods. The highly …

Large language models are zero shot hypothesis proposers

B Qi, K Zhang, H Li, K Tian, S Zeng, ZR Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
Significant scientific discoveries have driven the progress of human civilisation. The
explosion of scientific literature and data has created information barriers across disciplines …

Amortizing intractable inference in large language models

EJ Hu, M Jain, E Elmoznino, Y Kaddar, G Lajoie… - arXiv preprint arXiv …, 2023 - arxiv.org
Autoregressive large language models (LLMs) compress knowledge from their training data
through next-token conditional distributions. This limits tractable querying of this knowledge …

Diffusion generative flow samplers: Improving learning signals through partial trajectory optimization

D Zhang, RTQ Chen, CH Liu, A Courville… - arXiv preprint arXiv …, 2023 - arxiv.org
We tackle the problem of sampling from intractable high-dimensional density functions, a
fundamental task that often appears in machine learning and statistics. We extend recent …

Local search gflownets

M Kim, T Yun, E Bengio, D Zhang, Y Bengio… - arXiv preprint arXiv …, 2023 - arxiv.org
Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a
distribution over discrete objects proportional to their rewards. GFlowNets exhibit a …

Learning to scale logits for temperature-conditional gflownets

M Kim, J Ko, D Zhang, L Pan, T Yun, W Kim… - arXiv preprint arXiv …, 2023 - arxiv.org
GFlowNets are probabilistic models that learn a stochastic policy that sequentially generates
compositional structures, such as molecular graphs. They are trained with the objective of …

Causal machine learning for single-cell genomics

A Tejada-Lapuerta, P Bertin, S Bauer, H Aliee… - arXiv preprint arXiv …, 2023 - arxiv.org
Advances in single-cell omics allow for unprecedented insights into the transcription profiles
of individual cells. When combined with large-scale perturbation screens, through which …

Pre-training and fine-tuning generative flow networks

L Pan, M Jain, K Madan, Y Bengio - arXiv preprint arXiv:2310.03419, 2023 - arxiv.org
Generative Flow Networks (GFlowNets) are amortized samplers that learn stochastic
policies to sequentially generate compositional objects from a given unnormalized reward …

Discrete probabilistic inference as control in multi-path environments

T Deleu, P Nouri, N Malkin, D Precup… - arXiv preprint arXiv …, 2024 - arxiv.org
We consider the problem of sampling from a discrete and structured distribution as a
sequential decision problem, where the objective is to find a stochastic policy such that …