Learning to transduce with unbounded memory

X Yang, Y Wang, R Byrne, G Schneider… - Chemical …, 2019 - ACS Publications

Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides
opportunities for the discovery and development of innovative drugs. Various machine …

被引用次数：751 相关文章所有 9 个版本

[PDF] arxiv.org

The best of both worlds: Combining recent advances in neural machine translation

MX Chen, O Firat, A Bapna, M Johnson… - arXiv preprint arXiv …, 2018 - arxiv.org

The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling
for Machine Translation (MT). The classic RNN-based approaches to MT were first out …

被引用次数：524 相关文章所有 6 个版本

[PDF] arxiv.org

Grokking: Generalization beyond overfitting on small algorithmic datasets

A Power, Y Burda, H Edwards, I Babuschkin… - arXiv preprint arXiv …, 2022 - arxiv.org

In this paper we propose to study generalization of neural networks on small algorithmically
generated datasets. In this setting, questions about data efficiency, memorization …

被引用次数：239 相关文章所有 4 个版本

[PDF] researchgate.net

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer

In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

被引用次数：177 相关文章所有 8 个版本

[PDF] researchgate.net

Relational inductive biases, deep learning, and graph networks

PW Battaglia, JB Hamrick, V Bapst… - arXiv preprint arXiv …, 2018 - arxiv.org

Artificial intelligence (AI) has undergone a renaissance recently, making major progress in
key domains such as vision, language, control, and decision-making. This has been due, in …

被引用次数：3660 相关文章所有 8 个版本

[PDF] science.org Full View

Deep reinforcement learning for de novo drug design

M Popova, O Isayev, A Tropsha - Science advances, 2018 - science.org

We have devised and implemented a novel computational strategy for de novo design of
molecules with desired properties termed ReLeaSE (Reinforcement Learning for Structural …

被引用次数：1166 相关文章所有 10 个版本

[PDF] arxiv.org

Scaling transformer to 1m tokens and beyond with rmt

A Bulatov, Y Kuratov, Y Kapushev… - arXiv preprint arXiv …, 2023 - arxiv.org

A major limitation for the broader scope of problems solvable by transformers is the
quadratic scaling of computational complexity with input size. In this study, we investigate …

被引用次数：59 相关文章所有 3 个版本

[PDF] researchgate.net

[图书][B] Neural network methods for natural language processing

Y Goldberg - 2022 - books.google.com

Neural networks are a family of powerful machine learning models. This book focuses on the
application of neural network models to natural language data. The first half of the book …

被引用次数：1815 相关文章所有 13 个版本

[PDF] arxiv.org

The concrete distribution: A continuous relaxation of discrete random variables

CJ Maddison, A Mnih, YW Teh - arXiv preprint arXiv:1611.00712, 2016 - arxiv.org

The reparameterization trick enables optimizing large scale stochastic computation graphs
via gradient descent. The essence of the trick is to refactor each stochastic node into a …

被引用次数：2737 相关文章所有 6 个版本

[PDF] arxiv.org

Simulation intelligence: Towards a new generation of scientific methods

A Lavin, D Krakauer, H Zenil, J Gottschlich… - arXiv preprint arXiv …, 2021 - arxiv.org

The original" Seven Motifs" set forth a roadmap of essential methods for the field of scientific
computing, where a motif is an algorithmic method that captures a pattern of computation …

被引用次数：109 相关文章所有 7 个版本

高级搜索

QQ 群