Attention is all you need A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Advances in neural information processing systems 30, 2017 | 131479 | 2017 |
Advances in neural information processing systems A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Attention is all you need, 2017 | 1832 | 2017 |
Tensor2tensor for neural machine translation A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ... arXiv preprint arXiv:1803.07416, 2018 | 624 | 2018 |
The reversible residual network: Backpropagation without storing activations AN Gomez, M Ren, R Urtasun, RB Grosse Advances in neural information processing systems 30, 2017 | 561 | 2017 |
Attention is all you need. Advances in neural information processing systems A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Advances in neural information processing systems 30 (2017), 2017 | 464 | 2017 |
Disease variant prediction with deep generative models of evolutionary data J Frazer, P Notin, M Dias, A Gomez, JK Min, K Brock, Y Gal, DS Marks Nature 599 (7883), 91-95, 2021 | 449 | 2021 |
One model to learn them all L Kaiser, AN Gomez, N Shazeer, A Vaswani, N Parmar, L Jones, ... arXiv preprint arXiv:1706.05137, 2017 | 388 | 2017 |
Depthwise Separable Convolutions for Neural Machine Translation L Kaiser, AN Gomez, F Chollet International Conference on Learning Representations, 2018 | 360 | 2018 |
Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval P Notin, M Dias, J Frazer, JM Hurtado, AN Gomez, D Marks, Y Gal International Conference on Machine Learning, 16990-17017, 2022 | 139 | 2022 |
A systematic comparison of bayesian deep learning robustness in diabetic retinopathy tasks A Filos, S Farquhar, AN Gomez, TGJ Rudner, Z Kenton, L Smith, ... arXiv preprint arXiv:1912.10481, 2019 | 134* | 2019 |
Learning Sparse Networks Using Targeted Dropout AN Gomez, I Zhang, S Rao Kamalakara, D Madaan, K Swersky, Y Gal, ... arXiv preprint arXiv:1905.13678, 2019 | 120 | 2019 |
Self-attention between datapoints: Going beyond individual input-output pairs in deep learning J Kossen, N Band, C Lyle, AN Gomez, T Rainforth, Y Gal Advances in Neural Information Processing Systems 34, 28742-28756, 2021 | 105 | 2021 |
Prioritized training on points that are learnable, worth learning, and not yet learnt S Mindermann, JM Brauner, MT Razzak, M Sharma, A Kirsch, W Xu, ... International Conference on Machine Learning, 15630-15649, 2022 | 97 | 2022 |
The difficulty of training sparse neural networks U Evci, F Pedregosa, A Gomez, E Elsen arXiv preprint arXiv:1906.10732, 2019 | 93 | 2019 |
Unsupervised cipher cracking using discrete GANs AN Gomez, S Huang, I Zhang, BM Li, M Osama, L Kaiser arXiv preprint arXiv:1801.04883, 2018 | 79 | 2018 |
Attention Is All You Need. NIPS’17 A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Proceedings of the 31st International Conference on Neural Information …, 2017 | 64 | 2017 |
Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17) A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Curran Associates Inc,, 2017 | 62 | 2017 |
Attention is all you need (arXiv: 1706.03762). arXiv A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... | 61 | 2017 |
503 Łukasz Kaiser, and Illia Polosukhin A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez Attention is all you need. Advances in neural information 504, 0 | 55 | |
Wat zei je? detecting out-of-distribution translations with variational transformers TZ Xiao, AN Gomez, Y Gal arXiv preprint arXiv:2006.08344, 2020 | 36* | 2020 |