Large batch optimization for deep learning: Training bert in 76 minutes Y You, J Li, S Reddi, J Hseu, S Kumar, S Bhojanapalli, X Song, J Demmel, ... arXiv preprint arXiv:1904.00962, 2019 | 987 | 2019 |
Reducing BERT pre-training time from 3 days to 76 minutes Y You, J Li, J Hseu, X Song, J Demmel, CJ Hsieh arXiv preprint arXiv:1904.00962 12, 2, 2019 | 106 | 2019 |
Large-batch training for LSTM and beyond Y You, J Hseu, C Ying, J Demmel, K Keutzer, CJ Hsieh Proceedings of the International Conference for High Performance Computing …, 2019 | 103 | 2019 |
Large batch optimization for deep learning: Training bert in 76 minutes. arXiv 2019 Y You, J Li, S Reddi, J Hseu, S Kumar, S Bhojanapalli, X Song, J Demmel, ... arXiv preprint arXiv:1904.00962, 0 | 21 | |
Large batch optimization for deep learning: Training bert in 76 minutes. arXiv Y You, J Li, S Reddi, J Hseu, S Kumar, S Bhojanapalli, X Song, J Demmel, ... arXiv preprint arXiv:1904.00962, 2019 | 15 | 2019 |
From Rogue to MicroRogue A Stump, R Besand, JC Brodman, J Hseu, B Kinnersley Electronic Notes in Theoretical Computer Science 117, 69-87, 2005 | 3 | 2005 |