On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima NS Keskar, D Mudigere, J Nocedal, M Smelyanskiy, PTP Tang International Conference on Learning Representations (ICLR), 2017, 2016 | 3528 | 2016 |
Deep learning recommendation model for personalization and recommendation systems M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ... arXiv preprint arXiv:1906.00091, 2019 | 677 | 2019 |
A study of BFLOAT16 for deep learning training D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ... arXiv preprint arXiv:1905.12322, 2019 | 317 | 2019 |
The architectural implications of Facebook's DNN-based personalized recommendation U Gupta, CJ Wu, X Wang, M Naumov, B Reagen, D Brooks, B Cottel, ... 2020 IEEE International Symposium on High Performance Computer Architecture …, 2020 | 288 | 2020 |
Distributed deep learning using synchronous stochastic gradient descent D Das, S Avancha, D Mudigere, K Vaidynathan, S Sridharan, D Kalamkar, ... arXiv preprint arXiv:1602.06709, 2016 | 210 | 2016 |
Recnmp: Accelerating personalized recommendation with near-memory processing L Ke, U Gupta, BY Cho, D Brooks, V Chandra, U Diril, A Firoozshahian, ... 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020 | 201 | 2020 |
Mixed Precision Training of Convolutional Neural Networks using Integer Operations D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ... International Conference on Learning Representations (ICLR), 2018, 2018 | 191 | 2018 |
A progressive batching L-BFGS method for machine learning R Bollapragada, J Nocedal, D Mudigere, HJ Shi, PTP Tang International Conference on Machine Learning, 620-629, 2018 | 166 | 2018 |
Vector evaluated particle swarm optimization (VEPSO) for multi-objective design optimization of composite structures SN Omkar, D Mudigere, GN Naik, S Gopalakrishnan Computers & structures 86 (1-2), 1-14, 2008 | 153 | 2008 |
Ternary neural networks with fine-grained quantization N Mellempudi, A Kundu, D Mudigere, D Das, B Kaul, P Dubey arXiv preprint arXiv:1705.01462, 2017 | 130 | 2017 |
Software-hardware co-design for fast and scalable training of deep learning recommendation models D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 112* | 2022 |
Compositional embeddings using complementary partitions for memory-efficient recommendation systems HJM Shi, D Mudigere, M Naumov, J Yang Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020 | 112 | 2020 |
Mixed dimension embeddings with application to memory-efficient recommendation systems AA Ginart, M Naumov, D Mudigere, J Yang, J Zou 2021 IEEE International Symposium on Information Theory (ISIT), 2786-2791, 2021 | 100 | 2021 |
Machine learning accelerator mechanism A Bleiweiss, A Ramesh, A Mishra, D Marr, J Cook, S Sridharan, ... US Patent 11,373,088, 2022 | 92 | 2022 |
Deep learning training in facebook data centers: Design of scale-up and scale-out systems M Naumov, J Kim, D Mudigere, S Sridharan, X Wang, W Zhao, S Yilmaz, ... arXiv preprint arXiv:2003.09518, 2020 | 84 | 2020 |
Crop classifieation using bj010 eally—inspired techniques with hi resolution satelliteimage SN OMKAR, J SENTHILNATH, M DHEEVATSA Journal oftheIndian SocietyofRemote Sensing, 2OO8 36 (2), 175-182, 2008 | 79* | 2008 |
Fine-grain compute communication execution for deep learning frameworks S Sridharan, D Mudigere US Patent App. 15/869,502, 2018 | 69 | 2018 |
Unity: Accelerating {DNN} training through joint optimization of algebraic transformations and parallelization C Unger, Z Jia, W Wu, S Lin, M Baines, CEQ Narvaez, V Ramakrishnaiah, ... 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022 | 68 | 2022 |
Dynamic precision management for integer deep learning primitives N Mellempudi, D Mudigere, D Das, S Sridharan US Patent 10,643,297, 2020 | 57 | 2020 |
Performance optimizations for scalable implicit RANS calculations with SU2 TD Economon, D Mudigere, G Bansal, A Heinecke, F Palacios, J Park, ... Computers & Fluids 129, 146-158, 2016 | 56 | 2016 |