A study of BFLOAT16 for deep learning training D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ... arXiv preprint arXiv:1905.12322, 2019 | 326 | 2019 |
Quasi-Monte Carlo feature maps for shift-invariant kernels J Yang, V Sindhwani, H Avron, MW Mahoney International Conference on Machine Learning (ICML 2014), 2014 | 199* | 2014 |
Sub-sampled Newton methods with non-uniform sampling P Xu, J Yang, F Roosta, C Ré, MW Mahoney Advances in Neural Information Processing Systems 29, 2016 | 139 | 2016 |
Software-hardware co-design for fast and scalable training of deep learning recommendation models D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 138* | 2022 |
Compositional embeddings using complementary partitions for memory-efficient recommendation systems HJM Shi, D Mudigere, M Naumov, J Yang Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020 | 111 | 2020 |
Mixed dimension embeddings with application to memory-efficient recommendation systems AA Ginart, M Naumov, D Mudigere, J Yang, J Zou 2021 IEEE International Symposium on Information Theory (ISIT), 2786-2791, 2021 | 101 | 2021 |
Towards automated neural interaction discovery for click-through rate prediction Q Song, D Cheng, H Zhou, J Yang, Y Tian, X Hu Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020 | 88 | 2020 |
Deep learning training in facebook data centers: Design of scale-up and scale-out systems M Naumov, J Kim, D Mudigere, S Sridharan, X Wang, W Zhao, S Yilmaz, ... arXiv preprint arXiv:2003.09518, 2020 | 86 | 2020 |
Matrix factorizations at scale: A comparison of scientific data analytics in Spark and C+ MPI using three case studies A Gittens, A Devarakonda, E Racah, M Ringenburg, L Gerhardt, ... 2016 IEEE International Conference on Big Data (Big Data), 204-213, 2016 | 86 | 2016 |
Implementing randomized matrix algorithms in parallel and distributed environments J Yang, X Meng, MW Mahoney Proceedings of the IEEE 104 (1), 58-92, 2015 | 69 | 2015 |
Online modified greedy algorithm for storage control under uncertainty J Qin, Y Chow, J Yang, R Rajagopal IEEE Transactions on Power Systems 31 (3), 1729-1743, 2015 | 63 | 2015 |
Random laplace feature maps for semigroup kernels on histograms J Yang, V Sindhwani, Q Fan, H Avron, MW Mahoney Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2014 | 63 | 2014 |
Quantile regression for large-scale applications J Yang, X Meng, M Mahoney International Conference on Machine Learning, 881-887, 2013 | 62 | 2013 |
Weighted SGD for Regression with Randomized Preconditioning J Yang, YL Chow, C Ré, MW Mahoney Journal of Machine Learning Research 18 (211), 1-43, 2018 | 53 | 2018 |
Distributed online modified greedy algorithm for networked storage operation under uncertainty J Qin, Y Chow, J Yang, R Rajagopal IEEE Transactions on Smart Grid 7 (2), 1106-1118, 2015 | 43 | 2015 |
Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions J Yang, O Rubel, Prabhat, MW Mahoney, BP Bowen Analytical chemistry 87 (9), 4658-4666, 2015 | 39 | 2015 |
Post-training 4-bit quantization on embedding tables H Guan, A Malevich, J Yang, J Park, H Yuen arXiv preprint arXiv:1911.02079, 2019 | 26 | 2019 |
Modeling and online control of generalized energy storage networks J Qin, Y Chow, J Yang, R Rajagopal Proceedings of the 5th international conference on Future energy systems, 27-38, 2014 | 21* | 2014 |
Understanding and improving failure tolerant training for deep learning recommendation with partial recovery K Maeng, S Bharuka, I Gao, M Jeffrey, V Saraph, BY Su, C Trippel, J Yang, ... Proceedings of Machine Learning and Systems 3, 637-651, 2021 | 20 | 2021 |
Training with low-precision embedding tables J Zhang, J Yang, H Yuen Systems for Machine Learning Workshop at NeurIPS 2018, 2018 | 18 | 2018 |