Deep learning with limited numerical precision S Gupta, A Agrawal, K Gopalakrishnan, P Narayanan Proceedings of the 32nd International Conference on Machine Learning (ICML …, 2015 | 2555 | 2015 |
Adacomp: Adaptive residual gradient compression for data-parallel distributed training CY Chen, J Choi, D Brand, A Agrawal, W Zhang, K Gopalakrishnan Thirty-Second AAAI Conference on Artificial Intelligence, 2018 | 188 | 2018 |
Ultra-Low Precision 4-bit Training of Deep Neural Networks. X Sun, N Wang, CY Chen, J Ni, A Agrawal, X Cui, S Venkataramani, ... NeurIPS, 2020 | 180 | 2020 |
A Scalable Multi-TeraOPS Deep Learning Processor Core for AI Trainina and Inference B Fleischer, S Shukla, M Ziegler, J Silberman, J Oh, V Srinivasan, J Choi, ... 2018 IEEE Symposium on VLSI Circuits, 35-36, 2018 | 152 | 2018 |
Approximate computing: Challenges and opportunities A Agrawal, J Choi, K Gopalakrishnan, S Gupta, R Nair, J Oh, DA Prener, ... 2016 IEEE International Conference on Rebooting Computing (ICRC), 1-8, 2016 | 122 | 2016 |
A 19Gb/s serial link receiver with both 4-tap FFE and 5-tap DFE functions in 45nm SOI CMOS A Agrawal, J Bulzacchelli, T Dickson, Y Liu, J Tierno, D Friedman Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 …, 2012 | 115 | 2012 |
Dlfloat: A 16-b floating point format designed for deep learning training and inference A Agrawal, SM Mueller, BM Fleischer, X Sun, N Wang, J Choi, ... 2019 IEEE 26th Symposium on Computer Arithmetic (ARITH), 92-95, 2019 | 91 | 2019 |
RaPiD: AI accelerator for ultra-low precision training and inference S Venkataramani, V Srinivasan, W Wang, S Sen, J Zhang, A Agrawal, ... 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021 | 74 | 2021 |
9.1 a 7nm 4-core AI chip with 25.6 TFLOPS hybrid FP8 training, 102.4 TOPS INT4 inference and workload-aware throttling A Agrawal, SK Lee, J Silberman, M Ziegler, M Kang, S Venkataramani, ... 2021 IEEE International Solid-State Circuits Conference (ISSCC) 64, 144-146, 2021 | 74 | 2021 |
An integrated silicon photonics technology for O-band datacom NB Feilchenfeld, FG Anderson, T Barwicz, S Chilstedt, Y Ding, ... International Electron Devices Meeting (IEDM), 2015, 2015 | 60 | 2015 |
Efficient AI System Design With Cross-Layer Approximate Computing S Venkataramani, X Sun, N Wang, CY Chen, J Choi, M Kang, A Agarwal, ... Proceedings of the IEEE 108 (12), 2232-2250, 2020 | 52 | 2020 |
An 8x3.2 Gb/s Parallel Receiver with Collaborative Timing Recovery A Agrawal, PK Hanumolu, GY Wei Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical …, 2008 | 51* | 2008 |
A 1.4 pJ/bit, Power-Scalable 16× 12 Gb/s Source-Synchronous I/O With DFE Receiver in 32 nm SOI CMOS Technology TO Dickson, Y Liu, SV Rylov, A Agrawal, S Kim, PH Hsieh, JF Bulzacchelli, ... IEEE Journal of Solid-State Circuits 50 (8), 1917-1931, 2015 | 47 | 2015 |
An 8x5 Gb/s Parallel Receiver With Collaborative Timing Recovery A Agrawal, A Liu, PK Hanumolu, GY Wei Solid-State Circuits, IEEE Journal of 44 (11), 3120-3130, 2009 | 45* | 2009 |
Monolithic silicon photonics at 25 Gb/s JS Orcutt, DM Gill, J Proesel, J Ellis-Monaghan, F Horst, T Barwicz, ... 2016 Optical Fiber Communications Conference and Exhibition (OFC), 1-3, 2016 | 43 | 2016 |
Monolithic Silicon Photonics at 25Gb/s J Orcutt, D Gill, J Proesel, J Ellis-Monaghan, F Horst, T Barwicz, C Xiong ... OFC 2016, Th4.1, 2016 | 43* | 2016 |
A 3.0 TFLOPS 0.62 V Scalable Processor Core for High Compute Utilization AI Training and Inference J Oh, SK Lee, M Kang, M Ziegler, J Silberman, A Agrawal, ... 2020 IEEE Symposium on VLSI Circuits, 1-2, 2020 | 41 | 2020 |
A 1.8-pJ/bit 16× 16-Gb/s source synchronous parallel interface in 32nm SOI CMOS with receiver redundancy for link recalibration TO Dickson, Y Liu, A Agrawal, JF Bulzacchelli, H Ainspan, Z Toprak-Deniz, ... Custom Integrated Circuits Conference (CICC), 2015 IEEE, 1-4, 2015 | 41* | 2015 |
A 1.8-pJ/bit 16× 16-Gb/s source synchronous parallel interface in 32nm SOI CMOS with receiver redundancy for link recalibration TO Dickson, Y Liu, A Agrawal, JF Bulzacchelli, H Ainspan, Z Toprak-Deniz, ... Custom Integrated Circuits Conference (CICC), 2015 IEEE, 1-4, 2015 | 41* | 2015 |
Accumulation bit-width scaling for ultra-low precision training of deep networks C Sakr, N Wang, CY Chen, J Choi, A Agrawal, N Shanbhag, ... arXiv preprint arXiv:1901.06588, 2019 | 39 | 2019 |