{TVM}: An automated {End-to-End} optimizing compiler for deep learning T Chen, T Moreau, Z Jiang, L Zheng, E Yan, H Shen, M Cowan, L Wang, ... 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2018 | 1719 | 2018 |
Learning to optimize tensor programs T Chen, L Zheng, E Yan, Z Jiang, T Moreau, L Ceze, C Guestrin, ... Advances in Neural Information Processing Systems 31, 2018 | 424 | 2018 |
TVM: end-to-end optimization stack for deep learning T Chen, T Moreau, Z Jiang, H Shen, EQ Yan, L Wang, Y Hu, L Ceze, ... arXiv preprint arXiv:1802.04799 11 (2018), 20, 2018 | 263 | 2018 |
SNNAP: Approximate computing on programmable SoCs via neural acceleration T Moreau, M Wyse, J Nelson, A Sampson, H Esmaeilzadeh, L Ceze, ... 2015 IEEE 21st International Symposium on High Performance Computer …, 2015 | 183 | 2015 |
A hardware–software blueprint for flexible deep learning specialization T Moreau, T Chen, L Vega, J Roesch, E Yan, L Zheng, J Fromm, Z Jiang, ... IEEE Micro 39 (5), 8-16, 2019 | 169 | 2019 |
Accept: A programmer-guided compiler framework for practical approximate computing A Sampson, A Baixo, B Ransford, T Moreau, J Yip, L Ceze, M Oskin University of Washington Technical Report UW-CSE-15-01 1 (2), 1-14, 2015 | 160 | 2015 |
VTA: an open hardware-software stack for deep learning T Moreau, T Chen, Z Jiang, L Ceze, C Guestrin, A Krishnamurthy arXiv preprint arXiv:1807.04188 10, 2018 | 83 | 2018 |
Exploiting errors for efficiency: A survey from circuits to applications P Stanley-Marbell, A Alaghi, M Carbin, E Darulova, L Dolecek, ... ACM Computing Surveys (CSUR) 53 (3), 1-39, 2020 | 71 | 2020 |
MATIC: Learning around errors for efficient low-voltage neural network accelerators S Kim, P Howe, T Moreau, A Alaghi, L Ceze, V Sathe 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1-6, 2018 | 69 | 2018 |
A taxonomy of general purpose approximate computing techniques T Moreau, J San Miguel, M Wyse, J Bornholt, A Alaghi, L Ceze, NE Jerger, ... IEEE Embedded Systems Letters 10 (1), 2-5, 2017 | 63 | 2017 |
Energy-efficient neural network acceleration in the presence of bit-level memory errors S Kim, P Howe, T Moreau, A Alaghi, L Ceze, VS Sathe IEEE Transactions on Circuits and Systems I: Regular Papers 65 (12), 4285-4298, 2018 | 53 | 2018 |
Automatic generation of high-performance quantized machine learning kernels M Cowan, T Moreau, T Chen, J Bornholt, L Ceze Proceedings of the 18th ACM/IEEE International Symposium on Code Generation …, 2020 | 49 | 2020 |
Approximate computing: Making mobile systems more efficient T Moreau, A Sampson, L Ceze IEEE Pervasive Computing 14 (2), 9-13, 2015 | 40 | 2015 |
Relay: A high-level compiler for deep learning J Roesch, S Lyubomirsky, M Kirisame, L Weber, J Pollock, L Vega, ... arXiv preprint arXiv:1904.08368, 2019 | 26 | 2019 |
Leveraging the vta-tvm hardware-software stack for fpga acceleration of 8-bit resnet-18 inference T Moreau, T Chen, L Ceze Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament …, 2018 | 22 | 2018 |
Exploring computation-communication tradeoffs in camera systems A Mazumdar, T Moreau, S Kim, M Cowan, A Alaghi, L Ceze, M Oskin, ... 2017 IEEE International Symposium on Workload Characterization (IISWC), 177-186, 2017 | 19 | 2017 |
Automating generation of low precision deep learning operators M Cowan, T Moreau, T Chen, L Ceze arXiv preprint arXiv:1810.11066, 2018 | 16 | 2018 |
React: A framework for rapid exploration of approximate computing techniques M Wyse, A Baixo, T Moreau, B Zorn, J Bornholt, A Sampson, L Ceze, ... Workshop on Approximate Computing Across the Stack (WAX w/PLDI), 7-9, 2015 | 12 | 2015 |
Relay: A high-level IR for deep learning J Roesch, S Lyubomirsky, M Kirisame, J Pollock, L Weber, Z Jiang, ... arXiv preprint arXiv:1904.08368, 2019 | 11 | 2019 |
QAPPA: A framework for navigating quality-energy tradeoffs with arbitrary quantization T Moreau, F Augusto, P Howe, A Alaghi, L Ceze Technical Report UW-CSE-17-03-02, 2017 | 11 | 2017 |