作者
Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Mattina Matthew
发表日期
2019/6/23
研讨会论文
4th edition of Workshop on Energy Efficient Machine Learning and Cognitive Computing, co-located with 46th rnational Symposium on Computer Architecture (ISCA)
简介
Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints. As a result, there is a need for compression techniques that can achieve significant compression without negatively impacting inference run-time and task accuracy. This paper explores a new compressed RNN cell implementation called Hybrid Matrix Decomposition (HMD) that achieves this dual objective. HMD creates dense matrices that results in output features where the upper sub-vector has "richer" features while the lower sub vector has "constrained" features". On the benchmarks evaluated in this paper, this results in faster inference runtime than pruning and better accuracy than matrix factorization for compression factors of 2-4x.
引用总数
2019202020212022202320244107224
学术搜索中的文章
U Thakker, J Beu, D Gope, G Dasika, M Mattina - 2019 2nd Workshop on Energy Efficient Machine …, 2019