Wireless network intelligence at the edge

J Park, S Samarakoon, M Bennis… - Proceedings of the …, 2019 - ieeexplore.ieee.org
Fueled by the availability of more data and computing power, recent breakthroughs in cloud-
based machine learning (ML) have transformed every aspect of our lives from face …

Efficient neural audio synthesis

N Kalchbrenner, E Elsen, K Simonyan… - International …, 2018 - proceedings.mlr.press
Sequential models achieve state-of-the-art results in audio, visual and textual domains with
respect to both estimating the data distribution and generating desired samples. Efficient …

Structured compression by weight encryption for unstructured pruning and quantization

SJ Kwon, D Lee, B Kim, P Kapoor… - Proceedings of the …, 2020 - openaccess.thecvf.com
Abstract Model compression techniques, such as pruning and quantization, are becoming
increasingly important to reduce the memory footprints and the amount of computations …

CSCNN: Algorithm-hardware co-design for CNN accelerators using centrosymmetric filters

J Li, A Louri, A Karanth… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) are at the core of many state-of-the-art deep learning
models in computer vision, speech, and text processing. Training and deploying such CNN …

Deeptwist: Learning model compression via occasional weight distortion

D Lee, P Kapoor, B Kim - arXiv preprint arXiv:1810.12823, 2018 - arxiv.org
Model compression has been introduced to reduce the required hardware resources while
maintaining the model accuracy. Lots of techniques for model compression, such as …

Learning low-rank approximation for cnns

D Lee, SJ Kwon, B Kim, GY Wei - arXiv preprint arXiv:1905.10145, 2019 - arxiv.org
Low-rank approximation is an effective model compression technique to not only reduce
parameter storage requirements, but to also reduce computations. For convolutional neural …

Flexor: Trainable fractional quantization

D Lee, SJ Kwon, B Kim, Y Jeon… - Advances in neural …, 2020 - proceedings.neurips.cc
Quantization based on the binary codes is gaining attention because each quantized bit can
be directly utilized for computations without dequantization using look-up tables. Previous …

Gst: Group-sparse training for accelerating deep reinforcement learning

J Lee, S Kim, S Kim, W Jo, HJ Yoo - arXiv preprint arXiv:2101.09650, 2021 - arxiv.org
Deep reinforcement learning (DRL) has shown remarkable success in sequential decision-
making problems but suffers from a long training time to obtain such good performance …

Double Viterbi: Weight encoding for high compression ratio and fast on-chip reconstruction for deep neural network

D Ahn, D Lee, T Kim, JJ Kim - International Conference on Learning …, 2018 - openreview.net
Weight pruning has been introduced as an efficient model compression technique. Even
though pruning removes significant amount of weights in a network, memory requirement …

Layerwise sparse coding for pruned deep neural networks with extreme compression ratio

X Liu, W Li, J Huo, L Yao, Y Gao - Proceedings of the AAAI Conference on …, 2020 - aaai.org
Deep neural network compression is important and increasingly developed especially in
resource-constrained environments, such as autonomous drones and wearable devices …