Distributed artificial intelligence empowered by end-edge-cloud computing: A survey

S Duan, D Wang, J Ren, F Lyu, Y Zhang… - … Surveys & Tutorials, 2022 - ieeexplore.ieee.org
As the computing paradigm shifts from cloud computing to end-edge-cloud computing, it
also supports artificial intelligence evolving from a centralized manner to a distributed one …

Pareto-optimal quantized resnet is mostly 4-bit

AA Abdolrashidi, L Wang, S Agrawal… - Proceedings of the …, 2021 - openaccess.thecvf.com
Quantization has become a popular technique to compress neural networks and reduce
compute cost, but most prior work focuses on studying quantization without changing the …

Efficient and effective methods for mixed precision neural network quantization for faster, energy-efficient inference

D Bablani, JL Mckinstry, SK Esser… - arXiv preprint arXiv …, 2023 - arxiv.org
For efficient neural network inference, it is desirable to achieve state-of-the-art accuracy with
the simplest networks requiring the least computation, memory, and power. Quantizing …

Design automation for fast, lightweight, and effective deep learning models: A survey

D Zhang, K Chen, Y Zhao, B Yang, L Yao… - arXiv preprint arXiv …, 2022 - arxiv.org
Deep learning technologies have demonstrated remarkable effectiveness in a wide range of
tasks, and deep learning holds the potential to advance a multitude of applications …

NeuLens: spatial-based dynamic acceleration of convolutional neural networks on edge

X Hou, Y Guan, T Han - Proceedings of the 28th Annual International …, 2022 - dl.acm.org
Convolutional neural networks (CNNs) play an important role in today's mobile and edge
computing systems for vision-based tasks like object classification and detection. However …

MWQ: Multiscale wavelet quantized neural networks

Q Sun, Y Ren, L Jiao, X Li, F Shang, F Liu - arXiv preprint arXiv …, 2021 - arxiv.org
Model quantization can reduce the model size and computational latency, it has become an
essential technique for the deployment of deep neural networks on resourceconstrained …

Accelerable lottery tickets with the mixed-precision quantization

Z Li, Y Gong, Z Zhang, X Xue, T Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
In recent years, the lottery tickets hypothesis has gained widespread popularity as a means
of network compression. However, the practical application of lottery tickets for hardware …

[PDF][PDF] Accelerating quantized dnns with dedicated hardware accelerators and risc-v processors using precision-scalable multipliers

L Urbinati - 2024 - tesidottorato.depositolegale.it
Summary Mixed-Precision Quantization (MPQ) and Transprecision Computing (TC)
represent two valuable techniques used to optimize Deep Neural Networks (DNNs) …

基於模型路徑數的量化神經網路的高效神經架構和混合精度搜尋

黃名善 - 2023 - tdr.lib.ntu.edu.tw
隨著人工智能及其應用的快速發展, 神經網絡模型變得更加複雜, 架構的計算量和參數量更是
以往的數千倍以上. 而這也使得人工去搜尋或是建立架構變得極其困難. 也因此 …

One model for all quantization: A quantized network supporting hot-swap bit-width adjustment

Q Sun, X Li, Y Ren, Z Huang, X Liu, L Jiao… - arXiv preprint arXiv …, 2021 - arxiv.org
As an effective technique to achieve the implementation of deep neural networks in edge
devices, model quantization has been successfully applied in many practical applications …