Distributed artificial intelligence empowered by end-edge-cloud computing: A survey

S Duan, D Wang, J Ren, F Lyu, Y Zhang… - … Surveys & Tutorials, 2022 - ieeexplore.ieee.org
As the computing paradigm shifts from cloud computing to end-edge-cloud computing, it
also supports artificial intelligence evolving from a centralized manner to a distributed one …

[HTML][HTML] Review of image classification algorithms based on convolutional neural networks

L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com
Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …

Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale

T Dettmers, M Lewis, Y Belkada… - Advances in Neural …, 2022 - proceedings.neurips.cc
Large language models have been widely adopted but require significant GPU memory for
inference. We develop a procedure for Int8 matrix multiplication for feed-forward and …

Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery

Y Wen, N Jain, J Kirchenbauer… - Advances in …, 2024 - proceedings.neurips.cc
The strength of modern generative models lies in their ability to be controlled through
prompts. Hard prompts comprise interpretable words and tokens, and are typically hand …

Beyond transmitting bits: Context, semantics, and task-oriented communications

D Gündüz, Z Qin, IE Aguerri, HS Dhillon… - IEEE Journal on …, 2022 - ieeexplore.ieee.org
Communication systems to date primarily aim at reliably communicating bit sequences.
Such an approach provides efficient engineering designs that are agnostic to the meanings …

A crossbar array of magnetoresistive memory devices for in-memory computing

S Jung, H Lee, S Myung, H Kim, SK Yoon, SW Kwon… - Nature, 2022 - nature.com
Implementations of artificial neural networks that borrow analogue techniques could
potentially offer low-power alternatives to fully digital approaches,–. One notable example is …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Dynamic neural networks: A survey

Y Han, G Huang, S Song, L Yang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Dynamic neural network is an emerging research topic in deep learning. Compared to static
models which have fixed computational graphs and parameters at the inference stage …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …