From dnns to gans: Review of efficient hardware architectures for deep learning

G Bhattacharya - arXiv preprint arXiv:2107.00092, 2021 - arxiv.org
arXiv preprint arXiv:2107.00092, 2021arxiv.org
In recent times, the trend in very large scale integration (VLSI) industry is multi-dimensional,
for example, reduction of energy consumption, occupancy of less space, precise result, less
power dissipation, faster response. To meet these needs, the hardware architecture should
be reliable and robust to these problems. Recently, neural network and deep learning has
been started to impact the present research paradigm significantly which consists of
parameters in the order of millions, nonlinear function for activation, convolutional operation …
In recent times, the trend in very large scale integration (VLSI) industry is multi-dimensional, for example, reduction of energy consumption, occupancy of less space, precise result, less power dissipation, faster response. To meet these needs, the hardware architecture should be reliable and robust to these problems. Recently, neural network and deep learning has been started to impact the present research paradigm significantly which consists of parameters in the order of millions, nonlinear function for activation, convolutional operation for feature extraction, regression for classification, generative adversarial networks. These operations involve huge calculation and memory overhead. Presently available DSP processors are incapable of performing these operations and they mostly face the problems, for example, memory overhead, performance drop and compromised accuracy. Moreover, if a huge silicon area is powered to accelerate the operation using parallel computation, the ICs will be having significant chance of burning out due to the considerable generation of heat. Hence, novel dark silicon constraint is developed to reduce the heat dissipation without sacrificing the accuracy. Similarly, different algorithms have been adapted to design a DSP processor compatible for fast performance in neural network, activation function, convolutional neural network and generative adversarial network. In this review, we illustrate the recent developments in hardware for accelerating the efficient implementation of deep learning networks with enhanced performance. The techniques investigated in this review are expected to direct future research challenges of hardware optimization for high-performance computations.
arxiv.org
以上显示的是最相近的搜索结果。 查看全部搜索结果