Weight initialization based‐rectified linear unit activation function to improve the performance of a convolutional neural network model

B Olimov, S Karshiev, E Jang, S Din… - Concurrency and …, 2021 - Wiley Online Library
Concurrency and Computation: Practice and Experience, 2021Wiley Online Library
Abstract Convolutional Neural Networks (CNNs) have made a great impact on attaining
state‐of‐the‐art results in image task classification. Weight initialization is one of the
fundamental steps in formulating a CNN model. It determines the failure or success of the
CNN model. In this paper, we conduct a research based on the mathematical background of
different weight initialization strategies to determine the one with better performance. To
have smooth training, we expect the activation of each layer of the CNN model follow the …
Abstract
Convolutional Neural Networks (CNNs) have made a great impact on attaining state‐of‐the‐art results in image task classification. Weight initialization is one of the fundamental steps in formulating a CNN model. It determines the failure or success of the CNN model. In this paper, we conduct a research based on the mathematical background of different weight initialization strategies to determine the one with better performance. To have smooth training, we expect the activation of each layer of the CNN model follow the standard normal distribution with mean 0 and SD 1. It prevents gradients from vanishing and leads to more smooth training. However, it was obtained that even with the appropriate weight initialization technique, a regular Rectified Linear Unit (ReLU) activation function increases the activation mean value. In this paper, we address this issue by proposing weight initialization based (WIB)‐ReLU activation function. The proposed method resulted in more smooth training. Moreover, the experiments showed that WIB‐ReLU outperforms ReLU, Leaky ReLU, parametric ReLU, and exponential linear unit activation functions and results in up to 20% decrease in loss value and 5% increase in accuracy score on both Fashion‐MNIST and CIFAR‐10 databases.
Wiley Online Library
以上显示的是最相近的搜索结果。 查看全部搜索结果