Convolutional Neural Networks (CNNs) have made a great impact on attaining state‐of‐the‐art results in image task classification. Weight initialization is one of the fundamental steps in formulating a CNN model. It determines the failure or success of the CNN model. In this paper, we conduct a research based on the mathematical background of different weight initialization strategies to determine the one with better performance. To have smooth training, we expect the activation of each layer of the CNN model follow the standard normal distribution with mean 0 and SD 1. It prevents gradients from vanishing and leads to more smooth training. However, it was obtained that even with the appropriate weight initialization technique, a regular Rectified Linear Unit (ReLU) activation function increases the activation mean value. In this paper, we address this issue by proposing weight initialization based (WIB)‐ReLU activation function. The proposed method resulted in more smooth training. Moreover, the experiments showed that WIB‐ReLU outperforms ReLU, Leaky ReLU, parametric ReLU, and exponential linear unit activation functions and results in up to 20% decrease in loss value and 5% increase in accuracy score on both Fashion‐MNIST and CIFAR‐10 databases.