作者
Minhyeok Lee
发表日期
2023
期刊
Journal of Mathematics
卷号
2023
期号
1
页码范围
4229924
出版商
Hindawi
简介
Selecting the most suitable activation function is a critical factor in the effectiveness of deep learning models, as it influences their learning capacity, stability, and computational efficiency. In recent years, the Gaussian error linear unit (GELU) activation function has emerged as a dominant method, surpassing traditional functions such as the rectified linear unit (ReLU) in various applications. This study presents a rigorous mathematical investigation of the GELU activation function, exploring its differentiability, boundedness, stationarity, and smoothness properties in detail. In addition, we conduct an extensive experimental comparison of the GELU function against a broad range of alternative activation functions, utilizing a residual convolutional network trained on the CIFAR‐10, CIFAR‐100, and STL‐10 datasets as the empirical testbed. Our results demonstrate the superior performance of GELU compared to other …
引用总数