作者
Fangqi Zhu, Qilian Liang
发表日期
2021/6/15
期刊
IEEE Internet of Things Journal
卷号
9
期号
3
页码范围
1962-1975
出版商
IEEE
简介
The parametrization of recurrent neural network (RNN) to solve the gradient vanishing and exploding problem is critical for sequential learning. The reason lies on eigenvalue of the gradient of loss function against the recurrent weight matrix in recurrent weight matrix. To control the eigenvalues, the orthogonal constraint is imposed on the recurrent weight matrix. In this article, we analyze the designing mechanism of decomposition methods of three orthogonal constrained RNNs (OCRNNs) and derive their corresponding training algorithms. We compare the performance of the four OCRNNs and the standard long short-term memory (LSTM) network on the synthetic baseline copying and adding tasks. We find out that the coordinate descent-based iteration of the two OCRNNs proposed by us can achieve comparative or better testing error and convergence speed than the rest model. The above two OCRNN models …
引用总数