DOI QR코드

DOI QR Code

A Comparison of the Effects of Optimization Learning Rates using a Modified Learning Process for Generalized Neural Network

일반화 신경망의 개선된 학습 과정을 위한 최적화 신경망 학습률들의 효율성 비교

  • Yoon, Yeochang (Department of Information Security, Woosuk University) ;
  • Lee, Sungduck (Department of Information and Statistics, Chungbuk National University)
  • 윤여창 (우석대학교 정보보안학과) ;
  • 이성덕 (충북대학교 정보통계학과)
  • Received : 2013.10.01
  • Accepted : 2013.10.16
  • Published : 2013.10.31

Abstract

We propose a modified learning process for generalized neural network using a learning algorithm by Liu et al. (2001). We consider the effect of initial weights, training results and learning errors using a modified learning process. We employ an incremental training procedure where training patterns are learned systematically. Our algorithm starts with a single training pattern and a single hidden layer neuron. During the course of neural network training, we try to escape from the local minimum by using a weight scaling technique. We allow the network to grow by adding a hidden layer neuron only after several consecutive failed attempts to escape from a local minimum. Our optimization procedure tends to make the network reach the error tolerance with no or little training after the addition of a hidden layer neuron. Simulation results with suitable initial weights indicate that the present constructive algorithm can obtain neural networks very close to minimal structures and that convergence to a solution in neural network training can be guaranteed. We tested these algorithms extensively with small training sets.

본 연구에서는 Liu 등의 학습 알고리즘과 Wu와 Zhang의 초기 가중값의 범위 설정, 그리고 Gunaseeli와 Karthikeyan의 초기 가중값에 관한 연구 결과를 이용하여 일반화 네트워크를 구할 수 있는 개선된 학습을 제안하고, 최적화된 신경망 학습률들을 이용하여 개선된 학습 과정의 학습효율등을 비교해 본다. 제시된 알고리즘을 이용한 학습에서 학습 초기에는 가장 단순한 학습 패턴과 은닉층으로부터 학습을 시작한다. 신경망 학습과정 중에 지역 최소값에 수렴되는 경우에는 가중값 범위 조정을 통하여 지역 최소값 문제를 해결하고, 지역 최소값으로부터 탈출이 용이하지 않으면 은닉노드를 점차적으로 하나씩 추가하면서 학습한다. 각 단계에서 새롭게 추가된 노드에 대한 초기 가중값의 선택은 이차계획법을 이용한 최적 처리절차를 이용한다. 최적 처리절차는 은닉층의 노드가 추가된 후의 새로운 네트워크에서 학습회수를 단순히 증가시키지 않아도 주어진 학습 허용오차를 만족시킬 수 있다. 본 연구에서 적용한 개선된 알고리즘을 이용하면서 초기 가중값들에 관한 기존 연구들을 적용하면 신경망 학습시의 수렴 정도를 높여주고 최소한의 단순 구조를 갖는 일반화 네트워크로 추정할 수 있게 된다. 이러한 학습률들을 변화시키는 모의실험을 통하여 기존의 연구 결과와의 학습 효율을 비교하고 향후 연구 방향을 제시하고자 한다.

Keywords

References

  1. Al-Shareef, A. J. and Abbod, M. F. (2010). Neural Network Initial Weights Optimisation, 12th International Conference on Computer Modelling and Simulation, 57-61.
  2. Anthony, M. and Bartlett, P. L. (2009). Neural Network Learning: Theoretical Foundations, Cambridge University Press.
  3. Diotalevi, F. and Valle, M. (2001). Weight Perturbation Learning Algorithm with Local Learning Rate Adaptation for the Classification of Remote-Sensing Images, Proceedings of European Symposium on Artificial Neural Networks, 217-222.
  4. Framling, K. (2004). Scaled Gradient Descent Learning Rate: Reinforcement Learning with Light-seeking Robot, Proceedings of International Conference on Informatics in Control, Automation and Robotics, 3-11.
  5. Fukuoka, Y., Matsuki, H., Minamitani, H. and Ishida, A. (1998). A modified back-propagation method to avoid false local minima, Neural Network, 11, 1059-1072. https://doi.org/10.1016/S0893-6080(98)00087-2
  6. Gunaseeli, N. and Karthikeyan, M. (2007). A Constructive Approach of Modified Standard Backpropagation Algorithm with Optimum Initialization for Feedforward Neural Networks, International Conference on Computational Intelligence and Multimedia Application, 325-331.
  7. Haykin, S. (2010). Neural Networks and Learning Machines, 3rd Ed., PHI Learning Private Limited.
  8. Liu, D., Chang, T. S. and Zhang, Y. (2001). A New Learning Algorithm for Feedforward Neural Networks, Proceedings of IEEE, International Symposium on Intelligent Control, 39-44.
  9. Maasoumi, E., Khotanzad, A. and Abaye, A. (1994). Artificial neural networks for some macroeconomic series : A first report, Econometric Reviews, 13, 105-122. https://doi.org/10.1080/07474939408800276
  10. Maghami, P. G. and Sparks, D. W. (2000). Design of neural networks for fast convergence and accuracy: Dynamics and control, IEEE Trans. Neural Networks, 11, 113-123. https://doi.org/10.1109/72.822515
  11. Parekh, R., Yang, J. and Honavar, V. (2000). Constructive neural-network learning algorithms for pattern classification, IEEE Trans. Neural Networks, 11, 436-451. https://doi.org/10.1109/72.839013
  12. RoyChowdhury, P., Singh, Y. P. and Chansarkar, R. (1999). Dynamic tunneling technique for efficient training of multilayer perceptrons, IEEE Trans, Neural Networks, 10, 48-55. https://doi.org/10.1109/72.737492
  13. Sharda, R. and Patil, R. B. (1990). Neural Networks as Forecasting Experts : An Empirical Test,Proceeding of the IJCNN Meeting, 491-494.
  14. White, H. (1988). Economic Prediction Using Neural Networks : The Case of IBM Stock Prices, Proceedings of the Second Annual IEEE Confernece in Neural Networks, 2, 451-458.
  15. Wu, Y. and Zhang, L. (2002). The Effect of Initial Weight, Learning Rate and Regularization on Generalization Performance and Efficiency, Proceedings on ICSP, 1191-1194.
  16. Yam, J. Y. F. and Chow, T. W. S. (2000). A Weight Initialization Method for Improving Training Speed in Feedforward Neural Network, IEEE Trans. Neural Networks, 30, 219-232.
  17. Zhang, Y. and Jiang, Q. (2010). An Improved Initial Method for Clustering High-Dimensional Data, 2nd International Workshop on Database Technology and Applications, 1-4.