A Function Approximation Method for Q-learning of Reinforcement Learning

;;

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Volume 31 Issue 11
/
Pages.1431-1438
/
2004
/
1229-6848(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

A Function Approximation Method for Q-learning of Reinforcement Learning

강화학습의 Q-learning을 위한 함수근사 방법

이영아 (경희대학교 컴퓨터공학과) ;
정태충 (경희대학교 컴퓨터공학과)

Published : 2004.11.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Reinforcement learning learns policies for accomplishing a task's goal by experience through interaction between agent and environment. Q-learning, basis algorithm of reinforcement learning, has the problem of curse of dimensionality and slow learning speed in the incipient stage of learning. In order to solve the problems of Q-learning, new function approximation methods suitable for reinforcement learning should be studied. In this paper, to improve these problems, we suggest Fuzzy Q-Map algorithm that is based on online fuzzy clustering. Fuzzy Q-Map is a function approximation method suitable to reinforcement learning that can do on-line teaming and express uncertainty of environment. We made an experiment on the mountain car problem with fuzzy Q-Map, and its results show that learning speed is accelerated in the incipient stage of learning.

강화학습(reinforcement learning)은 온라인으로 환경(environment)과 상호작용 하는 과정을 통하여 목표를 이루기 위한 전략을 학습한다. 강화학습의 기본적인 알고리즘인 Q-learning의 학습 속도를 가속하기 위해서, 거대한 상태공간 문제(curse of dimensionality)를 해결할 수 있고 강화학습의 특성에 적합한 함수 근사 방법이 필요하다. 본 논문에서는 이러한 문제점들을 개선하기 위해서, 온라인 퍼지 클러스터링(online fuzzy clustering)을 기반으로 한 Fuzzy Q-Map을 제안한다. Fuzzy Q-Map은 온라인 학습이 가능하고 환경의 불확실성을 표현할 수 있는 강화학습에 적합한 함수근사방법이다. Fuzzy Q-Map을 마운틴 카 문제에 적용하여 보았고, 학습 초기에 학습 속도가 가속됨을 보였다.

Keywords

References

Richard Sutton, Andrew G. Barto, 'Reinforcement Learning :An Introduction,' MIT Press, 1998
Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moor, 'Reinforcement Learning: A Survey,' Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996
Pierre Yves Glorennce, 'Reinforcement Learning : an Overview,' Proceedings of the European Symposium on Intelligent Techniques, 2000
William Donald Smart, 'Making Reinforcement Learning Work on Real Robots,' Ph. D. Thesis, Brown University, 2002
A.K. Jain, M.N, Murty, P.J. Flynn, 'Data Clustering: A Review,' ACM Computing Surveys, vol. 31, no. 3, 1999 https://doi.org/10.1145/331499.331504
Baraldi, A. and Blonda, P., 1999, 'A Survey of Fuzzy Clustering Algorithms for Pattern Recognition - Part I,' IEEE Transactions on Systems, Man, and Cybernetics, Part B, Vol. 29, No.6, pp. 778-786 https://doi.org/10.1109/3477.809032
Aristidis Likas, 'A Reinforcement Learning Approach to On-line Clustering,' Neural computation 11 (8): 1915-1932, 1999 https://doi.org/10.1162/089976699300016025
Nicolas B. Karayiannis, James C. Bezdek, 'An Integrated Approach to Fuzzy Learning Vector Quantization and Fuzzy c-Means Clstering,' IEEE Transactions of Fuzzy systems, vol. 5, no. 4, 1997 https://doi.org/10.1109/91.649915
전종원, 민준영, 'GLVQ클러스터링을 위한 필기체 숫자의 효율적인 특징추출 방법', 한국정보처리학회 논문지, vol. 2, no. 6, 1995
Barbara Hammer, Thomas Villmann, 'Generalized Relevance Learning Vector Quantization,' Neural Networks, vol. 15 no. 8-9, pp. 1059-1068, 2002 https://doi.org/10.1016/S0893-6080(02)00079-5
Shyn Jong Hu, 'Pattern Recognition by LVQ and GLVQ Networks,' http://neuron.et.ntust.edu.tw/homework/87/NN/87Homework%232/M8702043
Michael Herrmann, Ralf Der, 'Efficient Q- Learning by Division of Labor,' Proceedings of International Conference on Artificial Neural Networks, 1995
K. Yamada, M. Svinin, K. Ueda, 'Reinforcement Learning with Autonomous State Space Construction using Unsupervised Clustering Method,' Proceedings of the 5th International Symposium on Artificial Life and Robotics, 2000
Lionel Jouffe, 'Fuzzy Inference System Learning by Reinforcement Methods,' IEEE Transactions on Systems, Man and Cybernetics pp. 338-355, 1998. https://doi.org/10.1109/5326.704563
Andrea Bonarini, 'Delayed Reinforcement, Fuzzy Q-Learning and Fuzzy Logic Controllers,' In Herrera, F., Verdegay, J. L. (Eds.) Genetic Algorithms and Soft Computing, pp. 447-466, 1996
Pierre Yves Glorennec, Lionel Jouffe, 'Fuzzy Q-Learning,' Proceedings of Sixth IEEE International Conference on Fuzzy Systems, pp. 719-724, 1997
정석일, 이연정, '분포기여도를 이용한 퍼지 Q-Learning', 퍼지 및 지능시스템 학회 논문지, vol. 11, no. 5, pp. 388-394, 2001
Richard S. Sutton, 'Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding,' Advances in Neural Information Processing Systems 8, pp. 1038-1044, MIT Press, 1996
R. Matthew Kretchmar, Charles W. Anderson, 'Comparison of CMACs and Radial Basis Functions for Local Function Approximators in Reinforcement Learning,' Proceedings of International Conference on Neural Networks, 1997 https://doi.org/10.1109/ICNN.1997.616132
Juan Carlos Santamaria, Richard S. Sutton, Ashwin Ram, 'Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces,' COINS Technical Report 96-88, 1996
William D. Smart, Leslie Pack Kaelbling, 'Practical Reinforcement Learning in Continuous Spaces,' Proceedings of International Conference on Machine Learning, 2000
William D. Smart, Leslie Pack Kaelbling, 'Reinforcement Learning for Robot Control,' In Mobile Robots XVI, 2001

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

A Function Approximation Method for Q-learning of Reinforcement Learning

강화학습의 Q-learning을 위한 함수근사 방법

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)