Labeling Q-Learning for Maze Problems with Partially Observable States

Lee, Hae-Yeon;Hiroyuki Kamaya;Kenich Abe;

제어로봇시스템학회:학술대회논문집

2000.10a
/
Pages.489-489
/
2000

Institute of Control, Robotics and Systems (제어로봇시스템학회)

Labeling Q-Learning for Maze Problems with Partially Observable States

Lee, Hae-Yeon (Dept. of Electrical and Communication Engineering, Graduate School of Engineering, Tohoku Univ) ;
Hiroyuki Kamaya (Dept. of Electrical Engineering, Hachinohe National College of Technology) ;
Kenich Abe (Dept. of Electrical and Communication Engineering, Graduate School of Engineering, Tohoku Univ)

Published : 2000.10.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Recently, Reinforcement Learning(RL) methods have been used far teaming problems in Partially Observable Markov Decision Process(POMDP) environments. Conventional RL-methods, however, have limited applicability to POMDP To overcome the partial observability, several algorithms were proposed [5], [7]. The aim of this paper is to extend our previous algorithm for POMDP, called Labeling Q-learning(LQ-learning), which reinforces incomplete information of perception with labeling. Namely, in the LQ-learning, the agent percepts the current states by pair of observation and its label, and the agent can distinguish states, which look as same, more exactly. Labeling is carried out by a hash-like function, which we call Labeling Function(LF). Numerous labeling functions can be considered, but in this paper, we will introduce several labeling functions based on only 2 or 3 immediate past sequential observations. We introduce the basic idea of LQ-learning briefly, apply it to maze problems, simple POMDP environments, and show its availability with empirical results, look better than conventional RL algorithms.

제어로봇시스템학회:학술대회논문집

Labeling Q-Learning for Maze Problems with Partially Observable States

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)