MDP Modeling for the Prediction of Agent Movement in Limited Space

폐쇄공간에서의 에이전트 행동 예측을 위한 MDP 모델

  • Received : 2015.05.08
  • Accepted : 2015.06.09
  • Published : 2015.08.31


This paper presents the issue that is predicting the movement of an agent in an enclosed space by using the MDP (Markov Decision Process). Recent researches on the optimal path finding are confined to derive the shortest path with the use of deterministic algorithm such as $A^*$ or Dijkstra. On the other hand, this study focuses in predicting the path that the agent chooses to escape the limited space as time passes, with the stochastic method. The MDP reward structure from GIS (Geographic Information System) data contributed this model to a feasible model. This model has been approved to have the high predictability after applied to the route of previous armed red guerilla.


MDP;Optimal Path;Deterministic Algorithm;Stochastic;GIS


  1. 방수남, 허 준, 손홍규, 이용웅, "지형공간정보 및 최적탐색기법을 이용한 최적침투경로 분석", 대한토목학회논문집, 제26권, 제1D호(2006), pp. 195-202.
  2. 신내호, 오명호, 최호림, 정동윤, 이용웅, "지형공간정보 기반의 침투위험도 예측 모델을 이용한 최적침투지역 분석", 한국군사과학기술학회지, 제12권, 제2호(2009), pp.199-205.
  3. 김동호, 이재송, 최재득, 김기응, "복수 무인기를 위한 POMDP 기반 동적 임무 할당 및 정찰임무 최적화 기법", 정보과학회지, 제39권, 제6호(2011), pp.453-463.
  4. 민대기, "추계 계획법을 이용한 수술실 약 모델과 Newsvendor 비율의 자원 효율성에 한 향 분석", 경영과학, 제28권, 제2호(2011), pp.17-29.
  5. 육군본부, "대침투작전 전투기술", 야전교범 3-0-2, 2009.
  6. 윤봉규, "전장 모델링 실무자를 위한 마코프 체인에 대한 소고", 국방과학기술, 제2권, 제3호(2009), pp.47-61.
  7. 이건창, 한민희, 서영욱, "탐색 및 활용을 통한 컴퓨터 매개 커뮤니케이션의 팀 창의성에 관한 연구:에이전트 모델링 기법을 중심으로", 경영과학, 제28권, 제1호(2011), pp.91-105.
  8. 정석윤, 허 선, "마코프 재생과정을 이용한 ATM트랙픽 모델링 및 성능분석", 한국경영과학학회지, 제24권, 제3호(1999), pp.83-91.
  9. Abhijit, G., "One Step Sizes, Stochastic Shortest Paths, and Survival Probabilities in Reinforcement Learning," Proceeding of the 40th Conference on Winter Simulation. Winter Simulation Conference, 2008.
  10. Alagoz, O., H. Hsu, A.J. Schaefer, and M.S. Roberts, "Markov Decision Processes:A Tool for Sequential Decision Making under Uncertainty," Medical Decision Making, Vol. 30, No.4(2010), pp.474-483.
  11. Bonet, B. and H. Geffner, "Solving Stochastic Shortest-Path Problem with RTDP," Technical report, University of California, Losangeles, 2002.
  12. Hyeong, S.C., C. M. Fu, H.J. Hu, and M.I. Steven, "Simulation-Based Algorithms for Markov Decision Processes," Springer, 2006.
  13. Kolobov, A., Mausam, and D.S. Weld, "Stochastic Shortest Path MDPs with Dead Ends," HSDIP, Vol.78, No.10(2012), pp.78-86.
  14. Pan, Y., L. Sun, and M. Ge, "Finding Reliable Shortest Path in Stochastic Time-Dependent Network," Procedia-Social and Behavioral Sciences, Vol.96, No.6(2013), pp.451-460.
  15. Ravindra, K.A. and L.T. Magnanti, B.J. Orlin, "Network Flows," Prentice Hall, 1993.
  16. Schaefer, A.J., M.D. Bailey, S.M. Schechter, and M.S. Roberts, "Modeling Medical Treatment using Markov Decision Processes," Handbook of Operations Research/Management Science Applications in Health Care, Kluwer Academic Publisher, 2004.
  17. Swarup, S., G. Eubank, and M.V. Marathe, "Computational Epidemiology as Challenge Domain for Multiagent Systems," Proceeding of International Conference on Autonomous Agents and Multi-agent Systems. International Foundation for Autonomous Agents and Multi-agent Systems, 2014.
  18. Yu, X.-H. and W.W. Recker "Stochastic Adaptive Control Model for Traffic Signal Systems," Transportation Research Part C, Vol.14, No.4(2006), pp.263-282.