Path Planning of Unmanned Aerial Vehicle based Reinforcement Learning using Deep Q Network under Simulated Environment

시뮬레이션 환경에서의 DQN을 이용한 강화 학습 기반의 무인항공기 경로 계획

  • 이근형 (연세대학교 컴퓨터과학과) ;
  • 김신덕 (연세대학교 컴퓨터과학과)
  • Received : 2017.09.21
  • Accepted : 2017.09.23
  • Published : 2017.09.30

Abstract

In this research, we present a path planning method for an autonomous flight of unmanned aerial vehicles (UAVs) through reinforcement learning under simulated environment. We design the simulator for reinforcement learning of uav. Also we implement interface for compatibility of Deep Q-Network(DQN) and simulator. In this paper, we perform reinforcement learning through the simulator and DQN, and use Q-learning algorithm, which is a kind of reinforcement learning algorithms. Through experimentation, we verify performance of DQN-simulator. Finally, we evaluated the learning results and suggest path planning strategy using reinforcement learning.

Keywords

References

  1. Christopher J. C. H. Watkins., Dayan, P., "Q-learning," Machine Learning, Vol. 8, Issue 3-4, pp. 279-292, 1992. https://doi.org/10.1007/BF00992698
  2. Li, S., Xu, X., & Zuo, L., "Dynamic path planning of a mobile robot with improved Q-learning algorithm," In Information and Automation, 2015 IEEE International Conference on IEEE, pp. 409414, 2015.
  3. Sutton R. S., Barto, A. G., "Reinforcement Learning: An Introduction", MIT Press, Cambridge, MA, 1998.
  4. Setiawan, Y, D., Pratama, P, S., Jeong, S, K., Duy, V, H., Kim, S, B., "Experimental Comparison of A* and D* Lite Path Planning Algorithms for Differential Drive Automated Guided Vehicle," AETA 2013: Recent Advances in Electrical Engineering and Related Sciences, pp. 555-564, 2013.
  5. Koenig, S., Likhachev, M., Furcy, D., "Lifelong Planning A*," Artificial Intelligence Vol. 155, Issues 1-2, pp. 93-146, 2004. https://doi.org/10.1016/j.artint.2003.12.001
  6. Lee, H, S., Lee, D, S., Shim, D, H., "Receding Horizon-based RRT* Algorithm for a UAV Realtime Path Planner," AIAA Information Systems-AIAA Infotech @ Aerospace, 0676, 2017.
  7. Roberge, V., Tarbouchi, M., & Labonte, G., "Comparison of parallel genetic algorithm and particle swarm optimization for real-time UAV path planning," IEEE Transactions on Industrial Informatics, 9(1), pp. 132-141, 2013. https://doi.org/10.1109/TII.2012.2198665
  8. Sahingoz, O. K., "Flyable path planning for a multi-UAV system with Genetic Algorithms and Bezier curves, " In Unmanned Aircraft Systems (ICUAS), 2013 International Conference on IEEE, pp. 4148, 2013.
  9. Mnih, V., Kavukcuoglu, K., Silver, D., A. Rusu, A., Veness, J., G. Bellemare, M., Graves, A., Riedmiller, M., K. Fidjeland, A., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D., "Human-level control through deep reinforcement learning," Nature 518, pp. 529-533, 2015. https://doi.org/10.1038/nature14236
  10. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M., "Playing Atari with Deep Reinforcement Learning," NIPS '13 Workshop on Deep Learning, 2013.