DOI QR코드

DOI QR Code

Autonomous and Asynchronous Triggered Agent Exploratory Path-planning Via a Terrain Clutter-index using Reinforcement Learning

  • Kim, Min-Suk (Department of Human Intelligence and Robot Engineering, Sangmyung University) ;
  • Kim, Hwankuk (Department of Information Security Engineering, Sangmyung University)
  • Received : 2022.05.31
  • Accepted : 2022.08.17
  • Published : 2022.09.30

Abstract

An intelligent distributed multi-agent system (IDMS) using reinforcement learning (RL) is a challenging and intricate problem in which single or multiple agent(s) aim to achieve their specific goals (sub-goal and final goal), where they move their states in a complex and cluttered environment. The environment provided by the IDMS provides a cumulative optimal reward for each action based on the policy of the learning process. Most actions involve interacting with a given IDMS environment; therefore, it can provide the following elements: a starting agent state, multiple obstacles, agent goals, and a cluttered index. The reward in the environment is also reflected by RL-based agents, in which agents can move randomly or intelligently to reach their respective goals, to improve the agent learning performance. We extend different cases of intelligent multi-agent systems from our previous works: (a) a proposed environment-clutter-based-index for agent sub-goal selection and analysis of its effect, and (b) a newly proposed RL reward scheme based on the environmental clutter-index to identify and analyze the prerequisites and conditions for improving the overall system.

Keywords

Acknowledgement

This research was funded by a 2021 research grant from Sangmyung University.

References

  1. M. -S. Kim, "A study of collaborative and distributed multi-agent path-planning using reinforcement learning," Journal of The Korea Society of Computer and Information, vol. 26, no. 3, pp. 9-17, Mar. 2021. DOI: 10.9708/jksci.2021.26.03.009.
  2. D. B. Megherbi, M. Kim, and M. Madera, "A study of collaborative distributed multi-goal and multi-agent based systems for large critical key infrastructures and resources (CKIR) dynamic monitoring and surveillance," in IEEE International Conference on Technologies for Homeland Security, Waltham: MA, USA, pp. 687-692, 2013. DOI: 10.1109/THS.2013.6699087.
  3. Y. Bicen and F. Aras, "Intelligent condition monitoring platform combined with multi-agent approach for complex systems," in 2014 IEEE Workshop on Environmental, Energy, and Structural Monitoring Systems Proceedings, Naples, Italy, pp. 1-4, 2014. DOI: 10.1109/EESMS.2014.6923283.
  4. M. Saim, S. Ghapani, W. Ren, K. Munawar, and U. M. Al-Saggaf, "Distributed average tracking in multi-agent coordination: extensions and experiments," IEEE Systems Journal, vol. 12, no. 3, pp. 2428-2436, Apr. 2018. DOI: 10.1109/JSYST.2017.2685465.
  5. D. B. Megherbi and V. Malaya, "A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning," The Journal of Supercomputing, vol. 59, no. 3, pp. 1188-1217, Dec. 2012. DOI: 10.1007/s11227-010-0510-3.
  6. Z. Li, L. Gao, W. Chen, and Y. Xu, "Distributed adaptive cooperative tracking of uncertain nonlinear fractional-order multi-agent systems," IEEE/CAA Journal of Automatica Sinica, vol. 7, no. 1, pp. 292-300, Jan. 2020. DOI: 10.1109/JAS.2019.1911858.
  7. D. B. Megherbi and M. Kim, "A hybrid P2P and master-slave cooperative distributed multi-agent reinforcement learning system with asynchronously triggered exploratory trials and clutter-index-based selected sub-goals," in 2016 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Budapest, Hungary, pp. 1-6, 2016. DOI: 10.1109/CIVEMSA.2016.7524249.
  8. H. Lee and S. W. Cha, "Reinforcement learning based on equivalent consumption minimization strategy for optimal control of hybrid electric vehicles," IEEE Access, vol. 9, pp. 860-871, 2021. DOI: 10.1109/ACCESS.2020.3047497.
  9. K. Zhang, Z. Yang, and T. Basar, "Multi-agent reinforcement learning: a selective overview of theories and algorithms," in Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol. 325. Springer, Cham, 2021.
  10. J. B. Kim, H. -K. Lim, C. -M. Kim, M. -S. Kim, Y. -G. Hong, and Y. -H. Han, "Imitation reinforcement learning-based remote rotary inverted pendulum control in openflow network," IEEE Access, vol. 7, pp. 36682 - 36690, Mar. 2019. DOI: 10.1109/ACCESS.2019.2905621.
  11. S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 4th ed. Prentice Hall, 2021.
  12. J. Blumenthal, D. B. Megherbi, and R. Lussier, "Unsupervised machine learning via Hidden Markov Models for accurate clustering of plant stress levels based on imaged chlorophyll fluorescence profiles & their rate of change in time," Computers and Electronics in Agriculture, vol. 174, Jul. 2020. DOI: 10.1016/j.compag.2019.105064.
  13. D. Xu and T. Ushio, "On stability of consensus control of discrete-time multi-agent systems by multiple pinning agents," IEEE Control Systems Letters, vol. 3, no. 4, pp. 1038-1043, Oct. 2019. DOI: 10.1109/LCSYS.2019.2920207.
  14. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT press, 2018.
  15. M. Madera and D. B. Megherbi, "An interconnected dynamical system composed of dynamics-based reinforcement learning agents in a distributed environment: A case study," in Proceedings of IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, Tianjin, China, pp. 63-68, 2012. DOI: 10.1109/CIMSA.2012.6269597.
  16. J. C. Bol and J. Leiby, "Status motives and agent-to-agent information sharing: how evolutionary psychology shapes agents' Responses to Control System Design," AAA 2016 Management Accounting Section (MAS) Meeting Paper, Aug. 2015. DOI: 10.2139/ssrn.2645804.
  17. H. S. AI-Dayaa and D. B. Megherbi, "Reinforcement learning technique using agent state occurrence frequency with analysis of knowledge sharing on the agent's learning process in multi-agent environments," The Journal of Supercomputing, vol. 59, no. 1, pp. 526-547, Jun. 2010. DOI: 10.1007/s11227-010-0451-x.
  18. H. S. Al-Dayaa and D. B. Megherbi, "Towards a multiple-lookahead-levels reinforcement-learning technique and its implementation in integrated circuits," The Journal of Supercomputing, vol. 62, no. 1, pp. 588-615, Jan. 2012. DOI: 10.1007/s11227-011-0738-6.
  19. Y. Duan, N. Wang, and J. Wu, "Minimizing training time of distributed machine learning by reducing data communication," IEEE Transactions on Network Science and Engineering, vol. 8, no. 2, pp. 1802-1814, Apr. 2021. DOI: 10.1109/TNSE.2021.3073897.
  20. W. Wang, W. Zhang, C. Yan, and Y. Fang, "Distributed adaptive bipartite time-varying formation control for heterogeneous unknown nonlinear multi-agent systems," IEEE Access, vol. 9, pp. 52698-52707, Mar. 2021. DOI: 10.1109/ACCESS.2021.3068966.
  21. D. Bertsekas, "Multiagent reinforcement learning: Rollout and policy iteration," IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 2, pp. 249-272, Feb. 2021. DOI: 10.1109/JAS.2021.1003814.
  22. X. Gan, H. Guo, and Z. Li, "A new multi-agent reinforcement learning method based on evolving dynamic correlation matrix," IEEE Access, vol. 7, pp. 162127-162138, Oct. 2019. DOI: 10.1109/ACCESS.2019.2946848.
  23. D. B. Megherbi and M. Kim, "A collaborative distributed multi-agent reinforcement learning technique for dynamic agent shortest path planning via selected sub-goals in complex cluttered environments," in 2015 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision, Orlando: FL, USA, pp. 118-124, 2015. DOI: 10.1109/COGSIMA.2015.7108185.
  24. Megherbi D. B., Malaya, "A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning", The Journal of Supercomputing, Vol. 59, Issue 3, p 1188-121, 2012,https://doi.org/10.1007/s11227-010-0510-3.
  25. H. Qie, D. Shi, T. Shen, X. Xu, Y. Li, and L. Wang, "Joint optimization of multi-UAV target assignment and path planning based on multi-agent reinforcement learning," IEEE Access, vol. 7, pp. 146264-146272, Sep. 2019. DOI: 10.1109/ACCESS.2019.2943253.
  26. L. Canese, G. C. Cardarilli, L. D. Nunzio, R. Fazzolari, D. Giardino, M. Re, and S. Spano, "Multi-agent reinforcement learning: A review of challenges and applications," Applied Science, vol. 11, no. 11, p. 4948, May. 2021. DOI: 10.3390/app11114948.
  27. S. Zheng and H. Liu, "Improved multi-agent deep deterministic policy gradient for path planning-based crowd simulation," IEEE Access, vol. 7, pp. 147755-147770, Oct. 2019. DOI: 10.1109/ACCESS.2019.2946659.
  28. B. Brito, M. Everett, J. P. How, and J. Alonso-Mora, "Where to go next: Learning a subgoal recommendation policy for navigation in dynamic environments," IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4616-4623, Jul. 2021. DOI: 10.1109/LRA.2021.3068662.
  29. C. Liu, F. Zhu, Q. Liu, and Y. Fu, "Hierarchical reinforcement learning with automatic sub-goal identification," IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 10, pp. 1686-1696, Oct. 2021. DOI: 10.1109/JAS.2021.1004141.