An Artificial Intelligence Game Agent Using CNN Based Records Learning and Reinforcement Learning

Jeon, Youngjin;Cho, Youngwan;

doi:10.7471/ikeee.2019.23.4.1187

Journal of IKEEE (전기전자학회논문지)

Volume 23 Issue 4
/
Pages.1187-1194
/
2019
/
1226-7244(pISSN)
/
2288-243X(eISSN)

Institute of Korean Electrical and Electronics Engineers (한국전기전자학회)

DOI QR Code

An Artificial Intelligence Game Agent Using CNN Based Records Learning and Reinforcement Learning

CNN 기반 기보학습 및 강화학습을 이용한 인공지능 게임 에이전트

Jeon, Youngjin (Dept. of Computer Engineering, Seokyeong University) ;
Cho, Youngwan (Dept. of Computer Engineering, Seokyeong University)

전영진 ;
조영완

Received : 2019.11.20
Accepted : 2019.12.18
Published : 2019.12.31

https://doi.org/10.7471/ikeee.2019.23.4.1187 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

This paper proposes a CNN architecture as value function network of an artificial intelligence Othello game agent and its learning scheme using reinforcement learning algorithm. We propose an approach to construct the value function network by using CNN to learn the records of professional players' real game and an approach to enhance the network parameter by learning from self-play using reinforcement learning algorithm. The performance of value function network CNN was compared with existing ANN by letting two agents using each network to play games each other. As a result, the winning rate of the CNN agent was 69.7% and 72.1% as black and white, respectively. In addition, as a result of applying the reinforcement learning, the performance of the agent was improved by showing 100% and 78% winning rate, respectively, compared with the network-based agent without the reinforcement learning.

본 논문에서는 인공지능 오델로 게임 에이전트를 구현하기 위해 실제 프로기사들의 기보를 CNN으로 학습시키고 이를 상태의 형세 판단을 위한 근거로 삼아 최소최대탐색을 이용해 현 상태에서 최적의 수를 찾는 의사결정구조를 사용하고 이를 발전시키고자 강화학습 이론을 이용한 자가대국 학습방법을 제안하여 적용하였다. 본 논문에서 제안하는 구현 방법은 기보학습의 성능 평가 차원에서 가치평가를 위한 네트워크로서 기존의 ANN을 사용한 방법과 대국을 통한 방법으로 비교하였으며, 대국 결과 흑일 때 69.7%, 백일 때 72.1%의 승률을 나타내었다. 또한 본 논문에서 제안하는 강화학습 적용 결과 네크워크의 성능을 강화학습을 적용하지 않은 ANN 및 CNN 가치평가 네트워크 기반 에이전트와 비교한 결과 각각 100%, 78% 승률을 나타내어 성능이 개선됨을 확인할 수 있었다.

Keywords

References

Y. J. Jeon, Y. W. Cho, "An Implementation of Othello Game Player Using ANN based Records Learning and Minimax Search Algorithm," The Transactions of the Korean Institute of Electrical Engineers, Vol.67, No.12, pp.1657-1664, 2018. DOI: 10.5370/KIEE.2018.67.12.1657
D. Silver et al., "Mastering the game of Go with deep neural networks and tree search," Nature 529, pp.484-489, 2016. DOI: https://doi.org/10.1038/nature16961
D. Silver et al., "Mastering the game of Go with-out human knowledge," Nature 550, pp.354- 359, 2017. https://doi.org/10.1038/nature24270
M. Campbell, A. J. Hoane, F. Hsu, "Deep Blue," Artificial Intelligence, Vol.134, Issues 1-2, pp.57-83, 2002. DOI: 10.1016/S0004-3702(01)00129-1
M. Buro, "LOGISTELLO-A Strong Learning Othello Program," NEC Research Institute, Princeton, NJ, 1997.
P. S. Rosenbloom, "A World-Championship- Level Othello Program," Artificial Intelligence, Vol.19, Issue.3 pp.279-320, 1982. DOI: 10.1016/0004-3702(82)90003-0
K.-F. Lee, S. Mahajan, "The Development of a World Class Othello Program," Artificial Intelligence, Vol.43, Issue1, pp.21-36, 1990. DOI: 10.1016/0004-3702(90)90068-B
J. Schaeffer, H. J. Herik, "Chips Challenging Champions: Games, Computers and Artificial Intelligence," North Holland; 1 edition, pp.135, 2002.
Gunawan et al., "Evolutionary Neural Network for Othello Game," Procedia-Social and Behavioral Sciences, Vol.57, pp.419-425, 2012. DOI: 10.1016/j.sbspro.2012.09.1206
P. Liskowski, W. M. Jaskowski and K. Krawiec, "Learning to Play Othello with Deep Neural Networks," in IEEE Transactions on Games, 2018. DOI: 10.1109/TG.2018.2799997
N. J. van Eck and M. van Wezel, "Reinforcement learning and its application to othello," Technical Report EI 2005-47, Econometric Institute Report, 2005.
M. van der Ree and M. Wiering, "Reinforcement learning in the game of Othello: Learning against a fixed opponent and learning from self-play," 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp.108-115, 2013. DOI: 10.1109/ADPRL.2013.6614996
R. S. Sutton, A. G. Barto, "Reinforcement Learning: An Introduction," MIT Press, Cambridge, MA, 1998.
R Hahnloser, R. Sarpeshkar, M A Mahowald, R. J. Douglas, H.S. Seung, "Digital selection and analogue amplification coesist in a cortex-inspired silicon circuit," Nature. 405. pp.947-951, 2000. https://doi.org/10.1038/35016072
Y. J. Jeon, "Implementation of an artificial intelligence game agent using deep neural network and reinforcement learning," Thesis of master's degree, Seokyeong University, 2019.

Journal of IKEEE (전기전자학회논문지)

An Artificial Intelligence Game Agent Using CNN Based Records Learning and Reinforcement Learning

CNN 기반 기보학습 및 강화학습을 이용한 인공지능 게임 에이전트

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)