Controller Learning Method of Self-driving Bicycle Using State-of-the-art Deep Reinforcement Learning Algorithms

Choi, Seung-Yoon;Le, Tuyen Pham;Chung, Tae-Choong;

doi:10.9708/jksci.2018.23.10.023

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

Volume 23 Issue 10
/
Pages.23-31
/
2018
/
1598-849X(pISSN)
/
2383-9945(eISSN)

Korean Society of Computer Information (한국컴퓨터정보학회)

DOI QR Code

Controller Learning Method of Self-driving Bicycle Using State-of-the-art Deep Reinforcement Learning Algorithms

Choi, Seung-Yoon (Dept. of Computer Engineering, Kyung Hee University) ;
Le, Tuyen Pham (Dept. of Computer Engineering, Kyung Hee University) ;
Chung, Tae-Choong (Dept. of Computer Engineering, Kyung Hee University)

Received : 2018.09.17
Accepted : 2018.10.01
Published : 2018.10.31

https://doi.org/10.9708/jksci.2018.23.10.023 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Recently, there have been many studies on machine learning. Among them, studies on reinforcement learning are actively worked. In this study, we propose a controller to control bicycle using DDPG (Deep Deterministic Policy Gradient) algorithm which is the latest deep reinforcement learning method. In this paper, we redefine the compensation function of bicycle dynamics and neural network to learn agents. When using the proposed method for data learning and control, it is possible to perform the function of not allowing the bicycle to fall over and reach the further given destination unlike the existing method. For the performance evaluation, we have experimented that the proposed algorithm works in various environments such as fixed speed, random, target point, and not determined. Finally, as a result, it is confirmed that the proposed algorithm shows better performance than the conventional neural network algorithms NAF and PPO.

Keywords

References

L. Keo and M. Yamakita, "Controlling balancer and steering for bicycle stabilization," 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4541-4546, Oct. 2009.
J. P. Meijaard, J. M. Papadopoulos, A. Ruina, and A. L. Schwab, "Linearized dynamics equations for the balance and steer of a bicycle: a benchmark and review," In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, Vol. 463, No. 2084, pp. 1955-1982. The Royal Society, Aug. 2007. https://doi.org/10.1098/rspa.2007.1857
A. Schwab, J. Meijaard, and J. Kooijman, "Some recent developments in bicycle dynamics," In Proceedings of the 12th World Congress in Mechanism and Machine Science, pp. 1-6, 2007.
J. Tan, Y. Gu, C. K. Liu, and G. Turk, “Learning bicycle stunts,” ACM Transactions on Graphics (TOG), Vol. 33, No. 4, pp. 1-16, 2014.
Google Nederland, "Introducing the self-driving bicycle in the netherlands," March, 2017.
J. Randlv and P. Alstrm, "Learning to drive a bicycle using reinforcement learning and shaping," Proceeding ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning, pp. 463-471, 1998.
L. P. Tuyen and T. Chung, "Controlling bicycle using deep deterministic policy gradient algorithm," In Ubiquitous Robots and Ambient Intelligence (URAI), 2017 14th International Conference on, pp. 413-417. IEEE, 2017.
J. Peters and S. Schaal, “Reinforcement learning of motor skills with policy gradients,” Neural networks, Vol. 21, No. 4, pp. 682-697, May 2008. https://doi.org/10.1016/j.neunet.2008.02.003
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015.
R. S. Sutton and A. G. Barto, "Reinforcement learning: An introduction," Vol. 1, MIT press Cambridge, 1998.
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, "Deterministic policy gradient algorithms," In ICML, June 2014.
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al, "Human-level control through deep reinforcement learning," Nature, Vol. 518, pp. 529-533, Feb. 2015. https://doi.org/10.1038/nature14236
C.-L. Hwang, H.-M.Wu, and C.-L. Shih, “Fuzzy sliding-mode underactuated control for autonomous dynamic balance of an electrical bicycle,” IEEE transactions on control systems technology, Vol. 17, No. 3, pp. 658-670, May 2009. https://doi.org/10.1109/TCST.2008.2004349
G. E. Uhlenbeck and L. S. Ornstein, “On the theory of the brownian motion,” Physical review, Vol. 36, No. 5, pp. 823-841, Sep. 1930. https://doi.org/10.1103/PhysRev.36.823
D. P. Kingma and J. Ba. Adam, "A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, "Continuous deep q-learning with model-based acceleration," In International Conference on Machine Learning, pp. 2829-2838, June 2016.
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal policy optimization algorithms," arXiv preprint arXiv:1707.06347, 2017.
J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, "Trust region policy optimization," In International Conference on Machine Learning, pp. 1889-1897, 2015.
M. Lu and X. Li, "Deep reinforcement learning policy in Hex game system," 2018 Chinese Control And Decision Conference (CCDC), pp. 6623-6626, 2018.
E. Bejar and A. Moran, "Deep reinforcement learning based neuro-control for a two-dimensional magnetic positioning system," 2018 4th International Conference on Control, Automation and Robotics (ICCAR), pp. 268-273, 2018.
T. Yasuda and K. Ohkura, "Collective Behavior Acquisition of Real Robotic Swarms Using Deep Reinforcement Learning," 2018 Second IEEE International Conference on Robotic Computing (IRC), pp. 179-180, 2018.

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

Controller Learning Method of Self-driving Bicycle Using State-of-the-art Deep Reinforcement Learning Algorithms

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)