DOI QR코드

DOI QR Code

Research Trends on Deep Reinforcement Learning

심층 강화학습 기술 동향

  • Published : 2019.08.01

Abstract

Recent trends in deep reinforcement learning (DRL) have revealed the considerable improvements to DRL algorithms in terms of performance, learning stability, and computational efficiency. DRL also enables the scenarios that it covers (e.g., partial observability; cooperation, competition, coexistence, and communications among multiple agents; multi-task; decentralized intelligence) to be vastly expanded. These features have cultivated multi-agent reinforcement learning research. DRL is also expanding its applications from robotics to natural language processing and computer vision into a wide array of fields such as finance, healthcare, chemistry, and even art. In this report, we briefly summarize various DRL techniques and research directions.

Keywords

Acknowledgement

Grant : 초연결 공간의 분산 지능 핵심원천 기술

Supported by : 한국전자통신연구원

References

  1. V. Mnih et al., "Playing Atari with Deep Reinforcement Learning," arxiv:1312.5602, 2013.
  2. M. Hessel et al., "Rainbow: Combining Improvements in Deep Reinforcement Learning," in AAAI Conf. Crtif. Intell., New Orleans LA, USA, Feb. 2018, pp. 3215-3222.
  3. R.S. Sutton et al., "Policy Gradient Methods for Reinforcement Learning with Function Approximation," in Proc. Int. Conf. Neural Inf. Process. Syst., Denver, CO, USA, 2000, pp. 1057-1063.
  4. J. Schulman et al., "Proximal Policy Optimization Algorithms," arxiv:1707.06347, 2017.
  5. J. Schulman et al., "Trust Region Policy Optimization," in Int. Conf. Mach. Learning(ICML), Lille, France, July 2015.
  6. Y. Burda et al., "Exploration by Random Network Distillation," in Int. Conf. Learning Representations, New Orleans, LA, USA, 2019.
  7. T. Rashid et al., "QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning," in Int. Conf. Mach. Learning(ICML), Stockholm, Sweden, 2018.
  8. R. Lowe et al., "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments," in Conf. Neural Inf. Process. Syst., Long Beach, CA, USA, 2017.
  9. M. Tan, "Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents," in Int. Conf. Mach. Learning(ICML), Amherst, MA, USA, 1993.
  10. A. Tampuu et al., "Multiagent Cooperation and Competition with Deep Reinforcement Learning," PLOS One, vol. 12, no. 4, Apr. 2017, pp. 1-15.
  11. S. Li et al., "Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient," in AAAI Conf. Crtif. Intell., Honolulu, HI, USA, 2019.
  12. J. Dean et al., "Large Scale Distributed Deep Networks," in Int. Conf. Neural Inf. Process. Syst., Lake Tahoe, NV, USA, Dec. 2012, pp. 1223-1231.
  13. V. Mnih et al., "Asynchronous Methods for Deep Reinforcement Learning," in Proc. Int. Conf. Mach. Learning, New York, USA, 2016, pp. 1928-1937.
  14. A. Nair et al., "Massively Parallel Methods for Deep Reinforcement Learning," in Int. Conf. Mach. Learning(ICML), Lille, France, July 2015.
  15. V. Mnih et al., "Human-Level Control Through Deep Reinforcement Learning," Nature, vol. 518, no. 7540, 2015, pp. 529-533. https://doi.org/10.1038/nature14236
  16. T. Salimans et al., "Evolution Strategies as a Scalable Alternative to Reinforcement Learning," CoRR, arXiv: 1703.03864, 2017.
  17. I. Adamski et al., "Distributed Deep Reinforcement Learning: Learn How to Play Atari Games in 21 Minutes," CoRR, arXiv: 1801.02852, 2018.
  18. D. Horgan, et al., "Distributed Prioritized Experience Replay," in Int. Conf. Learning Representations, Vancouver, Canada, May 2018.
  19. E. Liang et al., "RLlib: Abstractions for Distributed Reinforcement Learning," in Int. Conf. Learning Representations, Vancouver, Canada, May 2018.
  20. P. Buchlovsky et al., "TF-Replicator: Distributed Machine Learning for Researchers," arxiv: 1902.00465, 2019.
  21. L. Espeholt et al., "IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures," Proc. Mach. Learning Research, vol. 80, 2018, pp. 1407-1416.
  22. S. Kapturowski et al., "Recurrent Experience Replay in Distributed Reinforcement Learning," in Int. Conf, Learning Representations, New Orleans, LA, USA, May 2019.
  23. H. Matthew and S. Peter, "Deep Recurrent Q-Learning for Partially Observable MDPs," in AAAI Fall Symposia, Arlington, VA, USA, Nov. 2015, pp. 29-37.
  24. N. G. Lopez et al., "Gym-Gazebo2, a Toolkit for Reinforcement Learning Using ROS 2 and Gazebo," arxiv: 1903.06278, 2019.
  25. J. Arthur et al., "Unity: A General Platform for Intelligent Agents," arxiv: 1809.02627, 2018.
  26. G. Brockman et al., "OpenAI Gym," arxiv:1606.01540, 2016.
  27. C. Beattie et al., "DeepMind Lab," arxiv: 1612.03801, 2016.
  28. J. Tan et al., "Sim-to-Real: Learning Agile Locomotion for Quadruped Robots," in Proc. Robotics: Sci. Syst., Pittsburgh, PA, USA, 2018.
  29. Y. Li, "Deep Reinforcement Learning," arxiv: 1810.06339, 2018.