DOI QR코드

DOI QR Code

지속적 학습 환경에서 지식전달에 기반한 LwF 개선모델

Advanced LwF Model based on Knowledge Transfer in Continual Learning

  • Kang, Seok-Hoon (Department of Embedded Systems Engineering, Incheon National University) ;
  • Park, Seong-Hyeon (Department of Embedded Systems Engineering, Incheon National University)
  • 투고 : 2021.12.20
  • 심사 : 2022.02.03
  • 발행 : 2022.03.31

초록

지속적 학습에서의 망각현상을 완화시키기 위해, 본 논문에서는 지식전달 방법에 기반한 개선된 LwF 모델을 제안하고, 이의 효율성을 실험 결과로 보인다. LwF에 지속적 학습을 적용할 경우, 학습되는 데이터의 도메인이 달라지거나 데이터의 복잡도가 달라지면, 이전에 학습된 결과는 망각현상에 의해 정확도가 떨어지게 된다. 특히 복잡한 데이터에서 단순한 데이터로 학습이 이어질 경우 그 현상이 더 심해지는 경향이 있다. 본 논문에서는 이전 학습 결과가 충분히 LwF 모델에 전달되게 하기 위해 지식전달 방법을 적용하고, 효율적인 사용을 위한 알고리즘을 제안한다. 그 결과 기존 LwF의 결과보다 평균 8% 정도의 망각현상 완화를 보였으며, 학습 태스크가 길어지는 경우에도 효과가 있었다. 특히, 복잡한 데이터가 먼저 학습된 경우에는 LwF 대비 최대 30% 이상 효율이 향상되었다.

To reduce forgetfulness in continuous learning, in this paper, we propose an improved LwF model based on the knowledge transfer method, and we show its effectiveness by experiment. In LwF, if the domain of the learned data is different or the complexity of the data is different, the previously learned results are inaccurate due to forgetting. In particular, when learning continues from complex data to simple data, the phenomenon tends to get worse. In this paper, to ensure that the previous learning results are sufficiently transferred to the LwF model, we apply the knowledge transfer method to LwF, and propose an algorithm for efficient use. As a result, the forgetting phenomenon was reduced by an average of 8% compared to the existing LwF results, and it was effective even when the learning task became long. In particular, when complex data was first learned, the efficiency was improved more than 30% compared to LwF.

키워드

참고문헌

  1. R. M. French, "Catastrophic forgetting in connectionist networks," Trends in cognitive sciences, vol. 3, no. 4, pp. 128-135, Apr. 1999. https://doi.org/10.1016/S1364-6613(99)01294-2
  2. G. I. Parisi, R. Kemker, J. L.Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural Networks, vol. 113, pp. 54-71, May. 2019. https://doi.org/10.1016/j.neunet.2019.01.012
  3. Z. Li and D. Hoiem, "Learning without forgetting," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935-2947, vol. 40, no. 12, pp. 2935-2947, Dec. 2018. https://doi.org/10.1109/tpami.2017.2773081
  4. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, "Overcoming catastrophic forgetting in neural networks," in Proceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521-3526, Mar. 2017. https://doi.org/10.1073/pnas.1611835114
  5. F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," in Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3987-3995, 2017.
  6. Y. Hsu, Y. Liu, A. Ramasamy, and Z. Kira, "Re-evaluating continual learning scenarios: A categorization and case for strong baselines," arXiv preprint, arXiv:1810.12488, 2018.
  7. J. Yoon, E. Yang, J. Lee, and S. J. Hwang, "Lifelong learning with dynamically expandable networks", arXiv preprint, arXiv:1708.01547, 2017.
  8. H. Shin, J. K. Lee, J. Kim, and J. Kim, "Continual Learning with Deep Generative Replay," arXiv preprint, arXiv:1705.08690, 2017.
  9. G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," NIPS Workshop, 2014.
  10. K. McRae, and PA. Hetherington, "Catastrophic Interference is Eliminated in Pretrained Networks," in Proceedings of the 15h Annual Conference of the Cognitive Science Society, pp. 723-728, 1993.
  11. S. Zagoruyko and N. Komodakis, "Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer," arXiv preprint, arXiv:1612.03928, 2016.
  12. B. Heo, M. Lee, S Yun, and JY. Choi, "Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons," in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019.