• Title/Summary/Keyword: Knowledge distillation

Search Result 47, Processing Time 0.023 seconds

Performance analysis of Object detection using Self-Knowledge distillation method (자가 지식 증류 기법을 적용한 객체 검출 기법의 성능 분석)

  • Dong-Jun Kim;Seunghyun Lee;Byung-Cheol Song
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.126-128
    • /
    • 2022
  • 경량화 기법 중 하나인 Knowledge distillation 은 최근 object detection task 에 적용되고 있다. Knowledge distillation 은 3 가지 범주로 나뉘는데 그들 중에서 Self-Knowledge distillation 은 기존의 Knowledge distillation 에서의 pre-trained teacher 에 대한 의존성 문제를 완화시켜준다. Self-Knowledge distillation 또한 object detection task 에 적용되어 training cost 를 줄이고 고전적인 teacher-based methods 보다 좋은 성능을 성취했다.

  • PDF

Performance Improvement of SRGAN's Discriminator via Mutual Distillation (상호증류를 통한 SRGAN 판별자의 성능 개선)

  • Yeojin Lee;Hanhoon Park
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.3
    • /
    • pp.160-165
    • /
    • 2022
  • Mutual distillation is a knowledge distillation method that guides a cohort of neural networks to learn cooperatively by transferring knowledge between them, without the help of a teacher network. This paper aims to confirm whether mutual distillation is also applicable to super-resolution networks. To this regard, we conduct experiments to apply mutual distillation to the discriminators of SRGANs and analyze the effect of mutual distillation on improving SRGAN's performance. As a result of the experiment, it was confirmed that SRGANs whose discriminators shared their knowledge through mutual distillation can produce super-resolution images enhanced in both quantitative and qualitative qualities.

Knowledge Distillation Based Continual Learning for PCB Part Detection (PCB 부품 검출을 위한 Knowledge Distillation 기반 Continual Learning)

  • Gang, Su Myung;Chung, Daewon;Lee, Joon Jae
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.7
    • /
    • pp.868-879
    • /
    • 2021
  • PCB (Printed Circuit Board) inspection using a deep learning model requires a large amount of data and storage. When the amount of stored data increases, problems such as learning time and insufficient storage space occur. In this study, the existing object detection model is changed to a continual learning model to enable the recognition and classification of PCB components that are constantly increasing. By changing the structure of the object detection model to a knowledge distillation model, we propose a method that allows knowledge distillation of information on existing classified parts while simultaneously learning information on new components. In classification scenario, the transfer learning model result is 75.9%, and the continual learning model proposed in this study shows 90.7%.

Text Classification Using Heterogeneous Knowledge Distillation

  • Yu, Yerin;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.10
    • /
    • pp.29-41
    • /
    • 2022
  • Recently, with the development of deep learning technology, a variety of huge models with excellent performance have been devised by pre-training massive amounts of text data. However, in order for such a model to be applied to real-life services, the inference speed must be fast and the amount of computation must be low, so the technology for model compression is attracting attention. Knowledge distillation, a representative model compression, is attracting attention as it can be used in a variety of ways as a method of transferring the knowledge already learned by the teacher model to a relatively small-sized student model. However, knowledge distillation has a limitation in that it is difficult to solve problems with low similarity to previously learned data because only knowledge necessary for solving a given problem is learned in a teacher model and knowledge distillation to a student model is performed from the same point of view. Therefore, we propose a heterogeneous knowledge distillation method in which the teacher model learns a higher-level concept rather than the knowledge required for the task that the student model needs to solve, and the teacher model distills this knowledge to the student model. In addition, through classification experiments on about 18,000 documents, we confirmed that the heterogeneous knowledge distillation method showed superior performance in all aspects of learning efficiency and accuracy compared to the traditional knowledge distillation.

Knowledge Distillation based-on Internal/External Correlation Learning

  • Hun-Beom Bak;Seung-Hwan Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.31-39
    • /
    • 2023
  • In this paper, we propose an Internal/External Knowledge Distillation (IEKD), which utilizes both external correlations between feature maps of heterogeneous models and internal correlations between feature maps of the same model for transferring knowledge from a teacher model to a student model. To achieve this, we transform feature maps into a sequence format and extract new feature maps suitable for knowledge distillation by considering internal and external correlations through a transformer. We can learn both internal and external correlations by distilling the extracted feature maps and improve the accuracy of the student model by utilizing the extracted feature maps with feature matching. To demonstrate the effectiveness of our proposed knowledge distillation method, we achieved 76.23% Top-1 image classification accuracy on the CIFAR-100 dataset with the "ResNet-32×4/VGG-8" teacher and student combination and outperformed the state-of-the-art KD methods.

Research on apply to Knowledge Distillation for Crowd Counting Model Lightweight (Crowd Counting 경량화를 위한 Knowledge Distillation 적용 연구)

  • Yeon-Joo Hong;Hye-Ryung Jeon;Yu-Yeon Kim;Hyun-Woo Kang;Min-Gyun Park;Kyung-June Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.918-919
    • /
    • 2023
  • 딥러닝 기술이 발전함에 따라 모델의 복잡성 역시 증가하고 있다. 본 연구에서는 모델 경량화를 위해 Knowledge Distillation 기법을 Crowd Counting Model에 적용했다. M-SFANet을 Teacher 모델로, 파라미터수가 적은 MCNN 모델을 Student 모델로 채택해 Knowledge Distillation을 적용한 결과, 기존의 MCNN 모델보다 성능을 향상했다. 이는 정확도와 메모리 효율성 측면에서 많은 개선을 이루어 컴퓨팅 리소스가 부족한 기기에서도 본 모델을 실행할 수 있어 많은 활용이 가능할 것이다.

Anchor Free Object Detection Continual Learning According to Knowledge Distillation Layer Changes (Knowledge Distillation 계층 변화에 따른 Anchor Free 물체 검출 Continual Learning)

  • Gang, Sumyung;Chung, Daewon;Lee, Joon Jae
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.4
    • /
    • pp.600-609
    • /
    • 2022
  • In supervised learning, labeling of all data is essential, and in particular, in the case of object detection, all objects belonging to the image and to be learned have to be labeled. Due to this problem, continual learning has recently attracted attention, which is a way to accumulate previous learned knowledge and minimize catastrophic forgetting. In this study, a continaul learning model is proposed that accumulates previously learned knowledge and enables learning about new objects. The proposed method is applied to CenterNet, which is a object detection model of anchor-free manner. In our study, the model is applied the knowledge distillation algorithm to be enabled continual learning. In particular, it is assumed that all output layers of the model have to be distilled in order to be most effective. Compared to LWF, the proposed method is increased by 23.3%p mAP in 19+1 scenarios, and also rised by 28.8%p in 15+5 scenarios.

Performance Analysis of Hint-KD Training Approach for the Teacher-Student Framework Using Deep Residual Networks (딥 residual network를 이용한 선생-학생 프레임워크에서 힌트-KD 학습 성능 분석)

  • Bae, Ji-Hoon;Yim, Junho;Yu, Jaehak;Kim, Kwihoon;Kim, Junmo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.5
    • /
    • pp.35-41
    • /
    • 2017
  • In this paper, we analyze the performance of the recently introduced Hint-knowledge distillation (KD) training approach based on the teacher-student framework for knowledge distillation and knowledge transfer. As a deep neural network (DNN) considered in this paper, the deep residual network (ResNet), which is currently regarded as the latest DNN, is used for the teacher-student framework. Therefore, when implementing the Hint-KD training, we investigate the impact on the weight of KD information based on the soften factor in terms of classification accuracy using the widely used open deep learning frameworks, Caffe. As a results, it can be seen that the recognition accuracy of the student model is improved when the fixed value of the KD information is maintained rather than the gradual decrease of the KD information during training.

Ensemble Knowledge Distillation for Classification of 14 Thorax Diseases using Chest X-ray Images (흉부 X-선 영상을 이용한 14 가지 흉부 질환 분류를 위한 Ensemble Knowledge Distillation)

  • Ho, Thi Kieu Khanh;Jeon, Younghoon;Gwak, Jeonghwan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.313-315
    • /
    • 2021
  • Timely and accurate diagnosis of lung diseases using Chest X-ray images has been gained much attention from the computer vision and medical imaging communities. Although previous studies have presented the capability of deep convolutional neural networks by achieving competitive binary classification results, their models were seemingly unreliable to effectively distinguish multiple disease groups using a large number of x-ray images. In this paper, we aim to build an advanced approach, so-called Ensemble Knowledge Distillation (EKD), to significantly boost the classification accuracies, compared to traditional KD methods by distilling knowledge from a cumbersome teacher model into an ensemble of lightweight student models with parallel branches trained with ground truth labels. Therefore, learning features at different branches of the student models could enable the network to learn diverse patterns and improve the qualify of final predictions through an ensemble learning solution. Although we observed that experiments on the well-established ChestX-ray14 dataset showed the classification improvements of traditional KD compared to the base transfer learning approach, the EKD performance would be expected to potentially enhance classification accuracy and model generalization, especially in situations of the imbalanced dataset and the interdependency of 14 weakly annotated thorax diseases.

  • PDF

A Study of Lightening SRGAN Using Knowledge Distillation (지식증류 기법을 사용한 SRGAN 경량화 연구)

  • Lee, Yeojin;Park, Hanhoon
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.12
    • /
    • pp.1598-1605
    • /
    • 2021
  • Recently, convolutional neural networks (CNNs) have been widely used with excellent performance in various computer vision fields, including super-resolution (SR). However, CNN is computationally intensive and requires a lot of memory, making it difficult to apply to limited hardware resources such as mobile or Internet of Things devices. To solve these limitations, network lightening studies have been actively conducted to reduce the depth or size of pre-trained deep CNN models while maintaining their performance as much as possible. This paper aims to lighten the SR CNN model, SRGAN, using the knowledge distillation among network lightening technologies; thus, it proposes four techniques with different methods of transferring the knowledge of the teacher network to the student network and presents experiments to compare and analyze the performance of each technique. In our experimental results, it was confirmed through quantitative and qualitative evaluation indicators that student networks with knowledge transfer performed better than those without knowledge transfer, and among the four knowledge transfer techniques, the technique of conducting adversarial learning after transferring knowledge from the teacher generator to the student generator showed the best performance.