• Title/Summary/Keyword: concept drift detection

Search Result 7, Processing Time 0.025 seconds

An Effective Concept Drift Detection Method on Streaming Data Using Probability Estimates (스트리밍 데이터에서 확률 예측치를 이용한 효과적인 개념 변화 탐지 방법)

  • Kim, Young-In;Park, Cheong Hee
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.718-723
    • /
    • 2016
  • In streaming data analysis, detecting concept drift accurately is important to maintain the performance of classification model. Error rates are usually used for concept drift detection. However, by describing prediction results with only binary values of 0 or 1, useful information about a behavior pattern of a classifier can be lost. In this paper, we propose an effective concept drift detection method which describes performance pattern of a classifier by utilizing probability estimates for class prediction and detects a significant change in a classifier behavior. Experimental results on synthetic and real streaming data show the efficiency of the proposed method for detecting the occurrence of concept drift.

Concept Drift Based on CNN Probability Vector in Data Stream Environment

  • Kim, Tae Yeun;Bae, Sang Hyun
    • Journal of Integrative Natural Science
    • /
    • v.13 no.4
    • /
    • pp.147-151
    • /
    • 2020
  • In this paper, we propose a method to detect concept drift by applying Convolutional Neural Network (CNN) in a data stream environment. Since the conventional method compares only the final output value of the CNN and detects it as a concept drift if there is a difference, there is a problem in that the actual input value of the data stream reacts sensitively even if there is no significant difference and is incorrectly detected as a concept drift. Therefore, in this paper, in order to reduce such errors, not only the output value of CNN but also the probability vector are used. First, the data entered into the data stream is patterned to learn from the neural network model, and the difference between the output value and probability vector of the current data and the historical data of these learned neural network models is compared to detect the concept drift. The proposed method confirmed that only CNN output values could be used to reduce detection errors compared to how concept drift were detected.

A novel window strategy for concept drift detection in seasonal time series (계절성 시계열 자료의 concept drift 탐지를 위한 새로운 창 전략)

  • Do Woon Lee;Sumin Bae;Kangsub Kim;Soonhong An
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.377-379
    • /
    • 2023
  • Concept drift detection on data stream is the major issue to maintain the performance of the machine learning model. Since the online stream is to be a function of time, the classical statistic methods are hard to apply. In particular case of seasonal time series, a novel window strategy with Fourier analysis however, gives a chance to adapt the classical methods on the series. We explore the KS-test for an adaptation of the periodic time series and show that this strategy handles a complicate time series as an ordinary tabular dataset. We verify that the detection with the strategy takes the second place in time delay and shows the best performance in false alarm rate and detection accuracy comparing to that of arbitrary window sizes.

Quantitative Estimation Method for ML Model Performance Change, Due to Concept Drift (Concept Drift에 의한 ML 모델 성능 변화의 정량적 추정 방법)

  • Soon-Hong An;Hoon-Suk Lee;Seung-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.259-266
    • /
    • 2023
  • It is very difficult to measure the performance of the machine learning model in the business service stage. Therefore, managing the performance of the model through the operational department is not done effectively. Academically, various studies have been conducted on the concept drift detection method to determine whether the model status is appropriate. The operational department wants to know quantitatively the performance of the operating model, but concept drift can only detect the state of the model in relation to the data, it cannot estimate the quantitative performance of the model. In this study, we propose a performance prediction model (PPM) that quantitatively estimates precision through the statistics of concept drift. The proposed model induces artificial drift in the sampling data extracted from the training data, measures the precision of the sampling data, creates a dataset of drift and precision, and learns it. Then, the difference between the actual precision and the predicted precision is compared through the test data to correct the error of the performance prediction model. The proposed PPM was applied to two models, a loan underwriting model and a credit card fraud detection model that can be used in real business. It was confirmed that the precision was effectively predicted.

Application of an Adaptive Incremental Classifier for Streaming Data (스트리밍 데이터에 대한 적응적 점층적 분류기의 적용)

  • Park, Cheong Hee
    • Journal of KIISE
    • /
    • v.43 no.12
    • /
    • pp.1396-1403
    • /
    • 2016
  • In streaming data analysis where underlying data distribution may be changed or the concept of interest can drift with the progress of time, the ability to adapt to concept drift can be very powerful especially in the process of incremental learning. In this paper, we develop a general framework for an adaptive incremental classifier on data stream with concept drift. A distribution, representing the performance pattern of a classifier, is constructed by utilizing the distance between the confidence score of a classifier and a class indicator vector. A hypothesis test is then performed for concept drift detection. Based on the estimated p-value, the weight of outdated data is set automatically in updating the classifier. We apply our proposed method for two types of linear discriminant classifiers. The experimental results on streaming data with concept drift demonstrate that the proposed adaptive incremental learning method improves the prediction accuracy of an incremental classifier highly.

Study on the Operational Concept of Underwater Acoustic Measurement System in Korean Sea (한국 환경에 적합한 기동형 수중음향측정체계 운용 개념 연구)

  • Dho, Kyeong-Cheol;Son, Kweon;Choi, Jae-Yong
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.6 no.2
    • /
    • pp.45-54
    • /
    • 2003
  • The radiated-noise of combat ship is very important in the point of detection and vulnerability assessment. Therefore several kind of underwater acoustic measurement method has been developed. This paper reviews the various measurement concepts and proposes a procedure to select the better one under consideration of measurement conditions. And this paper recommends the portable drift type, which has vertical line array, as the most efficient measurement method in Korean sea.

A Framework for Early Detection and Interpretation of Concept Drift (컨셉 드리프트를 고려한 조기탐지 및 해석 프레임워크)

  • Min-Jung Kang;Su-Bin Oh;Sang-Min Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.701-704
    • /
    • 2023
  • 본 연구는 반도체 제조 과정에서 생산 가용 능력이 저하되는 시점을 조기 탐지하기 위한 프레임워크를 제안한다. 이를 위해 데이터 패턴의 불규칙한 변동이 잦은 환경에서 모델의 재학습 없이 최적의 성능을 유지할 수 있도록 온라인 학습 방식을 활용하였다. Augmented Dicky-Fuller test 를 통해 데이터의 정상성 여부를 검정하고, 데이터에 변화가 있을 경우 학습 모델은 지속적으로 업데이트된다. 특히, 상한 재공재고는 생산량과 직결되는 주요 지표로써, 낮게 예측된 시점에서 주요 원인 변수를 파악하는 것이 중요하다. 따라서 정확도와 효율성 측면에서 다른 모델 대비 가장 우수한 성능을 보였던 제안 기법에 shapley additive explanations(SHAP)을 적용하여 생산 저하 시 문제가 되는 원인 변수를 분석하고자 하였다.