Frequent Items Mining based on Regression Model in Data Streams

스트림 데이터에서 회귀분석에 기반한 빈발항목 예측

  • 이욱현 (한북대학교 컴퓨터정보학과)
  • Published : 2009.01.28


Recently, the data model in stream data environment has massive, continuous, and infinity properties. However the stream data processing like query process or data analysis is conducted using a limited capacity of disk or memory. In these environment, the traditional frequent pattern discovery on transaction database can be performed because it is difficult to manage the information continuously whether a continuous stream data is the frequent item or not. In this paper, we propose the method which we are able to predict the frequent items using the regression model on continuous stream data environment. We can use as a prediction model on indefinite items by constructing the regression model on stream data. We will show that the proposed method is able to be efficiently used on stream data environment through a variety of experiments.


Stream Data;Frequent Item;Regression Models;Prediction


  1. R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," In Proc. of Very Large Data Bases, pp.487-499, 1994.
  2. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, "Models and Issues in Data Stream Systems," In Proc. of PODS, 2002(3).
  3. G. Chen, X. Wu, and X. Zhu, "Mining Sequential Patterns Across Data Streams," Univ. of Vermont Computer Science Technical Report(CS-05-04), 2005(3).
  4. N. Davey, S. P. Hunt, and R. J. Frank, "Time Series Prediction and Neural Networks," In Journal of Intelligent and Robotic Systems, 2001.
  5. M. J. Franklin and S. R. Jeffery etc., "Design Considerations for High Fan-In System: The HiFi Approach," Conference on Innovative Data Systems Research, pp.290-304, 2005.
  6. C. Giannella, J. Han, J. Pei, X. Yan, and P. S. Yu, "Mining Frequent Patterns in Data Streams at Multiple Time Granularities," In H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yeshar(eds.), Next Generation Data Mining, AAAI/MIT, 2003.
  7. L. Golab, M. Tamer Ozsu, "Issues in Data Stream Management," In SIGMOD Record, Vol.32, No.2, 2003.
  8. H. Han, H. Ryoo, and H. Patrick, "An Infrastructure of Stream Data Mining, Fusion and Management for Monitored Patients," In Proc. of 19th IEEE International Symposium on CBMS 2006, pp.461-468, 2006(6).
  9. X. Hao and D. Xu, "Time Series Prediction based on Non-Parametric," In SIGMOD Record, Vol.32, No.2, 2003.
  10. H. Li, S. Lee, and M. Shan, "Online Mining (Recently) Maximal Frequent Itemsets over Data Streams," In Proc. of RIDE-SDMA'05, pp.11-18, 2005(4).
  11. R. C. Olover and K. Smettem, "Field Testing a Wireless Sensor Network for Reactive Environmental Monitoring," Intelligent Sensors, Sensor Networks and Information Processing, pp.7-12, 2004.
  12. J. Pei, J. Han, and R. Mao, "CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets," In Proc. of 2000 ACM SIGMOD International Workshop Data Mining and Knowledge Discovery, pp.11-20, 2000.
  13. J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu, "Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach," IEEE Transactions on Knowledge and Data Engineering, Vol.16, No.11, 2004(11).
  14. S. Sarkka, A. Vehtari, and J. Lampinen, "Time Series Prediction by Kalman Smoother with Cross-Validated Noise Density," In Proc. of IJCNN, pp.1653-1658, 2004.
  15. D. F. Specht, "A General Regression Neural Network," IEEE Trans. on Neural Networks, Vol.2, No.6, pp.568-576, 1991(11).
  16. M. J. Zaki and C. J. Hsiao, "CHARM: An Efficient Algorithm for Closed Itemset Mining," In Proc. 2002 SIAM International Conference Data Mining, pp457-473, 2002.
  17. B. Xu. and O. Wolfson, "Time-Series Prediction with Application to Traffic and Moving Objects Databases," ACM Workshop on Data Engineering for Wireless and Mobile Access, pp.56-60, 2003.
  18. O. B. Yaik, C. H. Yong, and F. Haron, "Time Series Prediction using Adaptive Association Rules," In Proc. of DFMA05, pp.310-314, 2005.
  19. 김현철, "SPSS for Windows에 의한 실용회귀분석"