DOI QR코드

DOI QR Code

Evaluation of Multivariate Stream Data Reduction Techniques

다변량 스트림 데이터 축소 기법 평가

  • 정훈조 (한서대학교 컴퓨터정보학과) ;
  • 서성보 ;
  • 최경주 (충북대학교 전기전자컴퓨터공학부) ;
  • 박정석 (충주대학교 전기전자 및 정보공학부) ;
  • 류근호 (충북대학교 전기전자컴퓨터공학부)
  • Published : 2006.12.31

Abstract

Even though sensor networks are different in user requests and data characteristics depending on each application area, the existing researches on stream data transmission problem focus on the performance improvement of their methods rather than considering the original characteristic of stream data. In this paper, we introduce a hierarchical or distributed sensor network architecture and data model, and then evaluate the multivariate data reduction methods suitable for user requirements and data features so as to apply reduction methods alternatively. To assess the relative performance of the proposed multivariate data reduction methods, we used the conventional techniques, such as Wavelet, HCL(Hierarchical Clustering), Sampling and SVD (Singular Value Decomposition) as well as the experimental data sets, such as multivariate time series, synthetic data and robot execution failure data. The experimental results shows that SVD and Sampling method are superior to Wavelet and HCL ia respect to the relative error ratio and execution time. Especially, since relative error ratio of each data reduction method is different according to data characteristic, it shows a good performance using the selective data reduction method for the experimental data set. The findings reported in this paper can serve as a useful guideline for sensor network application design and construction including multivariate stream data.

센서 네트워크는 애플리케이션 분야에 따라 데이터 특성과 사용자의 요구사항이 다양함에도 불구하고, 현존하는 스트림 데이터 축소 연구는 데이터의 본질적인 특징보다 특정 축소 기법의 성능 향상 측면에 중점을 두고 있다. 이 논문은 계층/분산형 센서 네트워크 구조와 데이터 모델을 소개하고, 선택적으로 축소 기법을 적용하기 위해 데이터 특성과 사용자의 요구에 적합한 다변량 데이터 축소 기법을 비교 평가한다. 다변량 데이터 축소 기법의 성능을 비교 분석하기 위해, 우리는 웨이블릿, HCL(Hierarchical Clustering), SVD(Singular Value Decomposition), 샘플링과 같은 표준화 된 다변량 축소 기법을 이용한다. 실험 데이터는 다차원 시계열 데이터와 로봇 센서 데이터를 사용한다. 실험 결과 SVD와 샘플링 기법이 상대 에러 비율과 수행 성능 측면에서 웨이블릿과 HCL기법에 비해 우수하였다. 특히 각 데이터 축소 기법의 상대 에러 비율은 입력 데이터 특성에 따라 다르기 때문에 선택적으로 데이터 축소 기법을 적용하는 것이 좋은 성능을 보였다. 이 논문은 다차원 센서 데이터가 수집되는 센서 네트워크를 디자인하고 구축하는 응용 분야에 유용하게 활용될 것이다.

Keywords

References

  1. J. M. Hellerstein, W. Hong, and S. R. Madden, 'The Sensor Spectrum: Technology, Trends, and Requirements,' In SIGMOD Record. Vol. 32, No.4, pp.22-27, 2003 https://doi.org/10.1145/959060.959065
  2. A. Deligiannakis, Y. Kotidis and N. Roussopoulos, 'Compressing Historical Information in Sensor Networks,' In Conf. of SIGMOD, pp.527-538, 2004 https://doi.org/10.1145/1007568.1007628
  3. A. Deligiannakis, Y. Kotidis, and N. Roussopoulos, 'Hierarchical in-Network Data Aggregation with Quality Guarantees,' In Conf. of EDBT, pp.658-675, 2004
  4. M. J. Franklin and S. R. Jeffery et al, 'Design Considerations for High Fan-In Systems: The HiFi Approach,' In Conf. of CIDR, pp290-304, 2005
  5. A. Manjeshwar and D. P. Agrawal, 'TEEN: A routing protocol for enhanced efficiency in wireless sensor networks,' In Proc. of PDPS, pp2009-2015, 2001
  6. A. Mainwaring and J. Polastre et al, 'Wireless Sensor Networks for habitat monitoring,' In Proc. of WSNA, pp.88-97, 2002 https://doi.org/10.1145/570738.570751
  7. B. X. and O. Wolfson, 'Time-Series Prediction with Applications to Traffic and Moving Objects Databases,' In Proc. of MobiDE, pp.56-60, 2003 https://doi.org/10.1145/940923.940934
  8. S. Guha, C. Kim, and K. S. Shim, 'XWAVE: Approximate Extended Wavelets for Stream Data,' In Conf. of VLDB, pp.288-299, 2004
  9. Y. Chen and G. Dong et al, 'Multi-Dimensional Regresion Analysis of Time-Series Data Streams,' In Conf. of VLDB, pp.323-334, 2002
  10. A. Deshpande and C. Guestrin et al, 'Model-Driven Data Acquisition in Sensor Networks,' In Conf. of VLDB, pp.588-599, 2004
  11. R. C. Oliver and K. Smettem et al, 'Field Testing a Wireless Sensor Network for Reactive Environmental Monitoring,' In Proc. of ISSNlP, pp.7-12, 2004 https://doi.org/10.1109/ISSNIP.2004.1417429
  12. J. Han and M. Kamber, 'Data Mining Concepts and Techniques,' Morgan Kaufmann Publishers, 2000
  13. M. Garofalakis, and P. B. Gibbons, 'Approximate Query Processing: Taming the Terabytes!' In Conf. of VLDB, Tutorial, 2001
  14. G. Strang, 'Introduction to Linear Algebra,' 3rd Ed., Wellesley-Cambridge Press, 1998
  15. F. Korn, H. V. Jagadish, and C. Faloutsos, 'Efficient Supporting Ad Hoc Queries in Large Datasets of Time Sequences,' In Conf. SIGMOD, pp.289-300, 1997 https://doi.org/10.1145/253260.253332
  16. D. Barbara and W. DuMouchel, et al, 'The New Jersey Data Reduction Report,' IEEE Data Engineering Bulletin, pp.3-45, 1997
  17. L. M. Camarinha-Matos, L. S. Lopes, and J. Barata, 'Assembly Execution Supervision with Learning Capabilities,' In Conf. of ICRA, pp.272-279, 1994 https://doi.org/10.1109/ROBOT.1994.350978
  18. S. Guha and N. Mishara et al, 'Clustering Data Streams,' In Conf. of FOCS, pp.359-366, 2000 https://doi.org/10.1109/SFCS.2000.892124
  19. A. Deligiannakis, M. Garofalakis, and N. Roussopoulos, 'A Fast Approximation Scheme for Probabilistic Wavelet Synopses,' Int. Conf. on SSDBM, pp.243-252, 2005
  20. S. R. Madden, M. J. Franklin, and J. M. Hellerstein, 'Tinyl.B: An Acquisitional Query Processing System for Sensor Networks,' In ACM TODS, pp.1-47, 2004 https://doi.org/10.1145/1061318.1061322
  21. S. Hettich and S. D. Bay, 'The UCI KDD Archive (Synthetic Control Chart Time Series, Robot Execution Failures) [http://kdd.ics.uci.edu],' Irvine, CA: University of California, Department of Information and Computer Science, 1999
  22. 'JAMA,' A Java Matrix Package, 'http://math.nist.gov.'
  23. 'Multivariate Data Analysis Software,' Java Source, 'http://astro.ustrasbg.fr/~fmurtagh/mdasw/.'
  24. 'FFT Spectrum Analyzer,' Java Source, 'http://www.dsptutor.freeuk.com/analyser/SA102.html.'
  25. S.B. Seo, J.W. Kang, D.W.Lee, and K.H.Ryu, 'Multivariate stream data classification using standard text classifiers,' In Conf of DEXA, pp.420-429, 2006
  26. S.B.Seo, J.W.Kang, and K. H. Ryu, 'Multivariate Stream Data Reduction in Sensor Network Applications,' EUC workshops, pp.198-207, 2005