DOI QR코드

DOI QR Code

데이터 마이닝을 적용한 기업형 클라우드 컴퓨팅 기반 데이터 처리 기법

Data processing techniques applying data mining based on enterprise cloud computing

  • 강인성 (고려대학교 정보경영공학부) ;
  • 김태호 (고려대학교 정보경영공학부) ;
  • 이홍철 (고려대학교 산업경영공학부)
  • Kang, In-Seong (Dept. of Information Management Engineering, Korea University) ;
  • Kim, Tae-Ho (Dept. of Information Management Engineering, Korea University) ;
  • Lee, Hong-Chul (Dept. of Industrial Management Engineering, Korea University)
  • 투고 : 2011.03.21
  • 심사 : 2011.05.06
  • 발행 : 2011.08.31

초록

최근 클라우드 컴퓨팅은 인터넷 접속을 통해 언제어디서든 사용할 수 있는 높은 이용편리성과 동시에 스마트폰, 넷북, PDA 등과 같은 각종 정보통신 기기로 데이터를 손쉽게 공유할 수 있는 사용환경을 제공하기 때문에 산업적 파급효과가 커 디지털혁명을 주도할 서비스로 주목받고 있다. 이와 같은 클라우드 컴퓨팅 기반의 협업 시스템을 통해 비즈니스 실무부서 간의 업무 통합이 점차적으로 이루어지면, 관련 부서 간 공유하게 되는 데이터가 더욱 많아지기 때문에 실무자가 필요한 데이터를 보다 쉽게 찾아 사용할 수 있는 방법이 필요하다. 기존 연구에서는 군집화를 통해 탐색과정을 단순화했지만, 본 논문에서는 관련 부서 간에 자주 발생하는 데이터 중복을 제거하고 시스템 성능을 향상시키기 위해 해쉬함수를 사용하고, 변경된 데이터에 대한 정보가 동적으로 반영되어 실무자에게 적합한 데이터가 분류될 수 있도록 데이터 마이닝 기법 중 베이지안 네트워크를 사용한 시스템을 제안하였다. 본 시스템은 기존 방법과 비교하여 탐색기능이 향상된 결과를 나타내었을 뿐만 아니라, CPU, Network Bandwidth 사용량 등의 시스템 성능에도 효율적인 것을 확인하였다.

Recently, cloud computing which has provided enabling convenience that users can connect from anywhere and user friendly environment that offers on-demand network access to a shared pool of configurable computing resources such as smart-phones, net-books and PDA etc, is to be watched as a service that leads the digital revolution. Now, when business practices between departments being integrated through a cooperating system such as cloud computing, data streaming between departments is getting enormous and then it is inevitably necessary to find the solution that person in charge and find data they need. In previous studies the clustering simplifies the search process, but in this paper, it applies Hash Function to remove the de-duplicates in large amount of data in business firms. Also, it applies Bayesian Network of data mining for classifying the respect data and presents handling cloud computing based data. This system features improved search performance as well as the results Compared with conventional methods and CPU, Network Bandwidth Usage in such an efficient system performance is achieved.

키워드

참고문헌

  1. Cloud Computing, http://100.naver.com/
  2. Cloud Computing, http://terms.naver.com/
  3. Consortium of Cloud Computing Research http://www.cccr.or.kr/bin/
  4. Yun-Hee Lee, "Cloud Computing for Information Industry Business Innovation adoption," CIO Report, 2009.
  5. Yoon-Su Jeong, Yong-Tae Kim, "An Authentication and Integrity Guarantee Mechanismof Flooding Packet based on Double Hash Chain," Korean Institute of Information Technology, Vol. 9, No. 1, January 2011.
  6. Seung-Min Han, Eui-NamHuh, Chang-woo Youn, "Efficient Resource Recommendation System for Cloud Market Computing," Korean Society for Internet Information, Vol. 11, No. 3, June 2010.
  7. Electronics and Telecommunications Research Analysis," ETRI, 2009.
  8. 8] Ion Stoica, Robert Morris, David Karger, M.Frans Kaashoek, and Hari Balakrishnan, "Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications," ACMSIGCOMM 2001, San Deigo, CA, August 2001.
  9. L. P. Cox, C. D. Murray, and B. D. Noble. "Pastiche: Making backup cheap and easy," In Proc. 5th USENIX OSDI, Boston, MA, December 2002.
  10. R. L. Rivest, "The MD5 Message Digest Algorithm," Request for Comments(RFC) 1321, Internet Activities Board, 1992.
  11. RFC 3174, "US Secure Hash Algorithm 1 (SHA-1)"
  12. Ho-Min Jung, Young-Woong Ko, "Storage System Perfo rmance Enhancement Using Duplicated Data Management Scheme," Korea Institute of Information Scientists and Engineers, Vol. 37, No. 1, February 2010.
  13. Li Yan, KimHyung Sun, Byeong-Seob You, Jae-D ong Lee, Hae-Young Bae, "Data Cube GenerationMethod Using Hash Table in Spatial Data Warehouse," Korea Multimedia Society, Vol. 9, No. 11, November 2006.
  14. Kyu-Ock Lee, Man-Pyo Hong, "Efficient Processor Alloc ation based on Join Selectivity inMultiple Hash Joins using Synchronization of Page Execution Time,", Korea Institute of Information Scientists and Engineers, Vol. 28, No. 3,4, April 2001.
  15. David Heckerman, "A Tutorial on Learning Bayesia n Ne tworks," Technical Report MSR-TR -95-06, 1995.
  16. Thomas Dean et al, "Artificial Intelligence Theory an d Practice," Addison Wesley, 1995.
  17. Jensen, F. V., Bayesian Networks and Decision Graphs, Springer-Verlag Berlin Heidelberg New York, 2001.
  18. Yujung Lee, Byoungho Kang, Jaeho Kang, Kwang ryel Ryu, "Generation and Selection of Nominal Virtual Examples for Improving the Classifier Performance," Korea Institute of Information Scientists and Engineers, Vol. 33, No.12, 2006.
  19. Jeong-Sik Hwang, Su-Young Pi, Chang-Sik Son, Hwa n-Mook Chung, "A Purchase Pattern Analysis Using Bayesian Network and Neural Network," Korea Intelligent Information Systems, Vol. 15, No. 3, June 2005. https://doi.org/10.5391/JKIIS.2005.15.3.306
  20. Kyung-Yong Jung, Seong-Yong Choi, Kee-Wook Rim, Jung-Hyun Lee, "Preference Prediction System using Similarity Weight granted Bayesian estimated value and Associative User Clustering," Korea Institute of Information Scientists and Engineers, Vol. 30, No. 3,4, April 2003.
  21. JunHyeog Choi, DaeSu Kim, KeeWook Rim, "Dyna mic Recommendation System for a Web Library by Using Cluster Analysis and Bayesian Learning," Korean Society for Internet Information, Vol. 12, No. 5, October 2002. https://doi.org/10.5391/JKIIS.2002.12.5.385
  22. J. Ben Schafer, Dan Frankowski, Jon Herlocker and Shilad Sen, "Collaborative Filtering Recommender Systems," The Adaptive Web, 2007.
  23. Gediminas Adomavicius and Alexander Tuzhilin, "Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions," IEEE Educational Activities Department, 2005.
  24. BWMeter, http://www.desksoft.com/BWMeter.htm
  25. Weka, http://www.cs.waikato.ac.nz/ml/weka/
  26. Heavyload, https://www.jam-software.de/customer s/

피인용 문헌

  1. 클라우드 컴퓨팅 환경에서 무감독학습 방법과 퍼지이론을 이용한 결합형 데이터 분류기법 vol.19, pp.8, 2011, https://doi.org/10.9708/jksci.2014.19.8.011