Automatic Identification of Database Workloads by using SVM Workload Classifier

SVM 워크로드 분류기를 통한 자동화된 데이터베이스 워크로드 식별

  • 김소연 (연세대학교 컴퓨터과학과) ;
  • 노홍찬 (연세대학교 컴퓨터과학과) ;
  • 박상현 (연세대학교 컴퓨터과학과)
  • Received : 2009.11.24
  • Accepted : 2010.01.15
  • Published : 2010.04.28


DBMS is used for a range of applications from data warehousing through on-line transaction processing. As a result of this demand, DBMS has continued to grow in terms of its size. This growth invokes the most important issue of manually tuning the performance of DBMS. The DBMS tuning should be adaptive to the type of the workload put upon it. But, identifying workloads in mixed database applications might be quite difficult. Therefore, a method is necessary for identifying workloads in the mixed database environment. In this paper, we propose a SVM workload classifier to automatically identify a DBMS workload. Database workloads are collected in TPC-C and TPC-W benchmark while changing the resource parameters. Parameters for SVM workload classifier, C and kernel parameter, were chosen experimentally. The experiments revealed that the accuracy of the proposed SVM workload classifier is about 9% higher than that of Decision tree, Naive Bayes, Multilayer perceptron and K-NN classifier.


Workload Classification;Support Vector Machine;Database Management System;Database Tuning


Supported by : 한국연구재단


  1. James Martin, "Information Engineering Book II", Prentice Hall Pub, 1990.
  2. Oracle 8 : Database Administration, 성능튜닝 워크숍, SQL 튜닝, ORACLE, 1998.
  3. S. Chaudhuri and V. Narasayya, "AutoAdmin "What-if" index analysis utility", Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pp.367-378, 1998.
  6. A. Ben-Hur, D. Horn, H. T. Siegelmann and V. Vapnik, "Support Vector Clustering," The Journal of Machine Learning Research, Vol.2, pp.125-137, 2002.
  7. J. Lee and D. Lee, "An Improved Cluster Labeling Method for Support Vector Clustering," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.27, No.3, pp.461-464, 2005.
  8. B. Y. Sun and D. S. Huang, "Support Vector Clustering for Multiclass Classification Problems," IEEE Evolutionary Computation Congress, Vol.2, pp.1480-1485, 2003.
  9. A. Aboylnaga and S. Chaudhuri, "Self-tuning histograms : building histograms without looking at data," Proceedings of the 1999 ACM SIGMOD international conference on Management of data, pp.181-192, 1999.
  10. S. Chaudhuri and G. Weikum, "Rethinking Database System Architecture : Towards a Self-Tuning RISC-Style Database System", Proceedings of the 26th International Conference on Very Large Databases, pp.1-10, 2000.
  11. P. S. Yu, M. S. Chen, H. U. Heriss and S. Lee, "One Workload Characterization of Relational Database Environments," IEEE Transacion on Software, Vol.18, No.4, pp.347-355, 1992.
  12. P. Martin, H. Y. Li, M. Zheng, K. Romanufa, and W. Powley, "Dynamic Reconfiguration Algorithm : Dynamically Tuning Multiple Buffer Pools," Proceedings of the 11th International Conference on Database and Expert Systems Applications, pp.92-101, 2000.
  13. S. Elnaffar, "A Methodology for Auto-Recognizing DBMS Workloads," Proceedings of the 2002 conference of the Centre for Advanced Studies on Collaborative research, 2002.
  14. R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection," Proceedings of International Joint Conference on Artificial Intelligence, pp.1137-1143, 1995.