Improving Process Mining with Trace Clustering

자취 군집화를 통한 프로세스 마이닝의 성능 개선

  • Song, Min-Seok (Faculty of Technology Management, Eindhoven University of Technology) ;
  • Gunther, C.W. (Faculty of Technology Management, Eindhoven University of Technology) ;
  • van der Aalst, W.M.P. (Faculty of Technology Management, Eindhoven University of Technology) ;
  • Jung, Jae-Yoon (Department of Industrial Engineering, Kyung Hee University)
  • 송민석 (아인트호벤공대 기술경영학부) ;
  • ;
  • ;
  • 정재윤 (경희대학교 산업공학과)
  • Received : 2008.07.02
  • Accepted : 2008.10.11
  • Published : 2008.12.31

Abstract

Process mining aims at mining valuable information from process execution results (called "event logs"). Even though process mining techniques have proven to be a valuable tool, the mining results from real process logs are usually too complex to interpret. The main cause that leads to complex models is the diversity of process logs. To address this issue, this paper proposes a trace clustering approach that splits a process log into homogeneous subsets and applies existing process mining techniques to each subset. Based on log profiles from a process log, the approach uses existing clustering techniques to derive clusters. Our approach are implemented in ProM framework. To illustrate this, a real-life case study is also presented.

Keywords

Process Mining;Trace Clustering;Workflow;Data Mining;SOM

References

  1. van der Aalst, W. M. P., H. A. Reijers, A. J. M. M. Weijters, B. F. van Dongen, A. K. Alves de Medeiros, M. Song, and H. M. W. Verbeek (2007), Business Process Mining : An Industrial Application, Information Systems, Information Systems, 32(5), 713-732 https://doi.org/10.1016/j.is.2006.05.003
  2. van der Aalst, W. M. P., et al. (2007), ProM 4.0 : Comprehensive Support for Real Process Analysis, Proc. 28th Int'l Conf. on Applications and Theory of Petri Nets and Other Models of Concurrency (ICATPN 2007), Lecture Notes on Computer Science, 4546, 484- 494
  3. Jansen-Vullers, M. H., van der Aalst, W. M. P., and Rosemann, M. (2006), Mining Configurable Enterprise Information Systems, Data and Knowledge Engineering, 56(3), 195-244 https://doi.org/10.1016/j.datak.2005.03.007
  4. Kohonen, T. (1982), Self-organation formation of topologically correct feature maps, Biological Cybernetics, 43(1), 59-69 https://doi.org/10.1007/BF00337288
  5. de Medeiros, A. K. Alves, Weijters, A. J. M. M., and van der Aalst, W. M. P. (2007), Genetic Process Mining : An Experimental Evaluation, Data Mining and Knowledge Discovery, 14(2), 245-304 https://doi.org/10.1007/s10618-006-0061-7
  6. Heyer, L. J., Kruglyak, S., and Yooseph, S. (1999), Exploring Expression Data: Identification and Analysis of Coexpressed Genes, Genome Research, 9(11), 1106-1115 https://doi.org/10.1101/gr.9.11.1106
  7. Gunther, C. W. and van der Aalst, W. M. P. (2007), Fuzzy Mining -Adaptive Process Simplication Based on Multi-Perspective Metrics, In G. Alonso, P. Dadam, and M. Rosemann, editors, International Conference on Business Process Management(BPM 2007), Lecture Notes on Computer Science, 4714, 328-343
  8. van der Aalst, W. M. P. and Basten, T. (2002), Inheritance of workflows : an approach to tackling problems related to change, Theoretical Computer Science, 270(1), 125-203 https://doi.org/10.1016/S0304-3975(00)00321-2
  9. Jung, J.-Y., PROCL : A Process Log Clustering System, The Journal of Society for e-Business Studies, 13(2), 181-194
  10. van der Aalst, W. M. P., Weijters, A. J. M. M., and Maruster, L. (2004), Workow Mining : Discovering Process Models from Event Logs, IEEE Transactions on Knowledge and Data Engineering, 16 (9), 1128-1142 https://doi.org/10.1109/TKDE.2004.47
  11. Dumas, M., van der Aalst, W. M. P., and ter Hofstede, A. H. M. (2005), Process-Aware Information Systems: Bridging People and Software through Process Technology, Wiley and Sons
  12. Rozinat, A. and W. M. P. van der Aalst (2006), Decision Mining in ProM, Proc. 4th Int. Conf. on Business Process Management, 420-425
  13. Rozinat, A. and van der Aalst, W. M. P. (2008), Conformance checking of processes based on monitoring real behavior, Information Systems, 33(1), 64-95 https://doi.org/10.1016/j.is.2007.07.001
  14. Kaufman, L. and Rousseeuw, P. J. (1990), Finding Groups in Data : An Introduction to Cluster Analysis
  15. van der Aalst, W. M. P., Reijers, H. A., and Song, M. (2005), Discovering Social Networks from Event Logs, Computer Supported Cooperative work, 14(6), 549-593 https://doi.org/10.1007/s10606-005-9005-9
  16. Lloyd, S. P. (1982), Least squares quantization in PCM, IEEE Transactions on Information Theory, 2, 129-137
  17. Greco, G., Guzzo, A., and Pontieri, L. (2006), Discovering Expressive Process Models by Clustering Log Traces, IEEE Transactions on Knowledge and Data Engineering, 18(8), 1010-1027 https://doi.org/10.1109/TKDE.2006.123